Estimates every Software Developer should know
It’s alright even if you don’t, just wanted to make a catchy header. I recently got into system design back again and found somethings which I totally either overlooked or forgot in the years to come, this is one of the tidbits I wanted to share and keep for myself also. PS- This is inspired from the book System Design Interview by Alex Xu
While designing systems at times you would need to make some “back-of-the-envelope” calculations. To do so efficienty you would need to know some estimates and approximations we make for space and time in the world of computing. This blog consists of the same
The Power of 2
While estimating data storage we have to go back to basics. The way memory stored is stored in compputers is in terms of bits which make it coherent to powers of 2. An ASCII character is of one byte which is 8 bits. SO in terms of byte you can think of data in this way
Power | Approximate Value | Full name | Shortname |
10 | 1 Thousand | 1 Kilobyte | 1KB |
20 | 1 Million | 1 Megabyte | 1MB |
30 | 1 Billion | 1 Gigabyte | 1GB |
40 | 1 Trillion | 1 Terabyte | 1TB |
50 | 1 Quadrallion | 1 Petabyte | 1 PT |
Latency Numbers
These estimates are to tell you how much time a typical computer operation would take given the current hardware capabilities we have. This information was compiled and shared by Dr. Jeff Dean from Google.
L1 cache reference | 0.5 ns | |
Branch mispredict | 5 ns | |
L2 cache reference | 7 ns | 14x L1 cache |
Mutex lock/unlock | 25 ns | |
Main memory reference | 100 ns | 20x L2 cache, 200x L1 cache |
Compress 1K bytes with Zippy | 3,000 ns ~ 3 us | |
Send 1K bytes over 1 Gbps network | 10 us | |
Read 4K randomly from SSD* | 150 us | ~1GB/sec SSD |
Read 1 MB sequentially from memory | 250 us | |
Round trip within same datacenter | 500 us | |
Read 1 MB sequentially from SSD* | 1,000 us ~ 1 ms | ~1GB/sec SSD, 4X memory |
Disk seek | 10 ms | 20x datacenter roundtrip |
Read 1 MB sequentially from disk | 20 ms | 80x memory, 20X SSD |
Send packet CA->Netherlands->CA | ~150 ms |
Pictorial Representation -
Obviously these numbers have varied over time all thanks to the Moore’s Law. You can see the advancement in hardware and changing of these numbers over in the graphical representation here - https://colin-scott.github.io/personal_website/research/interactive_latency.html
Service Availability Numbers
Availability of system is measured by a percentage. These percentages are shared by 3rd party solution providers like AWS, GCP and Azure in terms of an agreement that formally defines the level of uptime this solution/managed service would deliver. So if a service has an SLA of 100% it means it will never go down in the entirety of its operational time. Mostly services lie between the SLA of 99% to 100%. General terms in which SLA is talked about is in terms of “Numbers of nine”. If a service has Five-Nines SLA it means its availability is 99.999%, similarly Three-Nines would mean 99.9%. This is because no one can promise 100% SLA due to obvious reasons. So we judge in terms of how close to 100 it can be.
So here’s a comparison of SLA vs Downtime expected by it.
Availability % | Downtime per day | Downtime per year |
99% | 14.40 Minutes | 3.65 Days |
99.9% | 1.44 Minutes | 8.77 Hours |
99.99% | 8.64 seconds | 52.60 Minutes |
99.999% | 864.0 milliseconds | 5.26 Minutes |
99.9999% | 86.40 milliseconds | 31.26 Seconds |
Well that’s all for this one. This blog will serve more as a cheatsheet to me and I will keep on adding numbers as I find them useful and/or interesting. So you can bookmark this if you want to. 🍃