Estimates every Software Developer should know

It’s alright even if you don’t, just wanted to make a catchy header. I recently got into system design back again and found somethings which I totally either overlooked or forgot in the years to come, this is one of the tidbits I wanted to share and keep for myself also. PS- This is inspired from the book System Design Interview by Alex Xu

While designing systems at times you would need to make some “back-of-the-envelope” calculations. To do so efficienty you would need to know some estimates and approximations we make for space and time in the world of computing. This blog consists of the same

The Power of 2

While estimating data storage we have to go back to basics. The way memory stored is stored in compputers is in terms of bits which make it coherent to powers of 2. An ASCII character is of one byte which is 8 bits. SO in terms of byte you can think of data in this way

Power	Approximate Value	Full name	Shortname
10	1 Thousand	1 Kilobyte	1KB
20	1 Million	1 Megabyte	1MB
30	1 Billion	1 Gigabyte	1GB
40	1 Trillion	1 Terabyte	1TB
50	1 Quadrallion	1 Petabyte	1 PT

Latency Numbers

These estimates are to tell you how much time a typical computer operation would take given the current hardware capabilities we have. This information was compiled and shared by Dr. Jeff Dean from Google.

L1 cache reference	0.5 ns
Branch mispredict	5 ns
L2 cache reference	7 ns	14x L1 cache
Mutex lock/unlock	25 ns
Main memory reference	100 ns	20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy	3,000 ns ~ 3 us
Send 1K bytes over 1 Gbps network	10 us
Read 4K randomly from SSD*	150 us	~1GB/sec SSD
Read 1 MB sequentially from memory	250 us
Round trip within same datacenter	500 us
Read 1 MB sequentially from SSD*	1,000 us ~ 1 ms	~1GB/sec SSD, 4X memory
Disk seek	10 ms	20x datacenter roundtrip
Read 1 MB sequentially from disk	20 ms	80x memory, 20X SSD
Send packet CA->Netherlands->CA	~150 ms

Pictorial Representation -

Latency Numbers

src

Obviously these numbers have varied over time all thanks to the Moore’s Law. You can see the advancement in hardware and changing of these numbers over in the graphical representation here - https://colin-scott.github.io/personal_website/research/interactive_latency.html

Service Availability Numbers

Availability of system is measured by a percentage. These percentages are shared by 3rd party solution providers like AWS, GCP and Azure in terms of an agreement that formally defines the level of uptime this solution/managed service would deliver. So if a service has an SLA of 100% it means it will never go down in the entirety of its operational time. Mostly services lie between the SLA of 99% to 100%. General terms in which SLA is talked about is in terms of “Numbers of nine”. If a service has Five-Nines SLA it means its availability is 99.999%, similarly Three-Nines would mean 99.9%. This is because no one can promise 100% SLA due to obvious reasons. So we judge in terms of how close to 100 it can be.

So here’s a comparison of SLA vs Downtime expected by it.

Availability %	Downtime per day	Downtime per year
99%	14.40 Minutes	3.65 Days
99.9%	1.44 Minutes	8.77 Hours
99.99%	8.64 seconds	52.60 Minutes
99.999%	864.0 milliseconds	5.26 Minutes
99.9999%	86.40 milliseconds	31.26 Seconds

Well that’s all for this one. This blog will serve more as a cheatsheet to me and I will keep on adding numbers as I find them useful and/or interesting. So you can bookmark this if you want to. 🍃

Estimates every Software Developer should know