Hard Upper Limit on Memory Latency

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 

In [NoBugs12], we discussed upper limits on the feasible possible memory size. It was found that even if every single atom of silicon implements one bit of memory, then implementing 2128 bytes will take 54 cubic kilometers of silicon, and to implement 2256 bytes there won’t be enough atoms in the observable universe. Now, we will proceed to analyse the upper limit of speed for huge amounts of memory (which are lower than the absolute limits mentioned above, but are still much higher than anything in use now). In other words, we’ll try to provide an answer to questions like, “Is it realistic to expect 2100-byte RAM to have access times which are typical for modern DDR3 SDRAM?”

Low Latencies between Planets??

Assumptions

We need to agree on the assumptions on which we will rely during our analysis. First, let’s assume that RAM is still made out of silicon, and that each bit of RAM requires at least one atom of silicon to implement it. This is an extremely generous assumption (current implementations use several orders of magnitude more atoms than that). Second, let’s rely on the assumption that nothing (including information) can possibly travel faster than speed of light in vacuum. This is a well-known consequence from invariance of speed of light and causality (many will name it a scientific fact rather than assumption or hypothesis, but we won’t argue about the terms here).

Analysis

Let’s consider memory which has B bytes. Let’s assume that each bit is implemented by one single atom of silicon. Then, this memory will take minimum a possible volume of

    \[ V_{min} = \frac{V_{mSi}}{N_A} \times B \times 8 \]

where NA is Avogadro’s number (6.02×1023 mol-1), and VmSi is molar volume of silicon (12×10-6 m3/mol). Therefore, for our 2100-byte (which is approximately equal to 1.27×1030 bytes) RAM it will take at least 200 cubic meters of silicon. Now let’s assume that whatever device which needs access to our RAM has dimensions which are negligible compared to the size of RAM silicon, so we can consider access to our RAM coming from a point (let’s name this point an ‘access point’). Now let’s arrange our RAM around the access point in a sphere (a sphere being the most optimal shape for our purposes). Such a sphere will have radius of

    \[ R_{min} = \sqrt[3]{\frac{V}{\frac{4}{3}\pi}} = \sqrt[3]{\frac{6}{\pi} \times \frac{V_{mSi}}{N_A} \times B} \]

Therefore, for our 2100-byte RAM, a silicon sphere implementing it will have radius of at least 3.7 meters. Now, let’s find out how long it will take an electromagnetic wave to go through Rmin (back and forth, to account for the time it takes a request to go to the location where the data is stored, and data to come back):

    \[ T_{min} = \frac{2\times R_{min}}{c} = \frac{2}{c} \times \sqrt[3]{\frac{6}{\pi} \times \frac{V_{mSi}}{N_A} \times B} \quad\quad \textrm{(*)} \]

where c is a speed of light (strictly speaking, we should take speed of electromagnetic waves in silicon, but as we’re speaking about lower bounds, and the difference is relatively small for our purposes, we can safely use the speed of light in vacuum, or 3×108 m/s). Substituting, we find that for our example 2100-byte RAM, equals approximately 25 nanoseconds.

It means that (given our assumptions above) there is a hard limit of 25 nanoseconds on the minimum possible guaranteed latency of 2100-byte RAM. While the number may look low, we need to realize that modern typical RAM latencies are more than an order of magnitude lower than that: for example, typical latency for DDR3 SDRAM is 1–1.5 ns (approximately 20 times less than our theoretical limit for 2100-byte RAM).

Judging hare:Even if each bit is implemented by single atom of silicon, 290-byte RAM is the largest RAM which can possibly have latencies comparable to modern DDR3 SDRAM.Now we can ask another question – “What is the maximum memory size for which we can realistically expect latencies typical to modern DDR3 SDRAM?” Using formula (*), we can calculate it as approximately 290 bytes. That is, even if each bit is implemented by single atom of silicon, 290-byte RAM is the largest RAM which can possibly have latencies comparable to modern DDR3 SDRAM.

Generalization

If (as is currently the case) each bit is implemented with N atoms of silicon, our formula (*) will become
allowing the calculation of latency limits depending on the technology in use. For example, if for our 2100-byte RAM every bit is represented with 1000 atoms of silicon (which is comparable – by the order of magnitude – to technologies used in modern RAM), the best possible latency will become 250 ns. As for the largest memory which can have latencies comparable to modern DDR3 SDRAM (given 1000 atoms per bit implementation), it is approximately 280 bits.

Further considerations

It should be mentioned that latency is not the only parameter which determines memory performance; another very important parameter is memory bandwidth (and memory bandwidth is not affected by our analysis). Also it should be mentioned that Tmin is in fact the minimum latency we can guarantee for all the bits stored (bits stored closer to the access point will have lower latencies than Tmin).

Another practical consideration is caching – our analysis did not take caching into account, and for most common access patterns caching will improve average latencies greatly.

Conclusions

One interesting consequence which comes out of our analysis is that currently silicon technology has already got very close to the hard physical limits (as opposed to technological limits which dominated electronics for decades), and that even relativistic effects may come into play when trying to improve things further along the lines of Moore’s law. While this is a known thing for those dealing with bleeding-edge electronics, it is usually ignored by people in the software industry, where it is quite common to extrapolate Moore’s law to last for centuries. On the other hand, approaching such hard physical limits may signal a close of the usual every-year expansion of number of cores/RAM/HDD size/…, and such an end may have very significant effects on the future of the software industry.

While it is unclear if it will be a Good Thing or Bad Thing for people in the industry, what is clear is that such an end would be quite a drastic change for the software development industry as a whole.

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

[+]References

[+]Disclaimer

Acknowledgements

This article has been originally published in Overload Journal #116 in August 2013 and is also available separately on ACCU web site. Re-posted here with a kind permission of Overload. The article has been re-formatted to fit your screen.

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.