In my first two CS101 posts (here and here) we discussed the basic electronic circuits used to compute logic conditions and basic arithmetic. At the end of my last post I alluded to a third element we need before we can construct something worth calling a ‘computer’. That element is memory.
Memory gives us the ability to have the current computation be influenced by the result of past computation. It’s what allows us to compose sequences of basic operations to produce ad-hoc, complex computations. These sequences of instructions are called programs.
Computers need memory to store two kinds of data: the programs - the sequences of instructions that define the computations - and the inputs and outputs of those computations. Both types of information are, as always with computation, reducible to long sequences of 0s and 1s. So the basic building block of computer memory is a way to store a 0 or a 1.
As you recall, electronic circuits can have two states, represented by the presence or absence of electrical current on a wire, and we can call these states “1” and “0”. A memory bit can be constructed using such a circuit, as long as it has the additional property of stability. That is, it must be able to sustain whatever state it’s in. So, if it has current (representing a 1) it has to continue to have current until we deliberately tell it otherwise, and vice versa. It can’t, for example, just let the current decay on its own, thus spontaneously turning a 1 into a 0.
There is, in fact, a way to construct these memory circuits, using the basic boolean components discussed in my first CS 101 post. Unfortunately, however, these circuits maintain their value, 0 or 1, only as long as there is current available. In other words, the memory loses all its state when you switch the power off. Outside of the most basic calculations, this causes a huge problem - how do we keep important information around for a long time?
The answer is to save the data to long-term, durable memory, so that even when the power goes off, the information isn’t lost. An example of long-term memory is a hard drive, where the information is stored on spinning platters coated with a magnetic material divided into many tiny areas, each set up so that its magnetic field points in one direction or another, representing a 0 or a 1 respectively. The material on the platter doesn’t require power to sustain its magnetic field, and so the data is preserved even without power. Then, when power returns we can copy the data from the disk back into the electronic memory, and continue with computation.
Latency and Bandwidth
You may well be thinking, “why even bother with electronic memory? why not use durable memory directly?”
The problem is that durable memory (hard drives, solid-state drives and so on) are incredibly slow in comparison with electronic memory. There are two manifestations of slowness, when it comes to memory: latency and bandwidth.
Latency is a measure of how long it takes to find the data. Or, in other words, how long it takes to retrieve the first bit we need. For example: RAM has latency on the order of 100 nanoseconds. Hard drive latency, however, is on the order of 10 milliseconds, because a hard drive has moving parts that have to be positioned mechanically. Finding a piece of data on disk is 100,000x slower than main memory! That is a huge difference.
Bandwidth is a measure of how long it takes, once we’ve found the data, to retrieve it all. Or, in other words, how many bits we can transfer into or out of memory per second. RAM has bandwidth on the order of 4 gigabytes/sec, while hard drives have bandwidth on the order of 40 megabytes/sec, which is 100x slower. Not quite as huge a difference as with latency, but still very significant in some cases.
A Snail’s Pace
The concepts of latency and bandwidth are relevant not just to memory, but to any case where we move data, the obvious example being networks. And indeed in many cases a computer on a network can use another computer as a form of memory: sending data over the network for a receiving computer to store, and then requesting the data back over the network later.
An extreme and evocative illustration of the difference between latency and bandwidth is that of the “snail pulling a chariot of DVDs” meme. That photo originated from a tongue-in-cheek experiment based on a computation similar to the following:
Say our “memory” is a box of DVDs with data burned on them, and for whatever reason, we decide that it’s a good idea to retrieve data by having a snail pull DVDs along.
An african land snail moves at a pace of about 0.5 inches per second. The snail+chariot is about 7.5 inches long, so if we send snails out single-file we can send a new one every 7.5/0.5 = 15 seconds. This means we can send out four chariots, or eight DVDs, a minute.
A common DVD has a capacity of around 4 GB, which means the snail system can deliver 8 x 4 = 32 GB / minute, which is about 500 MB/sec. This is a huge amount of bandwidth! By contrast, a fast broadband connection for home internet might be 10 MB/sec. And if we replace those DVDs with blu-ray discs, which can have a capacity of 40GB or more, we’re looking at 5 GB/sec of bandwidth, which is faster than pretty much any network out there, all on the backs of snails.
The latency of the snail system, however, is absurdly slow. If you’re just a mile away from the origin of the data, you’ll wait 36 hours for the first bit of data to arrive.
The Memory Hierarchy
Another big difference between different forms of memory is price. Hard drives are significantly cheaper per GB than RAM, even though the prices of both keep dropping dramatically (I’m not clear on what the current cost of snails is).
Therefore modern computers have an entire hierarchy of memory components, to provide various different tradeoffs between price, size, latency, bandwidth and durability. These range from tiny chunks of very fast memory embedded on the CPU itself, to RAM-based main memory, to SSDs and hard drives. Even tape drives are still used sometimes for backup. Like the snails, they have very high latency but also very high bandwidth, which is appropriate when you want to dump a large amount of data out efficiently, but don’t mind the restoration process being clunky.
Typically, the faster the memory (in latency and/or bandwidth) the less of it you have, due to cost or engineering constraints. So data needs to move up and down the hierarchy, from slow/cheap/durable memory to fast/expensive/volatile memory, as needed, and then demoted back down that hierarchy when possible to free up room in the fast memory for other data.
A holy grail of hardware is a memory subsystem with the performance of RAM, but the durability and price of disk. But until such a thing is invented, a good part of the art of building systems, such as those that power your favorite web sites, is that of managing the tradeoffs between the advantages and disadvantages of these various types of memory. Great systems are characterized by their ability to shunt data around in clever ways in order to maximize speed without compromising data safety if the power goes out in the datacenter.