The photo above shows the tiny silicon die inside the 8008 package. (Click the image for a higher resolution photo.) You can barely see the wires and transistors that make up the chip. The squares around the outside are the 18 pads that are connected to the external pins by tiny bond wires. You can see the text "8008" on the right edge of the chip and "© Intel 1971" on the lower edge. The initials HF appear on the top right for Hal Feeney, who did the chip's logic design and physical layout. (Other key designers of the 8008 were Ted Hoff, Stan Mazor, and Federico Faggin.)
Inside the chip
The diagram below highlights some of the major functional blocks of the chip. On the left is the 8-bit Arithmetic/Logic Unit (ALU), which performs the actual data computations.3 The ALU uses two temporary registers to hold its input values. These registers take up significant area on the chip, not because they are complex, but because they need large transistors to drive signals through the ALU circuitry.
Below the registers is the carry look ahead circuitry. For addition and subtraction, this circuit computes all eight carry values in parallel to improve performance.2 Since the low-order carry depends on just the low-order bits, while the higher-order carries depend on multiple bits, the circuit block has a triangular shape.
The triangular layout of the ALU is unusual. Most processors stack the circuitry for each bit into a regular rectangle (a bit-slice layout). The 8008, however, has eight blocks (one for each bit) arranged haphazardly to fit around the space left by the triangular carry generator. The ALU supports eight simple operations.3
In the center of the chip is the instruction register and the instruction decoding logic that determines the meaning of each 8-bit machine instruction. Decoding is done with a Programmable Logic Array (PLA), an arrangement of gates that matches bit patterns and generates the appropriate control signals for the rest of the chip. On the right are the storage blocks. The 8008's seven registers are in the upper right. In the lower right is the address stack, which consists of eight 14-bit address words. Unlike most processors, the 8008's call stack is stored on the chip instead of in memory. The program counter is just one of these addresses, making subroutine calls and returns very simple. The 8008 uses dynamic memory for this storage
The physical structure of the chip is very close to the block diagram in the 8008 User's Manual (below), with blocks located on the chip in nearly the same positions as in the block diagram.
The structure of the chip
What does the die photo show? For our purposes, the chip can be thought of as three layers. The diagram below shows a closeup of the chip, pointing out these layers. The topmost layer is the metal wiring. It is the most visible feature, and looks metallic (not surprisingly). In the detail below, these wires are mostly horizontal. The polysilicon layer is below the metal and appears orange under the microscope.
The foundation of the chip is the silicon wafer, which appears purplish-gray in the photo. Pure silicon is effectively an insulator. Regions of it are "doped" with impurities to create semiconducting silicon. Being on the bottom, the silicon layer is difficult to distinguish, but you can see black lines along the border between doped silicon and undoped silicon. A few vertical silicon "wires" are visible in the photo.4
Transistors are the key component of the chip, and a transistor is formed where a polysilicon wire crosses doped silicon. In the photo, the polysilicon appears as a brighter orange where it forms a transistor.
Why an 18 pin chip?
One inconvenient feature of the 8008 is it only has 18 pins, which makes the chip slower and much more difficult to use. The 8008 uses 14 address bits and 8 data bits so with 18 pins there aren't enough pins for each signal. Instead, the chip has 8 data pins that are reused in three cycles to transmit the low address bits, high address bits, and data bits. A computer using the 8008 requires many support chips to interact with this inconvenient bus architecture.5
There was no good reason to force the chip into 18 pins. Packages with 40 or 48 pins were common with other manufacturers, but 16 pins was "a religion at Intel".6 Only with great reluctance did they move to 18 pins. By the time the 8080 processor came out a few years later, Intel had come to terms with 40-pin chips. The 8080 was much more popular, in part because it had a simpler bus design permitted by the 40-pin package.
Power and data paths in the chip
The data bus provides data flow through the chip. The diagram below shows the 8-bit data bus of the 8008 with rainbow colors for the 8 data lines. The data bus connects to the 8 data pins along the outside of the upper half of the chip. The bus runs between the ALU on the left, the instruction register (upper center), and the registers and stack on the right. The bus is split on the left with half along each side of the ALU.
The red and blue lines show power routing. Power routing is an under-appreciated aspect of microprocessors. Power is routed in the metal layer due to its low resistance. But since there is only one metal layer in early microprocessors, power distribution must be carefully planned so the paths don't cross.7 The diagram above shows Vcc lines in blue and Vdd lines in red. Power is supplied through the Vcc pin on the left and the Vdd pin on the right, then branches out into thin, interlocking wires that supply all parts of the chip.
The register file
To show what the chip looks like in detail, I've zoomed in on the 8008's register file in the photo below. The register file consists of an 8 by 7 grid of dynamic RAM (DRAM) storage cells, each using three transistors to hold one bit.8 (You can see the transistors as the small rectangles where the orange polysilicon takes on a slightly more vivid color.) Each row is one of the 8008's seven 8-bit registers (A, B, C, D, E, H, L). On the left, you can see seven pairs of horizontal wires: the read select and write select lines for each register. At the top, you can see eight vertical wires to read or write the contents of each bit, along with 5 thicker wires to supply Vcc. Using DRAM for registers (rather than the more common static latches) is an interesting choice. Since Intel was primary a memory company at the time, I expect they chose DRAM due to their expertise in the area.
How PMOS works
The 8008 uses PMOS transistors. To simplify slightly, you can think of a PMOS transistor as a switch between two silicon wires, controlled by a gate input (of polysilicon). The switch closes when its gate input is low and it can pull its output high. If you're familiar with the NMOS transistors used in microprocessors like the 6502, PMOS may be a bit confusing because everything is backwards.
A simple PMOS NAND gate can be constructed as shown below. When both inputs are high, the transistors are off and the resistor pulls the output low. When any input is low, the transistor will conduct, connecting the output to +5. Thus, the circuit implements a NAND gate. For compatibility with 5-volt TTL circuits, the PMOS gate (and thus the 8008) is powered with unusual voltages: -9V and +5V.
For technical reasons, the resistor is actually implemented with a transistor. The diagram below shows how the transistor is wired to act as a pull-down resistor. The detail on the right shows how this circuit appears on the chip. The -9V metal wire is at the top, the transistor is in the middle, and the output is the silicon wire at the bottom.
History of the 8008
The 8008's complicated story starts with the Datapoint 2200, a popular computer introduced in 1970 as a programmable terminal. (Some people consider the Datapoint 2200 to be the first personal computer.) Rather than using a microprocessor, the Datapoint 2200 contained a board-sized CPU build from individual TTL chips. (This was the standard way to build a CPU in the minicomputer era.) Datapoint and Intel decided that it would be possible to replace this board with a single MOS chip, and Intel started the 8008 project to build this chip. A bit later, Texas Instruments also agreed to build a single-chip processor for Datapoint. Both chips were designed to be compatible with the Datapoint 2200's 8-bit instruction set and architecture.
Around March 1971, Texas Instruments completed their processor chip, calling it the TMC 1795. After delaying the project, Intel finished the 8008 chip later, around the end of 1971. For a variety of reasons, Datapoint rejected both microprocessors and built a faster CPU based on newer TTL chips including the 74181 ALU chip. TI tried unsuccessfully to market the TMC 1795 processor to companies such as Ford, but ended up abandoning the processor, focusing on highly-profitable calculator chips instead. Intel, on the other hand, marketed the 8008 as a general-purpose microprocessor, which eventually led to the x86 architecture you're probably using right now. Although TI was first with the 8-bit processor, it was Intel who made their chip a success, creating the microprocessor industry.
The diagram above summarizes the "family tree" of the 8008 and some related processors.10 The Datapoint 2200's architecture was used in the TMC 1795, the Intel 8008, and the next version Datapoint 220011. Thus, four entirely different processors were built using the Datapoint 2200's instruction set and architecture. The Intel 8080 processor was a much-improved version of the 8008. It significantly extended the 8008's instruction set and reordered the machine code instructions for efficiency. The 8080 was used in groundbreaking early microcomputers such as the Altair and the Imsai. After working on the 4004 and 8080, designers Federico Faggin and Masatoshi Shima left Intel to build the Zilog Z-80 microprocessor, which improved on the 8080 and became very popular.
The jump to the 16-bit 8086 processor was much less evolutionary. Most 8080 assembly code could be converted to run on the 8086, but not trivially, as the instruction set and architecture were radically changed. Nonetheless, some characteristics of the Datapoint 2200 still exist in today's x86 processors. For instance, the Datapoint 2200 had a serial processor, processing bytes one bit at a time. Since the lowest bit needs to be processed first, the Datapoint 2200 was little-endian. For compatibility, the 8008 was little-endian, and this is still the case in Intel's processors. Another feature of the Datapoint 2200 was the parity flag, since parity calculation was important for a terminal's communication. The parity flag has continued to the x86 architecture.
The 8008 is architecturally unrelated to Intel's 4-bit 4004 processor12. The 8008 is not an 8-bit version of the 4-bit 4004 in any way. The similar names are purely a marketing invention; during its design phase the 8008 had the unexciting name "1201".
How the 8008 fits into the history of semiconductor technology
The 4004 and 8008 both used silicon-gate enhancement-mode PMOS, a semiconductor technology that was only used briefly. This puts the chips at an interesting point in chip fabrication technology.
The 8008 (and modern processors) uses MOS transistors. These transistors had a long path to acceptance, being slower and less reliable than the bipolar transistors used in most computers of the 1960s. By the late 1960s, MOS integrated circuits were becoming more common; the standard technology was PMOS transistors with metal gates. The gates of the transistor consisted of metal, which was also used to connect components of the chip. Chips essentially had two layers of functionality: the silicon itself, and the metal wiring on top. This technology was used in many Texas Instruments calculator chips, as well as the TMC 1795 chip (the chip that had the same instruction set as the 8008).
A key innovation that made the 8008 practical was the self-aligned gate—a transistor using a gate of polysilicon rather than metal. Although this technology was invented by Fairchild and Bell Labs, it was Intel that pushed the technology ahead. Polysilicon gate transistors had much better performance than metal gate (for complex semiconductor reasons). In addition, adding a polysilicon layer made routing of signals in the chip much easier, making the chips denser. The diagram below shows the benefit of self-aligned gates: the metal-gate TMC 1795 is bigger than the 4004 and 8008 chips combined.
Shortly afterwards, semiconductor technology improved again with the use of NMOS transistors instead of PMOS transistors. Although PMOS transistors were easier to manufacture initially, NMOS transistors are faster, so once NMOS could be fabricated reliably, they were a clear win. NMOS led to more powerful chips such as the Intel 8080 and the Motorola 6800 (both 1974). Another technology improvement of this time was ion-implantation to change the characteristics of transistors. This allowed the creation of "depletion-mode" transistors for use as pull-up resistors. These transistors improved chip performance and reduced power consumption. They also allowed the creation of chips that ran on standard five-volt supplies.13 The combination of NMOS transistors and depletion-mode pull-ups was used for most of the microprocessors of the late 1970s and early 1980s, such as the 6502 (1975), Z-80 (1976), 68000 (1979), and Intel chips from the 8085 (1976) to the 80286 (1982).
In the mid 1980s, CMOS took over, using NMOS and PMOS transistors together to dramatically reduce power consumption, with chips such as the 80386 (1986), 68020 (1984) and ARM1 (1985). Now almost all chips are CMOS.14
As you can see, the 1970s were a time of large changes in semiconductor chip technology. The 4004 and 8008 were created when the technological capability intersected with the right market.
How to take die photos
In this section, I explain how I got the photos of the 8008 die. The first step is to open the chip package to expose the die. Most chips come in epoxy packages, which can be dissolved with dangerous acids.
Since I would rather avoid boiling nitric acid, I took a simpler approach. The 8008 is also available in a ceramic package (above), which I got on eBay. Tapping the chip along the seam with a chisel pops the two ceramic layers apart. The photo below shows the lower half of the ceramic package, with the die exposed. Most of the metal pins have been removed, but their positions in the package are visible. To the right of the die is a small square; this connects ground (Vcc) to the substrate. A couple of the tiny bond wires are still visible, connected to the die.
Once the die is exposed, a microscope can be used to take photographs. A standard microscope shines the light from below, which doesn't work well for die photographs. Instead, I used a metallurgical microscope, which shines the light from above to illuminate the chip.
I took 48 photographs through the microscope and then used the Hugin stitching software to combine them into one high-resolution image (details). Finally, I adjusted the image contrast to make the chip's structures more visible. The original image (which is approximately what you see through the microscope) is below for comparison.
I took detailed die photos of the 8008 that reveal the circuitry it used. While the 8008 wasn't the first microprocessor or even the first 8-bit microprocessor, it was truly revolutionary, triggering the microprocessor revolution and leading to the x86 architecture that dominates personal computers today. In future posts, I plan to explain the 8008's circuits in detail to provide a glimpse into the roots of todays computers.
Notes and references
According to the oral history of the 8008, photos of the 8008 were obtained in October / November 1971 (page 6). Chip designer Federico Faggin mentions that toward the end of 1971, "everything was working except for a few errors." Faggin then debugged a problem with the dynamic memory losing data, making it ready for production (page 9). ↩
The 8008's ALU supports eight operations: add, subtract, add with carry, subtract with carry, AND, OR, XOR, and compare. It also implements left and right shift and rotate operations. The 8008 also has increment and decrement instructions, extending the Datapoint 2200's instruction set. ↩
Because silicon has higher resistance than polysilicon, most chips use the polysilicon and metal layers for wiring, not the silicon layer. The 4004 and 8008 chips are unusual in that they prefer to use the silicon layer for wiring rather than polysilicon. I expect this was due to the recent introduction of polysilicon: before polysilicon, routing needed to be done in the silicon layer and perhaps the chip designers were sticking with the older layout techniques. ↩
The 8008 required 20 support chips according to chip architect Federico Faggin. In contrast, the 4004 and earlier MOS computers such as the Four Phase and CADC were designed with a small number of MOS chips that worked together without extra "glue chips". In this sense, the 8008 was a step backwards architecturally, saying "here's the CPU, you figure out how to make a computer out of it." ↩
For details on Intel's insistence on 16 pins, see Oral History of Federico Faggin, page 55-56. It was only when the 1103 memory chip required 18 pins that Intel reluctantly moved beyond 16 pins. And that was treated by Intel like "the sky had dropped from heaven," resulting in "so many long faces". ↩
If two metal lines need to cross, one of them can be routed under the other by using the polysilicon layer. To be low resistance, this cross-under must be relatively wide, so cross-unders are avoided if possible. ↩
The 8008 registers use the "3T1C" cell: three transistors and one capacitor (details). The circuit doesn't physically contain a separate capacitor, but uses the gate capacitance of the transistor. One unusual feature of the 8008 cell is it uses one wire for both reading and writing the bit, while the typical 3T cell has separate wires for reading and writing. The 4004 had separate wires, but the design changed slightly in the 8008. ↩
Pull-up resistors in later chips such as the 6502 were implemented using depletion-mode NMOS transistors. These yielded more faster, more efficient logic. They were also wired differently, with the gate connected to the output rather than the power rail. ↩
The second version of the Datapoint 2200 had a totally new implementation of the processor, still built from TTL chips. While the first version had a serial ALU (processing one bit at a time), the second version operated in parallel using 74181 ALU chips. As a result, the second version was much faster. ↩
Interestingly, in 2007 Intel started using metal gates again in order to scale transistors further (details). In a way, semiconductor technology has gone full circle, back to metal gates, although now unusual metals such as hafnium are used. ↩