Showing posts with label analog. Show all posts
Showing posts with label analog. Show all posts

IBM paperweight teardown: Reverse-engineering 1970s memory chips

I recently received a vintage IBM paperweight from the early 1970s that showcases some memory chips.1 When IBM started using integrated circuits in the late 1960s, they packed the chips in square metal modules called Monolithic Systems Technology (MST). The paperweight illustrates the manufacturing steps for an MST module as a silicon wafer was cut into silicon dies, mounted on a square ceramic substrate, and wrapped in a thumbnail-sized metal package.

The paperweight contains a silicon wafer, four dies, and an MST module in various stages of assembly. The paperweight is somewhat yellowed with age. Click this image, or any other, for a larger photo.

The paperweight contains a silicon wafer, four dies, and an MST module in various stages of assembly. The paperweight is somewhat yellowed with age. Click this image, or any other, for a larger photo.

Because the dies are encased in clear Lucite, it's possible to closely examine their circuitry and understand them better. The photo below is a closeup of the edge of the silicon wafer and the four dies inside the paperweight. The two larger dies are the same as the dies on the wafer. The two smaller dies are the same, but one is visibly damaged.2 For this blog post, I took detailed die photos using a microscope and reverse-engineered the smaller chip. My conclusion is that the larger chips are 1-kilobit static RAM chips, while the smaller ones are memory sense amplifiers.

Closeup of the dies and wafer inside the paperweight.

Closeup of the dies and wafer inside the paperweight.

IBM System/370

These chips were probably used in IBM's popular System/370 line of mainframe computers. In 1964, IBM introduced the extremely successful System/360 family of mainframes. This product line was modernized in 1970 with the announcement of System/370, which was constructed from integrated circuits (unlike the System/360) and moved from magnetic core memory to semiconductor memory. The paperweight illustrates both of these changes: integrated circuits and semiconductor memory.

To understand the scale of a System/370 computer, the rendering below shows a System/370 Model 145. The Model 145 was a "medium-scale" machine in the middle of the System/370 family.3 The Model 145 is notable as IBM's first computer that used semiconductor main memory. The computer is very large by modern standards, filling the blue cabinets below. One cabinet holds the CPU while another holds 256 kilobytes of memory chips. This computer predates the microprocessor, so the CPU is built gate-by-gate from many boards of integrated circuits. The Model 145 weighed over a ton, cost $5 to 10 million (in current dollars), and was roughly as fast as an IBM PC (1981).

Rendering of a System/370 Model 145. The computer is the large blue cabinet along the wall. The white unit at the back is disk storage, while a card reader is in the foreground. Image by Oliver.obi, CC BY-SA 3.0.

Rendering of a System/370 Model 145. The computer is the large blue cabinet along the wall. The white unit at the back is disk storage, while a card reader is in the foreground. Image by Oliver.obi, CC BY-SA 3.0.

The MST modules

In the earlier System/360, IBM didn't use integrated circuits, but instead used hybrid modules called SLT. For the System/370 IBM moved to integrated circuits, which they called "monolithics". While most companies packaged integrated circuits in rectangular plastic or ceramic packages, IBM retained the half-inch-square metal packages of SLT, calling it MST, for Monolithic Systems Technology.4 MST was a big improvement over the earlier hybrid SLT, about ten times more reliable and 4 to 8 times as dense. These MST integrated circuits were very simple by modern standards, with 32 transistors per module implementing about six gates, so thousands of integrated circuits were required to implement the computer.

The MST modules were manufactured in large quantities with automated production techniques. The sequence of components in the paperweight (below) illustrates the steps. On the left, the round silicon wafer is cut into individual dies. On the right, the square ceramic substrate has 16 holes for pins. Next, a printed-circuit pattern is applied to the substrate to connect the integrated circuit to the module's pins.5 In the third step, 16 pins are soldered to the substrate. Next, the silicon die and the ceramic substrate are combined, with the silicon die is mounted upside-down in the center of the ceramic substrate. Note how small the silicon die is, compared to the size of the package. The module is reflow-soldered, with contacts on the silicon die soldered directly to the substrate.6 Finally, the module is encased in metal, producing a half-inch square module. These modules give IBM's integrated circuits a unique appearance, distinct from the plastic or ceramic DIP integrated circuits used by other manufacturers.

The steps to manufacture an MST module.

The steps to manufacture an MST module.

The MST modules were tightly packed on circuit cards, such as the memory card below. The square module in combination with a four-plane printed circuit board provides considerably higher density than the circuit boards of other manufacturers at the time, which typically used DIP integrated circuits and 2-layer PCBs.

An IBM memory card packed with MST modules.

An IBM memory card packed with MST modules.

The memory wafer and chip

The silicon wafer in the paperweight is 2 inches in diameter, a size that was introduced in 1969. Wafer sizes have steadily increased since then and modern chip fabrication is done with much larger 300 mm (12") wafers.7 The wafer contains 177 dies; using a microscope, I created the die photo below of one of them. Curiously, the wafer is only partially manufactured; it looks like only one of the nine mask layers was constructed. Because this photo is taken from the wafer, you can see the test circuitry and alignment patterns in between the dies.

Die photo of one of the memory chips on the wafer. It is only partially manufactured. The part number "DLM1" is visible on it.

Die photo of one of the memory chips on the wafer. It is only partially manufactured. The part number "DLM1" is visible on it.

The paperweight also contains completed individual dies so I created the die photo below. The regular grid of memory cells is visible in the middle of the chip, with support circuitry around the edge. From studying the die and counting the cells, I think this is a 1-kilobit static RAM chip. Note the solder balls around the edge of the die, which allowed the chip to be soldered directly to the ceramic substrate. With 25 solder balls, this chip was probably mounted in an MST package with a 5×5 grid of pins.

Die photo of the memory chip.

Die photo of the memory chip.

Taking microscope photos is difficult when the die is encased in Lucite, so I wasn't able to see the circuitry under high magnification. As a result, I couldn't reverse-engineer this chip in detail.8 I was able to measure the feature size on the die as about 6µm, a process introduced around 1971.

The sense amplifier chip

The smaller die in the paperweight is much simpler with much larger components. I took the die photo below and found it contains 32 NPN transistors along with resistors. This chip is partially analog and also uses a type of logic called ECL. I believe the chip is a differential amplifier, a sense amplifier to read the signals from the memory chip. This would explain why the two chips are packaged together in the paperweight.

Die photo of the bipolar integrated circuit. The left and right sides are approximately mirror-images, with two copies of the same circuit.

Die photo of the bipolar integrated circuit. The left and right sides are approximately mirror-images, with two copies of the same circuit.

In the die photo above, the silicon of the die is gray. Parts of the silicon were doped with arsenic, boron, or phosphorus to create regions with different semiconductor properties. The black lines in the silicon are boundaries between different doping levels. The yellowish regions are metal wiring on top of the silicon, connecting the various components together. The large black circles are the solder balls to connect the die to the MST substrate.

The diagram below is a detail from the chip, showing two types of resistors and a transistor. The upper resistor above is made from a line of higher-resistance N-type silicon, with metal contacts connected to either end. This forms a 65Ω resistor. The lower resistor has six contacts, providing multiple resistance values depending on where the metal lines are attached. It uses P-type silicon for the resistive element, providing hundreds of ohms of resistance. (There's a bit more internal structure to the resistors, but I'll ignore it.)

Two resistors and a transistor as they appear on the die.

Two resistors and a transistor as they appear on the die.

The transistors are bipolar NPN transistors, but their structure is a bit more complex than the typical NPN transistor. Physically, they have two bases and two collectors wired together to reduce current density, so you'll see five metal connections to each transistor. The diagram below shows the cross-section structure of the transistor. The five metal connections on top of the cross-section correspond to the five connections on the transistor above. The collector, base, and emitter are connected to N-P-N layers, forming the NPN transistor.9 The P+ ring provides isolation around the transistor.

This diagram shows the internal structure of the chip's transistors, based on patent 3539876.

This diagram shows the internal structure of the chip's transistors, based on patent 3539876.

By recognizing the components on the die and tracing out the wiring, the circuit can be reverse-engineered. However, if you look at the die closely, you'll see that many components are not connected. The reason is that IBM used a technique called "master slice" to produce a variety of integrated circuits without custom-designing each one.10 The idea was to use a common silicon die with multiple transistors and resistors. By modifying the metal layer (which was relatively inexpensive), the components could be wired into the desired circuits. This is also why the resistors had multiple taps, so they could be wired to obtain different values as needed.

The differential amplifier and ECL logic

Logic circuits can be built in a wide variety of ways. Almost all computers today use a logic family called CMOS (complementary metal-oxide-semiconductor), building gates out of MOS transistors. However, the IBM System/370 used a high-performance11 logic family known as Emitter-coupled logic (ECL), which IBM called Current-Switch Emitter Follower (CSEF).12 ECL was invented at IBM in 1956 for use in IBM's high-performance transistorized computers.

ECL is based on a differential pair, a circuit that amplifies the difference between two inputs. (This circuit is also the basis of op-amps.) The idea behind a differential pair (below) is that a fixed current flows through the circuit. If the left input is a higher voltage than the right, the left transistor will turn on and most current will flow through the left branch (red). Conversely, if the right input is a higher voltage than the left, the right transistor will turn on and most current will flow through the right branch (blue). The differential pair provides amplification because a small difference in the inputs will create a large shift in the current.

A differential pair amplifies the difference between the two inputs.

A differential pair amplifies the difference between the two inputs.

The above circuit is used as an amplifier in the chip, but with a few modifications it also forms an ECL gate. For a gate, the voltage into one branch is fixed at a reference voltage, midway between the "0" level and the "1" level. Thus, if the input is higher than the reference voltage, it will be considered a "1", and lower will be a "0". (MST chips used ground as the reference voltage.4) The ECL circuit below is an inverter, since if the input is high, the current through the left resistor will pull the output low. To improve performance, the bottom resistor has been replaced with a current sink circuit (purple). The current through the current sink is set by an external bias voltage (VCS).

The differential pair can be modified to produce an ECL inverter.

The differential pair can be modified to produce an ECL inverter.

A buffer (green) has been added to the output above. The buffer circuit is called an emitter follower since the output is taken from the transistor's emitter and the output follows the input. This is why IBM used the name Current-Switch Emitter Follower for this logic family.

The sense amplifier chip's circuitry

I reverse-engineered the chip's circuitry and found it contains two copies of the circuit below. This circuit is a differential amplifier, probably used as a sense amplifier to amplify the outputs from the memory chip and convert them to logic signals.13

The chip takes two inputs, a negative input and a positive input, and produces a logic-level output. The circuit is a bit complicated, but I'll try to explain the highlights. The differential amplifiers (discussed earlier) are the core of the chip. The input signals are buffered and then go into the lower amplifier (green box). The outputs from that amplifier go into the upper amplifier. Cascading two amplifier stages in this way makes the chip very sensitive, providing a large degree of amplification.

Reverse-engineered schematic of the sense amplifier chip.

Reverse-engineered schematic of the sense amplifier chip.

The yellow boxes are buffers, using the emitter-follower circuit described earlier. One buffer is used on each input and one on the output. The purple box is an ECL gate. I believe it is used to latch the amplifier's value by feeding the output back in. The current sink transistors are colored blue to distinguish them. They provide a constant current to the differential amplifiers and other circuits.

Conclusion

Well, this is a lot of analysis for a paperweight. But this paperweight provides an interesting window into IBM's technology of 1974.14 In particular, it illustrates IBM's transition to integrated circuits and semiconductor memory for the System/370 mainframes. It also explains IBM's unique construction technique for integrated circuits, packaging them on a ceramic wafer in a square metal can, a technology they called MST. Finally, the paperweight's 1-kilobit memory chip shows the amazing progress that memory technology has made over the past decades, giving us megabit chips and now multi-gigabit chips.

Thanks to @magnetic_tape for sending me the paperweight. Thanks to Mark Smotherman for information on MST. I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed. If you're interested in old IBM technology, see my recent post on an IBM Technology Box, covering 1948 to 1986.

Notes and references

  1. The text "Essones" on the paperweight refers to IBM's semiconductor plant in Corbeil-Essones, a suburb of Paris. IBM opened this plant in 1964, Europe's biggest semiconductor factory at the time. 

  2. Curiously, one of the dies in the paperweight is damaged and has a corner missing. Note that it's not simply broken, since the metal layer and the silicon doping don't go to the edge. Probably this die is from the edge of the wafer so it didn't get fully exposed. With the incomplete wafer and the truncated die, it's clear that they were using junk parts in the paperweights.

    One die in the paperweight is damaged.

    One die in the paperweight is damaged.

     

  3. For a while, IBM used a rational numbering system for the System/370 models, with computer power increasing with the model number. Model numbers ranged from the low-end Model 115 to the high-end Model 195. However, the numbering system fell apart in the late 1970s as systems were assigned seemingly-random numbers such as 3031, 4361, 3090, and 9370. Despite having the biggest number, the 9370 was a low-end machine. See IBM's 360 and Early 370 Systems for a detailed history of the System/370. 

  4. IBM had multiple versions of MST logic for different products; some versions used different voltages. MST-1 uses ground as the upper voltage, -4 volts as the lower voltage, and -1.32 volts as the ECL reference voltage. (Because ECL circuits are more sensitive to fluctuations in the upper voltage, ECL families often assign that level to ground, making the lower voltage negative.) MST-2 shifts the levels so the reference level is ground; the upper voltage is +1.25V and the lower is -3V.

    I couldn't find much information on the other MST variants, but for reference I'll summarize what I did find. MST-2 was used in the S/370 Models 145 and 155, while MST-4 was a high-performance version developed by Texas Instruments and used in the S/360 Model 85. The S/370 Model 168 used MST-1, MST-2, MST-4, and MST-A. The System/3 used MST-10. The IBM 3889 OCR machine, 3350 Disk Storage, and 3704 Communications Controller used MST-1 and MST-E. The IBM 3031 used MST-1, MST-2, MST-4, MST-4E, MST-E, and MST-A. Other versions included MST-195 and MST-255. 

  5. The MST ceramic substrate provides the interface between two circuitry scales: the printed circuit board scale with 0.125-inch pin spacing, and the integrated circuit scale with 0.01-inch solder ball spacing. The pattern on the MST ceramic substrate has some interesting subtleties; each power pin is connected to three solder balls, allowing more current into the IC. The trace for V- crosses the chip, providing two connections on one side and on on the other. The trace for V+ extends into the middle of the IC to provide additional power connections.

    Diagram showing how the chip is mounted on the ceramic substrate. (The chip image has been mirrored to account for it being mounted upside down.)

    Diagram showing how the chip is mounted on the ceramic substrate. (The chip image has been mirrored to account for it being mounted upside down.)

    For some reason, MST uses two different pin-numbering schemes. The 12-pin SLT numbering was extended by spiraling 13-16 into the middle. But the more common MST pin names are A01 through D04.

     

  6. IBM called the chip mounting technique "controlled-collapse chip connections" or C-4. It used a controlled volume of solder to make electrical and mechanical contact with the module. During soldering, the chip was pulled into alignment with the module fingers by surface tension, similar to how a surface-mount device is soldered today. For more details, see Design of Logic Circuit Technology for IBM System/370 Models 145 and 155

  7. Information on wafer sizes is here and on Wikipedia

  8. The photo below is the best resolution I could get of the memory cells. I believe this is six memory cells; I put a box around one. The circuitry in two rows is connected as shown in blue. This is probably two cross-coupled inverters, a standard circuit for a static RAM cell.

    Closeup of six memory cells in the memory chip.

    Closeup of six memory cells in the memory chip.

     

  9. The diagram below provides more details on the construction and dimensions of the transistors.

    The transistors in the MST chips have a single base and collector but has two base and collector connections to reduce current density. Image from Design of Logic Circuit Technology for IBM System/370 Models 145 and 155.

    The transistors in the MST chips have a single base and collector but has two base and collector connections to reduce current density. Image from Design of Logic Circuit Technology for IBM System/370 Models 145 and 155.

     

  10. The master slice approach used a fixed silicon layout with transistors and resistors, but changed the metal interconnections to create different chips, a process called "personalization". The diagram below, from patent 3539876, shows a silicon layout used for IBM's master slice integrated circuits. If you match up the resistors and transistors, this diagram is almost identical to the die in the paperweight. There are a few differences, though. In particular, the die has an extra pin on the left and right, with slight resistor changes to accommodate them. Design of Monolithic Circuit Chips (1966) describes the origins of the master slice approach. Even in 1966, they were using computer-assisted design for integrated circuits.

    The die structure from patent 3539876 is almost identical to the chip.

    The die structure from patent 3539876 is almost identical to the chip.

     

  11. ECL gates obtained much of their speed advantage because the transistors were not completely turned on (i.e. saturated). This allowed the transistors to switch the current path rapidly. Additionally, the difference between a "0" voltage and a "1" voltage was small (about 0.8) volts, so signals could switch between the two voltages quickly. In comparison, TTL gates typically had a difference of about 3.2 volts between a "0" and a "1", requiring more time to switch. (Signals could typically switch at about 1 volt per nanosecond, so a larger voltage swing caused nanoseconds of delay.) On the other hand, the small voltage swings of ECL made the circuits more sensitive to electrical noise. 

  12. For more details on ECL logic and how IBM used it, see Design of Logic Circuit Technology for IBM System/370 Models 145 and 155

  13. I'm not completely sure of the role of this chip. I searched extensively, but couldn't find any documentation on it. IBM's MST modules are described in detail in MST-2 Module Data (1974). Inconveniently, the chip's part number (2551667) doesn't appear in this document (although nearby part numbers such as 2551665 are described). Thus, I had to study the circuit to determine its function. At first I expected it to be a standard logic gate. However, the two amplification stages didn't make sense, or the complementary inputs. Another possibility was that it converted differential signals (such as from the Differential Current Switch logic family) into ECL signals. That would explain the differential inputs, but not the two stages of amplification.

    I think it's most likely that the chip is acting as a sense amplifier for memory chip, amplifying the memory chip's output and turning it into a logic level. The 370 Model 45 hardware manual (page 3-9) describes a sense latch module used with its memory, so external sense amplifiers were used in System/370. The chip pin that I've labeled "latch" may be used to feed back the output to latch it, or it may be used as an enable pin or to reset the latch; without seeing the surrounding circuitry, I'm not sure.

    Intel also produced memory chips that required external sense amplifiers; see the Intel 1103 and Intel 2105. Intel produced sense amplifier chips, the 3208 and 3408 Hex Sense Amplifiers specifically to provide external sense amplifies for memory. One motivation for external sense amplifiers was that memory chips were built with MOSFET transistors, but bipolar transistors produced better amplifiers. Later memory chips, though, included the sense amplifiers on the chip. 

  14. I'm guessing that the module is from 1974. Based on the technology, the paperweight is from the early 1970s. The module is labeled with the code "1 425C404". My theory is that the second digit "4" indicates the year, dating the module to 1974. IBM's modules are usually labeled with three lines of text, but there's no solid information on the meanings. The first line is the part number. The second line is believed to indicate the manufacturing location. (So "IBM 52" would indicate Essones, France. Although a reader tells me that IBM 52 was Poughkeepsie or Fishkill NY, while IBM 29 was Essex.) The third line is believed to be a date/lot code. Studying an extensive collection of cards, the digit after the 1 appears to be the year. For instance, some codes start with "1712" for 1977, "1 949" and "1925" for 1979, "1-005" and "1 031" for 1980, "1-106" for 1981, "1 205" for 1982, "1 444" for 1984, "1 865" for 1988, "1912" for 1989. But other modules have codes starting with "1 E52", "1 F09", and "1 H27" so it's not quite that simple. There also are a few codes like "1 8450" for 1984, suggesting they also used 2-digit year codes. It's possible that different sites used different codes. 

Reverse-engineering the clock chip in the first MOS calculator

In 1969, Sharp introduced the first calculator built from high-density MOS chips, the QT-8D, followed by the handheld Sharp EL-8, the world's smallest calculator at the time.1 These calculators were high-end products, selling for $345 (about $1800 today). Integrated circuits at the time couldn't fit the entire calculator on one chip, so these calculators contained five ICs: an arithmetic chip, a decimal point chip, a keypad/display chip, a control chip, and a clock chip.

This blog post discusses the clock chip and how it generated the unusual four-phase clock signals required by the calculator. The die photo below, provided by calculator researcher Francois Gueissaz, shows the silicon die of the clock chip. the silicon substrate has a purple tint while the doped, conductive silicon is green. The metal layer on top is white. Around the edges, seven thin bond wires connect the die to the external pins.2 This chip has about 200 transistors and implements just a dozen moderately complex logic gates. While the density of this chip is absurdly low by modern standards, it illustrates the progress of MOS integrated circuits in the late 1960s.

Die photo of the CG2341 clock generator. This photo (and many others) courtesy of Francois Gueissaz.

Die photo of the CG2341 clock generator. This photo (and many others) courtesy of Francois Gueissaz.

Although computers now all use MOS integrated circuits, the path to MOS was rocky, with MOS integrated circuits viewed as slow and unreliable in the 1960s.4 Handheld calculators were a good match for the characteristics of MOS, though: they needed to be compact and lightweight with low power consumption, but computational speed was not important. In 1969, the Japanese calculator company Sharp signed a $30 million deal with Rockwell for this MOS-based calculator chipset, the largest MOS order in history at the time. The five chips were implemented by the Autonetics division of Rockwell.3

The Sharp EL-8 calculator. Note the unusual 8-segment display for the digits. Photo by  Mister rf (CC BY-SA 4.0).

The Sharp EL-8 calculator. Note the unusual 8-segment display for the digits. Photo by Mister rf (CC BY-SA 4.0).

Although the Sharp calculator (above) was handheld, you can see that it was rather thick and chunky, with unusual 8-segment vacuum fluorescent display tubes for its display. The photo below shows the circuit board inside the calculator. The board is dominated by the four large integrated circuits with circular golden lids. These integrated circuits were packaged as 42-pin ceramic ICs with staggered pins. Unlike modern printed circuit boards, the traces on this board are curved, showing its hand-drawn layout.

The circuit board for the Sharp EL-8 calculator. The clock IC is the small metal-can package in the middle. Photo from Mister rf (CC BY-SA 4.0).

The circuit board for the Sharp EL-8 calculator. The clock IC is the small metal-can package in the middle. Photo from Mister rf (CC BY-SA 4.0).

The clock IC is packaged in the small 10-pin metal can, marked with a blurry Rockwell logo (the inset shows the logo). This part number is CG1121 (probably standing for Clock Generator) and is similar to the CG2341 I examined. The date code 7047 indicates this IC was manufactured in the 47th week of 1970, i.e. late November.

The clock integrated circuit was packaged in a 10-pin metal can. The logo on the integrated circuits isn't clear, but it is the Rockwell logo as shown in the inset.

The clock integrated circuit was packaged in a 10-pin metal can. The logo on the integrated circuits isn't clear, but it is the Rockwell logo as shown in the inset.

Cutting the top off the metal can integrated circuit reveals the tiny silicon die. Although the metal can has 10 pins, only seven pins are wired to the die. The metal tab at the top of the photo indicates pin 1 of the integrated circuit.

The metal can of the CG2341 with the lid removed, showing the silicon die inside.

The metal can of the CG2341 with the lid removed, showing the silicon die inside.

Why do the calculator chips require a complex four-phase clock? In 1966, Autonetics invented a technique for building logic circuits called four-phase logic. Unlike standard static logic gates, these logic gates held values dynamically using the capacitance of the wiring. The four-phase clock stepped the gates through sequences of precharging and then computing the logic function. This sounds complicated, but four-phase logic had ten times the density of standard logic gates, as well as using 1/10 the power and having 10 times the speed. As a result, many early high-density MOS chips used four-phase logic.5

Constructing transistors, resistors, and capacitors

Transistors are the key component of the chip. The diagram below shows a metal-gate PMOS transistor, the (somewhat primitive) type of transistor used in this IC. At the bottom, two regions of silicon (green) are doped to make them conductive, forming the source and drain of the transistor The gate is formed by a metal strip between the silicon regions, separated from the silicon by a thin layer of insulating oxide. (These layers—Metal, Oxide, Semiconductor—--give the MOS transistor its name.) The transistor can be considered a switch between the source and drain, controlled by the gate. To simplify the behavior, a PMOS transistor turns on when the gate is pulled negative (-25 volts), while the transistor turns off when the gate is at 0 volts. (These early PMOS transistors required an inconveniently large negative voltage.)

Structure of a PMOS metal-gate transistor.

Structure of a PMOS metal-gate transistor.

The photos below show transistors on the die as they appear under a microscope. The silicon and metal layers match the diagram above; the doped silicon is greenish while the metal layer on top is white. The gate is formed where the metal and silicon overlap, with a faint oval where the oxide is thinned. These transistors are three different sizes: the wider transistors allow higher current. The transistors are carefully sized in the circuits based on the required current.

Three transistors of various sizes, as seen on the die.

Three transistors of various sizes, as seen on the die.

The next important component is the resistor; the photo below shows three resistors. These resistors may look like transistors, and that's because they are transistors. While the transistors above were widened to support more current, these transistors are made longer so the long path reduces the current flow through the transistors. This makes them act as resistors. The metal gate of these transistors is tied to -25 volts, so the transistors are always on, rather than operating as switches.

Resistors of various sizes.

Resistors of various sizes.

The final important component of the integrated circuit is the capacitor. A capacitor is formed by using metal for one plate and doped silicon (green) for the other plate, separated by the insulating oxide layer. The photo below shows two small capacitors and one large capacitor, at the same scale. The large capacitor is used in the output circuitry; the metal stripes above and below it are transistors that drive it.

Two small capacitors and one very large capacitor.

Two small capacitors and one very large capacitor.

Implementing an inverter and NAND gate

With these components, logic gates can be constructed. The schematic below shows how an inverter is implemented in the IC. The layout of the schematic matches the die image underneath, so hopefully the transistors and capacitor can be recognized. If the input is low, the input transistor turns on, pulling the output to ground (i.e. high). If the input is high, the input transistor turns off and the "bootstrap load", the tricky circuit on the right pulls the output to -25V (i.e. low). Thus, the circuit inverts the input.

An inverter using a bootstrap load.

An inverter using a bootstrap load.

Conceptually, you can think of the bootstrap load as a pull-down resistor. The implementation is complex to compensate for the poor characteristics of transistors at the time. The capacitor acts as a charge pump, providing a necessary voltage boost when the circuit switches. (For more details on bootstrap loads, see my earlier article.)

The implementation of a NAND gate is similar to the inverter above, but with multiple input transistors in parallel. If any input is low, the corresponding input transistor turns on, pulling the output to ground (i.e. high), as required by a NAND gate.

The NAND delay gate

The die photo below shows the functional blocks of the clock chip. Eight NAND gates (red) form an oscillating 4-bit shift register. Four gates (yellow) generate the four-phase clock signals from the shift register outputs. Finally, four output driver circuits (orange) amplify these signals to produce high-current outputs.

The clock chip die with key components labeled.

The clock chip die with key components labeled.

The main building block of the clock chip is a NAND gate that has a delay when its output goes low. This delay creates the timing of the clock signal.6 The diagram below shows how the gate is constructed; the schematic corresponds to the layout of the circuit on the die. The delay makes this circuit somewhat complex and partially analog, but I'll try to explain it.

The NAND delay gate uses an R-C circuit to provide the delay. For simplicity, the bootstrap load is represented by a resistor.

The NAND delay gate uses an R-C circuit to provide the delay. For simplicity, the bootstrap load is represented by a resistor.

The NAND circuit is in the upper right; two input transistors and a bootstrap load implement the NAND circuit described earlier. The output of the NAND gate goes through a resistor-capacitor circuit. This delays the output as the capacitor slowly charges through the resistor. The speed of the clock is controlled by the bias pin, which sets a threshold voltage. This voltage controls the point in the resistor-capacitor curve when the level switching transistor turns on.7 By lowering the voltage on the bias pin, the transistor switches sooner, increasing the clock speed. The typical clock speed is 60 kHz, a slow clock even compared to early microprocessors, but calculators didn't require much speed.

When the level switching transistor turns on, it pulls the buffer high,8 and driving the inverter's output low. The inverter has a bootstrap load to provide sufficient output current. Finally, the output is fed back to the bias circuit, probably to sharpen the transition and provide hysteresis. To summarize, this complex circuit implements a delayed NAND gate. It is the key functional block of the chip, repeated ten times.

The clock shift register

The clock is built from a 4-stage shift register. The idea is that each stage of the shift register shifts its bit to the right, after a delay. The bit on the right is inverted and shifted into the left side of the shift register. Thus, the shift register implements a ring counter, first shifting in 1's at the left and then shifting in 0's: the bit pattern is 0000, 1000, 1100, 1110, 1111, 0111, 0011, 0001, and back to 0000. This complete cycle corresponds to one 60 kilohertz clock cycle for the calculator.

The schematic below shows how the shift register is built from eight cross-coupled NAND gates with delay, using the circuit described earlier. Each pair of NAND gates forms a latch, storing either a 0 or a 1. The latch outputs are labeled Q0 through Q3 while the inverted outputs are labeled Q0 through Q3. The outputs from each latch are connected to the inputs of the next stage, so the bits are shifted to the right. Note that the wires from the last stage back to the first stage are crossed; this causes the bit to be inverted. Each stage consists of two cross-coupled NAND gates, forming a latch that holds one bit. If the delay is decreased (through the bias pin), the speed of the shift register increases, increasing the clock speed.

The 4-stage shift register.

The 4-stage shift register.

The shift register must be initialized to the proper state, which is the job of the reset gate. When the shift register is powered up, the reset gate initializes the latches to hold zeros by pulling the lower inputs to the latches low.

Output circuit

The output circuitry generates the four clock phase outputs from the shift register values. Two phases come from the last shift register stage and its complement. The other two phases are more complex. An unwired "select" pin selects between two outputs for these pins; presumably this pin was wired in other versions of the clock chip to provide different clock signals for a different calculator. In the normal case, these clock outputs are formed by NANDing together two shift register outputs to produce a shorter pulse.

The output circuit produces four clock outputs from the shift register values.

The output circuit produces four clock outputs from the shift register values.

The photo below shows one of the output buffers. The output signal enters at the left, travels through the buffer circuitry, and exits the chip through the bond wire on the right. The right half consists of two large transistors to provide the high output currents: one transistor pulls the output up to ground, while the other transistor pulls the output down to -25V. The remainder of the circuitry amplifies the small internal signal so it can drive the output transistors. Note the large bootstrap capacitor near the center; it helps drive one of the output transistors. There are also much smaller bootstrap capacitors in the upper left. This output buffer circuit is repeated four times, once for each output pin.

One output buffer as it appears on the die.

One output buffer as it appears on the die.

The output buffer transistors must be large due to an unusual characteristic of four-phase logic. Normal clocked logic uses the clock signals for timing, while the logic gates are connected to power and ground. In four-phase logic, however, the clock signals provide the power for the logic gates; there are no separate power and ground connections. When the gates are precharged and discharged by the clock signals, this provides the power for the gates. Thus, four-phase logic requires relatively high-current clock signals, since they are powering the circuits.9

To see the chip in action, the oscilloscope trace below shows the four clock outputs as measured from the chip. The yellow and blue traces are the main phases; note that the active (low) parts do not overlap. The magenta and green outputs are active during the first part of the yellow and blue phases, respectively. These clocks are used to precharge the logic circuits. (The clock phases match those on Wikipedia's four-phase article, except the polarity is reversed because of the PMOS transistors.)

Oscilloscope trace showing the four output phases from the clock chip.

Oscilloscope trace showing the four output phases from the clock chip.

Conclusion

Rockwell fit a calculator onto five chips, making the handheld calculator possible. However, Texas Instruments, Mostek, and other companies soon fit all the circuitry onto a single chip, creating the calculator-on-a-chip. Selling calculators was highly profitable for a short time and 11 million calculators were sold in the US in 1974. Although calculators sold for hundreds of dollars in 1969, competition and the improvements in technology caused calculator prices to plummet to $15 by 1975. The profit margin collapsed during the "calculator wars"; Texas Instruments alone lost $16 million in 1975.4

Although the calculator market was risky, the massive sales of calculators provided an important boost to MOS chip technology in the early 1970s, and thus the computer industry. In particular, microprocessors started with the Intel 4004, a chip designed for a calculator. And microcontrollers were created out of Texas Instruments' line of calculator chips. While a chip such as the CG2341 clock generator is trivial by modern standards with about 200 transistors, it provides a historical window into how chips were constructed in the early days of MOS ICs.

Thanks to Francois Gueissaz for doing all the hard work of obtaining the calculator ICs, decapping them, and providing me with die photos and other information. I announce my latest blog posts on Twitter, so follow me at kenshirriff. I also have an RSS feed.

Notes and references

  1. See this interesting vintage commercial for the Sharp EL-8 calculator for more information. 

  2. Measuring the die photo, I believe this chip uses a 15 µm process, so the transistors and features are very large by modern standards. (This is why five chips were required to implement the calculator.) In comparison, many modern chips use a 14 nm process, so the width of a modern transistor is roughly 1000 times smaller, and the area is roughly a million times smaller. This shows the amazing progress in silicon technology described by Moore's Law. 

  3. It's hard to follow the spin-offs and acquisitions of the companies involved. Autonetics was founded as the research laboratory for North American Aviation in 1945. Among other things, Autonetics developed guidance computers for the Minuteman missile. Although North American Aviation is mostly forgotten now, it was a major aerospace company, building everything from the P-51 Mustang in World War II to the command and service module for the Apollo landing. It merged with Rockwell in 1967, becoming North American Rockwell. In 1970, about 800 employees from Autonetics were split off to form North American Rockwell MicroElectronics to develop and manufacture commercial integrated circuits. This later became Rockwell Semiconductor, then spun off into Conexant, which was later acquired by Synaptics. Rockwell was sold to Boeing in 1996.

    Sharp, on the other hand, started as Hayakawa Metal Works in 1924, eventually being renamed Sharp Corporation in 1970. (The name came from the Ever-Sharp mechanical pencil, one of Hayakawa's early inventions.) Foxconn bought the majority of Sharp in 2016; Foxconn, also known as Hon Hai Precision Industry, is a Taiwanese electronics manufacturer. Although best known for manufacturing the iPhone for Apple, Foxconn is estimated to manufacture 40% of the world's consumer electronics. 

  4. Much of the historical information in this post comes from the books To the Digital Age and History of Semiconductor Engineering. These books provide a detailed look at the rise of MOS integrated circuits. 

  5. One of the main proponents of four-phase logic was Lee Boysel, who founded a company Four-Phase Systems around it. The company built 24-bit computers, which were some of the earliest MOS-based computers. Boysel's EECS presentation describes the advantages of four-phase logic. 

  6. One important characteristic of the delayed NAND gate is that the delay is much larger when the output goes low than when the output goes high. This ensures that the output clock phases do not overlap while active (low). This is necessary for four-phase logic to ensure that logic gates don't conflict with each other. 

  7. The level switching transistor (like other PMOS transistors) will turn on when the gate voltage is lower than the source voltage by Vt (the transistor's threshold voltage). Thus, by controlling the bias voltage on the transistor's source, the transistor can be made to turn on sooner or later, controlling the frequency. 

  8. Note that the buffer circuit is constructed "backward" compared to a standard PMOS inverter. A PMOS inverter has the transistor connected to ground with a load resistor to -25V, while the buffer has the transistor connected to -25V and the load resistor to ground. I think it is constructed this way to shift the voltage levels from the level switching transistor. 

  9. Although the four-phase clocks power the logic gates, the chips also have regular power and ground connections. These power the output pins since the current demands are too large to be reasonably satisfied by the clocks. 

How the bootstrap load made the historic Intel 8008 processor possible

Near the end of 1972, Intel introduced their first 8-bit microprocessor, the 8008. Decades later, this processor still influences computing; you probably use an x86 processor that is a descendent of the 8008. One unusual feature of the 8008 processor is its use of a "bootstrap load" or "bootstrap capacitor", a special capacitor circuit to improve performance.1 Federico Faggin, who led the development of the 8008, is the main character in this story; he invented a new way to fabricate bootstrap capacitors for the Intel 4004 and 8008 processors and says it "proved essential to the microprocessor realization" and "without [the bootstrap load], there was no microprocessor."

Die photo of the 8008 microprocessor. (Click for a larger image.)
The initials HF appear on the top right for Hal Feeney, who did the chip's logic design and physical layout.

Die photo of the 8008 microprocessor. (Click for a larger image.) The initials HF appear on the top right for Hal Feeney, who did the chip's logic design and physical layout.

My photo above shows the tiny silicon die inside the 8008 package. You can barely see the wires and transistors that make up the chip. There are 90 bootstrap capacitors, visible as small yellow rectangles, especially in the upper center. The squares around the outside are the 18 pads that are connected to the external pins by tiny bond wires. 18 pins is a very small number for a microprocessor, but Intel was bizarrely committed to small packages at the time.2 This required inconvenient tradeoffs; the lack of multiple power pins was one factor forcing the use of bootstrap loads.

The 8008 processor's history is more complex than you might expect. Its roots are the Datapoint 2200, a popular computer introduced in 1970 as a programmable terminal. Created before the microprocessor, the Datapoint 2200 contained a board-sized CPU build from individual TTL chips. Datapoint talked with both Intel and Texas Instruments about replacing the processor board with a single MOS chip. Texas Instruments created the TMX 1795 processor in March 1971, while Intel created the 8008 around the end of 1971 but Datapoint rejected both chips for a variety of reasons. Texas Instruments abandoned the TMX 1795 after their attempts to market it failed. Intel, on the other hand, marketed the 8008 as a general-purpose microprocessor, creating the microprocessor industry.

(You might wonder how the Intel 4004 fits into this story. The Intel 4004 is architecturally unrelated to the 8008 in almost every way; despite the similar names, the 8008 is not an 8-bit version of the 4-bit 4004. After the Intel 4004 was launched in 1971, much of the 4004 team (including Faggin, Hoff, Mazor, and Feeney) moved over to the 8008 project. Because the 4004 and 8008 processors were built by the same team with the same PMOS3 process, they have some layout and circuit-level similarities, in particular the bootstrap load circuit.)

Why the bootstrap load?

The purpose of the bootstrap load is to get extra voltage out of a transistor when necessary. To explain this, I'll start by showing how an inverter works when implemented in a processor. The diagram below shows an inverter, built from a PMOS3 transistor and a load resistor (which is actually a transistor). If the input to the inverter is 0 (low), the lower transistor turns on, pulling the output high (1). But if the input is 1 (high), the output transistor turns off. In that case, the load resistor pulls the output low (0). Thus, the input signal is inverted.

How an inverter is constructed from PMOS transistors. The upper symbol indicates a PMOS transistor that is acting as a load resistor.  Based on the 8008 datasheet.)

How an inverter is constructed from PMOS transistors. The upper symbol indicates a PMOS transistor that is acting as a load resistor. Based on the 8008 datasheet.)

The diagram below shows the physical implementation of an inverter in the 8008 processor. The first die photo shows the inverter as it appears in the chip. The horizontal metal wiring on top provides VDD and the input to the circuit. For the second photo, I dissolved the metal layer to reveal the two transistors that form the circuit. The schematic on the right matches the physical layout of the transistors on the die but otherwise corresponds to the schematic above. Because creating resistors in an integrated circuit is inconvenient, the load resistor is implemented by a transistor.

How an inverter appears in the 8008 processor.

How an inverter appears in the 8008 processor.

There's a complication from using a transistor as a load resistor: these MOS transistors have a property called the threshold voltage VT. The problem is that when you try to pull a signal low, the transistor can't pull it all the way low. Although you'd like the signal to get pulled down to VDD (-9 volts), the threshold voltage (say -5 volts)9 means that you can only get the signal down to -4 volts. (This is one of the reasons why the 8008 requires a much larger voltage (15 volts overall) than modern integrated circuits; if you tried to run it at 5 volts, the threshold voltage would consume the entire signal.)

The diagram below explains the threshold voltage in more detail. VD, VG, and VS are the voltages on the drain, gate, and source respectively. VGS is the voltage between the gate and the source. The transistor will turn on if VGS < VT, the threshold voltage. (Inconveniently, most of these voltages are negative in a PMOS transistor, which makes things confusing.) The problem is that with a gate voltage of -9 volts and a threshold voltage of -5 volts, the transistor will only be on if VS is higher than -4 volts. Thus, the transistor can't pull VS lower than -4 volts. The only way to get VS lower is if you had a more-negative gate voltage, at least -14 volts in this case. Some chips solve this by using an additional voltage supply to provide more voltage to the gate, such as the Intel 8080 or the HP Nanoprocessor.

VD, VG, and VS are the voltages on the transistor's drain, gate, and source respectively. VGS is the voltage difference between the gate and source.

VD, VG, and VS are the voltages on the transistor's drain, gate, and source respectively. VGS is the voltage difference between the gate and source.

The threshold voltage isn't much of a problem when you're dealing with inverters and other gates, because the voltage levels are restored by each gate. However, there are two places where the threshold voltage is a problem: superbuffers and pass transistor logic. In these circuits (described in the footnote4), the threshold voltage drop happens twice, yielding an output that is too weak. Since these circuits are common in processors, a solution was needed: the bootstrap load. It is a way of generating more voltage for the gate to overcome the threshold voltage so the transistor to pull its output all the way to VD.

How the bootstrap load works

The bootstrap load is essentially a charge pump circuit that uses a bootstrap capacitor to boost the gate voltage. The diagram below shows the basic idea of a charge pump. On the left, a capacitor is charged to -9 volts from a voltage source. If you disconnect the voltage source and then re-connect the negative side to the capacitor as shown on the right, the capacitor retains its charge of -9 volts. However, since the lower side of the capacitor is now at -9 volts, the upper side of the capacitor is now at -18 volts. The bootstrap load uses this -18 volts as the gate voltage, sufficient to overcome the threshold voltage.

A charge pump. On the left, the capacitor is charged to -9 volts. On the right, the bottom of the capacitor is connected to -9 volts, yielding -18 volts on top of the capacitor.

A charge pump. On the left, the capacitor is charged to -9 volts. On the right, the bottom of the capacitor is connected to -9 volts, yielding -18 volts on top of the capacitor.

The diagram below shows the bootstrap load circuit. The circuit is similar to the inverter described earlier, but with the addition of a capacitor and a transistor. In the first diagram, a 0 input turns on the lower transistor (Q1), yielding a 1 output (+5 volts). Meanwhile, Q3 acts as a load resistor, pulling the top of the capacitor to -4 volts (not -9 volts due to the threshold voltage.) This results in -9 volts stored across the capacitor.

How the bootstrap load circuit works.

How the bootstrap load circuit works.

The second and third diagrams show what happens with a 1 input. The lower transistor Q1 turns off, allowing Q2 to pull the output low. With a regular inverter, -4 volts is as low as the output can go (second diagram). However, as explained earlier, the capacitor still holds -9 volts, so the top of the capacitor must be -13 volts. With -13 volts on the gate of Q2, Q2 will continue to pull the output lower, until the circuit ends up as shown on the right, with the output pulled all the way down to -9 volts. Note that the source can't get pulled down any lower than the drain, regardless of the gate voltage. (In comparison, the simple inverter described earlier could only pull the output down to -5 volts.)5

The image below shows part of Intel's schematic for the 4004 processor, showing the circuit for a standard load and the circuit for the bootstrap load, indicated by a "B" next to the resistor.

Representation of the bootstrap load on the Intel 4004 schematic. The resistor with "B" symbolizes the bootstrap load circuit next to it.

Representation of the bootstrap load on the Intel 4004 schematic. The resistor with "B" symbolizes the bootstrap load circuit next to it.

The silicon-gate bootstrap load

So far, I've discussed the bootstrap load, which was extensively used with MOS circuitry, and was patented by North American Rockwell in 1966. The invention necessary for the 4004 and 8008 processors was the extension of the bootstrap load to silicon-gate integrated circuits.

One of the key inventions that made the 8008 practical was the self-aligning silicon gate transistor.6 The diagram below shows the structure of an MOS transistor. Early MOS integrated circuits used metal-gate 7 transistors, which used metal, typically aluminum, instead of polysilicon for the gate. But at Fairchild in 1968, Faggin and Klein invented a practical way to make transistors with silicon gates. This may seem like a trivial difference, but silicon-gate transistors were better than metal-gate transistors in three important ways. First, the electrical properties of silicon-gate transistors are much better than metal-gate transistors, running faster and at lower power. Second, polysilicon provided a second layer for routing signals, making integrated circuit layouts much more compact.

Structure of a PMOS transistor.

Structure of a PMOS transistor.

Finally, polysilicon permitted construction of self-aligned transistors, which play an important part in the bypass capacitor story. Integrated circuits are constructed through a sequence of processing steps, using optical masks and photo-sensitive resist to create patterns on the surface. An integrated circuit with metal-gate transistors is constructed from the bottom up. First, the source and drain regions are doped with impurities to form P-type silicon, as shown below. In a later step, the metal gate is created between the source and the drain, using a different mask. The tricky part is making sure the gate is lined up with the source and the drain; if there's a gap, the transistor won't work. Thus, a metal gate is made larger than necessary so it will still cover the gate channel, even if the alignment of the layers is slightly off. Unfortunately, this overlap creates capacitance and harms performance.

How a photomask is used to dope regions of silicon.

How a photomask is used to dope regions of silicon.

On the other hand, the self-aligned gate is created in the opposite order. The polysilicon gate is created first. In a later step, the source and drain regions are doped. However, a mask isn't used to separate the source and drain from the gate. Instead, the gate itself blocks doping of the region in between the source and drain. Thus, the source and drain are automatically "self-aligned" with the gate, eliminating the excess capacitance from a too-large gate. (Why couldn't metal gates be self-aligned? Because doping the silicon requires high temperatures that would melt the metal, but polysilicon can handle the heat.)

Although self-aligned silicon gates are a major improvement over metal gates, there was one drawback: capacitors. With metal-gate transistors, a capacitor could be easily constructed by using metal and doped silicon as the plates: a large metal layer on top, doped silicon underneath, and a thin insulating oxide layer in between. (In other words, a transistor with a large gate is used as a capacitor.) With self-aligned gates, the polysilicon gate could be used as a capacitor plate in place of the metal layer. However, in the self-aligned process, the polysilicon gate blocks doping of the silicon underneath, which is good for a transistor but bad for a capacitor, since you can't dope the silicon under the polysilicon plate. (You could use an extra manufacturing step to dope the capacitor plates before creating the polysilicon gate, but this extra step would increase the cost.)

Faggin invented a solution that made capacitors practical with self-aligned gates.8 He realized that if you bias the capacitor correctly, the charge on the upper plate will create a conductive region in the silicon underneath it, even without any doping. He tried this at Fairchild and discovered that it worked. This solved the problem of how to use a bootstrap load with self-aligned silicon-gate transistors.

Closeup of a bootstrap load circuit in the 8008.

Closeup of a bootstrap load circuit in the 8008.

The photo above zooms in on one of the boostrap load circuits in the 8008, used in an inverter. The diagram below shows the underlying silicon after removing the metal layer. The bootstrap capacitor is constructed by a layer of polysilicon (pinkish) over the underlying silicon, forming the capacitor plates. The transistor on the right inverts the input. The capacitor is charged by the transistor in the lower left. The load transistor is in the middle; the capacitor provides the boosted voltage to its gate. The transistors have varying sizes depending on their roles. The inverting transistor is the largest since it provides the most current. The transistor that charges the capacitor is very small in comparison because a small current can keep the capacitor charged.

The circuitry of an inverter with a bootstrap load.

The circuitry of an inverter with a bootstrap load.

This bootstrap load technique was extensively used in the 4004 and 8008 processors. The diagram below shows the bootstrap loads in the 8008 processor, indicated with a red box. The 8008 has 90 bootstrap loads, so it is a significant circuit. Many bootstrap loads are around the periphery of the chip to help drive the output pins. The instruction register (upper center) uses bootstrap loads to drive the relatively large instruction decoder (center). At the right, bootstrap loads drive the register storage (upper right) and stack storage (lower right). Other miscellaneous circuits throughout the processor also use bootstrap loads.

The bootstrap loads in the 8008 are indicated by red boxes.

The bootstrap loads in the 8008 are indicated by red boxes.

Conclusion

A final question is if the bootstrap load was a key invention that made the microprocessor possible (as embodied in the 4004 and 8008) or if the microprocessor was inevitable regardless of features such as the bootstrap load. One view is that "the buried contact and particularly the bootstrap load, were indispensable to obtain the required speed within the available power budget." Feeney said in an 8008 oral history "that being limited on pins, limited on power supplies, whatever, that the bootstrap load became very, very critical." On the other hand, the development of the microprocessor seemed an inevitable, incremental process to many. Fairchild engineer Lee Boysel said in 1970,10 "The computer-on-a-chip is no big deal. It's almost here now... I've no doubt the whole computer will be on one chip within five years." Hal Feeney of Intel said, "a the time in the early 1970s, late 1960s, the industry was ripe for the invention of the microprocessor."

In the narrow sense, the bootstrap load made the 4004 and 8008 possible with their given size, performance, and power consumption. The bootstrap load also illustrates how the microprocessor is not a single invention, but the aggregation of many smaller inventions that made it possible. However, looking at the broader picture, microprocessors would have been only slightly hampered if the bootstrap capacitor didn't exist. There were many alternatives such as four-phase logic, static logic, higher gate voltages, an additional power supply, or using an extra mask for the capacitors. The Texas Instruments TMX 1795 provides a direct comparison, since it was built at the same time as the 8008 with the same architecture, but using metal-gate transistors instead of silicon-gate. The diagram below shows that the TMX 1795 was considerably larger than the 8008, and it had somewhat worse performance, but the point is that microprocessors would have proceeded essentially the same without the bootstrap load. In any case, by 1974, the switch to NMOS transistors and improvements in threshold voltages made bootstrap loads unnecessary. My conclusion is that the bootstrap load was a helpful innovation, but microprocessors would have proceeded along a similar path even without this invention. Once technology permitted a few thousand transistors to be constructed on an integrated circuit, the single-chip CPU was inevitable.

Comparative die sizes of the TMX 1795, 4004 and 8008 microprocessors. Note that the 4004 and 8008 are nearly the same size, while the TMX 1795 is more than twice as large. The top third of the TMX 1795 is instruction decoding and control logic, the middle is the 8-bit ALU, and the bottom is storage (stack and registers). TMX 1795 die photo courtesy of Computer History Museum.

Comparative die sizes of the TMX 1795, 4004 and 8008 microprocessors. Note that the 4004 and 8008 are nearly the same size, while the TMX 1795 is more than twice as large. The top third of the TMX 1795 is instruction decoding and control logic, the middle is the 8-bit ALU, and the bottom is storage (stack and registers). TMX 1795 die photo courtesy of Computer History Museum.

If you're interested in the 8008, my previous article has a detailed discussion of the 8008's architecture and more die photos; I also explain the 8008's ALU. I announce my latest blog posts on Twitter, so follow me at kenshirriff. I also have an RSS feed.

Notes and references

  1. Bootstrap loads in the Intel 4004 are discussed by Insanity 4004 here and here

  2. In his oral history, Faggin describes Intel's fixation on 16-pin packages. When a memory chip required 18 pins instead of 16, it was "like the sky had dropped from heaven. I never seen so [many] long faces at Intel, over this issue, because it was a religion in Intel; everything had to be 16 pins, in those days. Everything had to be 16 pins... It was a completely silly requirements to have 16 pins." At the time, other manufacturers were using 40- and 48-pin packages, so there was no technical limitation, just a minor cost saving from the smaller package. 

  3. The classic microprocessors such as the 8080, 6502, and Z-80 were built with NMOS transistors. The earlier 4004 and 8008 used PMOS transistors, which were easier to manufacture but had poorer performance. If you're familiar with NMOS logic, PMOS logic is a mirror world, where everything is backward. PMOS used negative voltages, which were also significantly higher than the 5 volts used by standard TTL. For compatibility with TTL levels, the 8008 ran with Vcc at +5V and Vdd at -9V, so it could produce TTL-compatible outputs of roughly 0 volts and 5 volts. (See the datasheet for more details.) The 4004 required -15 volts, typically Vdd = -10V and Vss = +5V. Confusingly, the 4004 defined logic "0" as the more positive voltage and logic "1" as the more negative voltage (datasheet). 

  4. The "superbuffer" replaces the load resistor with an active transistor and is used when more current is required, for instance to drive an internal bus or an output pin. The upper transistor is driven by an inverter, so it is on when the lower transistor is off. Instead of the weak current from the load resistor/transistor, this transistor provides a high current. The problem is that the threshold voltage limits the voltage from the upper transistor. With a regular inverter, the inverter output loses VT, so it will provide -4 volts to the upper transistor's gate. Losing another VT there yields an insufficient output voltage of +1 volt instead of the desired -9 volts.

    A superbuffer provides a fast, high-current output in both directions.

    A superbuffer provides a fast, high-current output in both directions.

    The second case where the threshold voltage drop is a problem is with a pass transistor, used for dynamic logic. The diagram below illustrates a simple pass transistor circuit. When the control signal is low, the transistor is active, passing the input signal through to the output. But when the control signal is high, the transistor stops passing the input. Instead, the previous value is held by the circuit's capacitance (shown in gray) so the output holds its previous value. Thus, pass transistors provide an efficient way of implementing temporary storage. The problem with pass transistors is the threshold voltage. If the control signal on the gate comes from a regular gate, the "on" voltage will be -4 volts due to the threshold voltage loss. The pass transistor causes a second threshold voltage loss, so the lowest it can pull its output is +1 volt, not enough for reliable operation.

    A simple pass-transistor circuit.

    A simple pass-transistor circuit.

    The bootstrap load fixes these problems. By putting a bootstrap load on the inverter in the superbuffer or on the circuit controlling the pass transistor, the drive voltage will be close to -9 volts. Now there is only a single threshold voltage drop, leaving the output at -5 volts, sufficiently negative for reliable operation. 

  5. This discussion of the bootstrap load is a simplified explanation. The real circuit is affected by stray capacitance, transistor leakage, and other factors, so the output wouldn't be all the way to VDD. One thing I'd like to point out, though, is that you might expect the capacitor's charge to leak out through Q3 as fast as it charged. Although Q3 is treated as a resistor, it also acts as a diode, blocking the capacitor from discharging. (With the capacitor more negative, the roles of Q3's source and drain are reversed and it no longer conducts.) 

  6. The silicon-gate bootstrap capacitor exemplifies the paths of information between companies at the dawn of the microprocessor era. Practical silicon gate technology was created at Fairchild (with some earlier roots). When employees (including Faggin) left Fairchild for Intel, they took this knowledge with them. (And in some cases took "lots and lots of Fairchild internal confidential documents", see Shima oral history). From Intel, ideas spread to other companies, such as when Faggin leaving Intel to found Zilog, basing the Zilog Z80 on the Intel 8080.  

  7. Interestingly, in 2007 Intel started using metal gates again in order to scale transistors further (details). In a way, semiconductor technology has gone full circle, back to metal gates, although now unusual metals such as hafnium are used. 

  8. In the making of the first microprocessor, Federico Faggin says, "bootstrap load was a very popular circuit design trick used in just about all MOS dynamic circuits of that time. It made possible an output signal swing that was not only equal to the power supply voltage, but was also faster than possible with normal MOS loads for the same power dissipation." Faggin describes how he invented the bootstrap load in the 4004 oral history (p11) and the 8008 oral history (p8). Also see Faggin's The MOS silicon gate technology and the first microprocessors. He describes how the bootstrap load is needed for a two-phase design, and how silicon gate technology didn't support capacitors. Faggin's site describes the bootstrap load. Bootstrap load is also described at mosgate

  9. The threshold voltage depends on various properties of the integrated circuit including the gate material and the oxide thickness. I couldn't find a specific value for the threshold voltage in the 8008 processor, but -5 volts seems like the right ballpark (and is a conveniently round number). The book MOSFET in Circuit Design discusses threshold voltages for P-channel devices.  

  10. The bootstrap load illustrates the social process through which people are assigned credit for inventions and the construction of reputation. Although Faggin had a key role in the 4004 and 8008 processors, "when he left to found Zilog he got temporarily written outside of the Intel history." (See Intel disowns Faggin and Interview with San Mazor.) Faggin states, "They tried to erase my name from all of my contributions, including the silicon gate technology and the first microprocessor, and attribute them to others." After lobbying efforts by Faggin's wife and the pro-Faggin website intel4004.com, Intel reluctantly gave Faggin more credit. Faggin eventually received various awards including the National Medal of Technology and Innovation in 2010, so in the end he received his (deserved) recognition.

    The point is that credit is not assigned objectively, but is a dynamic force depending on various corporate and personal forces and who tells the story. (Wikipedia is one modern arena for these conflicts.) One corrective is the book History of semiconductor engineering, which covers many of the key people in the history of integrated circuits, with little regard for the "generally accepted" history. I should make it clear that I am drawing most heavily on Faggin's writings for background on the bootstrap load, so this blog post should not be viewed as an "objective" view of who should get credit for it. It looks like the silicon-gate bootstrap load was invented simultaneously at National Semiconductor; patent 3912948 filed in 1971 by Dilip Bapat describes an identical silicon-gate bootstrap load circuit.