Showing posts with label reverse-engineering. Show all posts
Showing posts with label reverse-engineering. Show all posts

Strange chip: Teardown of a vintage IBM token ring controller

IBM used some unusual techniques in its integrated circuits, and one of the most visible is packaging them in square metal cans. I've been studying these chips recently, since there's not a lot of information about them. I opened up the large metal chip—1.5" on a side—from the token ring network board below. This chip turned out to be stranger and more interesting than I expected, combining analog circuitry, a custom microprocessor, and complex logic. The internal packaging was also unconventional: instead of the bond wires used by most manufacturers to connect the silicon die, IBM used a "flip-chip" technique, soldering the die upside down onto a ceramic substrate. Instead of pads, the chip had solder balls across its surface, giving it an unexpected layout and appearance. In the blog post, I discuss this chip in detail.

The IBM 4/16 ISA token ring board. Click this photo (or any other) for a larger version.

The IBM 4/16 ISA token ring board. Click this photo (or any other) for a larger version.

The token ring network was introduced by IBM in 1985,1 a local-area network technology that competed with Ethernet and other network systems. In a token ring network, the computers are wired in a ring, with each computer receiving packets from the previous computer and transmitting them to the next computer in the loop. To give a computer access to the network, a special three-byte token circulates in the ring. When a computer receives the token, it can transmit a network packet to the next computer in the ring. The packet travels around the ring until it comes back to the original computer. That computer discards the packet and sends out the token in its place, giving another computer a chance to transmit data. In comparison, an Ethernet network lets computers transmit at any time; if two transmit at the same time, the collision is detected and they try again a bit later. A token ring network had the advantage of avoiding collision, making it more deterministic and fair and providing better performance on a congested network.

IBM's use of square metal cans goes back to the early 1960s with IBM's SLT modules (Solid Logic Technology). Because IBM didn't think integrated circuits were mature enough at the time, they used small hybrid modules with a few transistors, diodes, and resistors mounted on a ceramic substrate. These half-inch-square SLT modules were packaged in an aluminum can for protection, giving IBM circuit boards a unique appearance. In the late 1960s, IBM moved to integrated circuits2 but they kept the ½" metal cans instead of the rectangular ceramic or epoxy packages used by other manufacturers. As integrated circuits required more pins, IBM increased the package size, leading to the bulky 1.5" package that I examined.

To examine the integrated circuit, I removed it from the board with a hot air gun. In the photo below, you can see the grid of pins underneath the chip. The chip is labeled with the part number is 50G6144. The "ESD" suffix indicates an electrostatic-sensitive device that can be damaged by static electricity and requires special handling. The next line, IBM 9352PQ, is a code for the manufacturing site. The final line, 194390074M, shows that the chip was manufactured in 1994 during the 39th week of the year.

The integrated circuit is packaged in a square aluminum can, 1.5" on a side.

The integrated circuit is packaged in a square aluminum can, 1.5" on a side.

Cutting off the aluminum lid reveals the silicon die inside. The chip is mounted upside down as a flip chip, soldered directly to the connections on the ceramic substrate. Thus, you can't see the chip's circuitry, just the underside of the silicon die. IBM called this mounting technology controlled collapse chip connection or C4.3 (In comparison, most manufacturers mounted a silicon die right side up and connected it to the pins with tiny bond wires.) Tiny printed-circuit traces connect the module's 175 pins to the die.

The integrated circuit with the metal lid removed, showing the silicon die on the ceramic substrate.

The integrated circuit with the metal lid removed, showing the silicon die on the ceramic substrate.

I removed the die from the substrate with the hot-air gun and then dissolved the solder balls with a mixture of hydrogen peroxide and vinegar. By taking numerous photos with a metallurgical microscope, I created the die photo below. The black circles on the die are the positions of the solder balls, more irregular than you might expect. They are not around the edge of the die (as with bond pads), but overlap the circuitry. The chip is fairly large, about 9×7.9 mm, with features of about 1µm. Note the horizontal rows of circuitry; these are standard cells, which I will discuss below.

Die photo of the chip. Click this (or any other) image for a larger version.

Die photo of the chip. Click this (or any other) image for a larger version.

The pattern of solder balls is more visible in Antoine Bercovici's photo below. There are rows three-deep of solder balls along the four sides, as well as rows through the middle of the chip and more in the corners. Roughly speaking, the solder balls around the edges are for signals, while the solder balls in the middle distribute power and ground. Note the tangled metal wiring on top of the chip that connects the solder balls to the underlying circuitry.4

Die photo showing the solder balls and upper metal clearly. Courtesy of Antoine Bercovici.

Die photo showing the solder balls and upper metal clearly. Courtesy of Antoine Bercovici.

The photo below shows a closeup of the ceramic substrate that holds the die; compare the pattern to the die above.5 The die was soldered to the rectangular array of contacts in the middle, while the large circles around the edge of the photo are the pins of the chip. Note the dense, complex wiring pattern between the pins and the tiny contacts. The wiring traces are extremely thin (about 30µm), with thicker traces from power and ground. The contacts form a complex pattern. most are in a rectangular array, three deep. However, there are also rows of contacts through the center of the chip, connected alternately to power and ground by the thick traces inside the rectangle, and a few scattered contacts. The contact pattern on the substrate was optimized for the layout of this particular chip. Power distribution was a particular concern.

A closeup of the ceramic substrate showing where the die is mounted.

A closeup of the ceramic substrate showing where the die is mounted.

It's interesting to consider the hierarchy of connections between the coarse 0.1" grid of the chip's pins and the tiny 1µm features on the chip. At the top level, the pin spacing is 0.1" in a 14×14 grid. The solder balls have a spacing of 0.01", so the ceramic substrate reduces the spacing by a factor of 10. The solder balls are connected to the wiring on top of the die, spaced at 0.001", increasing the density by another factor of 10. The top wiring is connected to the underlying wiring on the chip, with a spacing of 0.0001", another factor of 10. Finally, the feature size on the die is about 1µm, another factor of 2.

With this type of packaging, you can visualize the die position by looking at the underside of the IC (below). Because the chip is soldered directly to the substrate, there are no pins where the chip is attached. Thus, the spot with no pins indicates the position of the die.

Underside of the package.

Underside of the package.

Inside the chip

The die photo below shows the chip with most of the metal layers dissolved, making the transistor structure underneath visible. The chip has three main components: a 16-bit microprocessor CPU, an analog front end for the network signals, and 24,000 logic gates for the main functionality. The chip also has some buffer RAM at the left, and I/O drivers in the middle and bottom. (IBM originally implemented the token ring interface with six analog and digital chips. To decrease cost, they put all the functionality onto a single chip, resulting in the combination of analog and digital circuitry.)

The die with major components labeled. The metal layer has been removed to show the circuitry underneath.

The die with major components labeled. The metal layer has been removed to show the circuitry underneath.

The block diagram below shows the complex functionality of the chip. Starting in the upper right, the analog front end circuitry communicates with the ring. The analog front end extracts the clock and data from the network signals. The protocol handler implements the low-level token ring protocol: it decodes data, breaks packets into frames and performs error checking. Network data is moved between on-chip buffers and the external RAM by the shared RAM control. Finally, a custom 16-bit microprocessor implements the data link layer protocols and controls the chip.

Block diagram of the chip, from IBM's paper.

Block diagram of the chip, from IBM's paper.

Standard-cell logic

The chip's logic is implemented with a CMOS standard cell library and consists of about 24,000 gates. The idea of standard-cell logic is that each function (such as a NAND gate or latch) has a standard layout. These cells can then be combined by automated design tools to create the desired logic. (This is in contrast to older methodologies, where the designer would lay out each transistor individually, either on paper or using design software.) Standard cells make chip design much easier, since software can do the circuit synthesis, layout, and routing, However, the design isn't as flexible or optimized as a fully-custom circuit.

The standard cell layout is visible on the chip, with the cells arranged in uniform rows, connected by horizontal and vertical wiring. The diagram below magnifies the die to zoom in on five rows of standard-cell logic, and then a single row, to show how small the cells are on the die.

Zooming in on the die shows rows of standard cell logic. Another zoom shows the details of the logic.

Zooming in on the die shows rows of standard cell logic. Another zoom shows the details of the logic.

The standard cell below implements a 3-input NAND gate, and I'll explain how it is constructed.6 There are 6 PMOS transistors on top and 6 NMOS transistors on the bottom. The transistors are formed from a region of doped silicon at the top and another at the bottom. Vertical lines of polysilicon, a special type of silicon, form the transistor gates. Polysilicon is also used for vertical wiring inside the cell. The chip has three layers of metal: the bottom layer is used for horizontal wiring, the middle layer is used for vertical wiring, and the top layer connects to the solder balls. Horizontal metal wiring connects the transistors inside the cell and connects the cell to other cells. The two thick horizontal metal wires provide power and ground for the cell. The second, vertical metal layer provides vertical wiring across and between cells. This layer also implements the power connections between the solder balls and the horizontal power wiring visible here. The round dots are connections between layers (silicon, polysilicon, or metal). The schematic on the right matches the layout of the cell.

Closeup of a cell that implements a NAND gate.

Closeup of a cell that implements a NAND gate.

In the schematic below, I've removed the redundant transistors and rearranged the layout to make the NAND circuit more clear. If all inputs are 1, the NMOS transistors at the bottom turn on, pulling the output low. If any input is 0, a PMOS transistor turns on, pulling the output high. Thus, the circuit implements a NAND gate.

Schematic of the 3-input NAND gate.

Schematic of the 3-input NAND gate.

To summarize, standard-cell logic provides a convenient, automated way of implementing logic. A small number of standardized cells implement the basic logic functions. These cells are arranged in rows and wired together to create the desired logic. (From the teardown perspective, standard-cell logic is somewhat disappointing, since the high-level structure is not visible; it's just a bunch of uniform cells.)

The logic circuitry includes some static RAM buffers to hold network data. These were custom-implemented (as were the I/O drivers) instead of using standard cells. The photo below shows a block of RAM cells.

One of the RAM buffers on the chip.

One of the RAM buffers on the chip.

Inside the CPU

The chip contains a 16-bit CMOS control microprocessor that was custom-designed by IBM7 and contains about 10,000 gates. This processor handles the network protocol, controls transmit and receive operations, and manages the shared memory. It runs at 5.34 megahertz and performs about 3 MIPS (million instructions per second). The microprocessor runs code from an EPROM on the board. IBM calls this "microcode", but it's unclear if this is microcode in the usual sense or just firmware instructions.

The CPU, with main functional blocks labeled. The metal layer has been removed.

The CPU, with main functional blocks labeled. The metal layer has been removed.

The CPU is built with standard-cell logic (except for the RAM and ALU), but curiously the cell layout is entirely different from the rest of the chip, presumably because it had different designers. The photo below compares the CPU's logic (left) with the other logic (right). The CPU fits 7 rows of logic in the same vertical space that holds 4 rows of the regular logic. On the other hand, the logic on the right appears to be much dense horizontally.

Comparison of the CPU's standard-cell logic (left) with the rest of the chip (right), at the same scale.

Comparison of the CPU's standard-cell logic (left) with the rest of the chip (right), at the same scale.

One design feature of the CPU that's visible on the die is its use of multiple PLAs (programmable logic arrays) for instruction decode and control. (Looking at the photo, I count nine small PLAs and a large PLA in the corner.) A PLA provides a structured and dense way of implementing logic (typically AND-OR logic). More importantly, PLAs also provided flexibility and the ability to easily change the design. In the PLA below, 12 signals enter at the lower left. The matrix above converts these to 11 signals that pass to the right. The second matrix generates 8 outputs. The contents of the PLA are visible as the pattern in the metal layer. Since the PLA could be modified by changing the chip's metal layer, bug fixes could even be done after the silicon had been etched.

One of the many PLAs in the CPU.

One of the many PLAs in the CPU.

The CPU contains memory cells for register storage (which they call a 16×16 cache). This RAM design is different from the RAM design in the logic circuitry.

Memory cells in the CPU.

Memory cells in the CPU.

Analog circuitry

The chip contains a block of analog circuitry implemented in CMOS. This circuitry "performs signal conversion and clock recovery functions as well as detecting and compensating for line impairments". This circuitry includes resistors, capacitors, MOS transistors with special properties, and other components.8 The analog block uses a variety of circuits such as op-amps, switched-capacitor amplifiers, voltage references, peak detectors, a charge pump, voltage-controlled-oscillator, and phase-locked loop.

Die photo showing part of the analog circuitry.

Die photo showing part of the analog circuitry.

One challenge in the design was to minimize "jitter" in the clock signal extracted from the network data. Because each node retransmitted the data, jitter would accumulate as a packet traversed the ring, so each node had to be accurate. They used a variety of techniques to keep noise out of the signal such as providing separate power and ground for the analog circuitry, using differential signals in the circuitry, and keeping logic signals away from the analog circuitry.

The analog circuitry made the chip much more complex to manufacture and test.9 The capacitors and special transistors required special process steps during manufacturing. Manufacturing tolerances were also much tighter since process variations could change the electrical characteristics enough to make the analog circuitry stop performing. Some of the analog circuitry was too sensitive to be tested on the wafer and couldn't be tested until the chip was packaged, making failed chips much more costly. Even so, IBM found it worthwhile to put the analog circuitry on the chip.

Shrinking the chip

IBM originally made the token ring chip in 1988. The chip I examined is a smaller version from 1994. The photo below compares the two chips on a 1 mm grid; the older, larger chip is on the left. Note that both chips have the same microprocessor block (upper left corner) and the same analog block (lower left / upper right corner). The height of the standard cell logic rows is much smaller in the newer chip, probably how they shrunk the logic. The solder balls on the left connect to the underlying circuitry, while the solder balls on the right are routed all over the chip by a third layer of metal.

Comparison of the two chips. Photo courtesy of Antoine Bercovici.

Comparison of the two chips. Photo courtesy of Antoine Bercovici.

The analog section from the old chip was copied to the new chip unchanged, but the connections to solder balls are very different, showing the change in wiring techniques. In the old chip (left), the solder balls are on top of metal pads that are connected to the circuitry. The layout is similar to integrated circuits that use wire bonding and bond pads. In the old chip (right), the solder ball grid is not anchored to the underlying chip architecture, but follows its own constraints. A new layer of metal connects the solder balls to the pads. The pads remain in their atavistic positions, despite being unused in the new chip.

Comparison of the analog section of the old chip and the new chip. The color of the chips is different due to lighting.

Comparison of the analog section of the old chip and the new chip. The color of the chips is different due to lighting.

The token ring board

I'll just say a bit about the token ring board that contains this chip. The board is an ISA card from 1994. The IBM chip dominates the board, but there are also numerous other chips, largely 74F-series TTL. There's also a square (and curiously thick) Lattice chip, probably a GAL (Generic Array Logic). A GAL is a programmable logic chip, combining AND/OR logic with flip-flops. A Signetics chip with an IBM label on top is probably a field-programmable logic array (FPLA). Despite all the complexity of the IBM chip, the board requires a lot of programmable logic and simple logic ICs, mostly to interface to the computer's ISA bus. The board has 64 kilobytes of RAM to store network data, two Toshiba TC55329 32K×9 bit static RAM chips. This RAM is accessible both by the network card and by the host PC. The code for the internal microprocessor is contained in an EPROM chip on the board, an AMD 27C1024 chip holding 128 kilobytes as 16-bit words. The EPROM chip has an adhesive label on it with the IBM part number 73G2042, indicating the microcode version.

The token ring board plugs into a PC's ISA slot.

The token ring board plugs into a PC's ISA slot.

The right side of the board holds the analog circuitry to interface with the network. Five pulse transformers provide electrical isolation between the interface board and the potentially-dangerous voltages of the network. Two bypass relays disconnect the card from the ring when not in use, preserving the ring's connectivity. There are also two transistor arrays along with resistors and capacitors to condition the network signals before passing them to the token ring chip. The card connects to the network via an RJ-45 connector that can be used with unshielded twisted-pair (UTP) cable. It also has a DB-9 connector on the back that can be used with shielded twisted-pair (STP).11

In the 1980s, many different local area networking standards were competing including Ethernet, Token Ring, Datapoint's ARCnet, Apple's LocalTalk, Omninet, and Econet. By the early 1990s, Ethernet won due to a combination of factors: much lower cost (about 1/5 the cost of Token Ring), less complexity leading to faster technological improvement (such as 100 Mb/s Ethernet and switched Ethernet), and a wider ecosystem than IBM provided.10 The complexity of the chip reflects the complexity of Token Ring and illustrates that IBM's technological edge in the 1980s was a double-edged sword: although it initially gave Token Ring a large performance advantage, the simpler technology of Ethernet eventually won.12

The IBM logo is in the lower-left corner of the die, along with the mysterious codename "PINEGR SH".

The IBM logo is in the lower-left corner of the die, along with the mysterious codename "PINEGR SH".

Thanks to Antoine Bercovici for die photos and information. Thanks to my Twitter readers for discussion. I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed.

Notes and references

  1. IBM's token ring network was inspired by ring network research from the 1970s, such as the Cambridge Ring

  2. IBM called their integrated circuits MST, Monolithic System Technology. 

  3. The diagram below illustrates the complex construction of a solder ball on the die. Thin layers of aluminum, chromium, copper, and gold are put on the silicon to obtain the necessary properties, followed by a layer of lead-tin solder, which is reflowed to form the balls. The chromium bonds to the oxide layer, while the copper provides solderability and the gold protects the copper from oxidizing.

    Diagram of a solder pad, from this paper.

    Diagram of a solder pad, from this paper.

     

  4. The metal wiring on the top layer of the chip looks like a mess, but there is some structure behind it. The diagram below shows a small section of this wiring, colored to show the structure. The solder balls are shown in yellow. The red and blue traces transmit power and ground from the solder balls across the chip. These traces connect with the vertical strips of metal wiring that transmit power and ground throughout the chip. The other wiring connects the signal solder balls to the I/O drivers, converging in a narrow band in groups of four. Most of the solder balls are positioned with little regard for the underlying circuitry; the top metal layer provides the "glue" between them and the integrated circuit itself. The result is the peculiar metal pattern visible on top of the chip.

    The colored lines show how the top layer of metal wiring connects the solder balls to the chip.

    The colored lines show how the top layer of metal wiring connects the solder balls to the chip.

    In most integrated circuits, the I/O drivers are around the edges of the chip next to the bond pads. However, in this chip, most of the I/O drivers stretch in a line across the middle of the chip (indicated above). More I/O drivers are at the bottom of the chip next to the CPU, probably connected to it directly.

    The photo below shows three I/O drivers, side by side. The metal layers have been mostly removed to reveal the silicon underneath. These drivers are fairly complex. The top half contains large drive transistors to provide relatively high-current outputs, along with smaller control transistors. The lower half contains reddish serpentine resistors made out of polysilicon. These resistors help protect the sensitive gates of the input transistors from static discharges. For output pins, these resistors are disconnected. The middle resistor, however, is connected to the input transistor near the bottom.

    Die photo of three I/O drivers.

    Die photo of three I/O drivers.

     

  5. The die is flipped over when soldered to the substrate. This needs to be kept in mind when comparing the die and the substrate. For instance, the two extra power connections for the CPU are in the lower right of the die but the lower left of the substrate. (Just a note to avoid potential confusion.) 

  6. I'm not sure which transistors are NMOS and which are PMOS in the gate. I'm assuming the PMOS are on top and it's a NAND gate, but it could be the other way around, in which case it's a NOR gate. 

  7. The processor is described as using IBM's "universal controller (UC) architecture" but there's very little information about this architecture. Wikipedia claims this architecture consisted of UC0 (8-bit), UC.5 (16-bit), and U1 (32-bit), with upwards compatibility. An alt.folklore.computers thread and this page provide a bit more information. 

  8. The analog circuitry contains small loops of various sizes that I was unable to identify. They are only connected on one end and have nothing underneath, so they don't seem to be inductors. Twitter readers suggested probe points, disconnected circuitry, or reflective delay lines, but their function remains unclear.

    Three of the loops on the die.

    Three of the loops on the die.

     

  9. The designers were very proud of the testability of the chip, writing a paper about the testing methodology, and a second paper about testing the analog circuitry. The chip includes a boundary scan feature (kind of like JTAG) and built-in self-test features, as well as mechanisms to isolate the analog block and the CPU for separate testing. 

  10. Much of the information about this chip comes from A 16-Mbit/s adapter chip for the IBM token-ring local area network. That article describes an earlier version of the chip, so I can't be sure everything is accurate when applied to this chip. (It appears to me that the chips are the same apart from the smaller size of the newer chip.) One source says the two chips are compatible. The older chip has part number 51F1439 while the chip I examined is 50G6144.

    For information on Token Ring, the book The Triumph of Ethernet: Technological Communities and the Battle for the LAN Standard discusses the competition between network protocols in great detail. You might also like Foone's Twitter thread on Token Ring. Interestingly, one of the original "ENIAC Women", Jean Bartik, wrote a 1984 article on Token Rings—"IBM's Token Ring: Have the Pieces Finally Come Together?"—but unfortunately I haven't been able to locate a copy. 

  11. Token Ring cables could be joined using the "IBM Data Connector", a curious type of connector. The connectors are known as hermaphroditic because two connectors can be joined without worrying about male and female ends. The connectors were nicknamed "Boy George" connectors after the androgynous singer, which seems questionable by current standards. (The nickname may also be motivated by the BOGR text on the connector, which I think indicates the black, orange, green, and red wires.)

    IBM Data Connector. Photo from Redgrittybrick, (CC BY-SA 3.0).

    IBM Data Connector. Photo from Redgrittybrick, (CC BY-SA 3.0).

     

  12. The book The Innovator's Dilemma describes how a low-end but innovating technology can defeat an advanced, entrenched technology. I haven't investigated Token Ring versus Ethernet enough to be sure this model applies, so consider it a hypothesis. 

A one-bit processor explained: reverse-engineering the vintage MC14500B

The Motorola MC14500B1 is a 1-bit processor introduced in 1976. While a 1-bit processor might seem almost useless,2 it was marketed as an Industrial Control Unit for applications that made simple decisions based on Boolean logic, for example, air conditioning, motor control, or traffic lights.

The die photo below shows the processor under a microscope. This silicon appears greenish, while the white lines on top are the metal layer that wires the transistors together. The 16 black spots around the edges are the bond wires that connect the chip to its 16 external pins. The MC14500B has roughly 500 transistors, very few for a microprocessor. In comparison, the popular 8-bit Z-80 microprocessor, also released in 1976, had 8500 transistors. Even the first microprocessor, the 4-bit Intel 4004 (1971), contained 2250 transistors.

The die of the MC14500B with functional blocks labeled. The pins are labeled around the outside. Die photo from
siliconpr0n
(CC BY 4.0).

The die of the MC14500B with functional blocks labeled. The pins are labeled around the outside. Die photo from siliconpr0n (CC BY 4.0).

You might think that a 1-bit processor would only support two instructions, making it impractical. However, like many processors, the MC14500B uses different sizes for data and instructions. Although it used one bit for data, its instructions were 4 bits, giving it a small but usable instruction set of 16 instructions.3

The MC14500B has an unusual architecture, making it more of a building block than a complete microprocessor. In particular, the chip doesn't include any support for memory or addresses; it didn't even have a program counter. The program counter, instruction fetches, jumps, subroutine calls, and I/O needed to be implemented with external circuitry.4 This is a key reason that the chip was so simple. (The other reason, of course, was that it only supported one bit.)

Since the MC14500B was designed for industrial control applications, you'd expect it to be a microcontroller, but it's the opposite of a microcontroller in many ways. A typical microcontroller is a computer-on-a-chip including RAM and ROM, with strong I/O support, providing a single-chip solution. The MC14500B, however, requires multiple external chips to make it usable.

The MC14500B comes in a 16-pin DIP integrated circuit, much smaller than the 40-pin packages commonly used for microprocessors at the time. The "CP" suffix indicates a plastic package. Photo from siliconpr0n
(CC BY 4.0).

The MC14500B comes in a 16-pin DIP integrated circuit, much smaller than the 40-pin packages commonly used for microprocessors at the time. The "CP" suffix indicates a plastic package. Photo from siliconpr0n (CC BY 4.0).

The block diagram below shows the internal structure of the chip. The Data pin in the upper left provides the single-bit I/O line. It feeds into the Logical Unit (LU), which implements 1-bit Boolean logic functions such as AND and OR. The result is stored in the Result Register (RR), the chip's main storage register. The chip has an on-board oscillator OSC that uses an external resistor to control the clock speed. (The chip runs at up to 1 megahertz, faster than I expected.) The Instruction Register stores the 4-bit instruction; the circuitry to decode an instruction occupies the majority of the chip. The JMP, RTN, FLAG O, and FLAG F pins are activated by the corresponding instructions, but the functionality must be implemented externally. Note the lack of a program counter or address pins.

Block diagram of the MC14500B. From the datasheet.

Block diagram of the MC14500B. From the datasheet.

The motivation for making such a stripped-down processor was to provide a low-cost alternative for applications that didn't require a full microprocessor. In 1977, the MC14500B cost $7.58 in quantities of 100 ($32 in current dollars), which seems expensive. However, at the time, an 8080A CPU cost $20 and a Z80 cost $50 ($85 and $215 in current dollars) so there was a significant cost saving to the MC14500B.5 However, the steady fall of processor prices soon made the MC14500B less attractive.

How CMOS logic is implemented

The chip was one of the first processors built from CMOS circuitry,6 a low-power logic family now used in almost all processors. CMOS (complementary MOS) circuitry uses two types of transistors, NMOS and PMOS, working together. The diagram below shows how a PMOS transistor is constructed. The transistor can be considered a switch between the source and drain, controlled by the gate. The source and drain (green) consist of regions of silicon doped with impurities to change its semiconductor properties and called P+ silicon. The gate consists of an aluminum layer, separated from the silicon by a very thin insulating oxide layer.7 (These three layers—Metal, Oxide, Semiconductor—give the MOS transistor its name.) The PMOS transistor turns on when the gate is pulled low.

Structure of a PMOS transistor. An NMOS transistor has the same structure, but with N-type and P-type silicon reversed.

Structure of a PMOS transistor. An NMOS transistor has the same structure, but with N-type and P-type silicon reversed.

An NMOS transistor has the opposite construction from PMOS: the source and drain consist of N+ silicon embedded in P silicon. The operation of an NMOS transistor is also opposite from the PMOS transistor: it turns on when the gate is pulled high. Typically PMOS transistors pull the drain (output) high, while NMOS transistors pull the drain low. In CMOS, the transistors act in complementary fashion, pulling the output high or low as needed.

Because the NMOS transistor is built in P silicon, but the silicon die itself is N silicon, the NMOS transistors are surrounded by a tub or well of P silicon. The cross-section diagram below shows how the NMOS transistor on the right is embedded in the well of P-type silicon. The NMOS and PMOS transistors both require a bias voltage connection to the underlying silicon substrate to block signals from escaping from the transistors.8 These bias connections can be seen scattered across the chip.

Cross-section of CMOS transistors.

Cross-section of CMOS transistors.

The basic CMOS gate is an inverter, shown below. It is constructed from a PMOS transistor and an NMOS transistor acting in opposite (i.e. complementary) fashion. When the input is low, the PMOS transistor (top) turns on, pulling the output high. When the input is high, the NMOS transistor (bottom) turns on, pulling the output low.

A CMOS inverter is constructed from a PMOS transistor (top) and an NMOS transistor (bottom).

A CMOS inverter is constructed from a PMOS transistor (top) and an NMOS transistor (bottom).

The diagram below shows how an inverter, outlined in red, appears on the die. Note that a single inverter takes a visible part of the die. The next image zooms in on the inverter; the metal wiring is visible as the white lines, while the silicon is mostly obscured. The third image shows the silicon layer after removing the metal with acid. Note how the metal gate lines up with silicon underneath. The circular contacts or vias connect the metal layer to the silicon.

How an inverter appears on the die. The middle image shows the metal layer. The metal was removed for the last image to show the underlying silicon.

How an inverter appears on the die. The middle image shows the metal layer. The metal was removed for the last image to show the underlying silicon.

The inverter consists of a PMOS transistor on top and an NMOS transistor below, connected together as described in the schematic earlier. In the diagram below, the silicon regions have been colored to show how they form transistors. Note that the source and drain aren't necessarily discrete, but can merge with neighboring transistors.

The inverter consists of silicon regions doped to form PMOS and NMOS transistors.

The inverter consists of silicon regions doped to form PMOS and NMOS transistors.

Other logic gates are constructed using the same concepts as the inverter, but with additional transistors. In a NAND gate, the PMOS transistors on top are in parallel, so the output will be pulled high if either input is 0. The NMOS transistors on the bottom are in series, so the output will be pulled low if both inputs are 1. Thus, the circuit implements the NAND function. (Note how the PMOS and NMOS transistors act in complementary fashion.) The NOR gate is implemented similarly, swapping the series and parallel transistors. The chip also uses more complex gates, discussed in the footnote.9

A NAND gate and a NOR gate are constructed in CMOS by putting transistors in series and parallel.

A NAND gate and a NOR gate are constructed in CMOS by putting transistors in series and parallel.

Transmission gate

Another key circuit in the processor is the transmission gate. This acts as a switch, either passing a signal through or blocking it. The schematic below shows how a transmission gate is constructed from two transistors, an NMOS transistor and a PMOS transistor. If the enable line is high, both transistors turn on, passing the input signal to the output. If the enable line is low, both transistors turn off, blocking the input signal. The schematic symbol for a transmission gate is shown on the right.

A transmission gate is constructed from two transistors. The transistors and their gates are indicated. The schematic symbol is on the right.

A transmission gate is constructed from two transistors. The transistors and their gates are indicated. The schematic symbol is on the right.

The photo below shows how a transmission gate appears on the die. This photo shows the metal layer, so the underlying silicon is difficult to see. The two transistors are outlined. Note that an inverter has the same input to both gates, so one transistor turns on at a time. In the transmission gate, however, the gates have opposite inputs, so the transistors turn on or off together.

A transmission gate as it appears on the die.

A transmission gate as it appears on the die.

Flip-flop

By combining inverters and transmission gates, an important circuit called the flip-flop is constructed. A flip-flop stores one bit, controlled by a clock signal. The flip-flops have a key role in the chip as they keep the processor synchronized to the clock.

A flip-flop is based on a latch built from two inverters, below By connecting two inverters in a loop, the circuit can store either a 0 or a 1. If the input to an inverter is a 1, it outputs a 0; this causes the other inverter to output a 1, feeding back to the first inverter. Thus, the circuit is stable in either the 0 or 1 state.

Two cross-coupled inverters can store a 0 or a 1.

Two cross-coupled inverters can store a 0 or a 1.

The circuit above requires two changes to form a useful flip-flop. First, it requires a way of storing a value in the latch. This is solved in a brute-force way. One of the inverters uses a weak, low-current transistor.10 An input signal can override this signal, forcing the inverters into the desired state. This input is controlled by a transmission gate: when the gate is active, the input signal is stored in the inverter latch. When the transmission gate is inactive, the inverter latch loop holds the value.

A flip-flop is constructed from two latches separated by transmission gates.

A flip-flop is constructed from two latches separated by transmission gates.

The second change is that two inverter latches are used. The first is controlled by the clock, while the second is controlled by the inverted clock. While the clock is high, the input value passes into the first inverter latch. But when the clock goes low, the transmission gates switch state: the first transmission gate blocks any additional changes, while the second transmission gate passes the value from the first inverter latch to the second latch and thus the output. In effect, the flip-flop grabs the input value when the clock switches low, and holds this output until the next time the clock switches low.

The diagram below shows one of the flip-flops in detail (specifically the instruction register bit I3). It consists of four inverters and two transmission gates, as described earlier but arranged top-to-bottom. The left half consists of the well of P-type silicon for the NMOS transistors, while the right half holds the NMOS transistors. As a result, the inverters and transmission gates have one transistor on each side. This forces some of the gates to have their two transistors widely separated, as seen below.

Implementation of a flip-flop, as seen on the die. This photo shows the metal layer.

Implementation of a flip-flop, as seen on the die. This photo shows the metal layer.

The diagram below shows the locations of the chip's flip-flops. Each flip-flop takes up a substantial part of the chip, despite being the simple circuit described above. This should give you an idea of the small amount of circuitry in the chip. On the left, the 4-bit instruction register consists of four flip-flops. The IEN and OEN registers, as well as two flip-flops to control write operations are on the right. At the bottom, six flip-flops hold the values for the Flag O, RTN, Flag F, and JMP pins, as well as buffering the data mux and holding the instruction skip state. The Result Register (RR) in the lower right is a more complex latch; it has circuitry to hold its existing value as well as a reset circuit.

Locations of the flip-flops on the die.

Locations of the flip-flops on the die.

The logic unit

The logic unit (LU) performs 1-bit operations.11 It takes a bit from the data pin, a bit from the RR register, and stores the result in the RR. It implements seven functions: load a bit into the RR, load the complement into the RR, logical AND, logical AND complement, logical OR, logical OR complement, and exclusive NOR. The table below summarizes these operations. Each column shows the results of an operation, for the four possible input combinations.

The logic unit implements seven different operations, using the RR bit and data pin bit as inputs.

The logic unit implements seven different operations, using the RR bit and data pin bit as inputs.

The colored rectangles indicate how the logic unit is implemented internally. A green rectangle indicates the data value is copied to the output. An orange rectangle indicates the complement of the data value is copied to the output. A blue rectangle indicates that the RR register is OR'd into the result.

The diagram below shows the three complex logic gates that implement this table. (These are AND-OR-INVERT and OR-AND-INVERT gates, but I've removed the inverters to simplify the explanation.) The appropriate inputs (i.e. colors) are selected based on the instruction and the value of the RR register. The upper-left gate is active for the combinations of instruction and RR value that use the inverted data value (i.e. orange). The lower-left gate is active for the combinations that use data (i.e. green). The right gate selects inverted data, RR, and data as appropriate, and ORs them together to form the final result, stored in the RR register.

The implementation of the logic unit. Note that the colors are associated with conceptual paths, not separate gates.

The implementation of the logic unit. Note that the colors are associated with conceptual paths, not separate gates.

I've looked at a lot of Arithmetic-Logic Units (ALUs) before, and this implementation is rather unusual. The main factor is that it doesn't perform arithmetic operations, so it's not dealing with sums, differences, carry-in, and carry-out. It's also one bit wide, rather than 8 or 16 bits. Due to these factors, it's implemented with the gates shown above, rather than a more typical combination of adders. Another interesting thing about the implementation is that the logic unit's circuitry is mixed in with the instruction decoding circuitry, rather than physically separating the two, as in most processors. (See the die photo at the top of the article.)

Instruction decoding

Much of the chip is devoted to instruction decoding, converting a 4-bit opcode into an instruction signal. Although many microprocessors, such as the 6502, use a Programmable Logic Array (PLA) for instruction decoding, the MC14500B doesn't use anything structured like that. Instead, it just has a bunch of gates. First, it decodes pairs of instruction bits (bit 0 with bit 1, and bit 2 with bit 3) into their combinations. Then it combines these signals for the full decoding.

For instance, one signal is generated if instruction bits I3 and I2 are high, by NOR of I3' and I2'. Another signal is generated if I1 is high and I0 is low, by NOR of I1' and I0. Combining these two signals with a NAND gate generates a signal that is low for the SKZ (skip on zero) instruction which has the opcode 1110. This signal is fed into the instruction skip circuitry to implement the instruction. The other instructions are decoded by similar combinations of gates.

Control flow

The MC14500B uses several techniques to provide control flow in programs. Its conditional instruction is SKZ, which skips the next instruction if the RR register is zero. The chip implements the skip instruction by setting a flip-flop if the next instruction should be skipped. If so, the next instruction is overridden by the opcode 0000 through some gates on the instruction pins. This opcode corresponds to the NOP O instruction. This instruction normally energizes the O pin, but the skip circuit suppresses this too. The result is that the skip circuit suppresses the next instruction, and then execution continues.

The chip has opcodes for jump (JMP) and return from subroutine (RTN). These instructions don't do much other than energizing the JMP or RTN pins.12 These operations must be implemented by external circuitry if desired.

The chip provides an unusual technique for implementing larger conditional code blocks. Write operations are controlled by the OEN (Output ENable) flip-flop. To suppress a block of code, the OEN flip-flop can be cleared. The code will still be executed, but it won't have any effect since the output is disabled, so it acts like an IF-THEN block. Similarly, the IEN (Input Enable) instruction will disable the input. These instructions provide conditional execution even if the hardware isn't implemented for the jump instruction.

Finally, a recommended implementation is to wire the F pin to the program counter's reset line. Then, the NOP F instruction will cause the program counter to return to the start of the code. This permits a processing loop to be implemented very simply.

Conclusion

The MC14500B processor is simple enough that its circuitry can be reverse-engineered and understood. To summarize the chip's operations, it takes a 4-bit instruction, which is stored in the instruction register (four flip-flops) and then decoded (using a large number of gates). A logic instruction takes a value from the RR register and the data pin. The "Logic Unit" uses three complex gates in a clever arrangement to perform the selected Boolean operation, and the result is stored back in the RR register. The processor has other flip-flops to handle write operations and other instructions. Execution is controlled by the on-chip clock.

The MC14500B is an unusual processor, handling just one bit of data, while off-loading functionality such as the program counter. Although a one-bit processor might seem like a joke at first, it had genuine uses for implementing logic-based industrial controllers. It seems like an evolutionary dead-end, though. Larger 4-bit and 8-bit microcontrollers were very popular, while the MC14500B was a niche product.

Thanks to David of Usagi Electric for driving the MC14500B analysis project and thanks to John McMaster for decapping the chips and creating the MC14500B images (CC BY 4.0). I first heard about the MC14500B from jonsen back in 2013. I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed.

Notes and references

  1. You might wonder why the MC14500B ends with a "B". Motorola used the B suffix to indicate a buffered chip, while UB indicated an unbuffered chip. For instance, the MC14001UB NOR gate took its output directly from the NOR circuit, while the MC14001B added a double-inverter buffering stage to the output. The buffering stage provided better noise immunity, but the unbuffered chips were faster and better for semi-analog circuits such as oscillators. (See Motorola 1978 Databook p4-3 for discussion.) Motorola was often haphazard with using the B suffix on their part numbers. However, Motorola also used MC14500 to refer to a family of assorted CMOS chips, so the "B" was necessary to distinguish between the MC14500 family and the MC14500B processor chip. Thus, Motorola never used MC14500 without the suffix to refer to the processor chip, as far as I can tell.

    Also see A Strong Commitment to Complementary MOS (1972), an interesting article from when CMOS was just starting its growth, and it was unclear how successful it would be. In this article, Motorola described its line of CMOS products, which it called "McMOS". Motorola had the 14000 series of standard parts, and the 14500 series for newer, more complex designs. Thus, the 14500 series included parts such as the MC14501 triple gate, MC14514 latch/decoder, and the MC14518 dual up counters.

    For more information on the MC14500B processor, see the datasheet and the detailed MC14500B handbook

  2. Note that the MC14500B is not a bit-slice processor, but intended for 1-bit applications. The idea of a bit-slice processor is that the processor chip is designed for 2 or 4 bits, for example, but you combine multiple chips to build a 16-bit processor, for instance. Each processor handles a slice of the complete word. Many systems were built in the 1970s with bit-slice processors such as the Am2901. Bit-slice processors were popular when an individual chip couldn't hold the circuitry for a complete processor, so it needed to be partitioned across multiple chips. 

  3. The table below gives the full instruction set for the MC14500B. It uses a 4-bit instruction so it has 16 different instructions.

    The MC14500B instruction set. From the datasheet.

    The MC14500B instruction set. From the datasheet.

     

  4. The MC14500B omitted a lot of circuitry that you expect to find in a processor, requiring it to be implemented separately. While the chip has opcodes for Jump and Subroutine Return, it doesn't do anything other than activate an external pin for those instructions; supporting subroutines required an external chip. The chip has a single data in/out pin; that pin is typically connected to multiple I/O devices with an external multiplexer/demultiplexer. Finally, while the chip itself has a 4-bit instruction set, a system typically added more instruction bits to address I/O devices or memory. A full system could use 8-bit instructions with four bits going to the processor and four bits selecting the I/O port or storage location. Alternatively, the system could use a larger program counter, more instruction bits, or external RAM depending on the application.

    Motorola sold chips that could work with the MC14500B to build a complete system. The MC14512 Input Selector could multiplex eight inputs into the processor, while the MC14599B Output Latch provided eight outputs or eight bits of storage. The program counter could be implemented by an MC14516B Program Counter (a 4-bit up/down counter chip) or two. 

  5. The 8-bit HP Nanoprocessor from 1974 was another low-cost, minimal processor. However, it was more complex than the MC14500B with about 10 times as many transistors. The Nanoprocessor included a program counter, subroutine support, and multiple registers. Like the MC14500B, the Nanoprocessor omitted arithmetic operations. 

  6. Early CMOS microprocessors include the 12-bit Intersil 6100 (1974) and the 8-bit RCA 1802 COSMAC (1974). The 1802 is said to be the first CMOS microprocessor. Mainstream microprocessors didn't switch to CMOS until the mid-1980s. 

  7. The MC14500B used metal-gate transistors, with aluminum forming the transistor gate. These transistors were not as advanced as the silicon-gate transistors that were developed in the late 1960s. Silicon gate technology was much better in several ways. First, silicon-gate transistors were smaller, faster, and more reliable. Second, silicon-gate chips had a layer of polysilicon wiring in addition to the metal wiring; this made chip layouts about twice as dense. In comparison, many of the signals on the MC14500B have long, winding paths in the metal layer due to the difficulties of routing with a single metal layer. The Intel 4004 used silicon gates in 1971, so the MC14500B was far behind technologically to use metal gates in 1976. I assume this was done for cost reasons. 

  8. The bias voltage makes the boundary between a transistor and the substrate act as a reverse-biased diode, so current can't flow across the boundary. Specifically, for a PMOS transistor, the N-silicon substrate is connected to +5 volts. For an NMOS transistor, the P-silicon well is connected to ground. A P-N junction acts as a diode, with current flowing from P to N. But the bias voltages put P at ground and N at +5, blocking any current flow. The result is that the substrate can be considered an insulator, with current restricted to the N+ and P+ doped regions (to simplify a bit). 

  9. More complex gates can be created by combining transistors in series and parallel. The AND-OR-INVERT gate below is an example. Because of the way CMOS works, this gate is constructed as a single gate (rather than three); it's no more difficult to make it than a NAND gate. CMOS gates, however, require inversion on the output, so you can't make an AND or OR gate directly.

    An AND-OR-INVERT gate implemented with CMOS.

    An AND-OR-INVERT gate implemented with CMOS.

     

  10. The current that a transistor can provide is proportional to the ratio between the gate length (the distance between the source and drain) and the gate width. The diagram below shows two transistors, a strong one on top and a weak one below. The strong transistor has a wide gate so it can provide a relatively high current. (Imagine slicing the transistor into parallel transistors, each one providing current.) The weak transistor has a narrow, but long gate, so it provides a much smaller current. (Think of the current needing to travel a longer distance through the current-blocking gate.) Based on the ratio of sizes, the upper transistor will pass about 16 times as much current as the lower transistor.

    The top transistor is a strong transistor, while the bottom transistor is a weak transistor. This photo shows the silicon after dissolving the metal layer.

    The top transistor is a strong transistor, while the bottom transistor is a weak transistor. This photo shows the silicon after dissolving the metal layer.

     

  11. The processor has a logic unit (LU), not an arithmetic/logic unit (ALU), because it doesn't perform any arithmetic operations. The handbook (chapter 14) explains how you can use the chip's logic instructions to do arithmetic if you really need to; it takes 12 operations to do a 1-bit add, repeated N times for an N-bit addition. 

  12. The return from subroutine instruction (RTN) causes the next instruction to be skipped. The motivation is that external circuitry can push the current address on the stack when doing a subroutine call. When popped for a return, this address will still point to the subroutine call instruction. By skipping the instruction after a return, the MC14500B avoids an infinite loop. (The external circuitry could, of course, increment the return address but that would have required more hardware.)  

IBM paperweight teardown: Reverse-engineering 1970s memory chips

I recently received a vintage IBM paperweight from the early 1970s that showcases some memory chips.1 When IBM started using integrated circuits in the late 1960s, they packed the chips in square metal modules called Monolithic Systems Technology (MST). The paperweight illustrates the manufacturing steps for an MST module as a silicon wafer was cut into silicon dies, mounted on a square ceramic substrate, and wrapped in a thumbnail-sized metal package.

The paperweight contains a silicon wafer, four dies, and an MST module in various stages of assembly. The paperweight is somewhat yellowed with age. Click this image, or any other, for a larger photo.

The paperweight contains a silicon wafer, four dies, and an MST module in various stages of assembly. The paperweight is somewhat yellowed with age. Click this image, or any other, for a larger photo.

Because the dies are encased in clear Lucite, it's possible to closely examine their circuitry and understand them better. The photo below is a closeup of the edge of the silicon wafer and the four dies inside the paperweight. The two larger dies are the same as the dies on the wafer. The two smaller dies are the same, but one is visibly damaged.2 For this blog post, I took detailed die photos using a microscope and reverse-engineered the smaller chip. My conclusion is that the larger chips are 1-kilobit static RAM chips, while the smaller ones are memory sense amplifiers.

Closeup of the dies and wafer inside the paperweight.

Closeup of the dies and wafer inside the paperweight.

IBM System/370

These chips were probably used in IBM's popular System/370 line of mainframe computers. In 1964, IBM introduced the extremely successful System/360 family of mainframes. This product line was modernized in 1970 with the announcement of System/370, which was constructed from integrated circuits (unlike the System/360) and moved from magnetic core memory to semiconductor memory. The paperweight illustrates both of these changes: integrated circuits and semiconductor memory.

To understand the scale of a System/370 computer, the rendering below shows a System/370 Model 145. The Model 145 was a "medium-scale" machine in the middle of the System/370 family.3 The Model 145 is notable as IBM's first computer that used semiconductor main memory. The computer is very large by modern standards, filling the blue cabinets below. One cabinet holds the CPU while another holds 256 kilobytes of memory chips. This computer predates the microprocessor, so the CPU is built gate-by-gate from many boards of integrated circuits. The Model 145 weighed over a ton, cost $5 to 10 million (in current dollars), and was roughly as fast as an IBM PC (1981).

Rendering of a System/370 Model 145. The computer is the large blue cabinet along the wall. The white unit at the back is disk storage, while a card reader is in the foreground. Image by Oliver.obi, CC BY-SA 3.0.

Rendering of a System/370 Model 145. The computer is the large blue cabinet along the wall. The white unit at the back is disk storage, while a card reader is in the foreground. Image by Oliver.obi, CC BY-SA 3.0.

The MST modules

In the earlier System/360, IBM didn't use integrated circuits, but instead used hybrid modules called SLT. For the System/370 IBM moved to integrated circuits, which they called "monolithics". While most companies packaged integrated circuits in rectangular plastic or ceramic packages, IBM retained the half-inch-square metal packages of SLT, calling it MST, for Monolithic Systems Technology.4 MST was a big improvement over the earlier hybrid SLT, about ten times more reliable and 4 to 8 times as dense. These MST integrated circuits were very simple by modern standards, with 32 transistors per module implementing about six gates, so thousands of integrated circuits were required to implement the computer.

The MST modules were manufactured in large quantities with automated production techniques. The sequence of components in the paperweight (below) illustrates the steps. On the left, the round silicon wafer is cut into individual dies. On the right, the square ceramic substrate has 16 holes for pins. Next, a printed-circuit pattern is applied to the substrate to connect the integrated circuit to the module's pins.5 In the third step, 16 pins are soldered to the substrate. Next, the silicon die and the ceramic substrate are combined, with the silicon die is mounted upside-down in the center of the ceramic substrate. Note how small the silicon die is, compared to the size of the package. The module is reflow-soldered, with contacts on the silicon die soldered directly to the substrate.6 Finally, the module is encased in metal, producing a half-inch square module. These modules give IBM's integrated circuits a unique appearance, distinct from the plastic or ceramic DIP integrated circuits used by other manufacturers.

The steps to manufacture an MST module.

The steps to manufacture an MST module.

The MST modules were tightly packed on circuit cards, such as the memory card below. The square module in combination with a four-plane printed circuit board provides considerably higher density than the circuit boards of other manufacturers at the time, which typically used DIP integrated circuits and 2-layer PCBs.

An IBM memory card packed with MST modules.

An IBM memory card packed with MST modules.

The memory wafer and chip

The silicon wafer in the paperweight is 2 inches in diameter, a size that was introduced in 1969. Wafer sizes have steadily increased since then and modern chip fabrication is done with much larger 300 mm (12") wafers.7 The wafer contains 177 dies; using a microscope, I created the die photo below of one of them. Curiously, the wafer is only partially manufactured; it looks like only one of the nine mask layers was constructed. Because this photo is taken from the wafer, you can see the test circuitry and alignment patterns in between the dies.

Die photo of one of the memory chips on the wafer. It is only partially manufactured. The part number "DLM1" is visible on it.

Die photo of one of the memory chips on the wafer. It is only partially manufactured. The part number "DLM1" is visible on it.

The paperweight also contains completed individual dies so I created the die photo below. The regular grid of memory cells is visible in the middle of the chip, with support circuitry around the edge. From studying the die and counting the cells, I think this is a 1-kilobit static RAM chip. Note the solder balls around the edge of the die, which allowed the chip to be soldered directly to the ceramic substrate. With 25 solder balls, this chip was probably mounted in an MST package with a 5×5 grid of pins.

Die photo of the memory chip.

Die photo of the memory chip.

Taking microscope photos is difficult when the die is encased in Lucite, so I wasn't able to see the circuitry under high magnification. As a result, I couldn't reverse-engineer this chip in detail.8 I was able to measure the feature size on the die as about 6µm, a process introduced around 1971.

The sense amplifier chip

The smaller die in the paperweight is much simpler with much larger components. I took the die photo below and found it contains 32 NPN transistors along with resistors. This chip is partially analog and also uses a type of logic called ECL. I believe the chip is a differential amplifier, a sense amplifier to read the signals from the memory chip. This would explain why the two chips are packaged together in the paperweight.

Die photo of the bipolar integrated circuit. The left and right sides are approximately mirror-images, with two copies of the same circuit.

Die photo of the bipolar integrated circuit. The left and right sides are approximately mirror-images, with two copies of the same circuit.

In the die photo above, the silicon of the die is gray. Parts of the silicon were doped with arsenic, boron, or phosphorus to create regions with different semiconductor properties. The black lines in the silicon are boundaries between different doping levels. The yellowish regions are metal wiring on top of the silicon, connecting the various components together. The large black circles are the solder balls to connect the die to the MST substrate.

The diagram below is a detail from the chip, showing two types of resistors and a transistor. The upper resistor above is made from a line of higher-resistance N-type silicon, with metal contacts connected to either end. This forms a 65Ω resistor. The lower resistor has six contacts, providing multiple resistance values depending on where the metal lines are attached. It uses P-type silicon for the resistive element, providing hundreds of ohms of resistance. (There's a bit more internal structure to the resistors, but I'll ignore it.)

Two resistors and a transistor as they appear on the die.

Two resistors and a transistor as they appear on the die.

The transistors are bipolar NPN transistors, but their structure is a bit more complex than the typical NPN transistor. Physically, they have two bases and two collectors wired together to reduce current density, so you'll see five metal connections to each transistor. The diagram below shows the cross-section structure of the transistor. The five metal connections on top of the cross-section correspond to the five connections on the transistor above. The collector, base, and emitter are connected to N-P-N layers, forming the NPN transistor.9 The P+ ring provides isolation around the transistor.

This diagram shows the internal structure of the chip's transistors, based on patent 3539876.

This diagram shows the internal structure of the chip's transistors, based on patent 3539876.

By recognizing the components on the die and tracing out the wiring, the circuit can be reverse-engineered. However, if you look at the die closely, you'll see that many components are not connected. The reason is that IBM used a technique called "master slice" to produce a variety of integrated circuits without custom-designing each one.10 The idea was to use a common silicon die with multiple transistors and resistors. By modifying the metal layer (which was relatively inexpensive), the components could be wired into the desired circuits. This is also why the resistors had multiple taps, so they could be wired to obtain different values as needed.

The differential amplifier and ECL logic

Logic circuits can be built in a wide variety of ways. Almost all computers today use a logic family called CMOS (complementary metal-oxide-semiconductor), building gates out of MOS transistors. However, the IBM System/370 used a high-performance11 logic family known as Emitter-coupled logic (ECL), which IBM called Current-Switch Emitter Follower (CSEF).12 ECL was invented at IBM in 1956 for use in IBM's high-performance transistorized computers.

ECL is based on a differential pair, a circuit that amplifies the difference between two inputs. (This circuit is also the basis of op-amps.) The idea behind a differential pair (below) is that a fixed current flows through the circuit. If the left input is a higher voltage than the right, the left transistor will turn on and most current will flow through the left branch (red). Conversely, if the right input is a higher voltage than the left, the right transistor will turn on and most current will flow through the right branch (blue). The differential pair provides amplification because a small difference in the inputs will create a large shift in the current.

A differential pair amplifies the difference between the two inputs.

A differential pair amplifies the difference between the two inputs.

The above circuit is used as an amplifier in the chip, but with a few modifications it also forms an ECL gate. For a gate, the voltage into one branch is fixed at a reference voltage, midway between the "0" level and the "1" level. Thus, if the input is higher than the reference voltage, it will be considered a "1", and lower will be a "0". (MST chips used ground as the reference voltage.4) The ECL circuit below is an inverter, since if the input is high, the current through the left resistor will pull the output low. To improve performance, the bottom resistor has been replaced with a current sink circuit (purple). The current through the current sink is set by an external bias voltage (VCS).

The differential pair can be modified to produce an ECL inverter.

The differential pair can be modified to produce an ECL inverter.

A buffer (green) has been added to the output above. The buffer circuit is called an emitter follower since the output is taken from the transistor's emitter and the output follows the input. This is why IBM used the name Current-Switch Emitter Follower for this logic family.

The sense amplifier chip's circuitry

I reverse-engineered the chip's circuitry and found it contains two copies of the circuit below. This circuit is a differential amplifier, probably used as a sense amplifier to amplify the outputs from the memory chip and convert them to logic signals.13

The chip takes two inputs, a negative input and a positive input, and produces a logic-level output. The circuit is a bit complicated, but I'll try to explain the highlights. The differential amplifiers (discussed earlier) are the core of the chip. The input signals are buffered and then go into the lower amplifier (green box). The outputs from that amplifier go into the upper amplifier. Cascading two amplifier stages in this way makes the chip very sensitive, providing a large degree of amplification.

Reverse-engineered schematic of the sense amplifier chip.

Reverse-engineered schematic of the sense amplifier chip.

The yellow boxes are buffers, using the emitter-follower circuit described earlier. One buffer is used on each input and one on the output. The purple box is an ECL gate. I believe it is used to latch the amplifier's value by feeding the output back in. The current sink transistors are colored blue to distinguish them. They provide a constant current to the differential amplifiers and other circuits.

Conclusion

Well, this is a lot of analysis for a paperweight. But this paperweight provides an interesting window into IBM's technology of 1974.14 In particular, it illustrates IBM's transition to integrated circuits and semiconductor memory for the System/370 mainframes. It also explains IBM's unique construction technique for integrated circuits, packaging them on a ceramic wafer in a square metal can, a technology they called MST. Finally, the paperweight's 1-kilobit memory chip shows the amazing progress that memory technology has made over the past decades, giving us megabit chips and now multi-gigabit chips.

Thanks to @magnetic_tape for sending me the paperweight. Thanks to Mark Smotherman for information on MST. I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed. If you're interested in old IBM technology, see my recent post on an IBM Technology Box, covering 1948 to 1986.

Notes and references

  1. The text "Essones" on the paperweight refers to IBM's semiconductor plant in Corbeil-Essones, a suburb of Paris. IBM opened this plant in 1964, Europe's biggest semiconductor factory at the time. 

  2. Curiously, one of the dies in the paperweight is damaged and has a corner missing. Note that it's not simply broken, since the metal layer and the silicon doping don't go to the edge. Probably this die is from the edge of the wafer so it didn't get fully exposed. With the incomplete wafer and the truncated die, it's clear that they were using junk parts in the paperweights.

    One die in the paperweight is damaged.

    One die in the paperweight is damaged.

     

  3. For a while, IBM used a rational numbering system for the System/370 models, with computer power increasing with the model number. Model numbers ranged from the low-end Model 115 to the high-end Model 195. However, the numbering system fell apart in the late 1970s as systems were assigned seemingly-random numbers such as 3031, 4361, 3090, and 9370. Despite having the biggest number, the 9370 was a low-end machine. See IBM's 360 and Early 370 Systems for a detailed history of the System/370. 

  4. IBM had multiple versions of MST logic for different products; some versions used different voltages. MST-1 uses ground as the upper voltage, -4 volts as the lower voltage, and -1.32 volts as the ECL reference voltage. (Because ECL circuits are more sensitive to fluctuations in the upper voltage, ECL families often assign that level to ground, making the lower voltage negative.) MST-2 shifts the levels so the reference level is ground; the upper voltage is +1.25V and the lower is -3V.

    I couldn't find much information on the other MST variants, but for reference I'll summarize what I did find. MST-2 was used in the S/370 Models 145 and 155, while MST-4 was a high-performance version developed by Texas Instruments and used in the S/360 Model 85. The S/370 Model 168 used MST-1, MST-2, MST-4, and MST-A. The System/3 used MST-10. The IBM 3889 OCR machine, 3350 Disk Storage, and 3704 Communications Controller used MST-1 and MST-E. The IBM 3031 used MST-1, MST-2, MST-4, MST-4E, MST-E, and MST-A. Other versions included MST-195 and MST-255. 

  5. The MST ceramic substrate provides the interface between two circuitry scales: the printed circuit board scale with 0.125-inch pin spacing, and the integrated circuit scale with 0.01-inch solder ball spacing. The pattern on the MST ceramic substrate has some interesting subtleties; each power pin is connected to three solder balls, allowing more current into the IC. The trace for V- crosses the chip, providing two connections on one side and on on the other. The trace for V+ extends into the middle of the IC to provide additional power connections.

    Diagram showing how the chip is mounted on the ceramic substrate. (The chip image has been mirrored to account for it being mounted upside down.)

    Diagram showing how the chip is mounted on the ceramic substrate. (The chip image has been mirrored to account for it being mounted upside down.)

    For some reason, MST uses two different pin-numbering schemes. The 12-pin SLT numbering was extended by spiraling 13-16 into the middle. But the more common MST pin names are A01 through D04.

     

  6. IBM called the chip mounting technique "controlled-collapse chip connections" or C-4. It used a controlled volume of solder to make electrical and mechanical contact with the module. During soldering, the chip was pulled into alignment with the module fingers by surface tension, similar to how a surface-mount device is soldered today. For more details, see Design of Logic Circuit Technology for IBM System/370 Models 145 and 155

  7. Information on wafer sizes is here and on Wikipedia

  8. The photo below is the best resolution I could get of the memory cells. I believe this is six memory cells; I put a box around one. The circuitry in two rows is connected as shown in blue. This is probably two cross-coupled inverters, a standard circuit for a static RAM cell.

    Closeup of six memory cells in the memory chip.

    Closeup of six memory cells in the memory chip.

     

  9. The diagram below provides more details on the construction and dimensions of the transistors.

    The transistors in the MST chips have a single base and collector but has two base and collector connections to reduce current density. Image from Design of Logic Circuit Technology for IBM System/370 Models 145 and 155.

    The transistors in the MST chips have a single base and collector but has two base and collector connections to reduce current density. Image from Design of Logic Circuit Technology for IBM System/370 Models 145 and 155.

     

  10. The master slice approach used a fixed silicon layout with transistors and resistors, but changed the metal interconnections to create different chips, a process called "personalization". The diagram below, from patent 3539876, shows a silicon layout used for IBM's master slice integrated circuits. If you match up the resistors and transistors, this diagram is almost identical to the die in the paperweight. There are a few differences, though. In particular, the die has an extra pin on the left and right, with slight resistor changes to accommodate them. Design of Monolithic Circuit Chips (1966) describes the origins of the master slice approach. Even in 1966, they were using computer-assisted design for integrated circuits.

    The die structure from patent 3539876 is almost identical to the chip.

    The die structure from patent 3539876 is almost identical to the chip.

     

  11. ECL gates obtained much of their speed advantage because the transistors were not completely turned on (i.e. saturated). This allowed the transistors to switch the current path rapidly. Additionally, the difference between a "0" voltage and a "1" voltage was small (about 0.8) volts, so signals could switch between the two voltages quickly. In comparison, TTL gates typically had a difference of about 3.2 volts between a "0" and a "1", requiring more time to switch. (Signals could typically switch at about 1 volt per nanosecond, so a larger voltage swing caused nanoseconds of delay.) On the other hand, the small voltage swings of ECL made the circuits more sensitive to electrical noise. 

  12. For more details on ECL logic and how IBM used it, see Design of Logic Circuit Technology for IBM System/370 Models 145 and 155

  13. I'm not completely sure of the role of this chip. I searched extensively, but couldn't find any documentation on it. IBM's MST modules are described in detail in MST-2 Module Data (1974). Inconveniently, the chip's part number (2551667) doesn't appear in this document (although nearby part numbers such as 2551665 are described). Thus, I had to study the circuit to determine its function. At first I expected it to be a standard logic gate. However, the two amplification stages didn't make sense, or the complementary inputs. Another possibility was that it converted differential signals (such as from the Differential Current Switch logic family) into ECL signals. That would explain the differential inputs, but not the two stages of amplification.

    I think it's most likely that the chip is acting as a sense amplifier for memory chip, amplifying the memory chip's output and turning it into a logic level. The 370 Model 45 hardware manual (page 3-9) describes a sense latch module used with its memory, so external sense amplifiers were used in System/370. The chip pin that I've labeled "latch" may be used to feed back the output to latch it, or it may be used as an enable pin or to reset the latch; without seeing the surrounding circuitry, I'm not sure.

    Intel also produced memory chips that required external sense amplifiers; see the Intel 1103 and Intel 2105. Intel produced sense amplifier chips, the 3208 and 3408 Hex Sense Amplifiers specifically to provide external sense amplifies for memory. One motivation for external sense amplifiers was that memory chips were built with MOSFET transistors, but bipolar transistors produced better amplifiers. Later memory chips, though, included the sense amplifiers on the chip. 

  14. I'm guessing that the module is from 1974. Based on the technology, the paperweight is from the early 1970s. The module is labeled with the code "1 425C404". My theory is that the second digit "4" indicates the year, dating the module to 1974. IBM's modules are usually labeled with three lines of text, but there's no solid information on the meanings. The first line is the part number. The second line is believed to indicate the manufacturing location. (So "IBM 52" would indicate Essones, France. Although a reader tells me that IBM 52 was Poughkeepsie or Fishkill NY, while IBM 29 was Essex.) The third line is believed to be a date/lot code. Studying an extensive collection of cards, the digit after the 1 appears to be the year. For instance, some codes start with "1712" for 1977, "1 949" and "1925" for 1979, "1-005" and "1 031" for 1980, "1-106" for 1981, "1 205" for 1982, "1 444" for 1984, "1 865" for 1988, "1912" for 1989. But other modules have codes starting with "1 E52", "1 F09", and "1 H27" so it's not quite that simple. There also are a few codes like "1 8450" for 1984, suggesting they also used 2-digit year codes. It's possible that different sites used different codes.