Showing posts with label power supply. Show all posts
Showing posts with label power supply. Show all posts

Repairing a 1960s mainframe: Fixing the IBM 1401's core memory and power supply

A few weeks ago, I wanted to use one of the vintage IBM 1401 mainframe computers at the Computer History Museum, but the computer wasn't working.1 This article describes the multi-week repair process to get the computer working again.

The problem started when the machine was powered up at the same time someone shut down the main power, apparently causing some sort of destructive power transient. The computer's core memory completely stopped working, making the computer unusable. To fix this we had to delve into the depths of the computer's core memory circuitry and the power supplies.

The IBM 1401 computer. The card reader/punch is in the foreground. The 12K memory expansion box is partially visible to the right behind the 1401.

The IBM 1401 computer. The card reader/punch is in the foreground. The 12K memory expansion box is partially visible to the right behind the 1401.

Debugging the core memory

The IBM 1401 was a popular business computer of the early 1960s. It had 4000 characters of internal core memory with additional 12000 characters in an external expansion box.2 Core memory was a popular form of storage in this era as it was relatively fast and inexpensive. Each bit is stored in a tiny magnetized ferrite ring called a core. (If you've ever heard of a "core dump", this is what the term originally referred to.) The photo below is a magnified view of the cores, along with the red wires used to select, read and write the cores.4 The cores are wired in an X-Y grid; to access a particular address, one of the X lines is pulsed and one of the Y lines is pulsed, selecting the core where they intersect.3

Detail of the core memory in the IBM 1401. Each toroidal ferrite core stores one bit.

Detail of the core memory in the IBM 1401. Each toroidal ferrite core stores one bit.

In the 1401, there are 4000 cores in each grid, forming a core plane that stores 4000 bits. Planes are then stacked up, one for each bit in the word, to form the complete core module, as shown below.

The 4000 character core memory module from an IBM 1401 computer. Tiny ferrite cores are strung on the red wires.

The 4000 character core memory module from an IBM 1401 computer. Tiny ferrite cores are strung on the red wires.

To diagnose the memory problem, the team started probing the 1401 with an oscilloscope. They checked the signals that select the core module, the memory control signals, the incoming addresses, the clock signals and so forth, but everything looked okay.

The next step was to see if the X and Y select signals were being generated properly. These pulses are generated by two boards called "matrix switches", one for the X pulse and one for the Y pulse.5 Some address lines are decoded and fed into the X matrix switch, while the other address lines are decoded and fed into the Y matrix switch. The matrix switches then create pulses on the appropriate X and Y select lines to access the desired address in the core planes.

The photo below shows the core memory module and its supporting circuitry inside the 1401. The core memory module itself is at the bottom, with the two matrix switch boards mounted on it. Above it, three rows of circuit boards (each the size of a playing card) provide the electronics. The top row consists of inhibit drivers (used for writing memory) and the current source and current driver boards (providing current to the matrix switches). The middle row has 17 boards to decode the memory addresses. At the bottom 19 sense amplifier boards read the data signals from the cores. As you can see, core memory requires a lot of supporting electronics and wiring. Also note the heat sinks on most of these boards due to the high currents required by core memory.

Inside the IBM 1401 computer, showing the key components of the core memory system.

Inside the IBM 1401 computer, showing the key components of the core memory system.

After some oscilloscope measurements, we found that one of the matrix switches wasn't generating pulses, which explained why the memory wasn't working. We started checking the signals going into the matrix switch and found one matrix switch input line showed some ringing, apparently enough to keep the matrix switch from functioning.

Since the CHM has two 1401 computers, we decided to swap cards with the good machine to track down the fault. First we tried swapping the thermal switch board (below). One problem with core memory is that the properties of ferrite cores change with temperature. Some computers avoid this problem by heating the core memory to a constant temperature in air (as in the IBM 1620 computer) or an oil bath (as in the IBM 7090). The 1401 on the other hand uses temperature-controlled switches to adjust the current based on the ambient temperature. We swapped the "AKB" thermal switch board (below) and the associated "AKC" resistor board, with no effect.

The core memory uses a thermal switch board to adjust the current through core memory as temperature changes.  The switches open at 35°C, 29°C and 22°C.  The type of the board (AKB) is stamped into the lower left of the board.

The core memory uses a thermal switch board to adjust the current through core memory as temperature changes. The switches open at 35°C, 29°C and 22°C. The type of the board (AKB) is stamped into the lower left of the board.

Next we tried swapping the "AQW" current source boards that control current through the matrix switches.6 We swapped these board and the 1401's memory started working. Replacing the original boards one at a time, we found the bad board, shown below.

The IBM 1401 has four "AQW" cards that generate currents for the core memory switches. This card had a faulty inductor (the upper green cylinder), preventing core memory from working.

The IBM 1401 has four "AQW" cards that generate currents for the core memory switches. This card had a faulty inductor (the upper green cylinder), preventing core memory from working.

I examined the bad board and tested its components with an multimeter. There were two 1.2mH inductors on the board (the large green cylinders). I measured 3 ohms across one and 3 megaohms across the other, indicating that the second inductor had failed. With an open inductor, the board would only provide half the current. This explained why the matrix switch wasn't generating pulses, and thus why the core memory didn't work.

I gave the bad inductor to Robert Baruch of Project 5474 for analysis. He found that the connection between the lead and the inductor wire was intermittent. He dissolved the inductor's package in acid and took photographs of the winding inside the inductor.7

The faulty inductor from the IBM 1401 showing the failed connection.

The faulty inductor from the IBM 1401 showing the failed connection.

We looked in the spare board cabinet for an AQW board to replace the bad one and found several. However, the replacement boards were different from the original—they had one power transistor instead of two. (Compare the photo below with the photo of the failed card from the computer.)

The replacement AQW card had one transistor instead of two, but was supposedly compatible with the old board.

The replacement AQW card had one transistor instead of two, but was supposedly compatible with the old board.

Despite misgivings from some team members, the bad AQW card was replaced with a one-transistor AQW card and we attempted to power the system back up. Relays clicked and fans spun, but the computer refused to power up. We put the old card back (after replacing the inductor), and the computer still wouldn't start. So now we had a bigger problem. Apparently something had gone wrong with the computer's power supplies so the debugging effort switched focus.

Diagnosing the power supply problem

The power supply system for the IBM 1401 is more complex than you might expect. Curiously, the main power supplies for the system are inside the card reader; a 1250W ferro-resonant transformer in the card reader regulates the line input AC to 130V AC, which is fed to the 1401 computer itself through a thick cable under the floor. Smaller power supplies inside the 1401 then produce the necessary voltages.

Since it was built before switching power supplies became popular, the IBM 1401 uses bulky linear power supplies. The photo below shows (left to right) the +30V, -6V, +6V and -12V supplies.8 In the lower left, under the +30V supply, you can see eight relays for power sequencing. The circuit board to the right of the relays is one of the "sense cards" that checks for proper voltages. Under the +6V supply is a small "+18V differential" supply for the core memory. Foreshadowing: these components will all be important later.9

Power supplies in the IBM 1401.

Power supplies in the IBM 1401.

After measuring voltages on the multiple power supplies, the team concluded that the -6V power supply wasn't working right. This was a bit puzzling because the AQW card (the one we replaced) only uses +12 and +30 volts. Since it doesn't use -6 volts at all, I didn't see how it could mess up the -6 volt supply.

Inside the IBM 1401's -6V power supply.

Inside the IBM 1401's -6V power supply.

The team removed the -6V supply and took it to the lab. In the photo above, you can see the heavy AC transformer and large electrolytic capacitors inside the power supply. Measuring the output transistors, they found one bad transistor and some weak transistors and decided to replace all six transistors. In the photo below, you can see the new transistors, mounted on the power supply's large heat sink. These are germanium power transistors; the whole computer is pre-silicon.

The -6V power supply from the IBM 1401 uses six power transistors on a large heat sink.

The -6V power supply from the IBM 1401 uses six power transistors on a large heat sink.

The -6V power supply tested okay in the lab with the new transistors, so it was installed back in the 1401. We hit the "Power On" button on the console and... it still didn't work. We still weren't getting -6V and the computer wouldn't power up.

In the next repair session, we tried to determine why the computer wasn't powering up. Recall the eight relays mentioned earlier; these relays provide AC power to the power supplies in sequence to ensure that the supplies start up in the right order. If there is a problem with a voltage, the next relay in the sequence won't close and the power-up process will be blocked. We looked at which relays were closing and which weren't, and measured the voltages from the various power supplies. Eventually we determined that about halfway through the power-up process, relay #1 was not closing when it should, stopping the power-up sequence.

Relay #1 was driven by the +30V supply and was activated by a "sense card" that checked the +6V supply. But the +30V and +6V supplies were powering up fine and the sense card was switching on properly. Thus, the problem seemed to be a failure with the relay itself. Just before we pulled out the relay for testing, someone found an updated schematic showing the relay didn't use the regular +30V supply but instead obtained its 30 volts through the "18V differential supply".11 And the schematic for the 18V differential supply had a pencilled-in fuse.10

Could the power problem be as simple as a burnt-out fuse? We opened up the 18V differential supply, and sure enough, there was a fuse and it was burnt out. After replacing the fuse, the system powered up fine and we were back in business.

The 18V differential power supply in the IBM 1401 provides 12 volts to the core memory. The fuse is under the large electrolytic filter capacitors.

The 18V differential power supply in the IBM 1401 provides 12 volts to the core memory. The fuse is under the large electrolytic filter capacitors.

With the computer operational, I could finally run my program. After a few bug fixes, my program used the computers's reader/punch to punch a card with a special hole pattern:

A punch card with "Merry Xmas" and a tree punched into it.

A punch card with "Merry Xmas" and a tree punched into it.

Happy holidays everyone!12

Conclusion

After all this debugging, what was the root cause of the problems? As far as we can tell, the original problem was the inductor failure and it's just a coincidence that the problem occurred after the power loss during system startup. The new AQW card must have caused the fuse to blow, although we don't have a smoking gun.13 The reason the -6V power supply wasn't showing any voltage is because it was sequenced by relay #1, which didn't close because of the fuse. The bad transistors in the -6V power supply problem were apparently a pre-existing and non-critical problem; the good transistors handled enough load to keep the power supply working. The moral from all this is that keeping an old computer running is challenging and takes a talented team.

Thanks to Robert Baruch for the inductor photos. Thanks to Carl Claunch for providing analysis. The Computer History Museum in Mountain View runs demonstrations of the IBM 1401 on Wednesdays and Saturdays so check it out if you're in the area; the demo schedule is here.

Follow me on Twitter or RSS to find out about my latest blog posts.

Notes and references

  1. Although there are two IBM 1401 computers at the CHM, only one of them has the "column binary punch" feature that I needed. "Column binary" lets you punch arbitrary patterns on a punch card (to store binary) rather than being limited to the standard punch card character set of 64 characters. 

  2. Note that the 1401 has 4000 characters of memory and not 4096 because it is a decimal machine. Also, the memory stores 6-bit characters plus a (metadata) word mark and not bytes. 

  3. If you want to know more about the 1401's core memory, I've written in detail about core memory and described a core memory fix

  4. The trick that makes core memory work is that the cores have extremely nonlinear magnetic characteristics. If you pass a current (call it I) through a wire through a core, the core will become magnetized in that direction. But if you pass a smaller current (I/2) through a wire, the core doesn't change magnetization at all. The result is that you can put cores on a grid of X and Y wires. If you put current I/2 through an X wire and current I/2 through a Y wire, the core at their intersection will get enough current to change state, while the rest of the cores will remain unchanged. Thus, individual cores can be selected. 

  5. The matrix switch is another set of cores in a grid, but used to generate pulses rather than store data. The 1401's memory has 50 X lines and 80 Y lines (yielding 4000 addresses), so generating the X and Y pulses with transistors would require 50 + 80 expensive, high-current transistors. The X matrix switch has 5 row inputs and 10 column inputs, and 50 outputs—one from each core. The address is decoded to generate the current pulses for these 15 inputs. Thus, instead of using transistor circuits to decode and drive 50 lines, just 15 lines need to be decoded and driven, and the matrix switch generates the final 50 lines from these. The Y lines are similar, using a second matrix switch to drive the 80 Y lines. 

  6. Each matrix switch has two current inputs (for the row select and the column select), so there are four current source boards and four current driver boards in total. 

  7. Strangely, half the inductor is nicely wound while the winding in the other half is kind of a mess.

    The faulty inductor from the IBM 1401.

    The faulty inductor from the IBM 1401.

  8. The 1401 has more power supplies that aren't visible in the picture. They are behind the power supplies in the photo and slide out from the side for maintenance. 

  9. If you want to see the original schematics and diagrams of the 1401's power supplies, you can find them here. Core memory schematics are here

  10. The pencilled-in fused on the schematic also had a note about an IBM "engineering change". In IBM lingo, an engineering change is a modification to the design to fix a problem. Thus, it appears the the 1401 originally didn't have the fuse, but it was added later. Perhaps we weren't the first installation to have this problem, and the fuse was added to prevent more serious damage. 

  11. The 18V differential supply provides 12 volts. This seemed contradictory, but there's an explanation. The core memory circuitry is referenced to +30 volts. It needs a supply 18 volts lower, which is provided by the 18V differential supply. Thus, the voltage is +12V above ground. Unlike the regular +12V power supply, however, the differential power supply's output will move with any changes to the +30V supply, ensuring the difference is a steady 18 volts. 

  12. The "Merry Xmas" card was inspired by a tweet from @rrragan. (I had also designed a card with a menorah, but unfortunately encountered keypunch problems and couldn't get it completed in time. Maybe next year.) Punch cards normally encode characters by punching up to three holes per column. Since this decorative card required many holes per column, I needed to use the 1401's column binary feature, which allows arbitrary binary data to be punched. I ended up punching the card upside down to simplify the program:

    Front of my "Merry Xmas" punch card.

    Front of my "Merry Xmas" punch card.

  13. After carefully examining the AQW boards, we determined that one- and two-transistor cards should be compatible. The two-transistor board had the two transistors in parallel, probably using earlier transistors that couldn't handle as much current. It's possible that the filter capacitor between +30V and ground was shorted in the replacement AQW board, blowing the fuse. 

Inside the vintage Xerox Alto's display, a tiny lightbulb keeps it working

In this Alto restoration episode, we repaired a second CRT display, exercising our TV repair skills and discovering a tiny mysterious lightbulb that caused the display to fail in a strange way. For those just tuning in, the Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, high-resolution bitmapped displays, WYSIWYG editors, Ethernet and laser printers to the world, among other things.

The YCombinator Xerox Alto, running a Mandelbrot set program I wrote in BCPL.

The YCombinator Xerox Alto, running a Mandelbrot set program I wrote in BCPL.

I've been restoring an Alto from YCombinator, along with Marc Verdiell, Carl Claunch and Luca Severini. Since we have the YCombinator Alto working (above), we've been trying trying to get a second Xerox Alto system running; this one is from DigiBarn. My full set of Alto posts is here and Marc's videos are here.

Inside the Alto's display

When we tried to connect the DigiBarn display to the Alto, we ran into a problem—it had an incompatible connector. Looking inside the display (below), we were surprised to find this connector led to a circuit board with a 6502 microprocessor; since the Alto came out in 1973 and the 6502 in 1975, this didn't make sense. After some investigation, I determined that although the display looked like an Alto display, it was actually for a Dorado, a Xerox minicomputer from 1979 that followed the Alto.

Inside the Xerox Alto's display. With the cover removed, the CRT and monitor circuitry are visible. The 7-wire interface board is at bottom-left. The Alto itself is in the cabinet under the display.

Inside the Xerox Alto's display. With the cover removed, the CRT and monitor circuitry are visible. The 7-wire interface board is at bottom-left. The Alto itself is in the cabinet under the display.

The Dorado was much faster than the Alto because the Dorado used ECL chips, the same technology used in the Cray-1 supercomputer. Unfortunately, since ECL chips used a lot of power and needed powerful cooling fans, the Dorado was too hot and noisy to use in an office. Putting a soundproof enclosure around the Dorado didn't work, so Xerox ended up putting Dorados in machine rooms. The user's keyboard, mouse and display was connected to the remote Dorado with a special cable using the "7-wire protocol" that Xerox invented for this purpose. The 6502 board was the interface for this protocol. 5

To connect the Alto to this display, I built an adapter cable that bypassed the 7-wire board. We hooked up the monitor, powered it on, and were greeted with a empty black screen. Given the age of the monitor, we weren't surprised that it didn't work. However, when we powered off the monitor, we saw a perfect image for a fraction of a second before the image collapsed to a dot and disappeared. This was unexpected—the monitor didn't work at all when powered on, but worked fine (but very briefly) when turned off! What could be going on?

How a CRT monitor works

Since many readers may not be familiar with how CRTs work, I'll take a brief detour and explain how CRTs work. The cathode ray tube (CRT) ruled television displays from the 1930s until LCD displays took over about 10 years ago.2 In a CRT, an electron beam is shot at a phosphor-coated screen, causing a spot on the screen to light up. The beam scans across the screen, left to right and top to bottom in a raster pattern. The beam is turned on and off, generating a pattern of dots on the screen to form the image. The horizontal scan will turn out to be very important to this repair.

The Xerox Alto's display uses an 875-line raster scan. (For simplicity, I'm ignoring interlacing of the raster scan.)

The Xerox Alto's display uses an 875-line raster scan. (For simplicity, I'm ignoring interlacing of the raster scan.)

The cathode ray tube is a large vacuum tube containing multiple components, as shown below. A small electrical heater, similar to a light bulb filament, heats the cathode to about 1000°C. The cathode is an electrode that emits electrons when hot. A control grid surrounds the cathode; putting a negative voltage on the grid repels the negatively-charged electrons, reducing the beam strength and thus the brightness. The next grid has about 800 volts on it, attracting and accelerating the electrons. The focus grid, at about 600 volts, squeezes the electron beam to form a sharp spot. The anode is positively charged to a high voltage (17,000V), accelerating the electrons to hit the screen with high energy. The screen is coated with a phosphor, causing it to glow where hit by the electron beam. Finally, two electromagnets are arranged on the neck of the tube to deflect the beam horizontally and vertically in the raster scan pattern shown earlier; these are the deflection coils.

Diagram of a Cathode Ray Tube (CRT). Based on drawings by Interiot and Theresa Knott (CC BY-SA 3.0)

Diagram of a Cathode Ray Tube (CRT). Based on drawings by Interiot and Theresa Knott (CC BY-SA 3.0)

The photo below shows the Cathode Ray Tube (CRT) inside the Alto's monitor, with the screen at the right. In the center, the red deflection coils are mounted on the neck of the tube. The thick red wire provides 17,000 volts to the anode. This high voltage is generated by the flyback transformer, the UFO-like gray disk in the lower left.

Inside the Xerox Alto's display, the CRT picture tube is visible. The thick red wire provides 17,000 volts from the flyback transformer to the tube's anode. The other red wire is connected to the 500MΩ 6W bleeder resistor, the large white cylinder at the left. Note the dirt and debris on the flyback transformer from the system's storage in a barn.

Inside the Xerox Alto's display, the CRT picture tube is visible. The thick red wire provides 17,000 volts from the flyback transformer to the tube's anode. The other red wire is connected to the 500MΩ 6W bleeder resistor, the large white cylinder at the left. Note the dirt and debris on the flyback transformer from the system's storage in a barn.

Back to the broken display

Why would the display work for a moment, just as it is powered off? It must have something to do with voltage levels dropping as the power supplies shut down—something that wasn't working at full voltage, but worked at a lower voltage. One theory was that one of the CRT grids might have the wrong voltage. Since electrons are negative, they are attracted to positive voltages (such as the 17,000 volt anode) and repelled by negative voltages. If a grid was too negative, the electron beam could be blocked. Perhaps as the power supplies shut down, the negative grid problem briefly resolved itself.3

We opened up the display and measured some voltages, taking extreme care around the high voltages. Verifying the 17,000 V supply was easy; with a voltage this high, waving an oscilloscope probe a few inches away is sufficient to pick up a signal (below). The main 55V supply was also good. But when we checked the grid voltages, we didn't get anything.

The service manual shows the waveform you can pick up two inches away from the flyback transformer.

The service manual shows the waveform you can pick up two inches away from the flyback transformer.

The grid voltages and 17KV supply are generated by the flyback transformer. Since we saw the 17KV signal, we knew the horizontal deflection circuit and the flyback transformer were working. Perhaps a capacitor had failed, but we didn't find any bad ones. On the schematic we noticed a tiny lightbulb in the high-voltage circuit, an unexpected circuit element. We measured the bulb's resistance on the board (below) and found it was open. We figured the bulb must have burned out, but after removing it we discovered that instead one of the bulb's leads had broken off right at the glass case.

The bulb is visible in front of the right side of the transformer.

The bulb is visible in front of the right side of the transformer.

The service manual for the monitor called the bulb a "No. 1764." I was afraid that this was an internal part number and we wouldn't be able to determine the correct replacement bulb. However, Google revealed that this was a 28V 0.04A miniature bulb, sold by many vendors. Unfortunately we couldn't find any local stores that sold this bulb and we wanted to test out a fix immediately. So Marc performed some precision microscope soldering to reattach the broken wire. Since the wire had broken off right at the glass, reattaching it was very difficult but he succeeded. We re-installed the bulb and the display worked fine!

Why is there a bulb inside the power supply? I assume that it is used as a current limiter. Bulbs have very low resistance when cold, but increase resistance as they warm up. It seems crazy to subject a 28 V bulb to pulses of 600 volts, but since the pulses are only a few microseconds, the bulb survives them just fine.

The tiny bulb inside the display's power supply.

The tiny bulb inside the display's power supply.

Details on the power supply

The high-voltage power supply is described in the monitor service manual, but I'll give a brief summary here.4

The primary purpose of the horizontal sync circuit is to create a sawtooth current through the horizontal deflection coil to scan left-to-right across each row of the screen. A common trick in TVs is to use the high-frequency (26 kHz) horizontal sync to generate high voltages. To do this, the horizontal sync circuit sends high-current pulses (2-3 amps) through the flyback transformer. This step-up transformer produces the 17 kilovolts required by the CRT's anode. A second transformer winding produces -100 volts, while the third winding is used to generate 600V and 1000V. (Interestingly, cell phone chargers also use flyback transformers, but obviously at much lower voltages.)

The photo below shows the flyback transformer (left). The thick black wire at the bottom of the photo connects the 17KV from the transformer to the picture tube, while the colorful wires at the top provide the lower voltages.

Flyback transformer inside the monitor. The large white cylinder is a 500 megaohm, 6W resistor. You don't usually see such a high resistance combined with a high wattage, but the resistor bleeds off the high voltage from the CRT.

Flyback transformer inside the monitor. The large white cylinder is a 500 megaohm, 6W resistor. You don't usually see such a high resistance combined with a high wattage, but the resistor bleeds off the high voltage from the CRT.

The schematic below indicates the key components of the power supply. The switching transistor is driven by the horizontal sync input. When it switches on, current (red line) builds up in the flyback transformer, storing energy in the transformer's magnetic field. When the transistor switches off, this stored energy is released into the secondary windings, producing 17KV for the anode and -100V for the brightness grid. In addition, a 600V pulse is created across the primary. The pulse (yellow line) flows through the bulb and a diode, generating 600V for the focus grid. The voltage doubler circuit (circled in pink) generates 1000V for the second (accelerator) grid.

The high-voltage power supply is driven by the horizontal deflection circuit.
The switching transistor puts 55 volts across the flyback transformer. When it switches off, the flyback transformer produces 17 kilovolts for the CRT anode, as well as powering the 600V, 1000V and -100V supplies.

The high-voltage power supply is driven by the horizontal deflection circuit. The switching transistor puts 55 volts across the flyback transformer. When it switches off, the flyback transformer produces 17 kilovolts for the CRT anode, as well as powering the 600V, 1000V and -100V supplies.

Why did the display originally show a picture for a moment as it was powered off? With the bulb not working, the 800V acceleration grid and the focus grid didn't receive any voltage, but the brightness grid was still powered (since -100V comes from a different winding). My theory is that without the attraction from the acceleration grid, electrons couldn't get past the negative brightness grid. But when the brightness grid lost power, the electron beam was no longer blocked and could reach the screen, until everything else shut down moments later.

Conclusion

Cathode ray tubes were the dominant display technology until LCD displays took over about 10 years ago. Now, CRT TV repair is a retro activity, involving circuits such as horizontal deflection, video amplifiers, and high-voltage flyback transformers that were formerly well-known but are now more obscure.

We tracked down the display's problem to a tiny light bulb, an unusual component to find in a critical role in a high-voltage power supply. Surprisingly, despite being exposed to 600 volt pulses, the problem with this 28 volt bulb wasn't that it had burnt out, but that one of its tiny leads had broken. After repairing the bulb, the display worked fine. Unlike our previous display which had a very faint CRT, this one produced a crisp, bright image. Since we got the display working and didn't get any high-voltage shocks, I consider this a successful restoration session.

The repaired display shows a test pattern, generated by the crttest program. The screen is bright and sharp, but the horizontal centering still needed adjustment.

The repaired display shows a test pattern, generated by the crttest program. The screen is bright and sharp, but the horizontal centering still needed adjustment.

Thanks to the Living Computer Museums + Lab for providing the display test board. Thanks to Al Kossow and Bitsavers for the scanned service manual.

Notes and references

  1. For more on the Alto's monitor, see my article from last year about restoring our first display

  2. A computer monitor is essentially a television set, but without the tuner to select the desired channel from the antenna. In addition, televisions have circuitry to extract the horizontal sync, vertical sync and video signals from the combined broadcast signal. These three signals are supplied to the Alto monitor separately, simplifying the circuitry. 

  3. I should admit that Marc and Carl had the right theory about the problem. My theory that the video input voltage might be too high didn't pan out. 

  4. I didn't discuss the 55V supply that powers the monitor. It is a straightforward regulated linear power supply driven from the 120V AC input. I also also didn't explain how the horizontal deflection coil operates. it is driven by the same transistor as the flyback transformer, but uses a fairly complex inductor-capacitor resonance circuit to generate the scan across the screen. (The scan current is a sawtooth; a smooth scan left-to-right followed by a rapid retrace back to the left.) For a thorough discussion of how the display's power supply works, see page 3-4 of the service manual

  5. The 7-wire interface used a 15-wire cable with standard DB-15 (serial port) connectors. It sent seven ECL signals as differential pairs, and used the remaining wire for ground. Calling it "7-wire" seems a bit misleading, since it used 15 wires in total. The board schematic is in the Dorado schematics page 159. The video signal was multiplexed across four of the signals; this reduced the bandwidth requirement by a factor of four. One signal was serial data; this transmitted the keyboard and mouse information. The remaining two signals were (apparently redundant) clocks. The protocol supported daisy-chaining, so multiple peripherals (such as a printer) could be connected.

    This 7-wire Terminal Interface board was used by Xerox PARC to connect a keyboard/display/mouse to a remote Dorado minicomputer

    This 7-wire Terminal Interface board was used by Xerox PARC to connect a keyboard/display/mouse to a remote Dorado minicomputer

    The photo above shows the 7-wire terminal interface board inside the display. The large chips in the upper right are the 6532 "RIOT" I/O chip, the 6502 microprocessor, and a 2716 EPROM holding the code. The remaining chips are a mixture of TTL and ECL. At the bottom are connectors for 7-wire in, 7-wire out, and the keyboard. The connector to the monitor itself is in the upper center. 

Lacking safety features, cheap MacBook chargers create big sparks

You might wonder if it's worth spending $79 for a genuine MacBook charger when you can get a charger on eBay for under $15. You shouldn't get a cheap charger because they are often dangerous and lack safety features. In addition, they produce poor-quality power that isn't good for your laptop and may charge more slowly. I've written before about the safety problems with cheap chargers, but they say a picture is worth a thousand words, so here is why you shouldn't buy a cheap knockoff charger:

A knockoff MacBook charger emits large sparks if short-circuited. Genuine Apple chargers have safety features to protect against this.

A knockoff MacBook charger emits large sparks if short-circuited. Genuine Apple chargers have safety features to protect against this.

If the connector comes in contact with something metal (a paperclip in this instance), it shorts out, creating a big spark. (Don't try this at home.) The genuine Apple charger (below) has safety features that protect against a short circuit. Shorting the connector on a genuine charger has no effect.

A genuine Apple MacBook charger has safety features that protect it from short circuits.

A genuine Apple MacBook charger has safety features that protect it from short circuits.

It's really hard to tell a genuine charger from a knockoff from the outside, since the knockoffs look just like the real thing. If you carefully read the text on this charger, you'll notice that "Apple" is missing. However, many knockoff chargers duplicate the text from a real charger, so often you can't tell if it is genuine or not just by looking. Big sparks, however, are a clear sign.

A cheap MacBook charger from eBay. Unlike most cheap chargers, this one doesn't pretend to be an Apple charger in the text.

A cheap MacBook charger from eBay. Unlike most cheap chargers, this one doesn't claim to be an Apple charger, but just a "Replacement AC Adapter".

Why does a fake charger produce sparks, while a genuine one doesn't? The fake charger constantly outputs 20 volts, so if any metal shorts the connector, it produces a big spark with all its 85 watts of power. On the other hand, the genuine charger doesn't power up until it has been securely connected to the laptop for a full second. Until it is properly connected (details), it outputs a tiny amount of power (0.6 volts at 100µA) that can't produce a spark. To manage this, the genuine charger includes a powerful microcontroller (more powerful than the microprocessor in the original Macintosh by some measures). Since this processor increases the cost of the charger, knockoff chargers omit it, even though this makes the charger more dangerous.

As the photos below show, the cheap charger (left) omits as much as possible. On the other hand, the genuine Apple charger (right) is crammed full of components. Many of these components filter the power to provide higher-quality power to your laptop. The Apple charger also includes power factor correction, making the charger more efficient.

The cheap MacBook charger (left) omits most of the components found in a genuine Apple charger (right). The genuine charger includes more filtering, power factor correction (left), and a powerful microcontroller (board in upper right).

The cheap MacBook charger (left) omits most of the components found in a genuine Apple charger (right). The genuine charger includes more filtering, power factor correction (left), and a powerful microcontroller (board in upper right).

I've written in detail before about how chargers work, but I'll give a quick explanation here. The AC power comes in the red wires at the top and is converted to high-voltage DC (170V or 340V, depending on if you're in the US or Europe). A transistor (black component on left) chops the power into high-frequency pulses. The pulses create a changing magnetic field in the flyback transformer (large blue box), generating a high-current, low-voltage output. The output is converted to DC by diodes (black component, upper right), and filtered by capacitors (cylinders), to produce the 20 volt output (wires at bottom). A control IC (see photo below) controls the system to regulate the voltage. This may seem like an excessively complicated way to generate 20 volts, but switching power supplies like this are very compact, lightweight and efficient compared to simpler power supplies.

Shorting a cheap charger with a paperclip creates impressive sparks.

Shorting a cheap charger with a paperclip creates impressive sparks.

Looking at the underside of the cheap charger board shows it has very few components, while the genuine Apple charger's board is covered with tiny components. The two chargers are worlds apart as far as complexity, and this complexity is what provides more efficiency, more safety, and better quality power in the Apple charger.

The cheap MacBook charger (left) uses very simple circuits compared to the genuine Apple charger (right), which is crammed full of components.

The cheap MacBook charger (left) uses very simple circuits compared to the genuine Apple charger (right), which is crammed full of components.

Conclusion

While buying a cheap charger saves a lot of money, these chargers omit many safety features and can be hazardous to you and your computer. Don't buy a cheap knockoff charger; if you don't want to pay for a genuine Apple charger, at least buy a charger from a name-brand manufacturer.

Maybe you think these safety issues don't matter because you don't poke your charger with a paperclip. But if you have any metal objects on your desk, a random contact could yield a surprisingly large spark.

I've written a bunch of articles before about chargers, so if this article seems familiar, you're probably thinking of an earlier article, such as: Counterfeit MacBook charger teardown, Magsafe charger teardown, iPhone charger teardown or iPad charger teardown.

Follow me on Twitter to find out about my new articles.

Notes

If you're interested in the components inside the cheap charger, I have some details. The PWM control IC is a SiFirst 1560, a basic control IC for a flyback converter. The IC datasheet has the approximate schematic for the charger. The switching transistor is a 2N601 2 amp, 600 volt N-channel MOSFET. The voltage reference is an AZ431, similar to the ubiquitous TL431. The optoisolator is an 817C. The output diode is a MBRF20100C 10 amp Schottky diode pair. The electrolytic capacitors are from HKLCON.

A cheap charger emits large sparks if you short the connector with a paperclip. Safety features in a genuine charger protect against shorts.

A cheap charger emits large sparks if you short the connector with a paperclip. Safety features in a genuine charger protect against shorts.

Restoring Y Combinator's Xerox Alto, day 1: Power supplies and disk interface

A few days ago, I wrote about how I'm helping restore a Xerox Alto for Y Combinator. This new post describes the first day of restoration: how we disassembled the computer and disk drive and fixed a power supply problem, but ran into a showstopper problem with the disk interface.

The Xerox Alto was a revolutionary computer from 1973, designed by computer pioneer Chuck Thacker at Xerox PARC to investigate ideas for personal computing. The Alto was the first computer built around a mouse and GUI, as well as introducing Ethernet and laser printers to the world. The Alto famously inspired Steve Jobs, who used many of its ideas in the Lisa and Macintosh computer.

Alan Kay, whose vision for a personal computer guided the Alto, recently gave an Alto computer to Y Combinator. Getting this system running again is a big effort but fortunately I'm working with a strong team, largely from the IBM 1401 restoration team. Marc Verdiell, Luca Severini, Ron, Carl Claunch, and I started on restoration a few days ago, as shown in Marc's video below.

Disassembling the Alto

We started by disassembling the computer. The Xerox Alto has a metal cabinet about the size of a dorm mini-fridge, with a Diablo hard disk drive on top, and a chassis with power supplies and the circuit boards below. With some tugging, the chassis slides out of the cabinet on rails as you can see in the photo below. At the front are the four cooling fans, normally protected by a decorative panel. Note the unusual portrait layout of the display.

The Xerox Alto II XM 'personal computer'. The card cage below the disk drive has been partially removed. Four cooling fans are visible at the front of it.

The Xerox Alto II XM 'personal computer'. The card cage below the disk drive has been partially removed. Four cooling fans are visible at the front of it.

With the chassis fully removed, you can see the four switching power supplies on the left, the blue metal boxes. The computer's circuit boards are on the right, not visible in this picture. The wiring for the backplane is visible at right front, with pins connected by wire-wrapped wire connections. This wiring connects the circuit boards together.[1]

The Alto's chassis has been removed. On the left are the four switching power supplies (blue boxes). On the right, the connections for the wire-wrapped backplane are visible. The circuit boards plug into this backplane.

The Alto's chassis has been removed. On the left are the four switching power supplies (blue boxes). On the right, the connections for the wire-wrapped backplane are visible. The circuit boards plug into this backplane.

The power supplies

Our first goal was to make sure the power supplies worked after decades of sitting idle. The Alto uses high-efficiency switching power supplies.[2] To explain the power supplies in brief, input power is chopped up thousands of times a second to produce a regulated voltage. Unlike modern computer power supplies, there's a second switching stage (the inverter), which drops the voltage to the desired 15 volts. This was more complexity than I expected, but fortunately the detailed power supply manual was available online, thanks to Al Kossow's bitsavers.[3] We tested each power supply with a resistor as a dummy load and checked that the output voltage was correct. We also used an oscilloscope to make sure the output was stable. All the power supplies worked fine, except for the +15V supply (top center), which had trouble getting up to 15 volts and staying there.

We disassembled the faulty power supply to track down the problem. The photo of the power supply below shows how densely components are crammed into the power supply. Two of the circuit boards have been removed and are at the back. Note the three large filter capacitors at the front.

Switching power supply from the Xerox Alto computer. Two of the control boards have been removed and are visible at back.

Switching power supply from the Xerox Alto computer. Two of the control boards have been removed and are visible at back.

We noted signs of overheating on the AC connector, as well as a somewhat sketchy looking repair (a trace replaced by a wire) and some signs of corrosion. Apparently the power supply had problems in the past and had been serviced. We cleaned up the corrosion and it appeared to be superficial.

The power supply disassembled easily for repair, as you can see below. The main board is at the right. The tower of three inductors on the main board is an unusual way of mounting inductors. Three circuit boards (top) plug into the main board. Because the power supply uses discrete components instead of a modern SMPS control IC, it needs a lot of control circuitry. The switching transistors (lower center) are mounted onto metal heat sinks for cooling.

The Alto's switching power supply, disassembled. The main board is in the lower right. The three circuit boards are at top, below the large input capacitors.

The Alto's switching power supply, disassembled. The main board is in the lower right. The three circuit boards are at top, below the large input capacitors.

The large capacitors were attached with screws, making it easy to remove them for testing. A capacitance meter showed that the three large capacitors had failed, explaining why the power supply had trouble outputting the desired voltage. Ron went off and found replacement capacitors, although they weren't an exact match for the originals. With the new capacitors mounted in place, the power supply worked properly.

Inside the Diablo disk drive

We also looked at the Diablo disk drive, which provides 2.5MB of storage for the Xerox Alto. The first step was removing the disk pack. In normal operation, the front of the drive is locked shut to keep the disk from being removed during use. To remove the disk without powering up the drive, we had to open the drive and manually trip the latch that locks it shut (see Diablo drive manual).

This picture shows the disk pack being reinserted into the drive. Unlike modern hard disk drives, the Alto's disk can be removed from the drive. Users typically used different disks for different tasks — a programming disk, a word processing disk, and so forth. The disk pack is a fairly large white package, resembling a cross between an overgrown Frisbee and a poorly-detailed Star Wars spaceship. The drive's multiple circuit boards are also visible in the photo.[4]

Inserting a 2.5 MB hard disk pack into the Diablo drive used by the Xerox Alto computer.

Inserting a 2.5 MB hard disk pack into the Diablo drive used by the Xerox Alto computer.

As the disk pack enters the drive, it opens up to provide access to the disk surface. The photo below shows the exposed surface of the disk, brownish from the magnetizable iron oxide layer over the aluminum platter. The read/write head is visible above the disk's surface, with another head below the disk. The disk stores data in 203 concentric pairs of tracks, with the heads moving in and out together to access each pair of tracks.

Closeup of the hard disk inside the Diablo drive. The read/write head (metal/yellow) is visible above the disk surface (brown).

Closeup of the hard disk inside the Diablo drive. The read/write head (metal/yellow) is visible above the disk surface (brown).

Although the heads are widely separated during disk pack insertion, they move very close to the disk surface during operation, floating about one thousandth of a millimeter above the surface. The diagram below from the manual helps visualize this minute distance, and illustrates the danger of particles on the disk's surface.

The Diablo disk and why contaminants are bad, from the Alto disk manual.

The Diablo disk and why contaminants are bad, from the Alto disk manual.

The disk interface cliffhanger

The final activity of the day was making sure all the Alto's circuit boards were in the right slots and the cables were all hooked up properly.[5] Everything went smoothly until I tried to hook up the Diablo disk drive to the disk interface card: the disk drive cable didn't fit on the card's connector!

The cable to the Alto disk didn't fit onto the disk interface card!

The cable to the Alto disk didn't fit onto the disk interface card!

After trying various combinations of cables and edge connectors, we discovered that the rainbow-colored ribbon cable you can see in the lower right above did fit the disk interface card. But instead of going to the Diablo disk drive, this cable went to a connector on the back of the Alto labeled "Tricon". Tricon is the controller for the Trident Disk, a high-capacity disk drive that could be used with the Alto, providing 80 MB instead of the just 2.5 MB that the standard Diablo drive provides. Looking at the disk interface card more closely, we saw it was labeled "Alto II Trident Disk Interface" (upper left corner of the photo below), confirming that it was for the Trident.

Trident Disk Interface card for Xerox Alto computer. (See label in upper left.)

Trident Disk Interface card for Xerox Alto computer. (See label in upper left.)

It was a shock to discover the disk interface card was for the Trident drive, since our Alto has the standard Diablo drive, which is completely incompatible with the Trident.[6] We checked all the boards and verified that the system was missing the Diablo interface board. This was a showstopper problem; with the wrong board, the disk drive would be unusable and we wouldn't be able to boot up the system. What could we do? Network boot the Alto? Build a disk simulator? Find a Trident drive on eBay? (We actually found a Trident disk platter on eBay for $129, but no drive.)

Tune in next episode to find out what we did about the disk interface problem. (Spoiler: we found a solution thanks to Al Kossow.)

Notes and references

[1] The physical layout of the power supplies is specified on page 11 of the Alto documentation introduction. On the top are three Raytheon/Sorensen power supplies, +12V (15A), +15V (12A), and -15V (12A). At the bottom is a large LH Research Mighty Mite power supply providing +5V (60A) and -5V (12A).

Why the variety of voltages? Most of the circuitry in the Alto uses 5V, which is standard for TTL chips. The MOS memory chips use +5V, -5V and +12V. The Ethernet card uses +15V and -15V, with +15V powering the transceiver. The disk drive uses +/- 15V.

[2] Steve Jobs claimed that the Apple II's use of a switching power supply was a revolutionary idea ripped off by other computer manufacturers. However, the Alto is just one of many computers that used switching power supplies before Apple (details).

[3] For full details on the power supply operation, see the block diagram. First, the 115V AC line input is converted to 300V DC by a rectifier and voltage doubler. (The voltage doubler is a clever way of supporting both 115V and 230V inputs; using the doubler with 115V. This is why older PCs have a switch on the power supply to select 115V or 230V. Modern power supplies handle a wide input range, and don't require a switch.) Next, the power supply has a chopper, a PWM transistor circuit that chops up the 300V DC, producing a regulated 120V-200V DC, depending on the output load. This goes to the inverter, which drives a step-down/isolation transformer that produces the desired 15V output. A regulation circuit sends feedback to the chopper based on the output voltage. Meanwhile, an entirely separate switching power supply circuit generates voltages (including +150V) used by the power supply internally.

Modern power supplies use a single switching stage in place of the separate chopper and inverter. I believe the two stages were used to reduce the load on the bipolar switching transistors, which don't have the performance of modern MOSFET switching transistors. (As Ron pointed out, modern power supplies often have a PFC (power factor correction) stage for improved efficiency. Thus, the two-stage design has returned, although the stages are entirely different now.)

Modern power supplies use a power supply control IC. The Alto's power supply instead has control circuits built from simple components: transistors, op amps, 555 timers. This is one reason the power supply requires three circuit boards.

[4] The following table from the Alto disk manual gives the stats for the drive.

Statistics on the Diablo 31 disk used with the Xerox Alto computer.

Statistics on the Diablo 31 disk used with the Xerox Alto computer.

[5] The Alto backplane has 21 slots, not all of which are used in our system. The list of which board goes into which slot is on page 8 of the Alto documentation.

[6] I suspect that the Y Combinator Alto originally had both a Trident drive and a Diablo drive (as well as four Orbit boards to drive a laser printer), and when it was taken out of service, the Trident drive, the Diablo interface board, and the Orbit boards went off somewhere else. This left the Alto with a drive that didn't match the interface card.

For reference, schematics and documentation on the Trident interface board are here. Despite all the chips on the disk interface board, it doesn't do very much, since each TTL chip is fairly simple. The interface board has some counters, one word data buffers, parallel/serial conversion, and a bit of control logic. The Alto was designed to offload many hardware tasks to microcode, so the hard work of disk I/O is performed in microcode (software).

Fixing the core memory in a vintage IBM 1401 mainframe

I recently had the chance to help fix one of the vintage IBM 1401 computer systems at the Computer History Museum when its core memory started acting up. As you might imagine, keeping old mainframes running is a difficult task. Most of the IBM 1401 restoration and repairs are done by a team of retired IBM engineers. But after I studied the 1401's core memory system in detail, they asked if I wanted to take a look at a puzzling memory problem: some addresses ending in 2, 4 or 6 had started failing.

An IBM 1401 mainframe computer at the Computer History Museum. Behind it is the IBM 1406 Storage Unit, providing an additional 12,000 characters of storage. IBM 729 tape drives are at the right and an IBM 1402 Card Read Punch is at the far left.

An IBM 1401 mainframe computer at the Computer History Museum. Behind it to the left is the IBM 1406 Storage Unit, providing an additional 12,000 characters of storage. IBM 729 tape drives are at the right and an IBM 1402 Card Read Punch is at the far left.

The IBM 1401 was low-end business computer that became the most popular computer of the early 1960s due to its low cost: $2500 a month, Like most computers of its era it uses ferrite core memory, which stores bits on tiny magnetized rings. The photo below shows a closeup of the ferrite cores, strung on red wires.

Closeup of the core memory in the IBM 1401 mainframe, showing the tiny ferrite cores.

Closeup of the core memory in the IBM 1401 mainframe, showing the tiny ferrite cores.

The 1401 had only 4,000 characters of storage internally, but could hold 16,000 characters with the addition of the IBM 1406 Storage Unit. This core memory expansion unit was about the size of a dishwasher and was connected to the 1401 computer by two thick cables.[1] This 12,000 character expansion box could be leased for $1575 a month or purchased for $55,100. (In comparison, a new house in San Francisco was about $27,000 at the time.) The failing memory locations were all in the same 4K block in the IBM 1406, which helped narrow down the problem.

A view inside the IBM 1406 Storage Unit, which provides 12,000 characters of storage for the IBM 1401 mainframe. At the left is the 8,000 character core module below the cards that control it.

A view inside the IBM 1406 Storage Unit, which provides 12,000 characters of storage for the IBM 1401 mainframe. At the left is the 8,000 character core module below the cards that control it.
The 1406 contains two separate core memory modules: one with 8,000 characters and one with 4,000 characters. In the picture above, the 8K core module is visible on the left, while the 4K core module is out of sight at the back right. Associated with each core module is circuitry to decode addresses, drive the core module, and amplify signals from the module; these circuits are in three rows of cards above each module. The 1406 also provided an additional machine opcode (Modify Address) for handling extended addresses. Surprisingly, the logic for this new opcode is implemented in the external 1406 box (the cards on the right), not in the 1401 computer itself. The 1406 box also contains hardware to dump the entire contents of memory to the line printer, performing a core dump.

The circuits in the 1406 (and the 1401) are made up of Standardized Module System (SMS) cards. A typical card has a few transistors and implements a logic gate or two. Unlike modern transistors, these transistors are made from germanium, not silicon. The photo below shows rows of SMS cards inside the 1406. Note the metal heat sinks on the high-current transistors driving the core module.

A closeup of SMS cards inside an IBM 1406 Storage Unit. The top cards have heat sinks on high-current driver transistors.

A closeup of SMS cards inside an IBM 1406 Storage Unit. The top cards have heat sinks on high-current driver transistors.
The core memory is made from planes of 4,000 cores, as seen below. Each plane is built from a grid of 50 by 80 wires, with cores where the wires cross. By simultaneously energizing one of the 50 horizontal (X) wires and one of the 80 vertical (Y) wires, the core at the intersection of the two wires is selected. Each plane holds one bit of each character, so 8 planes are stacked to hold a full character.

Core memory in the IBM 1401 mainframe. Each layer (plane) has 4,000 tiny cores in an 80x50 grid. Multiple planes are stacked to form the memory.

Core memory in the IBM 1401 mainframe. Each layer (plane) has 4,000 tiny cores in an 80x50 grid. Multiple planes are stacked to form the memory.
The photo below shows the 8K memory module inside the 1406, built from a stack of 16 core planes. (Since a stack of 8 planes makes 4K, 16 planes make 8K.) Mounted on the right of the core module are the "matrix switches", which drive the X and Y select lines; my previous core memory article explains them.

The 8,000 character core memory in the IBM 1406 Storage Unit consists of 16 layers (planes) of cores wired together. The matrix switches at the right (behind plastic) drive the control lines.

The 8,000 character core memory in the IBM 1406 Storage Unit consists of 16 layers (planes) of cores wired together. The matrix switches at the right (behind plastic) drive the control lines.
The IBM 1401 is a decimal machine and it uses 3-digit decimal addresses to access memory. The obvious question is how can it access 16,000 locations with a 3-digit address. To understand that requires a look at the characters used by the IBM 1401.

The IBM 1401 predates 8-bit bytes, and it used 6-bit characters. Each character consisted of a 4-bit BCD (binary-coded decimal) digit along with two extra "zone" bits. By setting zone bits, letters and a few symbols could be stored. For instance, with both zone bits set, the BCD digit values 1 through 9 corresponded to the characters "A" through "I". Other zone bit combinations provided the rest of the alphabet. While this encoding may seem strange, it maps directly onto IBM punched cards, which have 10 rows for the digit and two rows for zone punches. This encoding was called BCDIC (Binary-Coded Decimal Interchange Code), and later became the much-derided EBCDIC (Extended BCDIC) encoding. (You may have noticed that 8 planes are used for 6-bit characters. One extra plane holds special "word mark" bits, and the other holds parity.)

The point of this digression into IBM character encoding is that a three-digit address also included 6 zone bits. Four of these bits were used as part of the address, allowing 16,000 addresses in total.[2] For example, the address 14,578 would be represented as the digits 578 along with the appropriate zone bits, so the resulting address would be represented as the three characters "N7H".[3]

Getting back to the problem with the memory unit, the 4K bank was failing with addresses ending in 2, 4 and 6. Looking at 2, 4 and 6, I immediately concluded that what these all had in common was the 2 bit was set. Except 4 doesn't have the 2 bit set. So maybe the problem was with even addresses. Except 0 and 8 worked. After staring at bit patterns a while, I became puzzled because 2, 4 and 6 didn't really have anything in common.

Looking at the logic diagrams reveals the hardware optimization that makes 2, 4, and 6 have something in common. Since the problem happened with specific unit digits, the problem was most likely in the address decoding circuitry that translates the unit digit to a particular select line.[4] The normal way of decoding a digit is to look at the 4 bits of the digit to determine the value. Unexpectedly, the decoder only looks at 3 bits; this reduces the hardware required, saving money. For instance, the digit 2 is detected below if the 4 bit is clear, the 1 bit is clear, and the 8 bit is clear. The digit 4 is detected if the parity (CD) bit is clear, the 4 bit is set, and the 1 bit is clear. The digit 6 is detected if the 1 bit is clear, the 2 bit is set, and the 4 bit is set.[5] Looking at the decode logic, decoding of the digits 2, 4, and 6 (and only these digits) tests that the 1 bit is clear. Now the failure starts to make sense. If something is wrong with the units 1-bit-clear signal, these digits would not be decoded properly and memory would fail in the way observed.

The Instructional Logic Diagrams (ILD) for the IBM 1401 explain the circuitry of the computer. The above diagram shows the address decode logic used for the core memory.

The Instructional Logic Diagrams (ILD) for the IBM 1401 explain the circuitry of the computer. The above diagram shows part of the address decode logic used for the core memory.
The next step was to figure out how the units 1-bit-clear signal could be wrong. You'd expect a failure of one address bit to be catastrophic, not just limited to one memory bank. Looking at the specifics of the decoder circuitry revealed the problem.

Every connection and circuit of the IBM 1401 is documented in an Automated Logic Diagram (ALD). These diagrams were generated by computer and put in a set of binders for use by service engineers. The code number 42.73.11.2 on the previous diagram provides the page number of the related ALD. While these diagrams are extremely detailed, they are nearly incomprehensible. Since I'm using copies of reduced 50-year-old line printer output, the ALDs are also barely readable.

The Automated Logic Diagrams (ALD) for the IBM 1401 mainframe computer consist of hundreds of pages that show every card and connection in the computer. The above diagram shows part of the address decode circuitry in the IBM 1406 Storage Unit.

The Automated Logic Diagrams (ALD) for the IBM 1401 mainframe computer consist of hundreds of pages that show every card and connection in the computer. The above diagram shows part of the address decode circuitry in the IBM 1406 Storage Unit.
The diagram above shows part of the ALD for the units memory address decoding. Each box corresponds to a logic component on an SMS card and the lines show the wiring between cards. At the bottom of each box, "AEM-" indicates the type of SMS card. The reference information for an AEM/AQU card reveals that it is a Switch Decode card with two circuits. Each circuit combines an inverter, a three-input AND gate, and a high-current driver.[6]

Now we can see the root cause of the problem. The unit address bit 1 (highlighted in red on the ALD) goes into pin A of the Units 4 card and is inverted. The inverted value (pin D, yellow) then goes to the Units 2 and Units 6 cards, generating the decode outputs (green). If something is wrong with this signal, addresses 2, 4, and 6 won't decode, which is exactly the problem encountered. Thus, the Units 4 card seemed like the problem.

This IBM Standard Modular System (SMS) card is used by the core memory to decode addresses. It has two high-current outputs, driven by the germanium transistors at the top with red heat sinks.

This IBM Standard Modular System (SMS) card is used by the core memory to decode addresses. It has two high-current outputs, driven by the germanium transistors at the top with red heat sinks. The card type "AEM" is stamped into the bottom left of the card.
The diagram above indicates that the Units 4 card is card E15 in rack 06B5, which is in the right rear of the 1406 unit.[7] Once I'd located the right rack, I needed to find card E15. The three rows of cards are D through F (top). I counted to position 15 of 26 in row E. The photo below shows the position of the card (red arrow).

Circuitry inside the IBM 1406 Storage Unit. The green arrow indicates the 4,000 character core memory. The cards above it control the memory. The top row of cards has high-current drivers for the memory. The cards in the middle row decode addresses. The bottom row contains amplifiers to read the signals from the memory. The red arrow indicates the position of the faulty card. The fan above the cards provides cooling airflow. At the right, colorful wire bundles connect the circuitry.

Circuitry inside the IBM 1406 Storage Unit. The green arrow indicates the 4,000 character core memory. The cards above it control the memory. The top row of cards has high-current drivers for the memory. The cards in the middle row decode addresses. The bottom row contains amplifiers to read the signals from the memory. The red arrow indicates the position of the faulty card. The fan above the cards provides cooling airflow. At the right, colorful wire bundles connect the circuitry.

One convenient thing about the IBM 1401 and its peripherals is they are designed for easy maintenance. In many cases, you don't even need any tools. To get inside the IBM 1406, you just pop the front or side panels off (as shown below). The SMS cards have a metal cover to guide the cooling airflow, but that just pops off too. It's easy to attach an oscilloscope to see what's happening, although I didn't need to do that. The SMS cards themselves are easily pulled from their sockets. I'm told you don't even need to power down the system to replace cards, but of course I turned off the power.

Inside the IBM 1406 Storage Unit. At the left are the power supplies, including a 450W ferro-resonant regulator. The 8K core memory is at the right, connected by yellow wire bundles to the control circuitry above.

Inside the IBM 1406 Storage Unit. At the left are the power supplies, including a 450W ferro-resonant regulator. The 8K core memory is at the right, connected by yellow wire bundles to the control circuitry above.

I pulled out the card in slot E15, plugged in a replacement card from the 1401 lab's collection, and powered up the system. Much to my surprise, the memory worked perfectly after replacing the card. Some of the engineers (Stan, Marc, and Dave) tested the transistors on the bad card but didn't find any problems. After cleaning the bad card and swapping it back, the memory still worked, so there must have been some dirt or corrosion making a bad connection. They say this is the first problem they've seen due to bad connections, so the thick gold plating on the SMS card contacts must work well.

Conclusion

It's not every day one gets the chance to help fix a 50 year old mainframe, so it was an interesting experience. I was lucky that this problem turned out to be easy to resolve. The guys repairing the tape drives and card reader have much harder problems, since those devices are full of mechanical parts that haven't aged well.

Thanks to the members of the 1401 restoration team and the Computer History Museum for their assistance. Special thanks to Stan Paddock, Marc Verdiell and Dave Lion for inviting me to investigate this problem.

The IBM 1401 is demonstrated at the Computer History Museum on Wednesdays and Saturdays (unless there is a hardware problem) so check it out if you're in Silicon Valley (schedule).

Notes and references

[1] The 1406 expansion unit was 29" wide, 30 5/8" deep and 39 5/8" high and weighed 350 lbs. The 10 foot cables between the 1401 computer and the 1406 storage unit are each 1 1/4" thick; one provides power and the other has signals. The 1406 generates 250 watts of heat, which is less than I would have expected. Details are in the installation manual.

[2] The three-digit address has six zone bits in total. Four are used as part of the address. The other two zone bits to indicate an indexed address using one of three index registers (which are actually part of core, not separate registers). Indexed addressing was part of the "Advanced Programming" option which cost an extra $105 per month.

[3] For full information on converting addresses to characters, see the IBM 1401 Pocket Reference Manual, page 3.

[4] Scans of the Instructional Logic Diagrams (ILDs) are available online. The memory decode circuits are on page 56. Scans of the Automated Logic Diagrams (ALDs) are also available online; the core memory is in section 42.

[5] The IBM 1401 predates standardized logic symbols, so the logic diagram symbols may be confusing: the triangular symbol is an AND gate. The SWD (Switch Decode) card inverts its inputs, but that isn't shown on the logic diagram.

There are few subtleties in the decoding logic. You might think that the circuit described would decode a 0 digit as a 2 digit since the 1, 4, and 8 bits are clear. However, the IBM 1401 stores the digit 0 as the value 10 (8 bit and 2 bit set), since a blank is stored with all bits clear.

For the decoding using the parity bit, note that the IBM 1401 uses odd parity. For instance, the digit 4 (binary 0100) already has odd parity, so the CD (check digit) parity bit is clear. The digit 5 (binary 0101) has the CD parity bit set so three bits are set in total.

[6] The original idea of SMS cards was to build computers from a small set of standardized cards, but as you can guess from the complexity of the AEM card, engineers created highly-specialized SMS cards for specific purposes. IBM ended up with thousands of different SMS card types, defeating the advantages of standardization. I've created an SMS card database that describes a thousand different SMS cards.

[7] The 06B5 designation indicates which gate holds the card. (Each rack of cards is called a "gate" in IBM terminology. Confusingly, this has nothing to do with a logic gate.) The 06 indicates the 1406 frame. The B indicates a lower frame. Position 5 is in the back right. The same numbering system is used in the IBM 1401 itself. The 1401 is built around the same frame structure as the 1406, except with four frames, stacked 2x2. The left frames are numbered 01, and the right frames are 02. The frames on top are A, and the frames on the bottom are B. Gates 1 through 4 are in the front, and 5 through 8 continue around the back. A typical 1401 gate identifier is "01B2", which indicates the rack on the front of the 1401 below the console. (The use of frames to build computers and peripherals led to the term "main frame" to describe the processing unit itself.)