Restoring a vintage Xerox Alto day 8: it boots!

We've been restoring a Xerox Alto from the 1970s for several months, and we finally got it to boot and run some programs! There's still some hardware debugging ahead of us, since the Alto drops into the debugger for many programs, but we're quite happy to see the system running. In this post, I describe our latest debugging session and show some programs running on the Alto.

The Xerox Alto, listing the files on the disk.

The Xerox Alto, successfully booted and listing the files on the disk. The diagonal strips are an artifact of photographing the CRT and do not appear on the display.

For background, the Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Luca Severini, Ron Crane, Carl Claunch and Ed Thelen. For posts on previous restoration days see parts 1, 2, 3, 4, 5, 6, 6 update and 7.

The new boot disk

In an earlier session, we discovered that our boot disk had been used for drive testing decades earlier and was filled with random garbage, making it impossible to boot from the disk. Fortunately, the Living Computer Museum in Seattle sent us a new boot disk, loaded with diagnostic software. I received a vintage Digital RK05K-11 disk cartridge box:

Box for a vintage Digital RK05K-11 disk cartridge

Box for a vintage Digital RK05K-11 disk cartridge
Inside the box was the 14" disk. Despite its size, the disk cartridge only hold 2.5 megabytes, a tangible indication of the exponential improvements in disk density since the 1970s. We loaded the disk into the Alto's Diablo drive, waited a minute for the disk to spin up to speed and the heads to load, and Ed eagerly pressed the reset button. Would we be lucky and successfully boot the Alto? After all the anticipation, nothing happened.

An Alto diagnostic boot disk, sent to us by the Living Computer Museum in Seattle.

An Alto diagnostic boot disk, sent to us by the Living Computer Museum in Seattle.

Why won't the system boot?

Since we had successfully loaded a disk sector (of random data) earlier, we knew that the system was working end-to-end, from the drive through the disk interface card and into the processor boards and memory. One possibility was that the alignment was different between our drive and the Living Computer Museum's drive, corrupting the data. Needing to hand-align our drive would be very difficult, so we hoped that wasn't the problem.

To see the words as they came off the disk, we added more logic analyzer probes to the Alto's backplane to trace the processor bus. At this point, the backplane is liberally decorated with probes, allowing us to monitor the buses and microcode execution in detail.

We added more probes to the Alto's backplane to monitor the processor bus. The probes are connected to a vintage Agilent logic analyzer.

We added more probes to the Alto's backplane to monitor the processor bus. The probes are connected to a vintage Agilent logic analyzer.

Using the logic analyzer, we could step through the microcode to see each disk word getting loaded into memory, but the data didn't match the boot sector we expected. The Alto stores each sector on disk as a 2-word header (holding the disk address), an 8-word label (holding a next block pointer), and the 256-word data block. Although the data seemed wrong, more interesting was the octal value 000100 in the header coming from disk. (The Alto uses octal, causing us no end of confusion.) This header value corresponds to a disk address of cylinder 8, not the boot sector 0. Could we be reading the wrong sector?

By removing the cover from the Diablo drive, you can watch it seek. Unlike modern hard drives, the Alto's disk isn't sealed so you can see the disk surface and head when the disk is loaded in the drive.

Looking inside the Diablo disk drive, you can see the head moving over the disk's surface as disk seeks take place.

Looking inside the Diablo disk drive, you can see the head moving over the disk's surface as disk seeks take place. The green dial on the right rotates to indicate the current track. These seeks are from an earlier test, not from boot.

Watching the drive as the Alto attempted to boot, we saw the disk arm seek, which it shouldn't have done to read from boot sector 0. The seek dial rotated to cylinder 8—as the logic analyzer suggested, the Alto was trying to boot from the wrong disk cylinder, which clearly wouldn't work.

Inside the Diablo disk drive, the turquoise sector indicator shows the drive has seeked to sector 8.

Inside the Diablo disk drive, the turquoise sector indicator shows the drive has seeked to sector 8.

Since the drive seeked correctly last week, why was it trying to read from the wrong cylinder today? Were we suffering another chip failure on the disk interface card? Had something malfunctioned in the drive? We pored over the disk interface schematics and suspected a problem with the nine cylinder select lines between the Alto and the drive. In particular, a malfunction in the CYL(5) line could set the cylinder to 8, causing the seek we saw. (Bits on the Alto are inconveniently numbered backwards, so cylinder bit 5 corresponds to the value 8.)

We noticed a scratch in the 40-conductor ribbon cable between the Alto and the disk drive, exposing a wire. Could this be the cause of our problems? We carefully checked continuity and found no problems with the cable despite the scratch, so we hooked the cable back up along with an oscilloscope to monitor the offending signal, so we could debug the problem.

Running the Alto

We tried booting the Alto again, watching for the seek problem. This time the disk unexpectedly performed multiple seeks. And then the boot screen appeared on the Alto. We had a running system!

The Xerox Alto screen after booting, waiting for a command.

The Xerox Alto screen after booting, waiting for a command.

A few months ago, I had used the Salto simulator to see how the Alto worked. But now, facing a working system, I couldn't remember the commands. To see the files, was it LIST, or DIR? No. How about HELP? No good. After a minute or two, I remembered that a simple question mark was the command to list the disk, and I got a list of files. The system was working well enough to read a directory.

I tried running the WYSIWYG text editor Bravo and the mouse-based drawing program Draw, but they crashed, dropping the system into the debugger, Swat. Clearly some hardware problems remain and our debugging adventure is not over yet.

The Alto's debugger is called Swat, and runs if there is an error.

The Alto's debugger is called Swat, and runs if there is an error.

Some programs ran successfully. The CRT test program drew grids on the bitmapped screen. The CRT is a bit fuzzy in the upper left, but the quality is surprisingly good considering that this tube was almost too dim to see a few months ago. Apparently running the tube a while restored it by burning contaminants off the cathode (or something mysterious tube-era phenomenon like that).

The Xerox Alto running a CRT test program. Antique mechanical calculators are in the background.

The Xerox Alto running a CRT test program. Antique mechanical calculators are in the background.

The Ethernet diagnostic program ran and showed off the mouse-based GUI. I'm developing a BeagleBone-based Ethernet simulator for the Alto, so this program will be very helpful. We don't have a gridded optical mouse pad, so the mouse didn't work and we couldn't click anything.

The Alto's Ethernet Diagnostic Program uses a mouse-based GUI.

The Alto's Ethernet Diagnostic Program uses a mouse-based GUI.

The keyboard test program graphically displays the keyboard and shows each key as it is pressed. We used this to verify the keys all work.

The Alto running the keyboard test program. Antique calculators are in the background.

The Alto running the keyboard test program. Antique calculators are in the background.

A closeup of the Alto's keyboard test programming. It highlights keys when they are pressed.

A closeup of the Alto's keyboard test programming. It highlights keys when they are pressed.

Conclusion

It was an exciting day, with the Alto finally booting successfully. A disk seek problem blocked us for a while, but then the problem mysteriously disappeared. We ran a bunch of test programs from the disk. About half of them ran successfully, and half crashed into the debugger. There may be a malfunction in the processor that we need to track down. Or perhaps we're getting memory errors; the parity errors we saw earlier could have returned. In any case, we have some more debugging ahead of us, but it's exciting to see the system finally running. Hopefully we will soon be playing Alto Trek and Maze War.

For updates on the restoration, follow me on Twitter at kenshirriff.

Thanks to Josh Dersch and the Living Computer Museum for their debugging help and sending out the boot disk.

Sonicare toothbrush teardown: microcontroller, H bridge, and inductive charging

My Sonicare electric toothbrush recently quit working, so I took it apart and examined the interesting circuitry inside. There's much more complexity than I expected inside a toothbrush, especially in the mechanism that drives the brush head at 31,000 strokes per minute. Internally, the brush appears to be designed for quality rather than ease of manufacturing. Unfortunately, moisture can get in, causing reliability problems.

The toothbrush is a Sonicare Flexcare Platinum with more features than you'd expect in a toothbrush: three brushing modes, three intensities and a couple timers, along with 10 LEDs to indicate its status. A pressure sensor in the toothbrush changes the vibration if you apply too much pressure while brushing. The toothbrush uses wireless inductive charging so it charges when set on the base. (This toothbrush may seem overly complicated, but it's nothing compared to the new model that includes Bluetooth.)[1]

Disassembling the Sonicare toothbrush. At the left is the induction coil used for charging.

Disassembling the Sonicare toothbrush. At the left is the induction coil used for charging.

The first step was to remove the toothbrush base, allowing the toothbrush mechanism to be removed from the case. The toothbrush head mounts on the right; it needed to be removed to disassemble the toothbrush. At the left is the charging coil used to wirelessly charge the toothbrush.

The photos below show the top and bottom of the toothbrush internals. I expected to find a simple, low-cost mechanism, so I was surprised at how much complexity there was inside. The vibration mechanism (right) is built from multiple metal and plastic parts screwed together, requiring more expensive assembly than I expected. The circuit board is literally gold-plated and has a lot of components, even if it doesn't quite reach Apple's level of complexity. Overall, the toothbrush's internal design is high quality (except, of course, for the fact that it quit working, as did an earlier one).

Inside the Sonicare toothbrush, top and bottom composite view. The charging coil is at the left. The battery (red) is in the lower left. The coil that vibrates the brush is in the center and the brushing mechanism is at the right.

Inside the Sonicare toothbrush, top and bottom composite view. The charging coil is at the left. The battery (red) is in the lower left. The coil that vibrates the brush is in the center and the brushing mechanism is at the right.

The brush contains several key components, as can be seen above. In the center is the large red coil that causes the toothbrush to vibrate. On the right is the vibration mechanism, which has a powerful magnet that is moved by the coil. The brush head snaps on at the right. The battery (red, left) takes up about a third of the toothbrush. The long, thin circuit board (green) has the circuitry to operate the toothbrush. A white spacer sits on top of the circuit board, with holes for the LEDs and buttons.

The photo below shows the brush mechanism partially disassembled and separated from the electronics. The toothbrush still powers on in this state, as you can see from the illuminated LEDs. Note the flexible brown ribbon cable between the center of the brush mechanism and the electronics board. This connects the pressure sensor on the brush mechanism to the electronics board.

The brush mechanism (left) separated from the electronics (right). Note the illuminated LEDs. Alto note the flexible brown ribbon connecting the pressure sensor to the electronics board.

The brush mechanism (left) separated from the electronics (right). Note the illuminated LEDs. Alto note the flexible brown ribbon connecting the pressure sensor to the electronics board.

The diagram below shows the main components on the circuit board. The buttons are the most visible feature. The gold circles at the left are used to program the microcontroller. The MOSFET transistor switch the coil on and off to produce vibrations. Ten LEDs are scattered across the board. At the right, the diode bridge is part of the charging circuit.

The circuit board for the Sonicare toothbrush is crammed with tiny parts. The gold circles on the left are used to program the microcontroller chip. The tiny gold circles scattered across the board are test points for testing the board during manufacturing.

The circuit board for the Sonicare toothbrush is crammed with tiny parts. The gold circles on the left are used to program the microcontroller chip. The tiny gold circles scattered across the board are test points for testing the board during manufacturing.

The circuit board is covered with tiny gold circles. These are test points, allowing test connections to most parts of the board. For instance, each LED and each button has a test point that can be used to test the component. During testing, spring loaded pogo pins on the test circuit make contact with these test points on the toothbrush board. The number of test points (about 56) looks like overkill to me.

The diagram below shows the components on the back of the circuit board. The toothbrush is controlled by a mid-range 8-bit microcontroller, the PIC16F1516.[2] This chip contains the code for all the toothbrush functions: reading the buttons, lighting the LEDs, controlling the coil, and managing charging. There are too many LEDs (10) for the chip to control individually, so eight of the LEDs are controlled by a separate LED driver chip.[3]

The back of the Sonicare circuit board contains the PIC16F1516 microcontroller chip. The sensor is probably a Hall-effect magnetic field sensor.

The back of the Sonicare circuit board contains the PIC16F1516 microcontroller chip. The sensor is probably a Hall-effect magnetic field sensor.

The microcontroller is an off-the-shelf part, not a custom chip, so it needs to be programmed with the right software. This is done during manufacturing through the large gold circles and triangle near the end of the toothbrush.[4] The resonator provides the clock signal for the microcontroller's timing.[5]

The driver mechanism and the H bridge circuit

The toothbrush head is driven by an electromagnetic coil that moves a magnet. The coil has two halves, wired in opposite directions, so the sides will have opposite magnetic fields. The coil is pulsed one way to rotate the magnet one direction, and then pulsed the opposite way to rotate the magnet the other direction. The result is the high-speed brushing vibration.

The diagram below shows the driver mechanism disassembled. The coil constantly switches polarity so the north pole will switch from the top to bottom (the yellow and blue poles of the coil). The magnet has poles on the front and back edges (perpendicular to the coils), so it will attempt to rotate back and forth to line up with the coil, along the long axis of the toothbrush. The mechanism limits the rotation to a few degrees, resulting in a rotational vibration back and forth rather than spinning like a motor. This rotational vibration is transmitted to the toothbrush head by the torsion bar causing the head and bristles to vibrate. More details on the driver mechanism are here.

Sonicare toothbrush driver mechanism. As the polarity of the coil switches, the magnet rotates back and forth slightly. The torsion bar transmits the rotation to the shaft, which causes the toothbrush head to vibrate around its axis.

Sonicare toothbrush driver mechanism. As the polarity of the coil switches, the magnet rotates back and forth slightly. The torsion bar transmits the rotation to the shaft, which causes the toothbrush head to vibrate around its axis.

The figure below shows the voltage across the coil. Every 2 milliseconds, there is a 4 volt pulse across the coil, followed by a negative 4 volt pulse. The pulses generate the reversing magnetic field that drives the magnet and causes the toothbrush to vibrate. If you count the positive and negative pulses as separate brush strokes, you get the advertised 31,000 brush strokes per minute. (Although counting an up-down cycle as a single stroke rather than two would make more sense to me.)

Voltage across the actuator coil in a Sonicare toothbrush. An H-bridge drives the coil with +/- 4 volt pulse every 2 milliseconds.

Voltage across the actuator coil in a Sonicare toothbrush. An H bridge drives the coil with +/- 4 volt pulse every 2 milliseconds.

You might think that driving a coil in two directions would use two switches, but instead it uses four, in a common circuit called an H bridge, as shown below. If switches 1 and 4 are closed, current flows in the forward direction. If switches 2 and 3 are closed, current flows in the reverse direction. In the toothbrush, transistors are used for the switches, and are turned on and off by the microcontroller.[6] An H bridge is often used to control motors that need to go forwards and reverse, for example in a hoverboard.

An H bridge circuit is used to drive the vibration coil. This allows the coil to be off or energized in either direction. Four switches (MOSFET transistors) are used in the H bridge.

An H bridge circuit is used to drive the vibration coil. This allows the coil to be off or energized in either direction. Four switches (MOSFET transistors) are used in the H bridge.

Pressure sensor

One of the features of this toothbrush is a pressure sensor. If you press too hard while brushing, the vibrations start pulsing and the LEDs flash. The sensor itself is a tiny mystery chip (below) mounted on the drive assembly, and connected to the electronics board with a thin flexible cable. The cable is labeled with Vdd (1), Data (2), Clock (3), and Ground (4), so the sensor is probably sending a stream of bits using an I2C protocol. My suspicion is the sensor is a Hall effect magnetic field sensor that detects a change in the magnetic field if pressure is preventing the magnet from vibrating. The chip doesn't seem to be in a position to measure actual pressure, which is why I suspect it's measuring the magnetic field instead.

The pressure sensor on the toothbrush is connected to the electronics via a flexible cable. The sensor is probably a Hall effect magnetic sensor using the I2C protocol.

The pressure sensor on the toothbrush is connected to the electronics via a flexible cable. The sensor is probably a Hall effect magnetic sensor using the I2C protocol.

Charging

To charge the toothbrush, it is set on a stand and charges inductively without physically being plugged in. A coil in the stand is magnetically coupled to a coil in the toothbrush, transmitting the power wirelessly. You can see the coil at the bottom of the toothbrush. When set on the stand, the coil picks up about 12 volts, which is used to charge the battery. The power is transmitted at high frequency (80kHz) for efficiency.

The coil is connected to a diode bridge that converts the power to DC. It then goes through a transistor circuit that regulates the charging, as directed by the microcontroller. The battery in the toothbrush is a Sanyo Li-ion rechargeable battery, which is said to be 3.7V but I measured 4.0V.[7]

Voltage across the charging coil in a Sonicare toothbrush oscillates about about 80kHz.

Voltage across the charging coil in a Sonicare toothbrush oscillates about about 80kHz.

The toothbrush is designed to conserve battery by using very little power when not in use. The microcontroller has a low power standby mode when it is waiting for a button press. When the toothbrush is activated, a transistor energizes the LEDs and the LED driver chip, while another circuit powers up the pressure sensor. This prevents these components from draining the battery while the toothbrush is not in use.

Conclusion

Overall, I was surprised by how much electronics was inside the toothbrush, as well as the complexity of the drive mechanism. It was designed with quality in mind, not low-cost production. Unfortunately, the brush has reliability issues—this was the second one to fail on me. The problem appears to be water seeping in around the shaft, eventually damaging the internals.

Some other Sonicare teardowns are here, here and here. I would have expected different models to be based on similar electronics that just changed the LEDs, buttons and software. Surprisingly the different teardowns show a variety of microcontrollers, circuitry, and drive coils. Some models even move the magnets from the toothbrush unit to the brush head.

Unfortunately after disassembling my toothbrush I was unable to fix its problem. But at least I got an interesting teardown out of it!

To find out about my latest teardowns, follow kenshirriff on Twitter.

Notes and references

[1] It's ironic for a toothbrush to include Bluetooth technology because Bluetooth is named after Harald Bluetooth, a tenth century Danish king who was called Bluetooth because he had a bad, discolored tooth. The Bluetooth logo itself is formed by combining two runes from the king's name.

[2] The PIC microcontroller runs at 16 megahertz. It has 8K of flash memory for the program, as well as 512 bytes of RAM (the RAM on microcontrollers is usually very small) and 128 bytes of flash memory for data. It includes analog-to-digital conversion, which I think is used to monitor the charging voltage. The toothbrush's 8-bit microcontroller is less powerful than the 16-bit microcontroller inside a Macbook power supply.

[3] The LEDs are controlled by a 75HC595A serial to 8-bit output chip. The benefit of this chip is that the microcontroller would use 8 pins to control 8 LEDs, while the microcontroller only uses 3 pins to communicate with the serial chip, freeing up 5 pins for other tasks.

[4] Programming of the chip is done using the ISCP protocol. This uses the programming contacts labeled Vdd, Vpp, Tx, and Ground, as well as the triangle contact, which provides the ISCP data. For some reason, the Tx and Rx circles are also connected to the chips's UART serial pins, allowing serial communication with the microcontroller. I'm not sure why one would want to communicate with the chip outside programming. Maybe there's serial communication with the microcontroller as part of testing. Or maybe the NSA can download information on your brushing habits :-)

[5] The resonator is a 3-pin unit with built-in load capacitors, similar to a quartz crystal oscillator. I suspect it's a CERALOCK®, or something similar.

[6] The H bridge uses a 6866S 20V dual N-channel MOSFET on the low side and a 6963SD 20V dual P-channel MOSFET on the high side.

[7] The charger circuit is puzzlingly simple. The voltage from the diode bridge goes through a microcontroller-controlled transistor (Q5) and then to the battery (through a tiny fuse), without the filtering, voltage regulator or battery voltage monitoring I'd expect. The microcontroller is connected to the AC side of the diode bridge, and presumably is monitoring the input voltage waveform.

Restoring a Xerox Alto day 7: experiments with disk and Ethernet emulators

In this Alto restoration session we controlled the Alto's disk drive with an FPGA disk emulator and attempted booting the Alto with a BeagleBone-based Ethernet emulator. The GIF below shows the drive performing seeks as commanded by the emulator. (With the cover off the Diablo drive, you can see the disk head floating above the spinning disk surface and moving back and forth for seeks.) However, both emulators encountered some bugs, which we will need to fix.

Looking inside the Diablo disk drive, you can see the head moving over the disk's surface as disk seeks take place.

Looking inside the Diablo disk drive, you can see the head moving over the disk's surface as disk seeks take place. The green dial on the right rotates to indicate the current track.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Luca Severini, Ron Crane, Carl Claunch and Ed Thelen. For posts on previous restoration days see 1, 2, 3, 4, 5, 6 and 6 update.) Marc's YouTube video on Day 7 is below:

In our previous session, we discovered a faulty 7414 inverter chip on the disk interface card was preventing the disk from working: one of the six inverters on the chip had failed, preventing the disk sector task from running. Since we didn't have a 7414 lying around the house, we used a "dead bug" hack (below) to replace the bad inverter on the chip with an unused one, allowing us to access the disk. This session, we replaced the bad 7414 with a new one since we didn't want our hack to be permanent.

We re-wired a 7414 inverter chip. An unused inverter replaced the failed inverter.

We re-wired a 7414 inverter chip. An unused inverter replaced the failed inverter.

Last week, I discovered that our boot disk had been overwritten with random data decades ago to test the drive (details). This made it impossible to boot off our disk, blocking our progress. Tim Curley from Xerox PARC offered me some disks from PARC's collection of dozens of old Alto disks (below). Some people were concerned, though, that the disks could get damaged in a boot attempt, losing their historical data. To avoid damage, we decided not to boot these disks until we're sure the Alto is working properly and we have them archived. Instead, Josh Dersch at the Living Computer Museum in Seattle is sending us a fresh boot disk with no historical significance. Unfortunately we didn't get the disk in time for today's session, but we'll try it out next session.

Some old Xerox Alto hard disks at PARC.

Some old Xerox Alto hard disks at PARC. I borrowed a couple of them and we'll try reading them later.

The disk emulator

Our test setup to exercise the Diablo disk drive (center) with the FPGA board (front). The oscilloscope shows the sector pulses (top, blue), clock (middle, green), and data (bottom, yellow). Four sectors are visible on the bottom trace. The Xerox Alto is behind the oscilloscope. On the right are the power supply and the laptop controlling the FPGA board.

Our test setup to exercise the Diablo disk drive (center) with the FPGA board (front). The oscilloscope shows the sector pulses (top, blue), clock (middle, green), and data (bottom, yellow). Four sectors are visible on the bottom trace. The Xerox Alto is behind the oscilloscope. On the right are the power supply and the laptop controlling the FPGA board.

Carl built a Diablo disk emulator / exerciser from a FPGA board. The idea is we can hook this up to the Diablo drive to read and archive disks. Then we can connect the Emulator to the Alto and simulate multiple disk packs without physically handling disks. Building a disk emulator is complex because the drive itself implements very little functionality. It provides the raw bit stream as it is read off the disk, and the emulator needs to process this into bytes. In the photo above, the bottom oscilloscope trace shows several sectors as they are read from disk.

If you're not familiar with a FPGA (field-programmable gate array), it is a chip that can be programmed to generate custom hardware. The FPGA chip contain numerous logic blocks along with a switch matrix that allows them to be interconnected as desired. You describe the hardware configuration (gates, latches, and so forth) using a hardware description language such as Verilog and the chip is programmed to implement the desired circuitry.

The FPGA board for the emulator (below) is a Digilent Nexys 2 with a Xilinx Spartan-3E FPGA chip in the center of the board. This chip contains over ten thousand logic cells, allowing it to implement complex circuits. The FPGA board is connected to a prototyping board (right) with chips that shift the voltage levels to TTL as required by the Diablo drive. Carl's FPGA code generates the numerous signals required by the Diablo drive; in the photo below you can see the thick black cable going to the drive.

A Digilent FPGA board configured to control a Diablo disk drive.

A Digilent FPGA board configured to control a Diablo disk drive.

We hooked up the FPGA board to the Diablo drive and tested it out. It communicated with the drive just fine and could read from different tracks. Unfortunately, the read data was zeros, which was surprising since the Alto successfully read from the disk last week. After some investigation, Carl found the problem was in the FPGA code that stored the data in RAM, not his code. (See his blog for details.) You'd think writing to RAM would be the easy part, but apparently not. The disk logic appears to work fine so hopefully next session we will be able to read and archive disks.

The Ethernet emulator

The Xerox Alto was the first system with Ethernet, introducing a lot of networking innovations. Unfortunately, it uses 3 Mb/second Ethernet over coaxial cable, which is incompatible with anything modern. I built an Ethernet emulator using a BeagleBone Black, allowing me to send Ethernet boot packets to the Alto. The photo below shows the BeagleBone, along with a chip (74AHCT125) to convert the BeagleBone's 3.3V signals to 5V TTL signals. (The Ethernet signals to and from the Alto are 5V TTL. These signals normally go to a transceiver, which converts these signals to signals over the network cable.) I'm using the BeagleBone's PRU microcontrollers to implement this code; I wrote a blog post with more about the PRUs.

A BeagleBone Black configured to emulate the 3Mb/s Ethernet on the Xerox Alto.

A BeagleBone Black configured to emulate the 3Mb/s Ethernet on the Xerox Alto.

The emulator operates by converting a data block into the low-level signal required by Ethernet. A 0 bit is high-then-low and a 1 bit is low-then-high, with 170 nanosecond pulses. (Note that each data bit includes a transition (high-to-low or vice versa), which allows the receiver to detect bits and extract a clock signal.) My emulator almost worked; by using the logic analyzer, I saw the Ethernet microcode was running and the Alto was receiving data from my board. Unfortunately, there was about one bit error per word, making it unusable. The problem is probably interference due to the sketchy wiring I used; I'll try shielded wire next session.

Conclusion

This week we tried a Diablo disk emulator and an Ethernet emulator. They both partially worked, but still have some bugs. Next week we'll try booting the system with a new disk. I'm moderately optimistic that the system will come up successfully, but there could be more hardware problems waiting for us. For updates on the restoration, follow kenshirriff on Twitter.

Thanks to Josh Dersch and the Living Computer Museum for their debugging help and sending out a boot disk. Thanks to Tim Curley and Xerox PARC for supplying additional disks.

The discussion of this post on Hacker News is here.

How to run C programs on the BeagleBone's PRU microcontrollers

This article describes how to write C programs for the BeagleBone's microcontrollers. The BeagleBone Black is an inexpensive, credit-card sized computer that has two built-in microcontrollers called PRUs. By using the PRUs, you can implement real-time functionality that isn't possible in Linux. The PRU microcontrollers can be programmed in C using an IDE, which is much easier than low-level assembler programming. I recently wrote an article about the PRU microcontrollers, explaining how to program them in assembler and describing how they interact with the main ARM processor; so read it for more background.

A "blink" program in C

To motivate the discussion, I'll use a simple program that uses the PRU to flash an LED ten times. This example is based on PRU GPIO example but using C instead of assembly code.

Blinking an LED using the BeagleBone's PRU microcontroller.

Blinking an LED using the BeagleBone's PRU microcontroller.

The C code, below, flashes the LED ten times. The LED is controlled by setting or clearing a bit in register R30, which controls the GPIO pins. The code demonstrates two ways of performing delays. The first delay uses a for loop, leaving the LED on for 400 ms. The second delay uses the special compiler function __delay_cycles(), which delays for the specified number of cycles. Since the PRUs run at 200 MHz, each cycle is 5 nanoseconds, yielding an off time of 300 ms. At the end, the code sends an interrupt to the host code via register R31 to let it know the PRU has finished.[1]

How to compile C programs with Code Composer Studio

Although you can compile C programs directly on the BeagleBone,[2] it's more convenient to use an IDE. Texas Instruments provides Code Composer Studio (CCS), an integrated development environment on Windows and Linux that you can use to compile C programs for the PRU.[3] To install CCS, use the following steps:
  • Download CCS here. (You'll need to create a TI account and then fill out an export approval form before downloading, which seems so 1990s but isn't too difficult.)
  • Follow the instructions here to make sure you have the necessary dependencies or CCS installation will mysteriously fail.
  • In the installer, select Sitara 32-bit ARM Processors: GCC ARM Compiler and TI ARM Compiler.
  • In the add-ons dialog, selects PRU Compiler.
  • After installation, run CCS, select App Center, and install the additional add-ons (i.e. the PRU compiler).

To create a C program in CCS, use the following steps. The image highlights the fields to update in the dialog.

  • Start CCS.
  • Click New Project.
  • Change target to AM3358.
  • Change tab to PRU.
  • Enter a project name, e.g. "test".
  • Open "Project templates and examples" and select "Basic PRU Project".
  • Click Finish.
  • Enter the code.

How to set up Code Composer Studio to compile a PRU program for the BeagleBone.

How to set up Code Composer Studio to compile a PRU program for the BeagleBone.

To set up the BeagleBone for the example:

  • Download the device tree file: /lib/firmware/PRU-GPIO-EXAMPLE-00A0.dts.
  • Compile and install the device tree file to enable the PRU:
    # dtc -O dtb -I dts -o /lib/firmware/PRU-GPIO-EXAMPLE-00A0.dtbo -b 0 -@ PRU-GPIO-EXAMPLE-00A0.dts
    # echo PRU-GPIO-EXAMPLE > /sys/devices/bone_capemgr.?/slots
    # cat /sys/devices/bone_capemgr.?/slots
    
  • Download the linker command file bin.cmd.
  • Download the host file that loads and runs the PRU code (loader.c) and compile it:
    # gcc -o loader loader.c -lprussdrv
    
To compile and run the C program:
  • In CCS, select Project -> Build All (control-B) to compile the program.[4]
  • Copy the binary (test/Debug/test.out) to BeagleBone (e.g. with scp)
  • On the BeagleBone, link and run the program:[5]
    # hexpru bin.cmd test.out
    # ./loader text.bin data.bin
    

If everything went correctly, the LED should flash. (See my previous article for debugging help.)

In this example, loader simply loads and runs the executable on the PRU.[6] In a more advanced application, it would communicate with the PRU. For example, it could get commands from a web page, send them to the PRU, get results, and display them on the web. The point is that you can use the Linux-side code to do complex network or computational tasks, in combination with the PRU doing low-level, real-time hardware operations. It's kind of like having an Arduino together with a "real computer", in a tiny package.

The BeagleBone Black is a tiny computer that fits inside an Altoids mint tin. It is powered by the TI Sitara™ AM3358 processor, the large square chip in the center.

The BeagleBone Black is a tiny computer that fits inside an Altoids mint tin. It is powered by the TI Sitara™ AM3358 processor, the large square chip in the center.

Documentation

The PRUs are very complex and don't have nice APIs, so you'll probably end up reading a lot of documentation to use them. The most important document that describes the Sitara chip is the 5041-page Technical Reference Manual (TRM for short). This article references the TRM where appropriate, if you want more information. Information on the PRU is inconveniently split between the TRM and the AM335x PRU-ICSS Reference Guide. For specifics on the AM3358 chip used in the BeagleBone, see the 253 page datasheet. Texas Instruments' has the PRU wiki with more information. More information on using CCS is here.

If you're looking to use the BeagleBone and/or PRU I highly recommend the detailed and informative book Exploring BeagleBone. Helpful web pages on the PRU include BeagleBone Black PRU: Hello World, Working with the PRU and BeagleBone PRU GPIO example. Some PRU example code is in the TI PRU training course.

The BeagleBone Black, with the AM3358 processor in the center. The 512MB DRAM chip is below, with the HDMI framer chip to the right of it. The 4GB flash chip is in the upper right.

The BeagleBone Black, with the AM3358 processor in the center. The 512MB DRAM chip is below, with the HDMI framer chip to the right of it. The 4GB flash chip is in the upper right.

Using a timer and interrupts

For a more complex example, I'll show how to use the PRU with a timer and interrupts.[7] The basic idea is the timer will trigger an interrupt at a set frequency. The PRU code in this example will toggle the GPIO pin when an interrupt occurs, generating a sequence of 5 pulses.[8]

It is important to understand that PRU interrupts are not "real" interrupts that interrupt execution, but are signaled through polling.[9] A PRU interrupt sets bit 30 or bit 31 in register R31.[10] The PRU code can busy-wait on this bit to determine if an interrupt has happened. This is fast and very low latency, compared to context-switching interrupt, but it puts more demands on the program structure.

The first step is to add the plumbing for the timer's interrupt, so the PRU will receive the interrupt. The PRUs can handle 64 different interrupt types from various subcomponents of the system. The timer interrupt is assigned system event number 15 and has the cryptic name pr1_ecap_intr_req. (See TRM table 4-22.) Interrupts are configured in the host side code (loader.c) using the PRUSSDRV library API call prussdrv_pruintc_init. To support the timer interrupt, The diagram below shows the complex PRU interrupt configuration on the BeagleBone (details). The new interrupt path, highlighted in red, connects the timer interrupt (15) to CHANNEL0 and in turn to register R31, the register for polling.

Interrupt handling on the BeagleBone for the PRU microcontrollers. The timer interrupt (15) is shown in red. The default interrupt configuration is extended so the timer interrupt will trigger bit 30 of R31.

Interrupt handling on the BeagleBone for the PRU microcontrollers. The timer interrupt (15) is shown in red. The default interrupt configuration is extended so the timer interrupt will trigger bit 30 of R31.

To add interrupt 15 to the configuration as shown above, the configuration struct in loader.c must be modified. The following structure is passed to prussdrv_pruintc_init to set up the interrupt handling. The changes are highlighted in red. Without this change, timer interrupts will be ignored and the example code will not work.

#define PRUSS_INTC_CUSTOM {   \
 { PRU0_PRU1_INTERRUPT, PRU1_PRU0_INTERRUPT, PRU0_ARM_INTERRUPT, PRU1_ARM_INTERRUPT, \
   ARM_PRU0_INTERRUPT, ARM_PRU1_INTERRUPT,  15, (char)-1  },  \
 { {PRU0_PRU1_INTERRUPT,CHANNEL1}, {PRU1_PRU0_INTERRUPT, CHANNEL0}, {PRU0_ARM_INTERRUPT,CHANNEL2}, {PRU1_ARM_INTERRUPT, CHANNEL3}, \
   {ARM_PRU0_INTERRUPT, CHANNEL0}, {ARM_PRU1_INTERRUPT, CHANNEL1}, {15, CHANNEL0}, {-1,-1}},  \
 {  {CHANNEL0,PRU0}, {CHANNEL1, PRU1}, {CHANNEL2, PRU_EVTOUT0}, {CHANNEL3, PRU_EVTOUT1}, {-1,-1} },  \
 (PRU0_HOSTEN_MASK | PRU1_HOSTEN_MASK | PRU_EVTOUT0_HOSTEN_MASK | PRU_EVTOUT1_HOSTEN_MASK) \
}

The second step to using the timer is to initialize the timer to create interrupts at the desired frequency, as shown in the following code. Using PRU features is fairly difficult since you are controlling them through low-level registers, not a convenient API, so you'll probably need to study TRM section 15.3 to fully understand this. The basic idea is the timer counts up by 1 every cycle (PWM mode is enabled in ECCTL2). When the counter reaches the value in the APRD (period) register, it resets and triggers a "compare equal" interrupt (as controlled by ECEINT). Thus, interrupts will be generated with the period specified by DELAY_NS.

inline void init_pwm() {
  *PRU_INTC_GER = 1; // Enable global interrupts
  *ECAP_APRD = DELAY_NS / 5 - 1; // Set the period in cycles of 5 ns
  *ECAP_ECCTL2 = (1<<9) /* APWM */ | (1<<4) /* counting */;
  *ECAP_TSCTR = 0; // Clear counter
  *ECAP_ECEINT = 0x80; // Enable compare equal interrupt
  *ECAP_ECCLR = 0xff; // Clear interrupt flags
}

The final step is to wait for the interrupt to happen with a busy-wait. The while loop polls register R31 until the timer interrupt fires and sets bit 30. Then the interrupt is cleared in the PRU interrupt subsystem and in the timer subsystem.

inline void wait_for_pwm_timer() {
  while (!__R31 && (1 << 30)) {} // Wait for timer compare interrupt
  *PRU_INTC_SICR = 15; // Clear interrupt
  *ECAP_ECCLR = 0xff; // Clear interrupt flags
}

The oscilloscope trace below shows the result of the timer example program: five precision pulses with a width of 100 nanoseconds on and 100 nanoseconds off. The important advantage of using the PRU microcontroller rather than the regular ARM processor is the output is stable and free of jitter. You don't need to worry about nondeterminism such as context switches or cache misses. If your application won't be affected by milliseconds of random delay, the regular processor is much easier to program, but if you require precision timing, you should use the PRU.

Using the BeagleBone Black's PRU microcontroller to generate pulses with a width of 100 nanoseconds.

Using the BeagleBone Black's PRU microcontroller to generate pulses with a width of 100 nanoseconds.

The full source code for the timer example is here.[11] To run the timer example, you'll also need to use the updated loader.c that enables interrupt 15 (or else nothing will happen).

Conclusion

The PRU microcontrollers give the BeagleBone real-time, deterministic processing, but with a substantial learning curve. Programming the PRUs in C using the IDE is much easier than programming in assembler. (And you can embed assembler code in C if necessary.)

Combining the BeagleBone's full Linux environment with the PRU microcontrollers yields a very powerful system since the microcontrollers provide low-level real-time control, while the main processor gives you network connectivity, web serving, and all the other power of a "real" computer. (My current project using the PRU is a 3 megabit/second Ethernet emulator/gateway to connect to a Xerox Alto.)

Notes and references

[1] Delivering the interrupt to the host code is more complex than you'd expect. I wrote a longer description here, explaining details such as how event 3 on the PRU turns into event 0 on the host.

[2] To compile a C program on the BeagleBone, use the clpru command. See this article for details on clpru.

[3] Code Composer Studio isn't available for Mac, but CCS works well if you run Linux on your Mac using Parallels. I also tried running Linux in VirtualBox, but ran into too many problems.

[4] If you want to see the assembly code generated by the C compiler, use the following steps:

  • Go to Project -> Properties
  • Select the configuration you're building (Debug or Release)
  • Check Advanced Options -> Assembler Options: Keep the generated assembly language file. This adds the --keep_asm flag to the compile.

The resulting assembly file will be in Debug/main.asm. Although the file is hundreds of lines long, the actual generated code is much shorter, starting a few dozen lines into the file. Comments indicate which source lines correspond to the assembly lines.

[5] The hexpru utility converts the ELF-format file generated by the compiler into a raw image file that can be loaded onto the PRU. The bin.cmd file holds the command-line options for hexpru. See the PRU Assembly Language Tools manual for details.

You can configure Code Composer Studio to run hexpru automatically as part of compilation, by doing a bit of configuration. Follow the steps at here to enable and configure PRU Hex Utility.

[6] The loader.c code uses the PRU Linux Application Loader API (PRUSSDRV) to interact with the PRU. I'm told that the cool new framework is remoteproc, but I'll stick with PRUSSDRV for now. (There seems to be a great deal of churn in the BeagleBone world, with huge API changes in every kernel.)

[7] For a timer, I'll use the PRU's ECAP module, which can be configured for PWM and then used as a 32-bit timer. (Yes, this is confusing; see TRM section 15.3 for details.)

[8] This code is intended to demonstrate the timer, not show the best way to generate pulses. If you just want to generate pulses, use the PWM or even a simple delay loop.

[9] You might wonder why you'd use the PRU polling interrupts rather than just polling a device register directly. The reason is you can test the R31 register in one cycle, but reading a device register takes about 3 or 4 cycles (read latency details).

[10] The library uses the convention that PRU0 polls on bit 30 and PRU1 polls on bit 31, but this is arbitrary. You could use both bits to signal one PRU, for instance.

[11] One complexity in the timer source code is the need to define all the register addresses. To figure out a register address, find the address of the register block in the PRU Local Data Memory Map (TRM 4.3.1.2). Then add the offset of the register (TRM 4.5). Note that you can also access these registers from the Linux host side, but the addresses are different. (The PRU is mapped into the host's address space starting at 0x4a300000, TRM table 2.4.)

Restoring YC's Xerox Alto: how our boot disk was trashed with random data

In the previous Xerox Alto restoration session, we got the disk working, but the system didn't boot. After much investigation, I discovered the explanation for the boot failure: the disk has been overwritten with random data! This article describes my journey through the Alto microcode to determine what happened.

Inserting a disk into the Xerox Alto's disk drive. The Alto's video display is visible at the back.

Inserting a disk into the Xerox Alto's disk drive. The Alto's video display is visible at the back.

For background, the Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore it, along with Marc Verdiell, Luca Severini, Ron Crane, Carl Claunch and Ed Thelen (from the IBM 1401 restoration team). For posts on previous restoration days see 1, 2, 3, 4, 5 and 6.

Debugging the boot failure

Last session, after fixing a broken 7414 TTL chip on the disk interface board, we could fetch a block from disk but the Alto failed to boot. We used a logic analyzer to trace the microcode instructions and the ALU bus contents. Josh Dersch from the Living Computer Museum studied the traces and found that the boot program was executing a few instructions (jump, add, load), and then seemed to go off the rails. But it turns out things were more messed up than that.

I made a microcode trace browser to help figure out what was going on. With this program, I can step through an execution trace one micro-instruction at a time and see the corresponding source code line. (Click the image below for the live trace browser.) First, I examined the KWD (disk word task), which executes for each word from disk, and copies that word to memory. I verified that the disk read was working as expected. The second task of interest is the NOVEM (Nova emulator task), which runs a program. In our case, it runs the boot program as soon as it is loaded from disk. By examining this task, we can figure out what is going wrong with the boot process.

Xerox Alto microcode trace viewer.

Xerox Alto microcode trace viewer. With the viewer, you can step through the execution trace collected by the logic analyzer and see each source code line as it is executed. The buttons on the right indicate which microcode task is running at each step.

By studying the disk read microcode (KWD) closely, I was able to extract each word in the disk sector from the logic analyzer trace. This was very difficult for many reasons. For example, we logged the ALU bus which doesn't have the words from disk. I had to figure out the disk contents by reversing the checksum computation, which was on the ALU bus. Another problem was the Alto stores sectors on disk backwards. But eventually I extracted the contents of the boot sector, as read into the Alto:

16a5 2d4a 5a94 b528 14db 29b6 536c a6d8
333b 6676 ccec e753 b02d 1ed1 3da2 7b44
...

I hand-disassembled these words into Data General Nova assembler code and discovered a few things. First, the first few instructions matched Josh's interpretation, so the CPU and the emulator task seemed to be working correctly. Second, the instructions didn't make any sense as code, and some words weren't even instructions, which explained why the boot rapidly fell apart. Third, and most puzzling, the instructions were nothing like what the Alto boot code was supposed to be.

Backplane of the Xerox Alto wired with logic analyzer probes. These probes monitor the executing micro-instructions and the contents of the ALU bus.

Backplane of the Xerox Alto wired with logic analyzer probes. These probes monitor the executing micro-instructions and the contents of the ALU bus.

The boot block seemed to contain random junk. The problem wasn't flaky hardware generating bad data, because the block checksum validated correctly. This wasn't the drive returning the wrong sector, because the sector header was correct. The sector didn't contain instructions, it wasn't ASCII, and it didn't look like a sensible file format. As I studied the sector contents more, I wondered it the data was literally random. I made a histogram of how many times each byte value occurred, and it was pretty much uniform so (In comparison, archived Alto disk sectors showed very non-uniform distributions.) But why would the boot block have been overwritten with (pseudo-) random data?

Josh mentioned DiEx (Diablo disk exerciser), a utility program to diagnose problems with the Alto's Diablo disk drive, and suggested that it could have wiped the disk. I found the DiEx source code in the Computer History Museum's Alto archive, and sure enough, it has a feature to write random data to the disk (and then verify it).

Screenshot of the Diablo Disk Exerciser (DiEx) running on a Xerox Alto simulator. Courtesy of Nathan Lineback, toastytech.com.

Screenshot of the Diablo Disk Exerciser (DiEx) running on a Xerox Alto simulator. Note the early mouse-based GUI; clicking on an entry changes the value. Image courtesy of Nathan Lineback.

I could believe someone had inconveniently wiped our disk with the DiEx utility, but I still had nagging doubts that maybe we were seeing a hardware issue. Could I prove that DiEx was responsible? All I had to do was show that the disk data wasn't arbitrary, but came from DiEx.

Generating random numbers on the Alto

I found the source code for RANDOM.ASM, the Alto's random number code, in the Computer History Museum's Alto archive. This algorithm generates 16-bit random numbers with the recurrence formula: "x[n] = (x[n-33] + x[n-13]) mod 2^16". (Note that are very bad random numbers cryptographically since once you have 33 numbers in the sequence you can generate them all.) I wanted to see if the data we read from disk was generated from this function, so I coded up the algorithm. This was somewhat difficult as the original was written in Nova assembler code. The results didn't match the disk data, no matter what I tried. Finally, I realized that I could just use a brute force solution and ignore the details of the algorithm. I picked random pairs of values in the data and checked if their sum appeared in the data. If the data came from any sort of recurrence, I would get a bunch of matches, but I didn't. I concluded that the disk data wasn't generated from this random number algorithm.

However, on closer examination I noticed that the RANDOM.ASM function signature didn't match the DiEx code, so it probably wasn't the right function. After more searching I found TriexML.asm, another Alto random number function. To generate a random 16-bit word, this algorithm simply shifts the previous value one bit to the left. If there is an overflow, the result is xor'd with the number 077213. (It would be hard to come up with a cryptographically worse random number generator—from one number you can generate the whole sequence—but the algorithm is very fast.)

To check the disk contents against this algorithm, I skipped the careful implementation and went straight to brute force. To see if any shift-and-xor algorithm would explain our data, I shifted each word from the disk sector and xor'd it with the next one. In each case, I got either 0 or octal 077213, matching the algorithm. Starting the algorithm with 012345 (the seed value in the code) eventually generates the exact sector of data we read, proving this algorithm generated the random data we saw on the disk.

A few of the old Xerox Alto disks in Xerox PARC's collection.

A few of the old Xerox Alto disks in Xerox PARC's collection. Hopefully they haven't been overwritten with junk.

Thus, someone had clobbered our disk (probably decades ago) while testing the drive with DiEx. Since we couldn't boot off this disk, we'd need a new boot disk. Xerox PARC has dozens of old Alto disks lying around and they offered some of them to us. But the Living Computer Museum offered to send us a working Alto disk, rather than risk damage to the potentially-interesting contents of an old PARC disk, so we'll use the LCM disk instead.

Conclusion

Last repair session, we fixed a failed 7414 inverter chip on the disk interface board. With that fixed, we could read the disk but boot still failed. After careful investigation of the microcode and traces, I discovered that our disk had been overwritten with random data making it impossible to boot from it. In one way this is a good result, since it means our boot wasn't failing because of a hardware problem.

When we get a new Alto disk, we'll try booting again. I'm moderately optimistic that the system will come up successfully, but there could be more hardware problems waiting for us. For updates on the restoration, follow kenshirriff on Twitter.

Thanks to Josh Dersch and the Living Computer Museum for their debugging help. Thanks to Tim Curley and Xerox PARC for supplying additional disks.

Restoring YCombinator's Xerox Alto day 6: Fixed a chip, data read from disk

In today's Xerox Alto restoration session we investigated why the disk drive isn't working and found a failed chip. With this chip repaired, we were able to read a block from disk, although the system still doesn't boot. (In previous episodes, we fixed the power supply, got the CRT display working, cleaned up the disk drive and hooked up a logic analyzer: days 1, 2, 3, 4 and 5.)

Our test setup for the Xerox Alto. The Alto computer itself is the metal cabinet in the center with the visible circuit boards. On the left is a vintage HP line printer, with the logic analyzer behind it. The video display for the Alto is visible on the right, behind the oscilloscope.

Our test setup for the Xerox Alto. The Alto computer itself is the metal cabinet in the center with the visible circuit boards. On the left is a vintage HP line printer, with the logic analyzer behind it. The video display for the Alto is visible on the right, behind the oscilloscope.

The Alto was a revolutionary computer, designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Luca Severini, Ron Crane, Carl Claunch and Ed Thelen (from the IBM 1401 restoration team). Marc's video of this restoration session is below.

The missing disk sector task

In the Alto, like most modern computers, each machine instruction is implemented in an even more primitive form of code called microcode. But unlike most computers, the Alto also implements some of its low-level software in microcode. Part of the Alto's design philosophy was to use software (i.e. microcode) instead of hardware where possible. For instance, a microcode sector task processes each disk sector and a word task stores each word of data as it arrives from the disk drive; most computers do this with DMA hardware.

Last week we hooked a logic analyzer to the Alto to trace the executing microcode and found the disk sector task was failing to run. Each track on the Alto's hard disk is divided into 12 sectors, with 12 slots in the hub to indicate the sectors. We verified that the disk drive was detecting these slots and sending the sector pulses every 3.33 milliseconds. The disk sector task is supposed to run for each sector and perform any disk command, but the logic analyzer showed that this task was not running.

The hard disk pack for the Xerox Alto has 12 sectors. Slots cut into the disk hub trigger a signal for each sector.

The hard disk pack for the Xerox Alto has 12 sectors. Slots cut into the disk hub trigger a signal for each sector. Four of the sector slots are labeled in the photo.

Why was the sector task not running? The disk interface board provides a signal to indicate when the sector task should run (WAKEST), but we found it was not being activated even though the disk drive was providing sector pulses to the disk interface board. Looking at the disk interface board schematic, the sector pulse circuit is fairly simple: just a few flip flops. (You don't need to understand the schematic below. The key point is the sector pulse comes in on the left, goes through a few chips, and the wakeup signal comes out on the right.) I've heard that old TTL flip flops fail regularly, so I figured one of the flip flop chips had failed. We decided to hook up an oscilloscope and see where things were going wrong, but one problem stood in our way.

Schematic from the Xerox Alto's disk controller card. This circuit processes sector pulses from the disk drive and generates signals to wake up the microcode sector task.

Schematic from the Xerox Alto's disk controller card. This circuit processes sector pulses from the disk drive and generates signals to wake up the microcode sector task.

The extender card

The Alto consists of 13 circuit cards plugged into a wire-wrapped backplane, making them inaccessible to probing. Fortunately, the Living Computer Museum gave us an extender card, a board that goes between an Alto board and the backplane, physically extending the Alto board out of the cabinet where it can be diagnosed. Last week, we used the extender card to probe signals on the CPU control board. But no matter how hard I tried, I couldn't get the extender board to plug into the disk interface board's slot. Marc noticed out that the board was hitting something, and we realized that the disk interface board had a notch on the right, allowing the board to clear a bar that was in the way. The extender board, like most of the Alto boards, lacked this notch. A bit more investigation revealed that memory boards had a notch, but on the left.

Why did some boards have notches? Most of the boards are powered with 5 volts. The memory boards also require -5 volts and +12 volts for the 4116 DRAM chips. The I/O boards (Ethernet and disk) have +/- 15 volts as well as 5 volts. The Alto backplane was apparently designed so you couldn't plug a board into a slot with the wrong voltages (which would have been catastrophic). Boards with unusual power requirements had a notch that allowed them to fit into slots wired with unusual voltages. The consequence was that we couldn't use the extender board with the disk interface without cutting a notch in it, which we did (see photo below).

Milling a notch into the extender board.

Milling a notch into the extender board.

We were worried that by cutting a notch in the extender board and using it in a slot where it wasn't intended we might destroy the computer in a spectacular show of sparks and smoke. The concern was that the extender board doesn't simply pass the 162 lines through, but wires all the ground lines to a ground plane and wires the +5 lines together. If the disk interface card had +15 volts where the extender board expected, say, +5 volts, the extender card would run +15 volts to all the chips and destroy them. We verified the wiring five times to make sure nothing would get shorted, plugged in the extender board, and turned the Alto on with some trepidation. Fortunately our calculations were correct and nothing blew up.

Debugging the disk interface

The photo below shows the disk interface card extended out of the Alto cabinet, with some oscilloscope probes attached to the flip flop chips. (The ribbon cable attached to the board connects to the disk drive, while the ribbon cable hanging above the board allows us to probe microcode signals with the logic analyzer.) Strangely, we didn't see any signals either going into the flip flops or coming out. We checked that the sector pulses were showing up in the logic analyzer, and on the connector from the disk drive, but the flip flops were getting nothing. Eventually we turned our attention to the inverter chip (see earlier schematic). We saw the sector signal going into the inverter, but not coming out. Could this simple chip be causing the problems?

Debugging the disk interface card in the Xerox Alto.

Debugging the disk interface card in the Xerox Alto.

The 7414 TTL chip contains 6 inverters, which turn a 1 input into a 0 output and vice versa. We pulled the chip out of the disk interface board and tested it with a simple LED circuit (see photo below). Five of the six inverters worked fine, but one of the inverters had entirely failed. The chip is a bit unusual since it uses a Schmitt trigger—a circuit that cleans up noisy signals (such as the sector pulses that traveled over a long cable from the disk drive)—so we couldn't get a replacement at Fry's or Radio Shack. Were we stuck for the day?

Testing the 7414 inverter chip from the Xerox Alto's disk interface card. One inverter was burnt out, preventing the disk from working.

Testing the 7414 inverter chip from the Xerox Alto's disk interface card. One inverter was burnt out, preventing the disk from working.

Fortunately we could work around the faulty chip. Carl studied the schematics and discovered that one of the good inverters on the chip was unused. We rewired the chip to replace the bad inverter with the unused good inverter by using an ugly but effective "dead bug" hack. We bent out the pins from the good inverter and attached wires. We cut off the pins from the bad inverter. Finally, we stuck the wires into the socket along with the IC, so the good inverter was wired in place of the bad inverter.

We re-wired a 7414 inverter chip. An unused inverter replaced the failed inverter.

We re-wired a 7414 inverter chip. An unused inverter replaced the failed inverter.

We booted the Alto and found that our chip hack actually worked and the system worked much better than before: the sector pulses got through the inverter, were processed by the flip flops, and triggered the sector task as we hoped. The sector task read the disk command from memory and sent it to the disk drive. The disk drive read the desired sector and started sending bits back. For each word, the disk word task read the word from the disk interface and stored it in memory. In summary, we were now reading data from disk!

Reading data from disk was a big milestone, since most of the system needed to be working properly for this to happen. Unfortunately the Alto didn't boot up, and we'll need to figure out where things went wrong. Is the boot block not running correctly? Is the read data corrupted? Is the disk returning an error at some point? Is our disk not a boot disk? Strangely, there was no sign of the parity errors we kept seeing last week.

The timeline diagram below shows task switching in the Alto over an interval of 700 microseconds.. You can see that the microcode is constantly switching between tasks. Today's accomplishment can be seen in the periodic execution of disk word task (KWT) at the bottom of the image; this task runs about every 9 microseconds when each word comes from the disk drive. The disk sector task (KSEC) runs at the start of the next sector (at which time the word task stops). Other tasks are the memory refresh task (MRT) and cursor task (CURT) that run periodically. (You can see where the higher-priority MRT task interrupted the KSEC task.) The lowest priority task is the Nova emulator (NOVEM), which runs program code when nothing else is happening. The numbers at the bottom show the micro-instruction count since boot; at this point we are 14.8 milliseconds into the boot process. I generated the diagram below by processing the logic analyzer output to show each running task. An interactive version is here, allowing zoom and pan with the mouse.

Timeline showing task switching on the Xerox Alto. These are microcode tasks switched by hardware, not operating system level processes or threads.

Timeline showing task switching on the Xerox Alto. These are microcode tasks switched by hardware, not operating system level processes or threads.

Conclusion

In today's repair session, we found a failed 7414 inverter chip that was preventing disk operation. By working around that issue, we could finally read from disk, but boot is still failing for unknown reasons. Nonetheless, today's session got us much closer to a working system. We'll need to dig through the logic analyzer output to figure out where the boot process is breaking down.