Showing posts with label alto. Show all posts
Showing posts with label alto. Show all posts

The Xerox Alto, Smalltalk, and rewriting a running GUI

Be sure to read the comment from Alan Kay at the bottom of the article!

We succeeded in running the Smalltalk-76 language on our vintage Xerox Alto; this blog post gives a quick overview of the Smalltalk environment. One unusual feature of Smalltalk is you can view and modify the system's code while the system is running. I demonstrate this by modifying the scrollbar code on a running system.

Smalltalk is a highly-influential programming language and environment that introduced the term "object-oriented programming" and was the ancestor of modern object-oriented languages.1 The Alto's Smalltalk environment is also notable for its creation of the graphical user interface with the desktop metaphor, icons, scrollbars, overlapping windows, popup menus and so forth. When Steve Jobs famously visited Xerox PARC, the Smalltalk GUI inspired him on how the Lisa and Macintosh should work.2

Our Xerox Alto running Smalltalk-76.

Our Xerox Alto running Smalltalk-76.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, high-resolution bitmapped displays, WYSIWYG editors, Ethernet, the optical mouse and laser printers to the world, among other things. I've been restoring an Alto from YCombinator, along with Marc Verdiell, Carl Claunch My full set of Alto posts is here and Marc's extensive videos of the restoration are here.

The desktop

Smalltalk introduced the desktop metaphor used in modern computing.3 It included overlapping windows4, multiple desktops and pop-up menus. These windows could be moved and resized with the mouse. (The biggest missing desktop feature was desktop icons, which Xerox later introduced in the Star computer.) To understand how revolutionary this was, consider that the Apple 1 microcomputer came out in 1976, displaying 24 lines of 40 uppercase characters. The mainframe and minicomputer worlds were focused around punched cards, line printers, Teletypes, and dumb CRT terminals. Alan Kay changed all this with his vision of a computer desktop with windows that could be directly manipulated, windows containing fancy typography or images.

Smalltalk introduced the desktop environment, with overlapping windows for multiple applications.

Smalltalk introduced the desktop environment, with overlapping windows for multiple applications.

The screenshot above shows the Smalltalk-76 desktop. At the bottom, a drawing program displays the Smalltalk elf image.5 Icons allow selection of drawing mode, line style, brush color (grayscale), and so forth. The Smalltalk class browser is in the middle. In the upper right is a file viewer. Finally, in the upper left is a window for entering Smalltalk statements, essentially a shell or REPL.

Dynamically changing the running system

One of the most interesting things about the Smalltalk environment is that all the Smalltalk code can be examined and modified, even while the system is running. The class browser below lets you select (using the mouse) a functional area such as "Basic Data Structures" or "Files". You can then select a class in that area, functionality within the class, and then a particular method. The browser then displays the code running on the system. All the Smalltalk code can be examined in this way, making the system's implementation very transparent.

Using the Smalltalk class browser, we can view the code to show a ScrollBar.

Using the Smalltalk class browser, we can view the code to show a ScrollBar.

In the screenshot above, I use the class browser to access "Panes and Menus", then "ScrollBar", then "Image" and finally "show". This displays the code for the scrollbar's "show" method, which draws the scrollbar. This code draws a black rectangle, and then insets a white rectangle, resulting in a black-bordered rectangle for the scrollbar. (Note the unusual dotted-circle character ☉ that Smalltalk-76 uses to create a Point from two Numbers.)

The unusual feature of the class browser is that you can use it to change the system's code, while the system is running, and everything will pick up the new code. For example, I can change how scrollbars are drawn. In the screenshot below, I changed clear: white to clear: black. Pressing the middle mouse button pops up a menu, and I can select "compile". This causes the scrollbar code to be recompiled—while the system is still running. (Note the modern appearance of the contextual pop-up menu.)

After changing the code, I selected "compile" from the pop-up menu.

After changing the code, I selected "compile" from the pop-up menu.

The result of this change is that all scrollbars (for existing or new windows) will now have a black background, as you can see below. The key point is this change was made while the system was running; there is no need to restart anything. Even existing windows get the new scrollbars.

Scrollbars now appear with a black background, even for existing windows.

Scrollbars now appear with a black background, even for existing windows.

Although this scrollbar change was rather trivial, major changes can be made to the running Smalltalk system. One well-known story of changing Smalltalk's behavior on the fly is from Steve Jobs' visit to Xerox PARC. Steve Jobs didn't like the way the window scrolled line-by-line, so Smalltalk implementer Dan Ingalls rewrote the scrolling code in a minute and implemented smooth scrolling while the system was running, much to Jobs' surprise.6

A closer look at some Smalltalk code

In Smalltalk, even most of the math functions are written in Smalltalk. For instance, suppose we wonder how square roots are computed. We can look at the square root function in the class browser by going to "Numbers", "Float", "Math functions", "sqrt". This brings up the seven lines of code below for the square root function. We can see that the code uses five iterations of Newton's method to approximate the square root.

Looking at the square root code in the Smalltalk-76 class browser.

Looking at the square root code in the Smalltalk-76 class browser.

If you're used to Java or C++, this object-oriented code may look strange, especially since Smalltalk uses some strange characters. The first line of code above defines local variables guess and i. In the next line, the right arrow ⇒ implements an "if" statement. If the number receiving the square root message (self) is 0, then 0 is returned (via the up arrow ⇑ return symbol) and if it is negative an exception is raised. The square brackets define blocks, similar to curly braces in C. The instfield line is a bit tricky; it pulls the exponent out of the floating point number and divides it by 2, yielding a reasonable starting point for the square root. Finally, the for loop applies Newton's method 5 times and returns the result. Note the unusual double colon symbol ⦂; this delays evaluation of the argument, so it can be evaluated multiple times in the loop.7

You might think that executing Smalltalk code for math operations would be very slow, and that is the case. The good news is that basic operations such as addition are implemented with short cuts, rather than full message passing. But other operations are slow; the team described performance as between "majestic" and "glacial". Xerox PARC ended up creating the Dorado, a much faster minicomputer than the Alto, to improve performance.

Conclusion

Although Smalltalk wasn't the first object-oriented programming language, Smalltalk introduced the term object-oriented programming and was very influential in later object-oriented programming languages. Most modern object-oriented languages, from Objective-C and Go to Java and Python, show the influence of Smalltalk. Smalltalk was also responsible for design patterns. The famous "Gang of Four" Design Patterns book describes design patterns in Smalltalk and C++.8

Smalltalk systems are still in use. Smalltalk-76 (and the earlier 71 and 72 versions) were intended for research, but Xerox released the Smalltalk-80 version; it was licensed by Xerox to HP, Apple, Tektronix and DEC for royalty-free distribution. Smalltalk-80 in turn led to modern Smalltalk systems such as Pharo, GNU Smalltalk and Squeak (which led to the Scratch language for children).

If you want to try Alto Smalltalk out for yourself, you can use the Contralto emulator, built by the Living Computers Museum. I explain how to run it here. (Most of the screenshots above are from Contralto rather than the live Alto, for clearer images.) Be warned, however, that Smalltalk on the Alto (live or emulated) is painfully slow.

Follow me on Twitter: @kenshirriff to find out about my latest blog posts. I also have an RSS feed.

Notes and references

  1. Smalltalk was developed on the Xerox Alto by Alan Kay, Dan Ingalls, Adele Goldberg and others. 

  2. The details of Steve Jobs' visits to Xerox PARC are highly controversial but the description in Dealers of Lightning seems most accurate. 

  3. Englebart's Mother of All Demos was fundamental to the development of the GUI, introducing the mouse and windows, among other things. This demo had a large influence on Xerox PARC. 

  4. For performance reasons, only the foreground window was active and background windows were not redrawn. 

  5. Screenshots of Smalltalk-76 very often include the lute-playing elf image, so I tracked down the image and figured out how to display it. The command is BitRect new fromuser; filin: 'elf'; edit. This creates a new BitRect object and gets the dimensions from the user. Then it sends a filin: 'elf' message to the BitRect, which performs file input from the elf.pic file. Finally, an edit message is sent to the BitRect, causing it to be opened in the image editor. 

  6. For details on Dan Ingalls' live implementation of smooth scrolling in Smalltalk-78, see this page. The story is also in Dealers of Lightning, page 341. (Dealers of Lightning is the best book I've come across describing Xerox PARC and the development of the Alto.) 

  7. If you want more information on Smalltalk-76, The Smalltalk-76 Programming System Design and Implementation provides a good overview. The document How to use the Smalltalk-76 system gives details on running Smalltalk on the Alto. Unfortunately, the special characters don't appear, making the document slightly cryptic. See Alan Kay's The Early History of Smalltalk for a detailed discussion of Smalltalk's background and history. 

  8. People often think of design patterns as a Java thing, but the Design Patterns book was published in 1994, prior to Java's release in 1995. The original design patterns referred to C++ and Smalltalk. 

Inside the vintage Xerox Alto's display, a tiny lightbulb keeps it working

In this Alto restoration episode, we repaired a second CRT display, exercising our TV repair skills and discovering a tiny mysterious lightbulb that caused the display to fail in a strange way. For those just tuning in, the Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, high-resolution bitmapped displays, WYSIWYG editors, Ethernet and laser printers to the world, among other things.

The YCombinator Xerox Alto, running a Mandelbrot set program I wrote in BCPL.

The YCombinator Xerox Alto, running a Mandelbrot set program I wrote in BCPL.

I've been restoring an Alto from YCombinator, along with Marc Verdiell, Carl Claunch and Luca Severini. Since we have the YCombinator Alto working (above), we've been trying trying to get a second Xerox Alto system running; this one is from DigiBarn. My full set of Alto posts is here and Marc's videos are here.

Inside the Alto's display

When we tried to connect the DigiBarn display to the Alto, we ran into a problem—it had an incompatible connector. Looking inside the display (below), we were surprised to find this connector led to a circuit board with a 6502 microprocessor; since the Alto came out in 1973 and the 6502 in 1975, this didn't make sense. After some investigation, I determined that although the display looked like an Alto display, it was actually for a Dorado, a Xerox minicomputer from 1979 that followed the Alto.

Inside the Xerox Alto's display. With the cover removed, the CRT and monitor circuitry are visible. The 7-wire interface board is at bottom-left. The Alto itself is in the cabinet under the display.

Inside the Xerox Alto's display. With the cover removed, the CRT and monitor circuitry are visible. The 7-wire interface board is at bottom-left. The Alto itself is in the cabinet under the display.

The Dorado was much faster than the Alto because the Dorado used ECL chips, the same technology used in the Cray-1 supercomputer. Unfortunately, since ECL chips used a lot of power and needed powerful cooling fans, the Dorado was too hot and noisy to use in an office. Putting a soundproof enclosure around the Dorado didn't work, so Xerox ended up putting Dorados in machine rooms. The user's keyboard, mouse and display was connected to the remote Dorado with a special cable using the "7-wire protocol" that Xerox invented for this purpose. The 6502 board was the interface for this protocol. 5

To connect the Alto to this display, I built an adapter cable that bypassed the 7-wire board. We hooked up the monitor, powered it on, and were greeted with a empty black screen. Given the age of the monitor, we weren't surprised that it didn't work. However, when we powered off the monitor, we saw a perfect image for a fraction of a second before the image collapsed to a dot and disappeared. This was unexpected—the monitor didn't work at all when powered on, but worked fine (but very briefly) when turned off! What could be going on?

How a CRT monitor works

Since many readers may not be familiar with how CRTs work, I'll take a brief detour and explain how CRTs work. The cathode ray tube (CRT) ruled television displays from the 1930s until LCD displays took over about 10 years ago.2 In a CRT, an electron beam is shot at a phosphor-coated screen, causing a spot on the screen to light up. The beam scans across the screen, left to right and top to bottom in a raster pattern. The beam is turned on and off, generating a pattern of dots on the screen to form the image. The horizontal scan will turn out to be very important to this repair.

The Xerox Alto's display uses an 875-line raster scan. (For simplicity, I'm ignoring interlacing of the raster scan.)

The Xerox Alto's display uses an 875-line raster scan. (For simplicity, I'm ignoring interlacing of the raster scan.)

The cathode ray tube is a large vacuum tube containing multiple components, as shown below. A small electrical heater, similar to a light bulb filament, heats the cathode to about 1000°C. The cathode is an electrode that emits electrons when hot. A control grid surrounds the cathode; putting a negative voltage on the grid repels the negatively-charged electrons, reducing the beam strength and thus the brightness. The next grid has about 800 volts on it, attracting and accelerating the electrons. The focus grid, at about 600 volts, squeezes the electron beam to form a sharp spot. The anode is positively charged to a high voltage (17,000V), accelerating the electrons to hit the screen with high energy. The screen is coated with a phosphor, causing it to glow where hit by the electron beam. Finally, two electromagnets are arranged on the neck of the tube to deflect the beam horizontally and vertically in the raster scan pattern shown earlier; these are the deflection coils.

Diagram of a Cathode Ray Tube (CRT). Based on drawings by Interiot and Theresa Knott (CC BY-SA 3.0)

Diagram of a Cathode Ray Tube (CRT). Based on drawings by Interiot and Theresa Knott (CC BY-SA 3.0)

The photo below shows the Cathode Ray Tube (CRT) inside the Alto's monitor, with the screen at the right. In the center, the red deflection coils are mounted on the neck of the tube. The thick red wire provides 17,000 volts to the anode. This high voltage is generated by the flyback transformer, the UFO-like gray disk in the lower left.

Inside the Xerox Alto's display, the CRT picture tube is visible. The thick red wire provides 17,000 volts from the flyback transformer to the tube's anode. The other red wire is connected to the 500MΩ 6W bleeder resistor, the large white cylinder at the left. Note the dirt and debris on the flyback transformer from the system's storage in a barn.

Inside the Xerox Alto's display, the CRT picture tube is visible. The thick red wire provides 17,000 volts from the flyback transformer to the tube's anode. The other red wire is connected to the 500MΩ 6W bleeder resistor, the large white cylinder at the left. Note the dirt and debris on the flyback transformer from the system's storage in a barn.

Back to the broken display

Why would the display work for a moment, just as it is powered off? It must have something to do with voltage levels dropping as the power supplies shut down—something that wasn't working at full voltage, but worked at a lower voltage. One theory was that one of the CRT grids might have the wrong voltage. Since electrons are negative, they are attracted to positive voltages (such as the 17,000 volt anode) and repelled by negative voltages. If a grid was too negative, the electron beam could be blocked. Perhaps as the power supplies shut down, the negative grid problem briefly resolved itself.3

We opened up the display and measured some voltages, taking extreme care around the high voltages. Verifying the 17,000 V supply was easy; with a voltage this high, waving an oscilloscope probe a few inches away is sufficient to pick up a signal (below). The main 55V supply was also good. But when we checked the grid voltages, we didn't get anything.

The service manual shows the waveform you can pick up two inches away from the flyback transformer.

The service manual shows the waveform you can pick up two inches away from the flyback transformer.

The grid voltages and 17KV supply are generated by the flyback transformer. Since we saw the 17KV signal, we knew the horizontal deflection circuit and the flyback transformer were working. Perhaps a capacitor had failed, but we didn't find any bad ones. On the schematic we noticed a tiny lightbulb in the high-voltage circuit, an unexpected circuit element. We measured the bulb's resistance on the board (below) and found it was open. We figured the bulb must have burned out, but after removing it we discovered that instead one of the bulb's leads had broken off right at the glass case.

The bulb is visible in front of the right side of the transformer.

The bulb is visible in front of the right side of the transformer.

The service manual for the monitor called the bulb a "No. 1764." I was afraid that this was an internal part number and we wouldn't be able to determine the correct replacement bulb. However, Google revealed that this was a 28V 0.04A miniature bulb, sold by many vendors. Unfortunately we couldn't find any local stores that sold this bulb and we wanted to test out a fix immediately. So Marc performed some precision microscope soldering to reattach the broken wire. Since the wire had broken off right at the glass, reattaching it was very difficult but he succeeded. We re-installed the bulb and the display worked fine!

Why is there a bulb inside the power supply? I assume that it is used as a current limiter. Bulbs have very low resistance when cold, but increase resistance as they warm up. It seems crazy to subject a 28 V bulb to pulses of 600 volts, but since the pulses are only a few microseconds, the bulb survives them just fine.

The tiny bulb inside the display's power supply.

The tiny bulb inside the display's power supply.

Details on the power supply

The high-voltage power supply is described in the monitor service manual, but I'll give a brief summary here.4

The primary purpose of the horizontal sync circuit is to create a sawtooth current through the horizontal deflection coil to scan left-to-right across each row of the screen. A common trick in TVs is to use the high-frequency (26 kHz) horizontal sync to generate high voltages. To do this, the horizontal sync circuit sends high-current pulses (2-3 amps) through the flyback transformer. This step-up transformer produces the 17 kilovolts required by the CRT's anode. A second transformer winding produces -100 volts, while the third winding is used to generate 600V and 1000V. (Interestingly, cell phone chargers also use flyback transformers, but obviously at much lower voltages.)

The photo below shows the flyback transformer (left). The thick black wire at the bottom of the photo connects the 17KV from the transformer to the picture tube, while the colorful wires at the top provide the lower voltages.

Flyback transformer inside the monitor. The large white cylinder is a 500 megaohm, 6W resistor. You don't usually see such a high resistance combined with a high wattage, but the resistor bleeds off the high voltage from the CRT.

Flyback transformer inside the monitor. The large white cylinder is a 500 megaohm, 6W resistor. You don't usually see such a high resistance combined with a high wattage, but the resistor bleeds off the high voltage from the CRT.

The schematic below indicates the key components of the power supply. The switching transistor is driven by the horizontal sync input. When it switches on, current (red line) builds up in the flyback transformer, storing energy in the transformer's magnetic field. When the transistor switches off, this stored energy is released into the secondary windings, producing 17KV for the anode and -100V for the brightness grid. In addition, a 600V pulse is created across the primary. The pulse (yellow line) flows through the bulb and a diode, generating 600V for the focus grid. The voltage doubler circuit (circled in pink) generates 1000V for the second (accelerator) grid.

The high-voltage power supply is driven by the horizontal deflection circuit.
The switching transistor puts 55 volts across the flyback transformer. When it switches off, the flyback transformer produces 17 kilovolts for the CRT anode, as well as powering the 600V, 1000V and -100V supplies.

The high-voltage power supply is driven by the horizontal deflection circuit. The switching transistor puts 55 volts across the flyback transformer. When it switches off, the flyback transformer produces 17 kilovolts for the CRT anode, as well as powering the 600V, 1000V and -100V supplies.

Why did the display originally show a picture for a moment as it was powered off? With the bulb not working, the 800V acceleration grid and the focus grid didn't receive any voltage, but the brightness grid was still powered (since -100V comes from a different winding). My theory is that without the attraction from the acceleration grid, electrons couldn't get past the negative brightness grid. But when the brightness grid lost power, the electron beam was no longer blocked and could reach the screen, until everything else shut down moments later.

Conclusion

Cathode ray tubes were the dominant display technology until LCD displays took over about 10 years ago. Now, CRT TV repair is a retro activity, involving circuits such as horizontal deflection, video amplifiers, and high-voltage flyback transformers that were formerly well-known but are now more obscure.

We tracked down the display's problem to a tiny light bulb, an unusual component to find in a critical role in a high-voltage power supply. Surprisingly, despite being exposed to 600 volt pulses, the problem with this 28 volt bulb wasn't that it had burnt out, but that one of its tiny leads had broken. After repairing the bulb, the display worked fine. Unlike our previous display which had a very faint CRT, this one produced a crisp, bright image. Since we got the display working and didn't get any high-voltage shocks, I consider this a successful restoration session.

The repaired display shows a test pattern, generated by the crttest program. The screen is bright and sharp, but the horizontal centering still needed adjustment.

The repaired display shows a test pattern, generated by the crttest program. The screen is bright and sharp, but the horizontal centering still needed adjustment.

Thanks to the Living Computer Museums + Lab for providing the display test board. Thanks to Al Kossow and Bitsavers for the scanned service manual.

Notes and references

  1. For more on the Alto's monitor, see my article from last year about restoring our first display

  2. A computer monitor is essentially a television set, but without the tuner to select the desired channel from the antenna. In addition, televisions have circuitry to extract the horizontal sync, vertical sync and video signals from the combined broadcast signal. These three signals are supplied to the Alto monitor separately, simplifying the circuitry. 

  3. I should admit that Marc and Carl had the right theory about the problem. My theory that the video input voltage might be too high didn't pan out. 

  4. I didn't discuss the 55V supply that powers the monitor. It is a straightforward regulated linear power supply driven from the 120V AC input. I also also didn't explain how the horizontal deflection coil operates. it is driven by the same transistor as the flyback transformer, but uses a fairly complex inductor-capacitor resonance circuit to generate the scan across the screen. (The scan current is a sawtooth; a smooth scan left-to-right followed by a rapid retrace back to the left.) For a thorough discussion of how the display's power supply works, see page 3-4 of the service manual

  5. The 7-wire interface used a 15-wire cable with standard DB-15 (serial port) connectors. It sent seven ECL signals as differential pairs, and used the remaining wire for ground. Calling it "7-wire" seems a bit misleading, since it used 15 wires in total. The board schematic is in the Dorado schematics page 159. The video signal was multiplexed across four of the signals; this reduced the bandwidth requirement by a factor of four. One signal was serial data; this transmitted the keyboard and mouse information. The remaining two signals were (apparently redundant) clocks. The protocol supported daisy-chaining, so multiple peripherals (such as a printer) could be connected.

    This 7-wire Terminal Interface board was used by Xerox PARC to connect a keyboard/display/mouse to a remote Dorado minicomputer

    This 7-wire Terminal Interface board was used by Xerox PARC to connect a keyboard/display/mouse to a remote Dorado minicomputer

    The photo above shows the 7-wire terminal interface board inside the display. The large chips in the upper right are the 6532 "RIOT" I/O chip, the 6502 microprocessor, and a 2716 EPROM holding the code. The remaining chips are a mixture of TTL and ECL. At the bottom are connectors for 7-wire in, 7-wire out, and the keyboard. The connector to the monitor itself is in the upper center. 

Bitcoin mining on a vintage Xerox Alto: very slow at 1.5 hashes/second

I've been restoring a Xerox Alto minicomputer from the 1970s and figured it would be interesting to see if it could mine bitcoins. I coded up the necessary hash algorithm in BCPL (the old programming language used by the Alto) and found that although the mining algorithm ran, the Alto was so slow that it would take many times the lifetime of the universe to successfully mine bitcoins.

Bitcoin mining on a vintage Xerox Alto computer.

Bitcoin mining on a vintage Xerox Alto computer.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced high-resolution bitmapped displays, the GUI, Ethernet and laser printers to the world, among other things. In the photo above, the Alto computer is in the lower cabinet. The black box is the 2.5 megabyte disk drive. The Alto's unusual portrait display and an early optical mouse are on top.

How Bitcoin mining works

Bitcoin, a digital currency that can be transmitted across the Internet, has attracted a lot of attention lately. The Bitcoin system can be thought of as a ledger that keeps track of who owns which bitcoins, and allows them to be transferred from one person to another. The revolutionary feature of Bitcoin is there's no central machine or authority keeping track of things. Instead, the "blockchain" is stored across thousands of machines on the Internet, and the system works with nobody in charge.

To ensure everyone agrees on which transactions are valid, Bitcoin uses a process called mining—about every 10 minutes a block of outstanding transactions is mined, which makes the block "official". Bitcoin mining is designed to take an insanely huge amount of computational effort to mine a block, so nobody can take over the mining process. Miners compete against each other, generating trillions and trillions of "hashes" until someone finds a lucky one that succeeds in mining a block. It's hard to visualize just how difficult the hashing process is: finding a valid hash is less likely than finding a single grain of sand out of all the sand on Earth.

Bitcoin mining is based on cryptography, with a "hash function" that converts a block into an essentially random hash value. If the hash starts with 17 zeros,1 the block is successfully mined and is sent into the Bitcoin network. Most of the time the hash isn't successful, so the miner will modify the block slightly and try again, over and over billions of times. About every 10 minutes someone will successfully mine a block, and the process starts over. It's kind of like a lottery, where miners keep trying until someone "wins".

As a side-effect, mining adds new bitcoins to the system. For each block mined, miners currently get 12.5 new bitcoins (currently worth about $30,000) as well as fees, which encourages miners to do the hard work of mining blocks. With the possibility of receiving $30,000 every 10 minutes, miners invest in datacenters full of specialized mining hardware using huge amounts of electricity2.

Structure of a Bitcoin block

Structure of a Bitcoin block. The data in yellow is hashed to yield the block hash, which becomes the identifier for the block. The block is linked to the previous block by including the previous block's hash, forming the blockchain. The Merkle root is a hash of all the transactions in the block.

The diagram above shows what actually goes into a block that is mined. The yellow part is the block header (which gets hashed), and it is followed by the transactions that go into the block. Each block contains the hash of the previous block, causing all the blocks to be linked together forming the blockchain. You can see that for the block above, the hash is successful because it starts with lots of zeros: 0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50. The "Merkle root" is a hash of all the transactions that go into the block; this ensures that none of the mined transactions can be changed. The nonce is an arbitrary number; each attempt at mining the block changes the nonce.

To summarize the mining process: you collect new Bitcoin transactions and create a header (as in the diagram above). You do the cryptographic hash of the block. If by some incredible chance the result starts with 17 zeros you send the block into the Bitcoin network and "win" $30,000 in bitcoin. Otherwise, you change the nonce and try again. Probably none of the nonce values will work, so you change something else in the header and start over. If someone else succeeds in mining the block, you start over with the new previous block hash and new transactions.

I've simplified a lot of details above. For in-depth information on Bitcoin and mining, see my articles Bitcoins the hard way and Bitcoin mining the hard way.

The SHA-256 hash algorithm used by Bitcoin

Next, I'll discuss the hash function used in Bitcoin, which is based on a standard cryptographic hash function called SHA-256.3 The SHA-256 algorithm is so simple you can literally do it by hand, but it manages to scramble the data entirely unpredictably. The SHA-256 hash algorithm takes input blocks of 512 bits (i.e. 64 bytes), combines the data cryptographically, and generates a 256-bit (32 byte) output.

The SHA-256 algorithm consists of a simple round repeated 64 times. The diagram below shows one round, which takes eight 4-byte inputs, A through H, performs a few operations, and generates new values for A through H. As can be seen from the diagram above, only A and E are changed in a round, while the others are just shifted over. Even so, after 64 rounds the input data will be completely scrambled, generating the unpredictable hash output.

The SHA-256 algorithm is pretty simple, about a page of pseudocode and can be easily implemented on a computer, even one as old as the Alto, using simple arithmetic and logic operations.5

SHA-256 round, from Wikipedia

SHA-256 round, from Wikipedia created by kockmeyer, CC BY-SA 3.0.

The dark blue boxes mix up the values in non-linear ways that are hard to analyze cryptographically. (If you could figure out a mathematical shortcut to generate successful hashes, you could take over Bitcoin mining.) The Ch "choose" box chooses bits from F or G, based on the value of input E. The Σ "sum" boxes rotate the bits of A (or E) to form three rotated versions, and then sum them together. The Ma "majority" box looks at the bits in each position of A, B, and C, and selects 0 or 1, whichever value is in the majority. The red boxes perform 32-bit addition, generating new values for A and E. The input data enters the algorithm through the Wt values. The Kt values are constants defined for each round.4

Implementing SHA-256 in BCPL

I implemented SHA-256 in BCPL, a programming language that was a precursor to C. It's a lot like C with syntax changes, except the only type is 16-bit words. My SHA-256 code is in sha256.bcpl. The snippet below (the choose function) will give you an idea of what BCPL looks like. Each value is two words; BCPL does array access with !1 instead of [1]. Like C++, comments are indicated with a double slash. Unlike C, BCPL uses words for xor and not.

   // ch := (e and f) xor ((not e) and g)
   ch!0 = (e!0 & f!0) xor ((not e!0) & g!0)
   ch!1 = (e!1 & f!1) xor ((not e!1) & g!1)

The mining is done in bitcoin.bcpl: it creates a Bitcoin header (from hardcoded values), substitutes the nonce, and calls the SHA-256 code to hash the header twice. One interesting feature of the code is the structure definition for the Bitcoin header in BCPL (below), similar to a C struct. It defines a two word field for version, 16 words for prevHash, and so forth; compare with the Bitcoin structure diagram earlier. Interestingly, ^1,16 indicates an array with indices from 1 to 16 inclusive. BCPL is not 0-indexed or 1-indexed, but lets you start array indices at arbitrary values.7

structure HEADER:
[
version^1,2 word
prevHash^1,16 word
merkleRoot^1,16 word
timestamp^1,2 word
bits^1,2 word
nonce^1,2 word
]

The line shows how structures are accessed in BCPL; it initializes one word of the header, using the slightly strange BCPL syntax. >>HEADER sort of casts the header variable to the HEADER structure described earlier. Then .prevhash^1 accesses the first word of the prevhash field. Also note that #13627 is an octal value; BCPL inconveniently doesn't include hex constants.6

header>>HEADER.prevHash^1 = #13627

The screenshot below shows the output as the program runs. The number on the left is each nonce in sequence as it is tested. The long hex number on the right is the resulting hash value. Each nonce results in a totally different hash value, due to the cryptographic hash algorithm.

On the Alto screen, each line shows a nonce value and the resulting hash.

On the Alto screen, each line shows a nonce value and the resulting hash.

Performance

The Alto can hash about 1.5 blocks per second, which is exceedingly slow by Bitcoin standards. At this speed, mining a single block on the Alto would take about 5000 times the age of the universe The electricity would cost about 2x10^16 dollars. And you'd get 12.5 bitcoins (₿12.5) worth about $30,000. Obviously, mining Bitcoin on a Xerox Alto isn't a profitable venture.

In comparison, a USB stick miner performs 3.6 billion hashes per second. The Alto cost about $12,000 to manufacture in 1973 (about $60,000 in current dollars), while the stick miner costs $200. The Alto used hundreds of watts, while the stick minter uses about 4 watts. The enormous difference in performance is due to huge increase in computer speed since the 1970s described by Moore's law as well as the giant speed gain from custom Bitcoin mining hardware.

The Alto wasn't a particularly fast machine, performing about 400,000 instructions per second. The Alto's instruction set lacks many of the operations you'd find on a modern processor. For instance, the SHA-256 algorithm makes heavy use of Boolean operations including exclusive-OR and OR. These are pretty basic instructions that you'd find on even something as primitive as the 6502, but the Alto doesn't have them. Instead, these operations are implemented with an inefficient subroutine call that does a sequence of operations with the same effect.

In addition, SHA-256 heavily uses bit shift and rotate operations. Modern processors typically have a "barrel shifter" that lets you shift by as many bits as you want in one step. The Alto's shift instructions, on the other hand, only shift a single bit. Thus, to shift by, say 10 bits, the Alto code calls a subroutine that performs 10 separate shift instructions. The result is a shift operation is much slower than you might expect.

You can see the Alto's arithmetic-logic board below. The Alto didn't use a microprocessor but instead built a CPU from simple TTL chips. You can see that even providing single-bit shifts required 8 separate chips—it's not surprising that the Alto doesn't do more complex shift operations.

The Alto doesn't have a microprocessor, but a CPU built from individual TTL chips. The ALU board has chips for arithmetic, chips for shifting, and chips for registers.

The Alto doesn't have a microprocessor, but a CPU built from individual TTL chips. The ALU board has chips for arithmetic, chips for shifting, and chips for registers.

I should point out that I'm not trying to write the best possible mining code for the Alto, and there are plenty of optimizations that one could do.8 For instance, writing the code in microcode would speed it up considerably, but Alto microcode is very hard to understand, let along write. My blog post on generating the Mandelbrot set on the Alto discussed Alto performance optimizations in detail, so I won't say more about optimization here.

Conclusion

The screenshot below shows a successful hash, ending in a bunch of zeros9. (I also display an image to show off the Alto's high-resolution bitmapped display.) Since the Alto would take well beyond the lifetime of the universe to find a successful hash, you might wonder how I found this. For this demonstration I simply used as input a block that had been successfully mined in the past, specifically block #286819. Thus, the algorithm succeeded quickly, but since it was an old block, I didn't make any money off it.

The algorithm found a successful hash, indicated by all the zeros at the end. Bitcoin graphic source probably MoneyWeek.

The algorithm found a successful hash, indicated by all the zeros at the end. Bitcoin graphic source probably MoneyWeek.

My code is on github if you want to look at BCPL code or try it out.

Notes and references

  1. At current difficulty, about 1 in 3x10^21 hashes will be successful at mining a block; a valid hash must start with approximately 17 zeros. The mining difficulty changes periodically to keep the time between blocks at approximately 10 minutes. As mining hardware gets faster, the difficulty factor is automatically updated to make mining more difficult so miners can never really catch up. 

  2. A while back I estimated that Bitcoin mining uses about as much electricity as the entire country of Cambodia. One paper puts mining's energy consumption comparable to Ireland's electricity usage. 

  3. Bitcoin uses "double SHA-256" which simply consists of applying the SHA-256 function twice.  

  4. The K constants used in the SHA-256 algorithm are provided by the NSA. You might worry that the NSA cleverly designed these constants to provide a backdoor. However, to prove that these are just arbitrary random constants, the NSA simply used the cube roots of the first 64 primes. 

  5. While SHA-256 is easy to implement, that's not the case for all the cryptography used by Bitcoin. To create a Bitcoin transaction, the transaction must be signed with elliptic curve cryptography. This requires 256-bit modular arithmetic, which is about as hard as it sounds. Even a simple implementation is 1000 lines of C. I decided that porting this to BCPL was too much effort for me. 

  6. I wrote a simple Python script to convert the many 32-bit hexadecimal constants used by SHA-256 to 16-bit octal constants. It's a good thing that hex has almost entirely replaced octal, as it is much better. 

  7. Some people claim that BCPL arrays are 0-based. However, arrays in BCPL structures can start at an arbitrary value. I start with 1 because that's what the Alto code typically does. (This caused me no end of confusion until I realized the indices weren't zero-based.) 

  8. The code could be made 33% faster by taking advantage of an interaction between SHA-256 and the Bitcoin header structure. Bitcoin performs a SHA-256 hash twice on the 80-byte header. Since SHA-256 only handles 64 bytes at a time, the first hash requires two SHA-256 cycles. The second hash takes a third SHA-256 cycle. However, when mining, the only thing that changes from one attempt to the next is the nonce value in the header. It happens to be in the second half of the header, which means the SHA-256 cycle performed on the first half of the header can be done once and then reused. Thus, the double SHA-256 hash can be done with two SHA-256 cycles instead of three. Bitcoin mining usually performs this optimization, but I left it out of the code to make the code less confusing. 

  9. You might wonder why Bitcoin successful hashes start with a bunch of zeros, while the displayed hash ends with a bunch of zeros. The reason is that Bitcoin reverses the byte order of the SHA-256 output. If you look closely, you'll see that the displayed hash matches the hash in the Bitcoin block diagram if you reverse bytes (i.e. pairs of hex digits). 

Improvements to the Xerox Alto Mandelbrot drop runtime from 1 hour to 9 minutes

Last week I wrote a Mandelbrot set program for the Xerox Alto, which took an hour to generate the fractal. The point of this project was to learn how to use the Alto's bitmapped display, not make the fastest Mandelbrot set, so I wasn't concerned that this 1970s computer took so long to run. Even so, readers had detailed suggestions form performance improvements, so I figured I should test out these ideas. The results were much better than I expected, dropping the execution time from 1 hour to 9 minutes.

The Mandelbrot set, generated by a Xerox Alto computer.

The Mandelbrot set, generated by a Xerox Alto computer.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced high-resolution bitmapped displays, the GUI, Ethernet and laser printers to the world, among other things. For my program I used BCPL, the primary Alto programming language and a precursor to C. In the photo above, the Alto computer is in the lower cabinet. The black box is the 2.5 megabyte disk drive. The Alto's unusual portrait display and an early optical mouse are on top.

Easy optimization: mirror the Mandelbrot set

The Mandelbrot set is a famous fractal, generated by a simple algorithm. Each point on the plane represents a complex number c. You repeatedly iterate the complex function f(z) = z^2 + c. If the value diverges to infinity, the point is outside the set. Otherwise, the point is inside the set and the pixel is set to black.

Since the Mandelbrot set is symmetrical around the X axis, a simple optimization is to draw both halves at the same time, cutting the time in half. (This only helps if you're drawing the whole set; this doesn't help if you're zoomed in.) I implemented this, cutting the time down to about 30 minutes. The image below shows the mirrored appearance midway through computation.

Drawing both halves of the Mandelbrot set simultaneously doubles the performance.

Drawing both halves of the Mandelbrot set simultaneously doubles the performance.

Improving the code

Embedded programmer Julian Skidmore had detailed comments on how to speed up my original code. I tried his changes and the time dropped from 30 to 24 minutes. Some of his changes were straightforward - calculating the pixel address in memory incrementally instead of using multiplication, and simplifying the loop counting. But one of his changes illustrates how primitive the Alto's instruction set is.

Quick background: my Mandelbrot program is implemented in BCPL, a programming language that was a precursor to C. The program is compiled to Nova machine code, the instruction set used by the popular Data General Nova 16-bit minicomputer. The Alto implements the Nova instruction set through microcode.

Since the Xerox Alto doesn't support floating point, I used 16-bit fixed point arithmetic: 4 bits to the left of the decimal point and 12 bits to the right. After multiplying two fixed point numbers with integer multiplication, the 32-bit result must be divided by 2^12 to restore the decimal point location. Usually if you're dividing by a power of 2, it's faster to do a bit shift. That's what I originally did, in the code below. (In BCPL, % is the OR operation, not modulo. ! is array indexing.)

let x2sp = (x2!0 lshift 4) % (x2!1 rshift 12)

Unfortunately this turns out to be inefficient for a couple reasons. Modern processors usually have a barrel shifter, so you can efficiently shift a word by as many bits as you want. The Alto's instruction set, however, only shifts by one bit. So to right shift by 12 bits, the compiled code calls an assembly subroutine (RSH) that does 12 separate shift instructions, much slower than the single instruction I expected. The second problem is the instruction set (surprisingly) doesn't have a bitwise-OR instruction, so the OR operation is implemented in another subroutine (IOR).1 I took Julz's suggestion and used the MulDiv assembly-language function to multiply two numbers and divide by 4096 instead of shifting. It's still not fast (since the Alto doesn't have hardware multiply and divide), but at least it reduces the number of instructions executed.

Shutting off the display

One way to speed up the Alto is to shut off the display.2 I tried this and improved the time from 24 minutes to 9 minutes, a remarkable improvement. Why does turning off the display make such a big difference?

One of the unusual design features of the Alto is that it performed many tasks in software that are usually performed in hardware, giving the Alto more flexibility. (As Alan Kay put it, "Hardware is just software crystallized early.") For instance, the CPU is responsible for copying data between memory and the disk or Ethernet interface. The CPU also periodically scans memory to refresh the dynamic RAM. These tasks are implemented in microcode, and the hardware switches between tasks as necessary, preempting low priority tasks to perform higher-priority tasks. Executing a user program has the lowest priority and runs only when there's nothing more important to be done.

All this task management was done in hardware, not by the operating system. The Xerox Alto doesn't use a microprocessor chip, but instead has a CPU built out of three boards of simple TTL chips. The board below shows one of the CPU boards, the control board that implements the microcode tasks and controls what the CPU is doing. It has PROMs to hold the microcode, 64-bit RAM chips (yes, just 64 bits) to remember what each task is doing, and more chips to determine which task has the highest priority.

Control board for the Xerox Alto. Part of the CPU, this board holds microcode and handles microcode tasks.

Control board for the Xerox Alto. Part of the CPU, this board holds microcode and handles microcode tasks.

The task that affects the Mandelbrot program is the display task: to display pixels on the screen, the CPU must move the pixels for each scan line from RAM to the display board, 30 times a second, over and over. During this time, the CPU can't run any program instructions, so there's a large performance impact just from displaying pixels on the screen. Thus, not using the display causes the program to run much, much faster. I still set the Mandelbrot pixels in memory though, so when the program is done, I update the display pointer causing the set to immediately appear on the display. Thus, the Mandelbrot set still appears on screen; you just don't see it as it gets drawn.

Microcode: the final frontier

The hardest way to optimize performance on the Alto is to write your own microcode. The Alto includes special microcode RAM, letting you extend the instruction set with new instructions. This feature was used by programs that required optimized graphics such as the Bravo text editor and the Draw program. Games such as Missile Command and Pinball also used microcode for better performance. Writing the Mandelbrot set code in microcode would undoubtedly improve performance. However, Alto microcode is pretty crazy, so I'm not going to try a microcode Mandelbrot.

Conclusion

After writing a Mandelbrot program for the Xerox Alto, I received many suggestions for performance improvements. By implementing these suggestions, the time to generate the Mandelbrot set dropped from one hour to 9 minutes, a remarkable speedup. The biggest speedup came from turning off the display during computation; just putting static pixels on the screen uses up a huge amount of the processing power. My improved Mandelbrot code is on github.

My goal with the Mandelbrot was to learn how to use the Alto's high-resolution display from a BCPL program. Using what I learned with the Mandelbrot, I wrote a program to display images; an example is below.3

The Xerox Alto displaying an image of the Xerox Alto displaying...

The Xerox Alto displaying an image of the Xerox Alto displaying...

Notes and references

  1. The Alto has an AND machine instruction but not an OR instruction, so the OR operation is performed by complementing an argument, ANDing, and complement-adding the complement. I.e. ab plus b. 

  2. Strictly speaking, I left the display on; it just wasn't displaying anything. The Alto uses a complex display system with a display list pointing to arbitrarily-sized blocks of pixels. (For instance, when displaying text, each block is a separate text line. Scrolling the screen just involves updating the list pointers, not moving actual data.) Thus, I could set the display list to NULL while rendering the Mandelbrot into memory. Then when the Mandelbrot was done, I simply updated the display list to make the image appear. 

  3. The recursive picture-in-picture effect is known as the Droste effect. After making this picture, I learned that generating the Droste effect on old computers is apparently a thing

One-hour Mandelbrot: Creating a fractal on the vintage Xerox Alto

I wrote a short program to generate the Mandelbrot set on the Xerox Alto, a groundbreaking minicomputer from the 1970s. The program, in the obsolete BCPL language, ran very slowly—taking almost exactly an hour—but the result below shows off the Alto's monochrome bitmapped display. (Bitmapped displays were a rarity at the time because memory was so expensive.)

The Xerox Alto took an hour to generate the Mandelbrot set.

The Xerox Alto took an hour to generate the Mandelbrot set.

The Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. In the photo above, the Alto computer itself is in the lower cabinet. The Diablo disk drive (with the 1970s orange stripe) uses a removable 14 inch disk pack that stores 2.5 megabytes of data. (A bunch of disk packs are visible behind the Alto.) The Alto's display is bitmapped with 606x808 pixels in an unusual portrait orientation, and the optical mouse is next to the display.

Last year Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Luca Severini and Carl Claunch. My full set of Alto posts is here and Marc's videos are here. I haven't posted an update for a while, but now I can write new programs and download them to the Alto using the Living Computer Museum's Alto file system implementation and gateway to the Alto's 3Mb Ethernet. I decided to start with the Mandelbrot set to take advantage of the Alto's high resolution display.

Marc's latest video shows the Mandelbrot programming running on the Alto.

The Mandelbrot program

The Mandelbrot set algorithm is fairly simple. Each point on the plane represents a complex number c. You repeatedly iterate the complex function f(z) = z^2 + c. If the value diverges to infinity, the point is outside the set. Otherwise, the point is inside the set and the pixel is set to black. Setting the pixel is tricky because the Alto doesn't have a graphics API; you need to determine which bit in memory to set.4

Since the Xerox Alto doesn't support floating point1, I needed a way to represent the numbers with its 16-bit word. I use fixed point arithmetic: 4 bits to the left of the decimal point and 12 bits to the right.2 For instance, the number 1.25 is represented in 16 bits as 1.25*2^12 = 0x1400. These fixed point numbers can be added with standard integer addition. After multiplying two fixed point numbers with integer multiplication, the 32-bit result must be divided by 2^12 (i.e. shifted right by 12) to restore the decimal point location.3

The code (above) is written in BCPL, the main language used on the Alto. BCPL is a precursor to C and many features of C are clearly visible in BCPL: everything from lvalues and rvalues to the ternary operator. You can think of BCPL as C without types; the only BCPL types are 16-bit words along with C-like structs, unions and bitfields. BCPL may look unfamiliar at first, but the code above should be clear if you consider the following syntax differences with C:

  • Blocks are indicated with [ and ] instead of { and }.
  • Indexing is with a!1 instead of a[1].
  • And, Or, and Shift bit operations are &, %, and lshift/rshift.
  • Variable definitions use let.
  • Arrays are defined with vec.

More information on BCPL is in the BCPL Reference Manual and my earlier article on using BCPL with the Alto.

The Xerox Alto, a few minutes into generation of the Mandelbrot set.

The Xerox Alto, a few minutes into generation of the Mandelbrot set.

Why is the Alto so slow?

Running the Mandelbrot set illustrates the amazing improvement in computer speed since the Alto was created in 1973 and the huge changes in computer architecture. On a modern computer, a Javascript program can generate the Mandelbrot set in a fraction of a second, while the Alto took an hour. The first factor is the Alto's slow clock speed of 5.88 MHz, hundreds of times slower than a modern processor. In addition, the Alto doesn't execute machine instructions directly, but uses a relatively inefficient microcode emulator that takes many cycles to perform one machine instruction.

The ALU board from the Xerox Alto. The Alto doesn't use a microprocessor chip, but a CPU built from three boards of integrated circuits.

The ALU board from the Xerox Alto. The Alto doesn't use a microprocessor chip, but a CPU built from three boards of integrated circuits.

Unlike modern computers, the Alto doesn't use a microprocessor chip, but instead has a CPU built from three boards full of simple TTL chips. The photo above shows the arithmetic-logic unit (ALU) board, which uses four 4-bit 74181 ALU chips to perform addition, subtraction and logic operations. You can also see the CPU's registers on this board. The Alto doesn't include a hardware multiplier, but must perform multiplication by repeated shifts and adds. Thus, the Alto performs especially poorly on the Mandelbrot set, which is essentially repeated multiplications.

Conclusion

The Mandelbrot set was a quick program to try out the Alto's graphics. Next I'll try some more complex projects on the Alto. If you want to run my code, it's on Github; you can run it on the ContrAlto simulator if you don't have an Alto available.

If you're interested in retrocomputing fractals, I also generated a Mandelbrot on a 50 year old IBM 1401 mainframe The 1401 generated the Mandelbrot set in 12 minutes—not because it's a faster machine than the Alto, but because the resolution on the line printer was very very low.

Mandelbrot generated on the IBM 1401 mainframe.

Mandelbrot generated on the IBM 1401 mainframe.

Notes and references

  1. There is a floating point library (source) for the Alto. I decided not use use it since the integer Mandelbrot was already very slow. But using floating point would make sense if you wanted to zoom in on the Mandelbrot. 

  2. Fixed-point arithmetic is a common trick for fast Mandelbrot calculation. 

  3. To multiply two 16-bit numbers efficiently, I use the double precision MulFull function (written in Nova assembler) in PressML.asm, part of the Computer History Museum's archived Alto software. 

  4. The hardest part of generating the Alto Mandelbrot was figuring out how to configure the display memory and update it correctly. The details on how the display works are in chapter 4 of the Xerox Alto Hardware Manual. To summarize, the display contents are defined by a linked list of display control blocks (DCBs), which define a rectangular region of pixels on the display. A microcode task reads 16 words of pixels from memory at a time and writes them to the display board, which shifts the pixels out to the monitor. Thus, as each scanline is being written to the CRT, the CPU is busily reading the pixels for that line from memory and feeding them to the display, another reason why the Alto is slow.

    The Alto's Smalltalk environment has a simple graphics API, but we don't have Smalltalk running yet. 

Restoring YC's Xerox Alto day 10: New boards, running programs, mouse problems

Last week our vintage Alto was crashing; we traced the problem to an incompatibility between two of the processor boards. Today we replaced these boards and now we can run all the programs on the disk without crashes. Unfortunately our mouse still doesn't work, which limits what we can do on this GUI-based computer. We discovered that the mouse has the wrong connector; even though it plugs in fine, it doesn't make any electrical connection.

If you're just joining, the Alto was a revolutionary computer designed at Xerox PARC in 1973 to investigate personal computing. It introduced the GUI, Ethernet and laser printers to the world, among other things. Y Combinator received an Alto from computer visionary Alan Kay and I'm helping restore the system, along with Marc Verdiell, Carl Claunch Luca Severini, Ed Thelen, and Ron Crane,

Incompatibility in the Alto's circuit boards

The Xerox Alto was designed before microprocessor chips, so its processor is built from three boards full of TTL chips: the ALU (Arithmetic-Logic Unit) board, the Control board, and the CRAM (Control RAM) board. The Control board and the CRAM board are the ones that we dealt with today. Last week we traced through software execution, microcode, and hardware circuitry to figure out why some programs on the Alto crashed. We discovered that the problem happened when a program attempted to execute microcode stored in the Control RAM. The microcode RAM select circuit malfunctioned due to some wiring added to the back of the Control board (the white wires in the photo below). Why had this wiring been added to the board? And why did it break things? At the end of last episode, we briefly considered ripping out this wiring, but figured we should understand what was going on first.

This is the Xerox Alto control board, one of three boards that make up the CPU. The board has been modified with several white wires.

This is the Xerox Alto control board, one of three boards that make up the CPU. The board has been modified with several white wires which trigger our crashes.

A bit of explanation will help understand what's going on. Like most computers, the Xerox Alto's instruction set is implemented in microcode. The Alto is more flexible than most computers, though, and allows user programs to actually change the instruction set by modifying the microcode, actually changing the instruction set. To achieve this, the Alto stores microcode in a combination of RAM and ROM.[1] The default microcode (used for booting and for standard programs) is stored in 1K of ROM on the Control board. Some programs use custom microcode, which allows them to modify the computer's instruction set for better performance. This microcode is stored in high-speed RAM on the Control RAM (CRAM) board. Our Alto came with a 1K CRAM board, but some programs (such as Smalltalk) required the larger 3K CRAM board.[2] (This microcode RAM is entirely different from the 512 Kbyte RAM used by Alto programs; you didn't need to fit an Alto program into 3K.)

Inconveniently, the 1K and 3K RAM boards have incompatible connections, and the Control board needs to be wired to work with one or the other. We determined that the white-wire modifications on our Control board converted it from working with a 1K RAM board to working with a 3K RAM board.[3] Unfortunately, our Alto had a 1K RAM board so the two boards were incompatible and programs that attempted to use the microcode RAM crashed, as we saw last week. It's a mystery why our Alto had two incompatible board types, but at least we knew why the modifications are there. (Since our Alto also came with the wrong disk interface card and an unbootable hard disk, we wonder what happened to the Alto since it was last used. It clearly wasn't left in a usable state.)

Fortunately Al Kossow of bitsavers came to our rescue and supplied us with some 3K Control RAM boards and the Control boards to go with them. This saved us from needing to rewire the board we had. Al also provided the strange but necessary connector (visible below on the left) that goes between the Control board and the CRAM board. We swapped these boards with our boards and everything worked without crashing! We could now run all the programs on the disk without crashes.

The Alto's Control board is part of the CPU. This board contains 2K words of microcode ROM, as well as control circuitry.

The Alto's Control board is part of the CPU. This board contains 2K words of microcode ROM, as well as control circuitry. Our original board had 1K of ROM (and 8 empty sockets), while this board has the full 2K of ROM. The ROM chips are in the lower left, with labels. The chip labeled SW3K (upside down) is the PROM that selects the hardware configuration. The spare PROM (labeled SW1) is in the upper left. The edge connector on the right plugs into the backplane, while the two connectors on the left are cabled to the CRAM board.

Some Alto software

With the Alto running reliably, we could try out the various programs on the hard disk that had crashed before. Draw, the Alto's mouse-based drawing program, apparently uses microcode for optimizing performance, so last week it immediately dropped into the Swat debugger. With the compatible boards, Draw ran successfully. Unfortunately, since our mouse isn't working, we couldn't actually draw anything, but you can still see the icon-based GUI below. I've tried Draw on the Alto simulator, and despite the icons, it's not exactly intuitive to use.

'Draw' is the Alto's mouse-based drawing program. Clicking an icon on the left selects an operation.

'Draw' is the Alto's mouse-based drawing program. Clicking an icon on the left selects an operation.

We also tried Bravo, the first WYSIWYG (What you see is what you get) text editor. Again, functionality is limited without the mouse. But I could enter text and change the font to a larger, bold font with proportional spacing. Xerox PARC also invented the laser printer and Ethernet, so one could create documents in Bravo and then print them on a networked laser printer. Charles Simonyi, one of the co-authors of Bravo, later created Microsoft Word, so there's a direct connection between the Alto's editor and Word. I've written more about how to use Bravo here.

'Bravo' is the Alto's WYSIWYG text editor. It supports multiple fonts, among other features.

'Bravo' is the Alto's WYSIWYG text editor. It supports multiple fonts, among other features.

The Alto included a GUI file manager called Neptune, allowing files and directories to be manipulated with the mouse. Neptune has an invisible scroll bar to the left of the file list that appears when you mouse-over it; apparently the Alto also invented the scroll bar.

The Alto includes a graphical file system browser.

The Alto includes a graphical file system browser.

A rather complex GUI is in PrePress, a program that converted a spline-based font to a bitmapped font for display or printing. (You can think of this as a forerunner of TrueType fonts.) High-quality fonts were created for the Alto using the FRED font editor. As you would expect from a document company, these fonts included proportional spacing, ligatures such as "ffl" and "fl", and multiple styles and sizes.

'PrePress' is used to convert a spline-based font into a bitmap suitable for display or printing.

'PrePress' is used to convert a spline-based font into a bitmap suitable for display or printing.

Most importantly, we were able to run MADTEST, the Microcoded Alto Diagnostic test, which runs a suite of low-level diagnostics using microcode. This test ran successfully, increasing our confidence that there are no obscure problems in the processor.

If you want to try these Alto programs for yourself, you can use the ContrAlto simulator, built by the Living Computer Museum. This simulator has been very useful to use for learning how the software is supposed to work.

The mouse

The biggest problem remaining at this point is the mouse doesn't work, so we investigated that next. Although the mouse was invented prior to the Alto, the Alto was the first computer to include the mouse as a standard input device. The Alto uses a three-button mouse that plugs into the back of the keyboard.

The three-button optical mouse.

The three-button optical mouse.

Some Altos had a mechanical mouse, while others had an optical mouse. Our Alto came with an optical mouse, but we couldn't get it to work at all. The mouse uses a special mousepad with a specific dotted pattern (which we didn't have), so at first we suspected that was the problem. However, the mouse didn't respond at all when we used a printed copy of the pattern. We also didn't see any light (visible or IR) from the three illumination LEDs on the mouse, so we suspected bigger problems than a missing mousepad.

Underside of the mouse. The sensor (right) consists of three illumination LEDs surrounding the optical sensor.

Underside of the mouse. The sensor (right) consists of three illumination LEDs surrounding the optical sensor.

Opening up the mouse shows the simple circuitry inside, with a single chip controlling it. The chip is rather unusual since it includes a 16-pixel optical sensor, with a light guide that goes from the bottom of the chip to the bottom of the mouse. The pixel-based optical mouse was invented at Xerox PARC in 1980 (and later patented), so this mouse is somewhat more modern than the original Alto (1973). The handwritten markings on the chip suggest this may have been a prototype of some sort.

Inside the optical mouse. In the middle are the three buttons. At top is the IC that contains the optical sensor.

Inside the optical mouse. In the middle are the three buttons. At top is the IC that contains the optical sensor.

When we closely looked at how the mouse was wired up, we discovered the problem. The mouse plugs into a 19-pin socket (DE-19), while the mouse used a 9-pin plug (usually called DB-9, but technically DE-9), so the connectors are entirely incompatible. The DE-19 has three rows of pins: 6/7/6 (with the middle row empty on the Alto), and the DB-9 has two rows of pins (7/6). The bizarre thing is that the mouse plugged into the socket just fine: the connector shells are physically the same size, and the mouse plug's pins went between the Alto socket's connections. So the mouse was plugged in, but not actually connected to anything! It's surprising the connectors could go together without bending any pins. (Several people have suggested sources for a DB-19 connector. Unfortunately the DE-19 (three rows) has a totally different shape from the DB-19 (two rows).)

The Alto's mouse plugs into the 19-pin connector on the keyboard housing (above). Unfortunately our mouse has a 9-pin connector (below).

The Alto's mouse plugs into the 19-pin connector on the keyboard housing (above). Unfortunately our mouse has a 9-pin connector (below).

After some investigation, we learned that the Alto was missing the mouse when it came from Alan Kay. YCombinator picked up a replacement mouse on eBay, but it wasn't compatible with our Alto. We're still trying to figure out if the mouse is an Alto mouse with a nonstandard connector or if it is for a different machine. The Xerox Star used a 2-button mouse, so the mouse isn't from a Star. Tim Curley at Xerox PARC loaned us a compatible Alto mouse, so we'll give that one a try next episode. We're also looking into making an adapter cable, but DE-19 connectors appear to be obsolete and difficult to find.

Conclusion

Last week we discovered that the control board in our Alto was incompatible with the microcode RAM board. Al Kossow loaned us some compatible boards, and with those boards our Alto has been functioning without any crashes or malfunctions. We discovered that our mouse wasn't working because it had the wrong connector—although it plugged into the Alto, it didn't make any electrical connection. Since the mouse is necessary for many Alto programs, we hope to get the mouse working soon.

For updates on the restoration, follow me on Twitter at kenshirriff. Thanks to Al Kossow for helping us out again.

Notes and references

[1] The table below shows the three microcode configurations the Alto supported. Details are in section 8.4 of the Hardware Manual. The desired configuration is selected by inserting a particular PROM in the Control board: SW1, SW2, or SW3K. Each control board has one PROM in use and an unused PROM in the upper left corner; switching PROMs switches the configuration. The Control board has sockets for 2K of ROM; these sockets are left empty for systems with 1K of ROM.

Configuration NameROMRAM
1K1K1K
2K2K1K
3K1K3K

[2] The Alto introduced the 3K RAM board to take advantage of the new 4 kilobit RAM chips, replacing the 1 kilobit chips on the 1K board. Both boards required 32 RAM chips for the 32-bit micro-instructions, showing that back then you needed a lot of chips for not much memory. The microcode required high-speed static memory, so density was worse with microcode than with regular RAM.

[3] The 3K RAM board requires a few additional signals from the Control board, such as the task id. The 1K RAM board grounds one of the lines used for these signals, so using the 3K Control board with the 1K RAM board (as our Alto did) shorts out one of the bank select lines. This causes bank switching to fail and explains the crashes we saw last week. Schematics for the boards are available at bitsavers.