Showing posts with label Apollo. Show all posts
Showing posts with label Apollo. Show all posts

The digital ranging system that measured the distance to the Apollo spacecraft

During the Apollo missions to the Moon, a critical task for NASA was determining the spacecraft's position. To accomplish this, they developed a digital ranging system that could determine the distance to the spacecraft, hundreds of thousands of kilometers away, with an accuracy of about 1 meter.

The basic idea was to send a radio signal to the spacecraft and determine how long it takes to return. Since the signal traveled at the speed of light, the time delay gives the distance. The main problem is that due to the extreme distance to the spacecraft, a radar-like return pulse would be too weak.2 The ranging system solved this in two ways. First, a complex transponder on the spacecraft sent back an amplified signal. Second, instead of sending a pulse, the system transmitted a long pseudorandom bit sequence. By correlating this sequence over multiple seconds, a weak signal could be extracted from the noise.1

In this blog post I explain this surprisingly-complex ranging system. Generating and correlating pseudorandom sequences was difficult with the transistor circuitry of the 1960s. The ranging codes had to be integrated with Apollo's "Unified S-Band" communication system, which used high-frequency microwave signals. Onboard the spacecraft, a special frequency-multiplying transponder supported Doppler speed measurements. Finally, communicating with the spacecraft required a complex network of ground stations spanning the globe.

The Apollo Service Module's transponder with the lid removed, showing the circuitry inside. The transponder returned the ranging signal to Earth, multiplying the frequency by exactly 240/221.

The Apollo Service Module's transponder with the lid removed, showing the circuitry inside. The transponder returned the ranging signal to Earth, multiplying the frequency by exactly 240/221.

The ranging system

The ranging system measured how long a signal takes to travel to the spacecraft and back.3 As shown below, the ranging system transmitted a pseudorandom bit sequence to the spacecraft. The spacecraft's transponder returned the signal to the ground receiver. The returned signal was correlated against the original signal to determine the time shift, and thus the distance. Meanwhile, a Doppler measurement (described later) provided the speed of the spacecraft.4

Block diagram of the Apollo ranging subsystem. Redrawn from A study of the JPL Mark I ranging subsystem

Block diagram of the Apollo ranging subsystem. Redrawn from A study of the JPL Mark I ranging subsystem

The key step was to correlate the transmitted and received signals to determine the time delay. When the two signals were aligned with the proper delay, the bits matched. Because the sequence is pseudorandom, if the two signals were misaligned, the bits didn't match (except randomly).5 Thus, by testing how many bits match (i.e. the correlation) for different delays, the proper delay could be determined. Since the signals could be correlated over long time intervals (multiple seconds), the range could be determined even if the signal was extremely weak and hidden by noise.

The code sequence needed to be very long in order to unambiguously determine the range; the system uses a pseudorandom sequence that is 5,456,682 bits long. Since the code is sent at 1 megahertz, it repeats every 5.46 seconds. A radio signal can travel 800,000 kilometers and back in this time, while the Moon is about 384,000 kilometers away, so the system can unambiguously measure over twice the distance to the Moon. Note that at one megahertz, one bit of the signal corresponds to 150 meters of range. The system achieved more accuracy by comparing the phase of the signals.

Conceptually, the transmitted and received sequences were shifted until they match, and the amount of shift gave the delay, and thus the distance. However, it's impractical to try 5 million different shifts to match the sequence, especially with the technology of the 1960s. To solve the matching problem, the sequence was constructed from several shorter codes, from 2 to 127 bits long. These "sub-codes" were short enough that they could be matched by brute force. The overall delay could then be determined by the delays of the sub-codes. The system used four sub-codes—sub-code A had a length of 31 bits, B was 63 bits, C was 127 bits, and X was 11 bits—along with a two-valued clock CL. Since these lengths are relatively prime, the combined sequence has an overall length equal to the product: 5,456,682. The important concept is that a very long code could be generated and matched using hardware, since the sub-codes were short.

A 30-foot antenna. From Apollo Unified S-Band Technical Conference.

I made an interactive page that demonstrates how the sub-code sequences were created. The sub-code sequences of lengths 31, 63, and 127 were generated with a well-known technique called the linear-feedback shift register (LFSR). In this technique, a shift register of length N holds bits. In each step, a new bit is generated from the exclusive-or of two bits in the shift register. The new bit is shifted in, and an old bit is shifted out, providing a code bit. This technique can generate a pseudorandom sequence of length 2N-1 with good statistical properties. In order to generate a sub-code of length 11, the X codes were generated from a Legendre sequence.7

The C sub-code is generated from a 7-bit delay line (black). The last two bits are fed into an XOR gate to produce a new bit (red). Part of the pseudorandom sequence is shown below the gate.

The C sub-code is generated from a 7-bit delay line (black). The last two bits are fed into an XOR gate to produce a new bit (red). Part of the pseudorandom sequence is shown below the gate.

Combining the sub-codes to form the overall code isn't as straightforward as you might think, since each sub-code must be individually recognizable in the result. The A, B, and C sub-codes were combined with the majority function maj(A,B,C), which returns the most common bit out of the three inputs.6 The complete formula for the bit sequence was (X·maj(A,B,C))⊕CL.

Construction of the ranging system

The ranging process was implemented by the "Mark I ranging subsystem" below. Although this was called a "special-purpose binary digital computer", it's not really a computer in the modern sense, but more of a state machine that moved through the necessary steps. First, the ground station sent the code sequence to the spacecraft and synchronized to the received signal. Next, the ranging system tested the different offsets for the X sub-code and found the offset with the best correlation. It repeated the process for the A, B, and C sub-codes. The digital circuitry performed some tricky modulo arithmetic8 to compute the range from the sub-code offsets. Finally, the Doppler subsystem determined the spacecraft's speed and constantly added this to the range to keep the range up to date as the spacecraft moved. I've made an interactive page that demonstrates these steps.

The Mark I ranging subsystem. From Apollo Unified S-Band System.

The Mark I ranging subsystem. From Apollo Unified S-Band System.

The ranging system was built from "T-Pac modules", boards that used transistors and other discrete components to build simple digital components such as logic gates and flip flops.9 The T-Pac modules were introduced by Computer Control Company in 1958 and were designed for quick and efficient implementation of a digital system, running at 1 MHz. Groups of 32 T-Pac cards were mounted in a "T-BLOC" rack-mounted chassis, and 10 T-BLOCs were installed in two racks. The cards were connected by plugging wires and taper pins into a large grid.

One module was the T-Pac LE-10 "logic element" module, below, implementing four AND gates. It cost $98 in 1961 (about $700 in current dollars), showing how expensive digital logic was at the time. The ranging system used about 300 digital modules, so the digital circuitry for ranging would have cost hundreds of thousands of (current) dollars, a cost repeated at each ground station.

A T-Pac module. This is the LE-10 "logic element".

A T-Pac module. This is the LE-10 "logic element".

Tasks that we nowadays consider trivial were difficult with the technology of the time. For instance, the range was stored as a 31-bit binary value. But instead of a register, the value was stored in a magnetorestrictive delay line, as torsion pulses in a long nickel wire. To add a value to the range, the circuitry used a single-bit serial adder, adding bits one at a time as they exited the nickel wire, and then cycling the bits back into the wire.

One interesting circuit is the correlation level detector that tests for correlation between a particular sub-code and the received signal. The correlation started as an analog voltage, which was converted to a digital value by a "Voldicon. The correlation was integrated over time by summing the binary values using another 31-bit storage/adder circuit. The number of summed samples could range from 1 to 219 samples, user-settable through "Digiswitch" thumbwheel switches. By integrating the correlation over a long time interval, an extremely weak signal could be detected in the presence of noise. The system kept track of the best offset, storing the value in another delay line. The total ranging time varied from 1.6 seconds for a strong signal to 30 seconds at lunar distance.

Determining the speed with Doppler

The ranging system could also measure the spacecraft's speed by measuring the Doppler shift of the returned signal. If the spacecraft was moving away from Earth, the waves would be stretched out, lowering the frequency. Conversely, the frequency would increase if the spacecraft was moving towards Earth.10 By measuring the frequency shift, the spacecraft's speed could be accurately determined.12

A moving source causes the wavelength to be decreased if the source is moving closer or increased if the source is moving away. Diagram by Tkarcher.

A moving source causes the wavelength to be decreased if the source is moving closer or increased if the source is moving away. Diagram by Tkarcher.

The Doppler measurement impacted the radio system's design in two ways. First, the spacecraft couldn't simply receive and retransmit the signal, because the Doppler shift from the upwards journey would be lost. Instead, a complex frequency-multiplying transponder system was used, so the downlink signal's frequency was exactly 240/221 times the received uplink frequency.11 Second, the spacecraft used phase modulation (PM) instead of the common frequency modulation (FM) for most communication: since frequency modulation changes the signal's frequency, it would have interfered with the Doppler measurements.

The Apollo radio system

Apollo used a complex radio system called the Unified S-Band System that included voice, telemetry, scientific data, television, and the ranging data. These signals were combined and transmitted over a single carrier frequency in the S-band frequency range. The diagrams below show how the spectrum was allocated for the signal up to the spacecraft (transmitted at 2.10640625 GHz), and the down-link spectrum at 2.2875 GHz. These two frequencies are in the exact ratio of 240/221, which turns out to be important. Notice that the voice and data are on fairly narrow subcarriers, while the pseudorandom ranging data has a lower, but very wide spectrum. (The ranging signal looks a lot like white noise due to its randomness.) The wide spectrum makes the ranging signal easier to detect at low levels with noise. Even though the ranging spectrum overlaps with the voice and data subcarriers, they don't interfere too much because the ranging spectrum is spread out.

The spectrum used by the Apollo radio system.

The spectrum used by the Apollo radio system.

The transponder

The S-band transponder onboard the spacecraft had a critical role in both ranging and for communications in general. From the outside, the transponder (built by Motorola) is a plain blue-gray box, weighing about 32 pounds. The connectors at the right linked the transponder to other parts of the radio system. Internally, the transponder was crammed full of radio circuitry: phase modulators for voice data, a detector for received uplink data, and an FM transmitter for video.14

The S-band transponder.

The S-band transponder.

The block diagram below shows the role of the transponder (red) in the communications system. Once the spacecraft was outside of VHF range (about 1500 miles), all communication used the S-band and went through the transponder. The transponder was connected to a traveling-wave tube amplifier and then the spacecraft's antennas.

Diagram of the communications system used in Apollo. The transponder is highlighted in red. Click this (or any image) for a larger version.
From "Apollo Logistics Training", courtesy of Spaceaholic.

Diagram of the communications system used in Apollo. The transponder is highlighted in red. Click this (or any image) for a larger version. From "Apollo Logistics Training", courtesy of Spaceaholic.

The photo below shows a closeup of the circuitry inside the transponder. The transponder is constructed of multiple modules, connected by tiny coaxial cables. The frequency multiplier, mixer, IF amplifier, and wide band detector modules are visible. Most of the modules are duplicated on the other side of the transponder to provide redundancy, since a failure of the transponder would jeopardize the mission.

Closeup of the transponder.

Closeup of the transponder.

The most complex part of the transponder is how it received the ranging data from Earth and echos it back. To avoid interference, the retransmitted data is at a different frequency from the received data. But in order to preserve the Doppler shift information, the transmitted frequency had to vary with the received frequency, so it couldn't be fixed. Instead, the transponder multiplied the received frequency by exactly 240/221 to generate the retransmission frequency, using a complex phase-locked loop circuit.

A phase-locked loop is a widely-used circuit that allows an oscillator to be locked to another frequency source, even in the presence of noise. It uses a voltage-controlled oscillator whose output is compared to the input. If the output is falling behind, the oscillator is sped up. If the input is falling behind, the oscillator is slowed down. Eventually, the oscillator will lock onto the input signal and will track it. The transponder uses a more complex circuit, so the output frequency tracks a multiple of the input frequency, specifically the output frequency is 240/221× the input (uplink) frequency.15

The transponder was one of many electronics boxes crammed into the Command Module. The diagram below shows its position in the equipment racks.

This diagram of the Apollo Command Module shows the position of the S-band transponder, along with the traveling-wave-tube amplifier and the Apollo Guidance Computer.

This diagram of the Apollo Command Module shows the position of the S-band transponder, along with the traveling-wave-tube amplifier and the Apollo Guidance Computer.

The console of the Command Module (below) was extremely complicated, with many controls and switches. The astronauts used the highlighted switches to control the transponder and to turn ranging on and off.

Astronauts controlled ranging through a switch on the console. Diagram from Command/Service Module Systems Handbook p208.

Astronauts controlled ranging through a switch on the console. Diagram from Command/Service Module Systems Handbook p208.

Ground stations

Communication with Apollo required powerful ground stations with large antennas.17 These stations needed to be situated around the world to maintain constant communication with Apollo as the Earth rotated. The three main stations had 85-foot (26-meter) parabolic antennas and were located in Goldstone, California; Canberra, Australia; and Madrid, Spain, part of the Manned Space Flight Network (MSFN). Other stations had smaller 30-foot antennas. Coverage gaps were filled by special ships with 30-foot antennas as well as special Apollo Range Instrumentation Aircraft (ARIA), based on C-135 cargo aircraft.

NASA's 26-meter antenna at Honeysuckle Creek, Australia. Photo from NASA.

NASA's 26-meter antenna at Honeysuckle Creek, Australia. Photo from NASA.

The block diagram below gives a hint of the complexity of a ground station, with signal processing equipment as well as computers16 and networking to Mission Control and other sites. The "Ranging Circuitry" block has the most relevance to this discussion, but I'd also like to point out the "angle channel receivers". The antennas were servo-positioned to lock onto the spacecraft's signal. This provided an angle measurement of the spacecraft's position, which was combined with the range to yield the spacecraft's 3-D position in space.

Block diagram of a Unified S-Band ground station. From Apollo Unified S-Band System.

Block diagram of a Unified S-Band ground station. From Apollo Unified S-Band System.

The photo below shows the ground station's transmitter/receiver cabinets that handled the unified S-Band signals. These cabinets were the connection between the microwave equipment (amplifiers and antennas) and the radio-frequency subsystems for voice, data, instrumentation, ranging, and so forth. The photo zooms in on the ranging receiver control panel. This ranging circuitry performed the analog tasks: it demodulated the received ranging signal, tracked the frequency, extracted the clock and the Doppler signal, and measured the correlation. These signals were provided to the digital ranging subsystem described earlier.

The receiver-exciter subsystem of the ground station, with a closeup of the ranging receiver. From Apollo Unified S-Band System.

The receiver-exciter subsystem of the ground station, with a closeup of the ranging receiver. From Apollo Unified S-Band System.

Conclusion

The ranging system provided the distance to the spacecraft, the Doppler provided the spacecraft's (radial) velocity, and the position of the receiving antenna provided the spacecraft's angular position. The system provides a great deal of accuracy for distance and speed: 1.5-meter range resolution and 0.1 meter/sec speed resolution. Because the angular measurement depended on the physical positioning of the antenna, angular resolution was much worse: 0.025°, which corresponds to over 150 kilometers at the distance of the Moon.

The ranging system illustrates the complexity of Apollo. Even though ranging was a small part of Apollo's navigation, it required complex racks of hardware distributed to sites around the world, specialized algorithms, and an advanced transponder onboard the spacecraft. Almost every part of Apollo has this sort of fractal complexity, where a seemingly-simple requirement such as finding the distance to the spacecraft required numerous innovations. The S-band communication system alone required a 1965 conference with 317 pages of proceedings.

The ranging system is hard to understand from a text description, so I've made two interactive pages that demonstrate it. The first page shows how linear-feedback shift registers (LFSR) and XOR gates generate the subcodes, and how the subcodes create the transmitted signal. The second page shows how the correlations are determined for the various subcodes and how the range is computed from these values.

I've implemented the ranging codes on a Teensy, so the next step is to transmit the codes through an actual Apollo transponder. I announce my latest blog posts on Twitter, so follow me @kenshirriff for updates. I also have an RSS feed. Thanks to Steve Jurvetson for providing the S-band transponder and the traveling-wave tube amplifier. Thanks to Mike Stewart for extensive information on the Apollo hardware and CuriousMarc for driving the transponder restoration.

Notes and references

  1. The ranging system has a lot in common with GPS. GPS also works by sending a pseudorandom bit sequence and using correlation to determine the signal's delay. There are several important differences though. First, ranging only determined distance, while GPS determines position in three dimensions and also the exact time. (This is why GPS requires visibility of at least four satellites.)

    Second, the GPS signal is transmitted from a satellite and received on the ground, unlike the Apollo ranging signal which was both transmitted and received on the ground. As a result, the GPS receiver doesn't have the transmitted signal available, but must generate its own copy for comparison.

    Third, GPS satellites are about 20,000 kilometers from Earth, while the Moon is 384,000 kilometers from Earth, making the Apollo signal much weaker. (Although Apollo did have the advantage of huge 26-meter receiving antennas, rather than a GPS antenna that is a few centimeters long.)

    Finally, GPS has the advantage of complex integrated circuits to determine the correlation, rather than racks full of transistor logic.

    As far as I can tell, there isn't any direct connection between the Apollo ranging system and GPS. GPS grew out of the Transit (Naval Navigation Satellite System), the Timation satellite program, and USAF Project 621B (history). 

  2. A radar signal was first bounced off the Moon in 1946 (details, p 25), but the distance accuracy was only ±1000 miles. The radar return from the spacecraft would be much smaller due to its size. 

  3. There's a complicated history leading up to the Apollo ranging system. Development of Radar (RAdio Detection And Ranging) started in the 1920s. Radar became critically important in World War II, and to counteract jamming, spread-spectrum ideas were applied. The WHYN missile ranging system (1946) used correlation detection to determine the phase difference and thus range. The Federal Telecommunications Laboratories investigated noise-like signals for communication in 1948. This unusual system stored noise values optically on a rotating disk; curiously, the original source of random numbers was the Manhattan telephone directory. The NOMAC (NOise Modulation And Correlation) system (1952) explored thermal noise as a carrier for communication. The CODORAC (either COded DOppler Ranging And Command or COded Doppler RAdar Command) system at JPL (1952) used similar electronics and became the basis for the Deep Space Instrumentation Facility. It introduced a phase-locked loop (PLL) to cancel out Doppler variations, as well as linear-feedback shift register sequences for "pseudonoise". JPL's continuing research led to the ranging system used for Apollo, as well as the Space Ground Link Subsystem (SGLS) (1966), which is still used by the US Air Force. See The Origins of Spread-Spectrum Communications for more. 

  4. One complication is that the spacecraft is moving very rapidly during the ranging process, up to 10,000 meters per second. This creates two problems. First, the distance measurement will be out of date by the time it is completed. Second, the bit alignment is shifting many times per second, which makes it hard to find the correlation. The solution was to use separate clocks for the sending and receiving circuitry. The receiver clock was locked onto the signal received from the spacecraft. (Due to Doppler shift, this frequency would be slightly different.) By generating the codes from this clock, the codes stayed exactly aligned even as the spacecraft moved. Integrating the difference between the two clocks provided the change in distance since the start of the measurement. Thus, the computed range remained locked to the distance at the start of the measurement, and the Doppler data gave the incremental distance change. Combining these provided the correct range, continuously updated. 

  5. The pseudorandom sequence must be carefully designed so non-matches can be clearly distinguished from matches, even in the presence of noise. Specifically, misalignments should have as many mismatched bits as possible. The LFSR and Legendre sequences provided this property. In contrast, the sequence 0000011111 would be a bad choice, since if it is shifted by 1, most of the bits still match. Moreover, the number of mismatches should be constant, regardless of the shift. Otherwise, the correlation may match against a local peak rather than the correct shift.

    For more about the mathematics of correlating codes, see Sequences with Small Correlation 

  6. The majority function maj(A,B,C) matches the A value 75% of the time (and likewise for B or C), allowing the correlation to be detected. (In contrast, A⊕B⊕C would be a bad choice since it only matches A 50% of the time, so the correlation is essentially random.) For a quick analysis of the majority function, consider maj(A,B,C) where A is 1 and the other inputs have four possibilities: maj(1,0,0)=0, maj(1,0,1)=1, maj(1,1,0)=1, and maj(1,1,1)=1. The result matches a for all except the first case, i.e. 75% of the time. (If a is 0, the analysis is similar.) The majority function can be expressed in Boolean logic as A·B+B·C+A·C, i.e. true if any two inputs are set. 

  7. The idea of the Legendre sequence is that some numbers have an integer "square root" modulo 11 (technically a quadratic residue), and some do not. For instance, 52 = 25 ≡ 3 modulo 11, so in a sense 5 is the square root of 3. If you take the numbers 0 through 10 and assign 1 if the number has a "square root" and 0 otherwise, you get the X codes. (The ranging sequence is slightly different from the mathematical Legendre sequence, which uses -1 for a "quadratic nonresidue" instead of 0.) 

  8. The modulo equations were solved using the Chinese remainder theorem, developed by the Chinese mathematician Sunzi Suanjing in the third century. The ranging system used pre-computed numbers that canceled out for all except one sub-code. For instance, 992124 is congruent to 1 modulo 11, but congruent to 0 modulo 31, 63, and 127. Multiplying this number by the X offset provides the X sub-code's contribution to the range.

    The Chinese remainder theorem constants are
    992124 ≡ 1 (mod 11) for the X subcode
    1408176 ≡ 1 (mod 31) for the A subcode
    736219 ≡ 1 (mod 63) for the B subcode
    2320164 ≡ 1 (mod 127) for the C subcode
    These numbers are ≡ 0 modulo the other cases. Thus, the total offset T = 992124×X + 1408176×A + 736219×B + 2320164×C (mod 5,456,682).

    The numbers can be obtained by a straightforward algorithm, described in Appendix E of the NASA document. The numbers were hard-coded into the ranging hardware by a "Chinese Number Generator", logic gates that provided the numbers in serial form for serial addition. 

  9. The Honeywell DDP-116 minicomputer (1965) also used T-Pac modules. Honeywell acquired Computer Control Co., the manufacturer of T-Pacs, in 1966 to expand their digital capability. The earlier Honeywell DDP-19 was a curious computer since it used 19-bit words; it was built with S-Pac modules. The later µ-Pac modules used integrated circuits in place of transistors. 

  10. Note that the Doppler system can only measure the spacecraft's velocity towards or away from Earth. Perpendicular motion would not show up. 

  11. I haven't been able to find any explanation for the specific 240/221 frequency ratio between the transmitted and received frequencies. The two frequencies needed to be different so the transmitted signal doesn't overpower the received frequency. The ratio should be fairly close to 1, though, so the system can be optimized for a particular frequency band. The ratio should be reasonably small integers so frequency multipliers can be used. But the mystery to me is why 240/221 instead of, say, 12/11, which is almost the same but much easier to generate. My current theory is that the larger ratio avoided collisions between harmonics that would otherwise occur. (e.g. 12×f1 = 11×f2, which might distort the received signal?) 

  12. The code delay and the Doppler shift aren't independent, but are really two aspects of the same thing. For example, suppose the spacecraft is moving away at 15 meters/second. This will stretch the radio waves, decreasing the frequency. The 1 MHz pseudorandom signal transmitted to the spacecraft will return with each 1-microsecond interval stretched by 0.1 picoseconds.13 In 10 seconds, the ranging system will transmit 10 million pulses, so the Doppler stretching will cause the pulse train to be 1 microsecond longer, one pulse. The other way of looking at this is that after 10 seconds, the spacecraft will be 150 meters further away, increasing the round-trip signal delay by 1 microsecond. The signal delay from the Doppler shift is the same as the signal delay from the change in range because they are both caused by the spacecraft's motion. 

  13. Note that the signal transmitted from the ground will be Doppler-shifted when received by the spacecraft, and then the spacecraft's signal will be Doppler-shifted again when received on Earth, so the total Doppler shift is doubled. 

  14. The block diagram below shows the main components of the transponder. It consists of two phase-modulation transmitter/receivers (for redundancy). At the bottom is the FM transmitter for television signals (not redundant). The VCO (voltage-controlled oscillator) is phase-locked to the input signal, but multiplies the frequency by 240/221. This multiplication is done through several frequency multipliers and mixers. The ranging signal is extracted, phase-modulated at the new frequency, and sent to the antenna for transmission back to Earth.

    Block diagram of the transponder. (Click for a larger version.)

    Block diagram of the transponder. (Click for a larger version.)

     

  15. The transponder multiplies the received carrier frequency by 240/221. The Command Module used an uplink frequency of 2106.4 MHz, multiplied by 240/221 for the downlink frequency of 2287.5 MHz, although this can vary by ±120 kHz due to Doppler shift. The Lunar Module and Saturn rocket used an uplink carrier of 2101.8 MHz, multiplied by 240/221 for the downlink carrier of 2282.5 MHz.

    The 240/221 ratio was obtained by a complex process of mixers and frequency multipliers, as shown in the block diagram above. A voltage-controlled oscillator (VCO) in the transponder, ran at a frequency of approximately 19.0625 MHz and formed part of the phase-locked loop. The VCO signal was multiplied by 108 (12×9 in two steps) and mixed with the uplink signal received from Earth. The mixer yielded a signal whose frequency was the difference between the ground signal and the multiplied VCO signal, approximately 47.65625 MHz. A second mixer mixed this signal with two times the VCO frequency, yielding a signal at the difference frequency 9.53125 MHz. The VCO was then phase-locked to this signal to produce an output at twice its frequency, yielding the 19.0625 MHz VCO frequency. Putting this all into an equation:
    (Uplink - 108×VCO - 2×VCO)×2 = VCO
    which simplifies to VCO frequency = Uplink frequency * 2 / 221.

    Meanwhile, the transponder's transmitter multiplied the VCO frequency by 4 to yield a 76.25 MHz carrier. This was multiplied by 30 to yield the 2287.5 MHz downlink frequency. In other words, the VCO frequency was multiplied by 120 to transmit, so the transmitting frequency is 240/221 × Uplink frequency. Thus, the 240/221 frequency ratio is established by multiple frequency multipliers and mixers.

    But how was frequency multiplication implemented? Based on other Apollo circuits, I think the signal was fed into a step recovery diode, a special diode with very fast switching. This turned each input cycle into a sharp pulse, full of harmonics at multiples of the input frequency. A resonant network concentrated the energy at the desired harmonic, and then a filter removed unwanted frequency sidebands. The frequency of the resulting signal was at the desired multiple of the input signal. This process is described in Hewlett-Packard application note 920 (1968).  

  16. The ground stations used Univac 642B computers, a 30-bit computer with 32-kilowords of magnetic core storage. The computer was designed for military real-time applications. This computer was a key component of the Naval Tactical Data System, a groundbreaking 1960s system to manage combat information on US Navy ships.

    The Univac 624B computer. From Apollo Unified S-Band System.

    The Univac 624B computer. From Apollo Unified S-Band System.

     

  17. For details on the S-band system and ground stations, see Apollo Unified S-Band System. See A study of the JPL Mark I ranging subsystem for a detailed discussion of the ranging hardware. 

The core memory inside a Saturn V rocket's computer

The Launch Vehicle Digital Computer (LVDC) had a key role in the Apollo Moon mission, guiding and controlling the Saturn V rocket. Like most computers of the era, it used core memory, storing data in tiny magnetic cores. In this article, I take a close look at an LVDC core memory module from Steve Jurvetson's collection. This memory module was technologically advanced for the mid-1960s, using surface-mount components, hybrid modules, and flexible connectors that made it an order of magnitude smaller and lighter than mainframe core memories.2 Even so, this memory stored just 4096 words of 26 bits.1

A core memory module from the LVDC. This module stored 4K words of 26 data bits and 2 parity bits. It weighs 2.3 kg (5.1 pounds) and measures about 14 cm×14 cm×16 cm (5½"×5½"×6"). Click on any photo for a larger version.

A core memory module from the LVDC. This module stored 4K words of 26 data bits and 2 parity bits. It weighs 2.3 kg (5.1 pounds) and measures about 14 cm×14 cm×16 cm (5½"×5½"×6"). Click on any photo for a larger version.

The race to the Moon started on May 25, 1961 when President Kennedy stated that America would land a man on the Moon before the end of the decade. This mission required the three-stage Saturn V rocket, the most powerful rocket ever built. The Saturn V was guided and controlled by the Launch Vehicle Digital Computer3 (below), from liftoff into Earth orbit, and then on a trajectory towards the Moon. (The Apollo spacecraft separated from the Saturn V rocket at that point, ending the LVDC's role.)

The LVDC mounted in a support frame. The round connectors are visible on the front side of the computer. There are 8 electrical connectors and two connectors for liquid cooling. Photo courtesy of IBM.

The LVDC mounted in a support frame. The round connectors are visible on the front side of the computer. There are 8 electrical connectors and two connectors for liquid cooling. Photo courtesy of IBM.

The LVDC was just one of several computers onboard the Apollo mission. The LVDC was connected to the Flight Control Computer, a 100-pound analog computer. The Apollo Guidance Computer (AGC) guided the spacecraft to the Moon's surface. The Command Module contained one AGC while the Lunar Module contained a second AGC7 along with the Abort Guidance System, an emergency backup computer.

Multiple computers were onboard an Apollo mission. The Launch Vehicle Digital Computer (LVDC) is the one discussed in this blog post.

Multiple computers were onboard an Apollo mission. The Launch Vehicle Digital Computer (LVDC) is the one discussed in this blog post.

Unit Logic Devices (ULD)

The LVDC was built with an interesting hybrid technology called ULD (Unit Logic Devices). Although they superficially resembled integrated circuits, ULD modules contained multiple components. They used simple silicon dies, each implementing just one transistor or two diodes. These dies, along with thick-film printed resistors, were mounted on a half-inch-square ceramic wafer to implement a circuit such as a logic gate. These modules were a variant of the SLT (Solid Logic Technology) modules developed for IBM's popular S/360 series of computers. IBM started developing SLT modules in 1961, before integrated circuits were commercially viable, and by 1966 IBM produced over 100 million SLT modules a year.

ULD modules were considerably smaller than SLT modules, as shown in the photo below, making them more suitable for a compact space computer.4 ULD modules used ceramic packages instead of SLT's metal cans, and had metal contacts on the upper surface instead of pins. Clips on the circuit board held the ULD module in place and connected with these contacts.5 The LVDC and associated hardware used more than 50 different types of ULDs.

SLT modules (left) are considerably larger than ULD modules (right). A ULD module is 7.6 mm × 8 mm.

SLT modules (left) are considerably larger than ULD modules (right). A ULD module is 7.6 mm × 8 mm.

The photo below shows the internal components of a ULD module. On the left, the circuit traces are visible on the ceramic wafer, connected to four tiny square silicon dies. While this looks like a printed circuit board, keep in mind that it is much smaller than a fingernail. On the right, the black rectangles are thick-film resistors printed onto the underside of the wafer.

Top and underside of a ULD showing the silicon dies and resistors. While SLT modules had resistors on the upper surface, ULD modules had resistors underneath, increasing the density but also the cost. From IBM Study Report Figure III-11.

Top and underside of a ULD showing the silicon dies and resistors. While SLT modules had resistors on the upper surface, ULD modules had resistors underneath, increasing the density but also the cost. From IBM Study Report Figure III-11.

The microscope photo below shows a silicon die from a ULD module that implements two diodes.6 The die is very small; for comparison, grains of sugar are displayed next to the die. The die had three external connections through copper balls soldered to the three circles. The two lower circles were doped (darker regions) to form the anodes of the two diodes, while the upper-right circle was the cathode, connected to the substrate. Note that this die is much less complex than even a basic integrated circuit.

Photo of a two-diode silicon die next to sugar crystals. This photo is a composite of top-lighting to show the die details, with back-lighting to show the sugar.

Photo of a two-diode silicon die next to sugar crystals. This photo is a composite of top-lighting to show the die details, with back-lighting to show the sugar.

How core memory works

Core memory was the dominant form of computer storage from the 1950s until it was replaced by semiconductor memory chips in the 1970s. Core memory was built from tiny ferrite rings called cores, storing one bit in each core by magnetizing the core either clockwise or counterclockwise. A core was magnetized by sending a pulse of current through wires threaded through the core. The magnetization could be reversed by sending a pulse in the opposite direction.

To read the value of a core, a current pulse flipped the core to the 0 state. If the core was in the 1 state previously, the changing magnetic field created a voltage in a sense wire threaded through the cores. But if the core was already in the 0 state, the magnetic field wouldn't change and the sense wire wouldn't pick up a voltage. Thus, the value of the bit in the core was read by resetting the core to 0 and testing the sense wire. An important characteristic of core memory was that the process of reading a core destroyed its value, so it needed to be re-written.

Using a separate wire to flip each core would be impractical, but in the 1950s a technique called "coincident-current" was developed that used a grid of wires to select a core. This depended on a special property of cores called hysteresis: a small current has no effect on a core, but a current above a threshold would magnetize the core. This allowed a grid of X and Y lines to select one core from the grid. By energizing one X line and one Y line each with half the necessary current, only the core where both lines crossed would get enough current to flip leaving the other cores unaffected.

Closeup of an IBM 360 Model 50 core plane. The LVDC and Model 50 used the same type of cores, known as 19-32 because their
inner diameter was 19 mils and their outer diameter was 32 mils (0.8 mm).
While this photo shows three wires through each core, the LVDC used four wires.

Closeup of an IBM 360 Model 50 core plane. The LVDC and Model 50 used the same type of cores, known as 19-32 because their inner diameter was 19 mils and their outer diameter was 32 mils (0.8 mm). While this photo shows three wires through each core, the LVDC used four wires.

The photo below shows one core plane from the LVDC's memory.8 This plane has 128 X wires running vertically and 64 Y wires running horizontally, with a core at each intersection. For reading, a single sense wire runs through all the cores parallel to the Y wires. For writing, a single inhibit wire (explained below) runs through all the cores parallel to the X wires. The sense wires cross over in the middle of the plane; this reduces induced noise because noise from one half of the plane cancels out noise from the other half.

One core plane for the LVDC's memory, holding 8192 bits. Connections to the core plane are made through the pins around the outside. From Smithsonian National Air and Space Museum.

One core plane for the LVDC's memory, holding 8192 bits. Connections to the core plane are made through the pins around the outside. From Smithsonian National Air and Space Museum.

The plane above had 8192 locations, each storing a single bit. To store a word of memory, multiple core planes were stacked together, one plane for each bit in the word. The X and Y select lines were wired to zig-zag through all the core planes, in order to select a bit of the word from each plane. Each plane had a separate sense line for reading, and a separate inhibit line for writing. The LVDC memory used a stack of 14 core planes (below), storing a 13-bit "syllable" along with a parity bit.10

The LVDC core stack consists of 14 core planes. This stack is at the US Space & Rocket Center. Photo from NCAR EOL. I retouched the photo to reduce distortion from the plastic case.

The LVDC core stack consists of 14 core planes. This stack is at the US Space & Rocket Center. Photo from NCAR EOL. I retouched the photo to reduce distortion from the plastic case.

Writing to core memory required additional wires called the inhibit lines. Each plane had one inhibit line threaded through all the cores in the plane. In the write process, a current passed through the X and Y lines, flipping the selected cores (one per plane) to the 1 state, storing all 1's in the word. To write a 0 in a bit position, the plane's inhibit line was energized with half current, opposite to the X line. The currents canceled out, so the core in that plane would not flip to 1 but would remain 0. Thus, the inhibit line inhibited the core from flipping to 1. By activating the appropriate inhibit lines, any desired word could be written to the memory.

To summarize, a core memory plane had four wires through each core: X and Y drive lines, a sense line, and an inhibit line. These planes were stacked to form an array, one plane for each bit in the word. By energizing an X line and a Y line, one core in each plane was selected. The sense line was used to read the contents of the bit, while the inhibit line was used to write a 0 (by inhibiting the writing of a 1).9

The LVDC core memory module

In this section, I'll explain how the LVDC core memory module was physically constructed. At its center, the core memory module contains the stack of 14 core planes shown earlier. This is surrounded by multiple boards with the circuitry to drive the X and Y select lines and the inhibit lines, read the bits from the sense lines, detect errors, and generate necessary clock signals.11

An exploded view of the memory module showing the key components.
An MIB (Multilayer Interconnection Board) is a 12-layer printed circuit board.
From Saturn V Guidance Computer Progress Report Fig 2-43.

An exploded view of the memory module showing the key components. An MIB (Multilayer Interconnection Board) is a 12-layer printed circuit board. From Saturn V Guidance Computer Progress Report Fig 2-43.

Memory Y driver panel

A word in core memory is selected by driving the appropriate X and Y lines through the core stack. I'll start by describing the Y driver circuitry and how it generates a signal through one of the 64 Y lines. Instead of having 64 separate driver circuits, the module reduces the amount of circuitry by using 8 "high" drivers and 8 "low" drivers. These are wired up in a "matrix" configuration so each combination of a high driver and a low driver selects a different line. Thus, the 8 high drivers and 8 low drivers select one of the 64 (8×8) Y lines. The footnote12 has more information on the matrix technique.

The Y driver board (front) drives the Y select lines in the core stack.

The Y driver board (front) drives the Y select lines in the core stack.

The closeup view below shows some of the ULD modules (white) and transistor pairs (golden) that drive the Y select lines. The "EI" module is the heart of the driver; it supplies a constant voltage pulse (E) or sinks a constant current pulse (I) through a select line.14 A select line is driven by activating an EI module in voltage mode at one end of the line and an EI module in current mode at the other end. The result is a pulse with the correct voltage and current to flip the core. It takes a hefty pulse to flip a core; the voltage pulse is fixed at 17 volts, while the current is adjusted from 180 mA to 260 mA depending on the temperature.13

Closeup of the Y driver board showing six ULD modules and six transistor pairs. Each ULD module is labeled with an IBM part number, the module type (e.g. "EI"), and an unknown code.

Closeup of the Y driver board showing six ULD modules and six transistor pairs. Each ULD module is labeled with an IBM part number, the module type (e.g. "EI"), and an unknown code.

The board also has error-detector (ED) modules that detect if more than one Y select line is driven at the same time. Implementing this with digital logic would require a complicated set of gates to detect if two or more of the 8 inputs are high. Instead, the ED module uses a simple semi-analog design: it sums the input voltages using a resistor network. If the resulting voltage is above a threshold, the output is triggered.

A diode matrix is underneath the driver board, containing 256 diodes and 64 resistors. This matrix converts the 8 high and 8 low pairs of signals from the driver board into connections to the 64 Y lines that pass through the core stack. Flex cables on the top and bottom of the board connect the board to the diode matrix. Two flex cables on the left (not visible in the photo) and two flex cables on the right (one visible) connect the diode matrix to the core stack.15 The flex cable visible on the left connects the Y board to the rest of the computer via the I/O board (described later) while a small flex cable on the lower right connects to the clock board.

Memory X driver panel

The circuitry to drive the X lines is similar to the Y circuitry, except there are 128 X lines compared to 64 Y lines. Because there are twice as many X lines, the module has a second X driver board underneath the one visible below. Although the X and Y boards have the same components, the wiring is different.

This board and the similar one underneath drive the X select lines in the core stack.

This board and the similar one underneath drive the X select lines in the core stack.

The closeup below shows that the board has suffered some component damage. One of the transistors has been dislodged, a ULD module has been broken in half, and the other ULD module is cracked. The wiring is visible inside the broken module as well as one of the tiny silicon dies (on the right). This photo also shows vertical and horizontal circuit board traces on several of the board's 12 layers.

A closeup of the X driver board showing some damaged circuitry.

A closeup of the X driver board showing some damaged circuitry.

Underneath the X driver boards is the X diode matrix, containing 288 diodes and 128 resistors. The X diode matrix uses a different topology than the Y diode board to avoid doubling the number of components.16 Like the Y diode board, this board contains components mounted vertically between two printed circuit boards. This technique is called "cordwood" and allows the components to be packed together closely.

Closeup of X diode matrix showing diodes mounted vertically using cordwood construction between two printed circuit boards. The two X driver boards are above the diode board, separated from it by foam. Note how the circuit boards are packed very closely together.

Closeup of X diode matrix showing diodes mounted vertically using cordwood construction between two printed circuit boards. The two X driver boards are above the diode board, separated from it by foam. Note how the circuit boards are packed very closely together.

Memory sense amplifiers

The photo below shows the sense amplifier board on top of the module.17 It has 7 channels to read 7 bits from the memory stack; an identical board below processes another 7 bits, for 14 bits in total. The job of the sense amplifier is to detect the small signal (20 millivolts) generated by a flipping core, and turn it into a 1-bit output. Each channel consists of a differential amplifier and buffer, followed by a differential transformer and an output latch. At the left, the 28-conductor flex cable connects to the memory stack, feeding the two ends of each sense wire into the amplifier circuitry, starting with an MSA-1 (Memory Sense Amplifier) module. The discrete components are resistors (brown cylinders), capacitors (red), transformers (black), and transistors (golden). The data bits exit the sense amplifier boards through the flex cable on the right.

The sense amplifier board on top of the memory module. This board amplifies the signals from the sense wires to produce the output bits.

The sense amplifier board on top of the memory module. This board amplifies the signals from the sense wires to produce the output bits.

Memory inhibit drivers

The inhibit board is on the underside of the core module and holds the inhibit drivers that are used for writing to memory. There are 14 inhibit lines, one for each plane in the core stack. To write a 0 bit, the corresponding inhibit driver is activated and the current through the inhibit line prevents the core from flipping to a 1. Each line is driven by an ID-1 and ID-2 (Inhibit Driver) module and a pair of transistors. The high-precision 20.8Ω resistors at the top and bottom of the board regulate the inhibit current. The 14-wire flex cable on the right connects the drivers to the 14 inhibit wires in the core stack.

The inhibit board on the bottom of the memory module. This board generates the 14 inhibit signals used during writing.

The inhibit board on the bottom of the memory module. This board generates the 14 inhibit signals used during writing.

Memory clock driver

The clock driver is a pair of boards that generate the timing signals for the memory module. Once the computer starts a memory operation, the various timing signals used by the memory module are generated asynchronously by the module's clock driver. The clock driver boards are on the bottom of the module, between the core stack and the inhibit board so it is hard to see the boards.

The clock driver boards are below the core memory stack but above the inhibit board.

The clock driver boards are below the core memory stack but above the inhibit board.

The photo above looks between the clock driver boards; the inhibit board is on the bottom. The blue components are multi-turn potentiometers, presumably to adjust timings or voltages. Resistors and capacitors are also visible on the boards. The schematic shows several MCD (Memory Clock Driver) modules, but I can't see any modules on the boards. I don't know if that is due to the limited visibility, a change in the circuitry, or another board with these modules.

Memory input-output panel

The final board of the memory module is the input-output panel (below), which distributes signals between the boards of the memory module and the remainder of the LVDC computer. At the bottom, the green 98-pin connector plugs into the LVDC's memory chassis, providing signals and power from the computer. (Much of the connector's plastic is broken, exposing the pins.) The distribution board is linked to this connector by two 49-pin flex cables at the bottom (only the front cable is visible). Other flex cables distribute signals to the X-driver board (left), the Y-driver board (right), the sense amplifier board (top), and inhibit board (underneath). The 20 capacitors on the board filter the power supplied to the memory module.

The input-output board is the interface between the memory module and the rest of the computer. The green connector at the bottom plugs into the computer, and these signals are routed through flat cables to other parts of the memory module. This board also has filter capacitors.

The input-output board is the interface between the memory module and the rest of the computer. The green connector at the bottom plugs into the computer, and these signals are routed through flat cables to other parts of the memory module. This board also has filter capacitors.

Conclusion

The LVDC's core memory module provided compact, reliable storage for the computer. The lower half of the computer (below) was filled by up to 8 core memory modules. This allowed the computer to hold a total of 32 kilowords of 26-bit words, or 16 kilowords in redundant high-reliability "duplex" mode.18

The LVDC held up to eight core memory modules. Photo at US Space & Rocket Center, courtesy of Mark Wells.

The LVDC held up to eight core memory modules. Photo at US Space & Rocket Center, courtesy of Mark Wells.

The core memory module provides an interesting view of a time when 8K of storage required a 5-pound module. While this core memory was technologically advanced for its time, the hybrid ULD modules were rapidly obsoleted by integrated circuits. Core memory as a whole died out in the 1970s with the advent of semiconductor DRAMs.

The contents of core memory are retained when the power is disconnected, so it's likely that the module still holds the software from when the computer was last used, even decades later. It would be interesting to try to recover this data, but the damaged circuitry poses a problem so the contents will probably remain locked inside the memory module for decades more.

I announce my latest blog posts on Twitter, so follow me @kenshirriff for future articles. I also have an RSS feed. For an explanation of core memory, see CuriousMarc's video where we wire up a core plane and demonstrate how it works. I've written before about core memory in the IBM 1401, core memory in the Apollo Guidance Computer, and core memory in the IBM S/360. Thanks to Steve Jurvetson for supplying the core array.

Notes and references

  1. A word size of 26 bits may seem bizarre, but in the 1960s computers hadn't yet standardized on bytes and word sizes that were a power of two. Business computers often used 6-bit characters, while aerospace computers typically used whatever word size provided the necessary accuracy. 

  2. It's interesting to compare the size of the LVDC's core memory to IBM's commercial core memories, which I wrote about here. The 128-kilobyte expansion for the IBM S/360 Model 40 computer required an additional cabinet weighing 610 pounds and measuring 62.5"×26"×60". An LVDC core memory module holds 4K words of 26 bits, equivalent to 13 kilobytes. Doing the math, the LVDC has 1/12 the weight and 1/40 the volume per byte. The core stack itself was very similar between the LVDC and the S/360 machines; the difference in weight and volume comes from the surrounding electronics and packaging. 

  3. For more information on the LVDC, see the Virtual AGC project's LVDC page. Also see the interesting SmarterEveryDay video on the LVDC. Fran Blanche did an extensive investigation into an LVDC circuit board. 

  4. The SLT modules in my photograph are mounted on an SMS card, rather than the expected SLT card. SMS cards were IBM's previous generation of circuit cards and normally used discrete germanium transistors. However, even after the introduction of SLT in 1964, IBM needed to support older computers with SMS cards. To reduce costs, they started building old-style SMS cards that used the more modern SLT modules. The point is that SLT modules were usually packed densely on multiple-layer circuit boards, rather than the low-density SMS card in the photo. 

  5. One question is why did IBM use SLT modules instead of integrated circuits? The main reason was that integrated circuits were still in their infancy, having been invented in 1959. In 1963, SLT modules had cost and performance advantages over integrated circuits. However, SLT modules were viewed outside IBM as backward compared to integrated circuits. One advantage of SLT modules over integrated circuits was that the resistors in SLT were much more accurate than those in integrated circuits. During manufacturing, the thick-film resistors in SLT modules were carefully sand-blasted to remove resistive film until they had the desired resistance. SLT modules were also cheaper than comparable integrated circuits in the 1960s. By 1969, IBM started using integrated circuits, which they called MST (Monolithic Systems Technology). IBM packaged their integrated circuits in SLT-style metal packages, rather than the industry-standard DIP epoxy packages. Chapter 2 of IBM's 360 and Early 370 Systems discusses the history of SLT modules in great detail. 

  6. Curiously, the ULD modules in the core memory did not contain any sealant inside. In contrast, the ULD modules examined by Fran Blanche were filled with pink silicone inside. 

  7. It's interesting to compare the AGC to the LVDC since they took two very different approaches to computer design and manufacture. Both computers had rectangular metal boxes, magnesium-lithium for the LVDC and magnesium for the AGC. Physically, the LVDC was about twice the size (2.2 cubic feet vs 1.1 cubic feet) even though they were both about 70 pounds. The LVDC used 138 Watts and was liquid-cooled, while the AGC used 55 watts and was cooled by conduction. The LVDC used 26-bit words compared to 15 bits in the AGC. One big architectural difference was that the LVDC was a serial computer, operating on one bit at a time, while the AGC operated on all bits in parallel (like most computers). Another important difference was that the LVDC used triple redundancy for reliability, while the AGC had no hardware fault handling. Both computers used a 2.048 MHz clock, but the LVDC was considerably slower because it was serial: 82 µs for an add operation compared to 23.4 µs for the AGC. The LVDC had up to 8 core memory modules, holding 4K words each. The AGC's core memory was only 2K words. However, the AGC also had 36K words of read-only storage in its hardwired core rope modules. (The LVDC did not use core rope.)

    The two computers were constructed in very different ways. The AGC was built from integrated circuits, while the LVDC used hybrid ULD modules. The AGC's logic gates were RTL (resistor-transistor logic) NOR gates, while the LVDC's were slightly more advanced DTL (diode-transistor logic) AND-OR-INVERT gates. While the AGC used two types of ICs (a dual NOR gate and a sense amplifier), the LVDC used many different types of modules.

    The AGC's circuit boards were encapsulated into rectangular modules, while the LVDC's circuit boards plugged into a backplane in a more standard way. The AGC's backplane was wire-wrapped by machine, while the LVDC's backplane was a 14-layer printed circuit board.

    IBM engaged in political battles, attempting to replace MIT's AGC with the LVDC. IBM argued that the AGC wasn't reliable enough compared to the triple-redundant LVDC. According to MIT, however, the AGC could run a guidance program 10 to 20 times faster than the LVDC, use half the memory, and provide more accuracy (by using double precision). MIT argued that the LVDC wasn't powerful enough to replace the AGC. In the end, the AGC survived the "naysayers" and was used on the Apollo spacecraft, while the LVDC had its role in the Saturn V rocket. The "showdown" is described in more detail here.  

  8. The Smithsonian website states that the core plane is approximately 4"×7"×1", but that can't be right since the entire memory module is less than 7" wide. The Study Report page 3-43 says each plane is 5.5"×3.5×0.15", which seems accurate. 

  9. The book Memories That Shaped an Industry discusses the history of core memory at IBM. 

  10. The LVDC has 26-bit words, each word consisting of two 13-bit syllables. Its core memory is described as holding 4K words, where each word is 26 data bits and 2 parity bits. However, the core memory is physically constructed to store 8K syllables (13 data bits and 1 parity bit). Thus, two memory accesses are required to read a complete word. An instruction is one 13-bit syllable so an instruction can be read in a single memory cycle. Thus, executing a typical instruction requires three memory accesses: one for the instruction and two for the data. (Keep in mind that reading from core memory erases the data, so a memory access consists of a read followed by a write to restore the data.) 

  11. Much of the memory-related circuitry is in the LVDC's computer logic, not the memory module itself. In particular, the computer's logic contains registers to hold the address and data word and convert between serial and parallel. It also contains circuitry to decode the address into drive lines, as well as to generate and check parity. 

  12. Core memories typically used a "matrix" approach to reduce the number of circuits required to drive the X and Y select lines. The diagram below demonstrates this technique for the vertical lines in a hypothetical 9×5 core array. There are three "high" drivers (A, B and C), and three "low" drivers (1, 2 and 3). If driver B is energized positive and driver 1 is energized negative, current flows through the core line highlighted in red. By selecting a different pair of drivers, a different line is energized. In a large array, this approach significantly reduces the number of line drivers required.

    The "matrix" approach reduces the number of line drivers required.

    The "matrix" approach reduces the number of line drivers required.

    When using a matrix approach, each line must have diodes to prevent "sneak paths" through the cores. To see the need for diodes, note that in the example above current could flow from B to 2, up to A and finally down to 1, for instance, incorrectly energizing multiple lines and flipping the wrong cores. By putting diodes on each line, reverse current paths such as 2 to A can be blocked. Also note that writing core memory requires current pulses in the opposite direction from reading. Supporting this requires additional diodes in the opposite direction. 

  13. Because the characteristics of ferrite cores change with temperature, the memory module adjusts the current based on temperature, from 260 mA at 10 °C to 180 mA at 70 °C. A sensor in the stack detects the temperature, causing a TCV regulator (Temperature Controlled Voltage) to generate a voltage ranging from 6 V at 10 °C to 4 V at 70 °C. The TCV control voltage is fed into each EI module, causing the current to drop 1.33 mA per °C.  

  14. It's unclear why the driver boards use EI modules as well as ID-2 (Inhibit Driver) modules, since a separate board implements the inhibit drivers. The earlier schematics show just the EI modules. (See Laboratory Maintenance Instructions for LVDC Vol. II (1965) page 10-164 for the schematics.) The inhibit driver is similar to the current sink in the EI driver, so I suspect the ID-2 module is being used to boost the current.  

  15. For reference, this footnote provides details of the Y driver signal routing. There are 8 high drive signals and 8 low drive signals generating the 64 Y select lines through the core stack. However, the current through the select line needs to go both ways, so cores can be flipped both directions. Thus, the drive signals are in pairs, one from the "E" side (voltage source) of the EI chip and one from the "I" side (current sink). These 32 signals go from the driver board to the diode matrix through two 16-wire flat cables. The diode board is connected to 64 Y select lines, but each line has two ends. These 128 connections are through four 32-wire flat cables, two on the left and two on the right. The two cables connected to the front side of the diode matrix wrap around to the far side of the stack, while the two cables connected to the back side of the diode matrix go to the near side of the stack. Thus, alternating select lines go through the stack in opposite directions. 

  16. The X and Y diode matrices use a different wiring topology. There are 64 Y lines through the core stack. They are matrixed with 8 drivers at one end and 8 at the other end. The Y board has a diode pair (electrically) at each end of the 64 Y lines, so it has 256 diodes and 128 wires to the Y lines. (Because a line needs to be driven in either direction, one diode is required in each direction, making a pair at each end.)

    On the other hand, there are 128 X lines through the core stack, matrixed with 16 drivers at one end and 8 at the other end. To avoid doubling the number of diodes used, the X board only has a diode pair at one end of each of the 128 X lines. At the other end, groups of 8 X lines are tied together directly, forming 16 groups with one diode pair is used for each group. Thus, there are 256 diodes in the matrix, as well as 32 diodes associated with the 16 groups. As far as wires between the diode matrix and the core stack, there are 128 wires for the diode-connected end, and 32 wires corresponding to the grouped end. See Figures 10-42 and 10-43 in the Laboratory Maintenance Instructions for LVDC Vol. II (1965) for schematics.

    The X driver board is connected to other boards and the core stack through multiple flex cables. The cable on the right links the driver board to the rest of the computer via the I/O board. The top edge of the board has a 24-wire flex cable to the diode matrix, with a second 24-wire cable at the bottom. At the bottom, another smaller flex cable receives signals from the timing board underneath the core stack. The flex cables between the diode matrices and the core stack are not visible: there is a 16-wire cable and a 64-wire cable to the stack at the top and similar cables at the bottom.

    There is an important difference between the X and Y wiring. The four flat cables between the X diode matrix and the core planes went vertically, from the top and bottom of the matrix. The flat cables from the Y diode matrix went horizontally, from the sides of the matrix. In this way, the X and Y cables were attached to orthogonal sides of the core planes, connecting to the orthogonal X and Y wires. 

  17. A special handle was produced to insert, remove, or carry the memory module. Because the memory modules were delicate and mounted with little clearance, this tool was developed to manipulate the module safely. This handle slides over the four shoulder screws on top of the module and latches into place.

    The special carrying handle for the memory module. From Laboratory Maintenance Instructions for LVDC Vol. II page 4-5.

    The special carrying handle for the memory module. From Laboratory Maintenance Instructions for LVDC Vol. II page 4-5.

     

  18. One interesting feature of the LVDC was that memory modules could be mirrored for reliability. In "duplex" mode, each word was stored in two memory modules. If one module had an error, the correct word could be retrieved from the other module. While this provided reliability, it cut the memory capacity in half. Alternatively, memory modules could be used in "simplex" mode, with each word stored once.

    Note that the LVDC's circuitry was triply-redundant to detect and correct errors. However, memory only needed to be doubly redundant because parity indicated which value was incorrect. The LVDC used odd parity. Odd parity had the advantage that parity would catch a word that was stuck all 0's or all 1's. One interesting feature of the simplex and duplex memory modes is that the software could switch between them while running, even setting separate modes for instructions and data. This allowed some words to be stored in simplex mode while more important words were stored in duplex mode. However, it appears that in actual use, the entire memory would be duplexed rather than specific parts. 

A computer built from NOR gates: inside the Apollo Guidance Computer

We recently restored an Apollo Guidance Computer1, the computer that provided guidance, navigation, and control onboard the Apollo flights to the Moon. This historic computer was one of the first to use integrated circuits and its CPU was built entirely from NOR gates.2 In this blog post, I describe the architecture and circuitry of the CPU.

Architecture of the Apollo Guidance Computer

The Apollo Guidance Computer with the two trays separated. The tray on the left holds the logic circuitry built from NOR gates. The tray on the right holds memory and supporting circuitry.

The Apollo Guidance Computer with the two trays separated. The tray on the left holds the logic circuitry built from NOR gates. The tray on the right holds memory and supporting circuitry.

The Apollo Guidance Computer was developed in the 1960s for the Apollo missions to the Moon. In an era when most computers ranged from refrigerator-sized to room-sized, the Apollo Guidance Computer was unusual—small enough to fit onboard the Apollo spacecraft, weighing 70 pounds and under a cubic foot in size.

The AGC is a 15-bit computer. It may seem bizarre to have a word size that isn't a power of two, but in the 1960s before bytes became popular, computers used a wide variety of word sizes. In the case of the AGC, 15 bits provided sufficient accuracy to land on the moon (using double- and triple-precision values as needed), so 16 bits would have increased the size and weight of the computer unnecessarily.4

The Apollo Guidance Computer has a fairly basic architecture, even by 1960s standards. Although it was built in the era of complex, powerful mainframes, the Apollo Guidance Computer had limited performance; it is more similar to an early microprocessor in power and architecture.3 The AGC's strengths were its compact size and extensive real-time I/O capability. (I'll discuss I/O in another article.)5

The architecture diagram below shows the main components of the AGC. The parts I'll focus on are highlighted. The AGC has a small set of registers, along with a simple arithmetic unit that only does addition. It has just 36K words of ROM (fixed memory) and 2K words of RAM (erasable memory). The "write bus" was the main communication path between the components. Instruction decoding and the sequence generator produced the control pulses that directed the AGC.

Block diagram of the Apollo Guidance Computer. From Space Navigation Guidance and Control, R-500, VI-14.

Block diagram of the Apollo Guidance Computer. From Space Navigation Guidance and Control, R-500, VI-14.

About half of the architecture diagram is taken up by memory, reflecting that in many ways the architecture of the Apollo Guidance Computer was designed around its memory. Like most computers of the 1960s, the AGC used core memory, storing each bit in a tiny ferrite ring (core) threaded onto a grid of wires. (Because a separate physical core was required for every bit, core memory capacity was drastically smaller than modern semiconductor memory.) A property of core memory was that reading a word from memory erased that word, so a value had to be written back to memory after each access. The AGC also had fixed (ROM), the famous core ropes used for program storage where bits were physically woven into the wiring pattern (below). (I've written about the AGC's core memory and core rope memory in detail.)

Detail of core rope memory wiring from an early (Block I) Apollo Guidance Computer. Photo from Raytheon.

Detail of core rope memory wiring from an early (Block I) Apollo Guidance Computer. Photo from Raytheon.

NOR gates

The Apollo Guidance Computer was one of the very first computers to use integrated circuits. These early ICs were very limited; the AGC's chips (below)2 contained just six transistors and eight resistors, implementing two 3-input NOR gates.

Die photo of the dual 3-input NOR gate used in the AGC. The ten bond wires around the outside of the die connect to the IC's external pins. Photo by Lisa Young, Smithsonian.

Die photo of the dual 3-input NOR gate used in the AGC. The ten bond wires around the outside of the die connect to the IC's external pins. Photo by Lisa Young, Smithsonian.

The symbol for a NOR gate is shown below. It is a very simple logic gate: if all inputs are low, the output is high. It might be surprising that NOR gates are sufficient to build a computer, but NOR is a universal gate: you can make any other logic gate out of NOR gates. For instance, wiring the inputs of a NOR gate together forms an inverter. Putting an inverter on the output of a NOR gate produces an OR gate. Putting inverters on the inputs of a NOR gate produces an AND gate.6 More complex circuits, such as flip flops, adders, and counters can be built from these gates.

The NOR gate generates a 1 output if all inputs are 0. If any input is a 1 (or multiple inputs), the NOR gate generates a 0 output.

The NOR gate generates a 1 output if all inputs are 0. If any input is a 1 (or multiple inputs), the NOR gate generates a 0 output.

One building block that appears frequently in the AGC is the set-reset latch. This simple circuit is built from two NOR gates and stores one bit of data: the set input stores a 1 bit and the reset input stores a 0 bit. In more detail, a 1 pulse on the set input turns the top NOR gate off and the bottom one on, so the output is a 1. A 1 pulse on the reset input does the opposite so the output is a 0. If both inputs are 0, the latch remembers its previous state, providing storage. The next section will show how the latch circuit is used to build registers.

A set-reset latch built from two NOR gates. If one NOR gate is on, it forces the other one off. The overbar on the top output indicates that it is the complement of the lower output.

A set-reset latch built from two NOR gates. If one NOR gate is on, it forces the other one off. The overbar on the top output indicates that it is the complement of the lower output.

The registers

The Apollo Guidance Computer has a small set of registers to store values temporarily outside of core memory. The main register is the accumulator (A), which is used in many arithmetic operations. The AGC also has a program counter register (Z), arithmetic unit registers (X and Y), a buffer register (B), return address register (Q)7, and a few others. For memory accesses, the AGC has a memory address register (S) and a memory buffer register (G) for data. The AGC also has some registers that reside in core memory, such as I/O counters.

The following diagram outlines the register circuitry for the AGC, simplified to a single bit and two registers (Q and Z). Each register bit has a latch (flip-flop), using the circuit described earlier (blue and purple). Data is transmitted both to and from the registers on the write bus (red). To write to a register, the latch is first reset by a clear signal (CQG or CZG, green). A "write service" gate signal (WQG or WZG, orange) then allows the data on the write bus to set the corresponding register latch. To read a register, a "read service" gate signal (RQG or RZG, cyan) passes the latch's output through the write amplifier to the write bus, for use by other parts of the AGC. The complete register circuitry is more complex, with multiple 16-bit registers, but follows this basic structure.

Simplified diagram of AGC register structure, showing one bit of the Q and Z registers. (Source)

Simplified diagram of AGC register structure, showing one bit of the Q and Z registers. (Source)

The register diagram illustrates three key points. First, the register circuitry is built from NOR gates. Second, data movement through the AGC centers on the write bus. Finally, the register actions (like other AGC actions) depend on specific control signals at the right time; the "control" section of this post will discuss how these signals are generated.

The arithmetic unit

Most computers have an arithmetic logic unit (ALU) that performs arithmetic and Boolean logic operations. Compared to most computers, the AGC's arithmetic unit is very limited: the only operation it performs is addition of 16-bit values, so it's called an arithmetic unit, not an arithmetic logic unit. (Despite its limited arithmetic unit, the AGC can perform a variety of arithmetic and logic operations including multiplication and division, as explained in the footnote.9)

The schematic below shows one bit of the AGC's arithmetic unit. The full adder (red) computes the sum of two bits and a carry. In particular, the adder sums the X bit, Y bit, and carry-in, generating the sum bit (sent to the write bus) and carry bit. The carry is passed to the next adder, allowing adders to be combined to add longer words.8)

Schematic of one bit in the AGC's arithmetic unit. (Based on AGC handbook p214.)

Schematic of one bit in the AGC's arithmetic unit. (Based on AGC handbook p214.)

The X register and Y register (purple and green) provide the two inputs to the adder. These are implemented with the NOR-gate latch circuits described earlier. The circuitry in blue writes a value to the X or Y register as specified by the control signals. This circuitry is fairly complex since it allows constants and shifted values to be stored in the registers, but I won't go into the details. Note the "A2X" control signal that gates the A register value into the X register; it will be important in the following discussion.

The photo below shows the physical implementation of the AGC's circuitry. This module implements four bits of the registers and arithmetic unit. The flat-pack ICs are the black rectangles; each module has two boards with 60 chips each, for a total of 240 NOR gates. The arithmetic unit and registers are built from four identical modules, each handling four bits; this is similar to a bit-slice processor.

The arithmetic unit and registers are implemented in four identical modules. Each module implements 4 bits. The modules are installed in slots A8 through A11 of the AGC.

The arithmetic unit and registers are implemented in four identical modules. Each module implements 4 bits. The modules are installed in slots A8 through A11 of the AGC.

Executing an instruction

This section illustrates the sequence of operations that the AGC performs to execute an instruction. In particular, I'll show how an addition instruction, ADS (add to storage), takes place. This instruction reads a value from memory, adds it to the accumulator (A register), and stores the sum in both the accumulator and memory. This is a single machine instruction, but the AGC performs many steps and many values move back and forth to accomplish it.

Instruction timing is driven by the core memory subsystem. In particular, reading a value from core memory erases the stored value, so a value must be written back after each read. Also, when accessing core memory there is a delay between when the address is set up and when the data is available. The result is that each memory cycle takes 12 time steps to perform first a read and then a write. Each time interval (T1 to T12) takes just under one microsecond, and the full memory cycle takes 11.7µs, called a Memory Cycle Time (MCT).

The erasable core memory module from the Apollo Guidance Computer. This module holds 2 kilowords of memory, with a tiny ferrite core storing each bit. To read memory, high-current pulses flip the magnetization of the cores, erasing the word.

The erasable core memory module from the Apollo Guidance Computer. This module holds 2 kilowords of memory, with a tiny ferrite core storing each bit. To read memory, high-current pulses flip the magnetization of the cores, erasing the word.

The MCT is the basic time unit for instruction execution. A typical instruction requires two memory cycles: one memory access to fetch the instruction from memory, and one memory access to perform the operation.13 Thus, a typical instruction requires two MCTs (23.4µs), yielding about 43,000 instructions per second. (This is extremely slow compared to modern processors performing billions of instructions per second.)

Internally, the Apollo Guidance Computer processes instructions by breaking an instruction into subinstructions, where each subinstruction takes one memory cycle For example, the ADS instruction consists of two subinstructions: the ADS0 subinstruction (which does the addition) and the STD2 subinstruction (which fetches the next instruction, and is common to most instructions). The diagram below shows the data movement inside the AGC to execute the ADS0 subinstruction. The 12 times steps are indicated left to right.

Operations during the ADS0 (add to storage) subinstruction. Arrows show important data movement. Based on the manual.

Operations during the ADS0 (add to storage) subinstruction. Arrows show important data movement. Based on the manual.

The important steps are:
T1: The operand address is copied from the instruction register (B) to the memory address register (S) to start a memory read.
T4: The operand is read from core memory to the memory data register (G).
T5: The operand is copied from (G) to the adder (Y). The accumulator value (A) is copied to the adder (X).
T6: The adder computes the sum (U), which is copied to the memory data register (G).
T8: The program counter (Z) is copied to the memory address register (S) to prepare for fetching the next instruction from core memory.
T10: The sum in the memory data register (G) is written back to core memory.
T11: The sum (U) is copied to the accumulator (A).

Even though this is a simple add instruction, many values are moved around during the 12 time intervals. Each of these actions has a control signal associated with it; for instance, the signal A2X at time T5 causes the accumulator (A) value to be copied to the X register. Copying the G register to the Y register takes two control pulses: RG (read G) and WY (write Y). The next section will explain how the AGC's control unit generates the appropriate control signals for each instruction, focusing on these A2X, RG, and WY control pulses needed by ADS0 at time T5.

The control unit

As in most computers, the AGC's control unit decodes each instruction and generates the control signals that tell the rest of the processor (the datapath) what to do. The AGC is designed with microcoded approach, but the control signals are generated from a hardwired control unit built from NOR gates. Specifically, there are no microinstructions and the AGC does not have a control store holding microcode (which would have taken too much physical space).12You can think of the AGC as implementing the microcode ROM with highly-optimized logic gates.

The heart of the AGC's control unit is called the crosspoint generator. Conceptually, the crosspoint generator takes the subinstruction and the time step, and generates the control signals for that combination of subinstruction and time step. (You can think of the crosspoint generator as a grid with subinstructions in one direction and time steps in the other, with control signals assigned to each point where the lines cross.) For instance, going back to the ADS0 subinstruction, at time T5 the crosspoint generator would generate the A2X, RG, and WY control pulses, causing the desired data movement.

The crosspoint generator required a lot of circuitry and was split across three modules; this is module A6. Note the added wires to modify the circuitry. This is an earlier module used for ground testing; modules in flight did not have these wires.

The crosspoint generator required a lot of circuitry and was split across three modules; this is module A6. Note the added wires to modify the circuitry. This is an earlier module used for ground testing; modules in flight did not have these wires.

For efficiency, the implementation of the control unit is highly optimized. Instructions with similar behavior are combined and processed together by the crosspoint generator to reduce circuitry. For instance, the AGC has a "Double-precision Add to Storage" instruction (DAS). Since this is roughly similar to performing two single-word adds, the DAS1 subinstruction and ADS0 subinstruction share logic in the crosspoint generator. The schematic below shows the crosspoint generator circuitry for time T5, highlighting the logic for subinstruction ADS0 (using the DAS1 signal). For instance, the 5K signal is generated from the combination of DAS1 and T5.

Crosspoint circuit for signals generated at time T5. With negative inputs, these NOR gates act as AND gates, detecting a particular subinstruction AND T05. From Apollo Lunar Excursion Manual.

Crosspoint circuit for signals generated at time T5. With negative inputs, these NOR gates act as AND gates, detecting a particular subinstruction AND T05. From Apollo Lunar Excursion Manual.

But what are the 5K and 5L signals? These are another optimization. Many control pulses often occur together, so instead of generating all the control pulses directly, the crosspoint generates intermediate crosspoint signals. For instance, 5K generates both the A2X and RG control pulses, while 5L generates the WY control pulse. The diagram below shows how the A2X signal is generated: any of 8 different signals (including 5K) generate A2X.15 Similar circuits generate the other control pulses. These optimizations reduced the size of the crosspoint generator, but it was still large, split across three modules in the AGC.

The A2X control signal is generated from multiple "crosspoint pulses" from the crosspoint generator. The different possibilities are ORed together. From manual, page 4-351.

The A2X control signal is generated from multiple "crosspoint pulses" from the crosspoint generator. The different possibilities are ORed together. From manual, page 4-351.

To summarize, the control unit is responsible for telling the rest of the CPU what to do in order to execute an instruction. Instructions are first decoded into subinstructions. The crosspoint generator creates the proper control pulses for each time interval and subinstruction, telling the AGC's registers, arithmetic unit, and memory what to do.14

Conclusion

This has been a whirlwind tour of the Apollo Guidance Computer's CPU. To keep it manageable, I've focused on the ADS addition instruction and a few of the control pulses (A2X, RG, and WY) that make it operate. Hopefully, this gives you an idea of how a computer can be built from components as primitive as NOR gates.

The most visible part of the architecture is the datapath: arithmetic unit, registers, and the data bus. The AGC's registers are built from simple NOR-gate latches. Even though the AGC's arithmetic unit can only do addition, the computer still manages to perform a full set of operations including multiplication and division and Boolean operations.9

However, the datapath is just part of the computer. The other critical component is the control unit, which tells the data path components what to do. The AGC uses an approach centered around a crosspoint generator, which uses highly-optimized hardwired logic to generate the right control pulses for a particular subinstruction and time interval.

Using these pieces, the Apollo Guidance Computer provided guidance, navigation, and control onboard the Apollo missions, making the Moon landings possible. The AGC also provided a huge boost to the early integrated circuit industry, using 60% of the United States' IC production in 1963. Thus, modern computers owe a lot to the AGC and its simple NOR gate components.

The Apollo Guidance Computer running in Marc's lab, hooked up to a vintage Tektronix scope.

The Apollo Guidance Computer running in Marc's lab, hooked up to a vintage Tektronix scope.

CuriousMarc has a series of AGC videos which you should watch for more information on the restoration project. I announce my latest blog posts on Twitter, so follow me @kenshirriff for future articles. I also have an RSS feed. Thanks to Mike Stewart for supplying images and extensive information.

Notes and references

  1. The AGC restoration team consists of Mike Stewart (creator of FPGA AGC), Carl Claunch, Marc Verdiell (CuriousMarc on YouTube) and myself. The AGC that we're restoring belongs to a private owner who picked it up at a scrapyard in the 1970s after NASA scrapped it. 

  2. In addition to the NOR-gate logic chips, the AGC used a second type of integrated circuit for its memory circuitry, a sense amplifier. (The earlier Block I Apollo Guidance Computer used NOR gate ICs that contained a single NOR gate.) 

  3. How does the AGC stack up to early microprocessors? Architecturally, I'd say it was more advanced than early 8-bit processors like the 6502 (1975) or Z-80 (1976), since the AGC had 15 bits instead of 8, as well as more advanced instructions such as multiplication and division. But I consider the AGC less advanced than the 16-bit Intel 8086 (1978) which has a larger register set, advanced indexing, and instruction queue. Note, though, that the AGC was in a class of its own as far as I/O, with 227 interface circuits connected to the rest of the spacecraft.

    Looking at transistor counts, the Apollo Guidance Computer had about 17,000 transistors in total in its ICs, which puts it between the Z80 microprocessor (8,500 transistors) and the Intel 8086 (29,000 transistors).

    As far as performance, the AGC did a 15-bit add in 23.4μs and a multiply in 46.8μs. The 6502 took about 3.9μs for an 8-bit add (much faster, but a smaller word). Implementing an 8-bit multiply loop on the 6502 might take over 100μs, considerably worse than the AGC. The AGC's processor cycle speed of 1.024 MHz was almost exactly the same as the Apple II's 1.023 MHz clock, but the AGC took 24 cycles for a typical instruction, compared to 4 on the 6502. The big limitation on AGC performance was the 11.7μs memory cycle time, compared to 300 ns for the Apple II's 4116 DRAM chips.  

  4. An AGC instruction fit into a 15-bit word and consisted of a 3-bit opcode and a 12-bit memory address. Unfortunately, both the opcode and memory address were too small, resulting in multiple workarounds that make the architecture kind of clunky.

    The AGC's 15-bit instructions included a 12-bit memory address which could only address 4K words. This was inconvenient since the AGC had 2K words of core RAM and 36K words of core rope ROM. To access this memory with a 12-bit address, the AGC used a complex bank-switching scheme with multiple bank registers. In other words, you could only access RAM in 256-word chunks and ROM in somewhat larger chunks.

    The AGC's instructions had a 3-bit opcode field, which was too small to directly specify the AGC's 34 instructions. The AGC used several tricks to specify more opcodes. First, an EXTEND instruction changed the meaning of the following instruction, allowing twice as many opcodes but wasting a word. Also, some AGC opcodes didn't make sense if performed on a ROM address (such as incrementing), so four different instructions ("quartercode instructions") could share an opcode field. Instructions that act on peripherals only use 9 address bits, freeing up 3 additional bits for opcode use. This allows, for instance, Boolean operations (AND, OR, XOR) to fit into the opcode space, but they can only access peripheral addresses, not main memory addresses.

    The AGC also used some techniques to keep the opcode count small. For example, it had some "magic" memory locations such as the "shift right register". Writing to this address performed a shift; this avoided a separate opcode for "shift right".

    The AGC also had some instructions that wedged multiple functions into a single instruction. For instance, the "Transfer to Storage" instruction not only transferred a value to storage, but also checked the overflow flag and updated the accumulator and skipped an instruction if there had been an arithmetic overflow. Another complex instruction was "Count, Compare, and Skip", which loaded a value from memory, decremented it, and did a four-way branch depending on its value. See AGC instruction set for details. 

  5. For more on the AGC's architecture, see the Virtual AGC and the Ultimate Apollo Guidance Computer Talk

  6. The NAND gate also has the same property of being a universal gate. (In modern circuits, NAND gates are usually more popular than NOR gates for technical reasons.) The popular NAND to Tetris course describes how to build up a computer from NAND gates, ending with an implementation of Tetris. This approach starts by building a set of logic gates (NOT, AND, OR, XOR, multiplexer, demultiplexer) from NAND gates. Then larger building blocks (flip flop, adder, incrementer, ALU, register) are built from these gates, and finally a computer is built from these building blocks. 

  7. Modern computers usually have a stack that is used for subroutine calling and returning. However, the AGC (like many other computers of its era) didn't have a stack, but stored the return address in a link register (the AGC's Q register). To use recursion, a programmer would need to implement their own stack. 

  8. A carry-skip circuit improves the performance of the adder. The problem with binary addition is that propagating a carry through all the bits is slow. For example, if you add 111111111111111 + 1, the carry from the low-order bit gets added to the next bit. This generates a carry which propagates to the next bit, and so forth. This "ripple carry" causes the addition to be essentially one bit at a time. To avoid this problem, the AGC uses a carry-skip circuit that looks at groups of four bits. If there is a carry in, and each position has at least one bit set, there is certain to be a carry, so a carry-out is generated immediately. Thus, propagating a carry is approximately three times as fast. (With groups of four bits, you'd expect four times as fast, but the carry-skip circuit has its own overhead.) 

  9. You might wonder how the AGC performs a variety of arithmetic and logic operations if the arithmetic unit only supports addition. Subtraction is performed by complementing one value (i.e. flipping the bits) and then adding. Most computers have a complement circuit built into the ALU, but the AGC is different: when the B register is read, it can provide either the value or the complement of the stored value.10 So to subtract a value, the value is stored in the B register and then the complement is read out and added.

    What about Boolean functions? While most computers implement Boolean functions with logic circuitry in the ALU, the Apollo Guidance Computer manages to implement them without extra hardware. The OR operation is implemented through a trick of the register circuitry. By gating two registers onto the write bus at the same time, a 1 from either register will set the bus high, yielding the OR of the two values. AND is performed using the formula A ∧ H = ~(~A ∨ ~H); complementing both arguments, doing an OR, and then complementing the result yields the AND operation. XOR is computed using the formula A ⊕ H = ~(A ∨ ~H) ∨ ~(H ∨ ~A), which uses only complements and ORs. It may seem inefficient to perform so many complement and OR operations, but since the instruction has to take 12 time intervals in any case (due to memory timing), the multiple steps don't slow down the instruction.

    Multiplication is performed by repeated additions, subtractions, and shifts using a Radix-4 Booth algorithm that operates two bits at a time. Division is performed by repeated subtractions and shifts.11 Since multiply and divide require multiple steps internally, they are slower than other arithmetic instructions. 

  10. Since a latch has outputs for both a bit and the complement of the bit, it is straightforward to get the complemented value out of a latch. Look near the bottom of the schematic to see the B register's circuitry that provides the complemented value. 

  11. The AGC's division algorithm is a bit unusual. Instead of subtracting the divisor at each step, a negative dividend / remainder is used through the division and the divisor is added. (This is essentially the same as subtracting the divisor, except everything is complemented.) See Block II Machine Instructions section 32-158 for details. 

  12. The AGC doesn't use microcode but confusingly some sources say it was microprogrammed. The book "Journey to the Moon" by Eldon Hall (creator of the AGC) says:

    The instruction selection logic and control matrix was a microprogrammed instruction sequence generator, equivalent to a read-only memory implemented in logic. Outputs of the microprogrammed memory were a sequence of control pulses that were logic products of timing pulses, tests of priority activity, instruction code, and memory address.

    This doesn't make sense, since the whole point of microprogramming is to use read-only memory instead of hardwired control logic. (See A brief history of microprogramming, Computer architecture: A quantitative approach section 5.4, or Microprogramming: principles and practices.) Perhaps Hall means that the AGC's control was "inspired" by microprogramming, using a clearly-stated set of sequenced control signals with control hardware separated from the data path (like most modern computers, hardwired or microcoded). (In contrast, in many 1950s computers (like the IBM 1401) each instruction's circuitry generated its own ad hoc control signals.)

    By the way, implementing the AGC in microcode would have required about 8 kilobytes of microcode (79 control pulses for about 70 subinstructions with 12 time periods. This would have been impractical for the AGC, especially when you consider that microcode storage needs to be faster than regular storage.  

  13. While instructions typically used two subinstructions, there were exceptions. Some instructions, such as multiply and divide, required multiple subinstructions because they took many steps. On the other hand, the jump instruction (TC) used a single subinstruction since fetching the next instruction was the only task to do. 

  14. Other processors use different approaches to generate control signals. The 6502 and many other early microprocessors decoded instructions with a Programmable Logic Array (PLA), a ROM-like way of implementing AND-OR logic. The Z-80 used a PLA, followed by logic very similar to the crosspoint generator to generate the right signals for each time step. Many computers use microcode, storing the sequence of control steps explicitly in ROM. Since minimizing the number of chips in the AGC was critical, optimizing the circuitry was more important than using a clean, structured approach.

    Die photo of the 6502 microprocessor. The 6502 used a PLA and random logic for the control logic, which occupies over half the chip. Note the regular, grid-like structure of the PLA. Die photo courtesy of Visual 6502.

    Die photo of the 6502 microprocessor. The 6502 used a PLA and random logic for the control logic, which occupies over half the chip. Note the regular, grid-like structure of the PLA. Die photo courtesy of Visual 6502.

     

  15. Each subinstruction's actions at each time interval are described in the manual. The control pulses are described in detail in the manual. (The full set of control pulses for ADS0 are listed here.)