Ken Shirriff's blog: June 2021

Deep dive into how the Teensy microcontroller interacts with the Arduino library

The Arduino language lets you program microcontrollers at a high level, controlling I/O pins without worry about exactly how the microcontroller works. But what's really going on behind the scenes? For my current project, I'm using a Teensy 3.6,1 a development board packaged in a breadboard-compatible 48-pin module that is considerably smaller than a classic Arduino.2 The Teensy uses a fairly powerful microcontroller, a 32-bit ARM processor running at 180 megahertz, and it is (mostly) compatible with the Arduino programming environment. I wanted to understand the low-level hardware better, so I investigated the implementation of one of the Arduino functions. Specifically, this post explains exactly how the analogWrite() function works in the Teensy 3.6. Disclaimer: this blog post goes into excessive detail on an obscure subject, so feel free to stop reading now :-)

An Arduino (top) and Teensy 3.6 (bottom).

analogWrite(): creating a PWM output

The Arduino IDE lets you quickly create an application using functions that abstract away the microcontroller's implementation details. In comparison, if you program a microcontroller directly, its hardware functions are activated by accessing special memory locations that act as control registers. There may be thousands of registers, different for each microcontroller, and described in thousand-page manuals, so programming a microcontroller directly can be daunting.

Using the Arduino library, you can put a voltage on an output pin with the analogWrite(pin, value) function. You specify a value between 0 and 256, where 0 is completely off and 256 is completely on and the library takes care of the details. For instance, analogWrite(pin, 64) produces an output value of 25% (i.e. 64/256). You might expect this would produce an analog voltage at 25% of the maximum, but despite the function's name, the output is not analog. Instead it is a digital pulse-width modulated (PWM) signal, which averages out to the desired value.3 As the oscilloscope trace below shows, the output switches between full-on and full-off, remaining on 25% of the time.4 Even though it doesn't produce a true analog output, the analogWrite function is useful for many tasks, such as controlling LED brightness.

Oscilloscope output showing the output from analogWrite().

The diagram below shows how the output changes with different analogWrite values, from 0 (completely off) to 256 (completely on). The main point is that the output is really digital, with a larger input parameter causing the output to be on for a larger fraction of the time. This technique is called Pulse Width Modulation (PWM), since the width of the pulse changes with the input.

Examples of different analogWrite values, from 0 to 256.

The diagram below illustrates how the microcontroller produces the PWM output. Internally, a timer repeatedly counts from 0 to 255, generating a counter value. Each time the timer starts at 0, the output is set high. When the timer matches the specified value (64 in this case), the output goes low. Thus, the match value controls how long the output remains high in each cycle; the larger the value, the longer the output remains high. The timer increments every 8 microseconds, so the total cycle length is 2048 microseconds, yielding a frequency of 490 Hz.

A PWM output is implemented by a timer and a match value.

The analogWrite function is sufficient for most purposes, but how does it work at the microcontroller register level? The manual for the Teensy's MK66FX1M0 processor explains how the chip's registers work, but is 2237 pages long. (I've extracted the relevant bits and give references to manual sections if you want to know more.) The code for the Teensy implementation of analogWrite is in a file called pins_teensy.c. Because the code supports multiple processors, it is full of #ifdefs; the Teensy 3.6 code is selected by the __MK66FX1M0__ and KINETISK5 defines, specifying the processor type and family. The code contains a bunch of case statements to handle all the different types of PWM pins. I'm using pin 30 in my example, which is defined in that file as FTM2_CH1_PIN (FlexTimer 2 Channel 1 pin). (I'll explain below why this timer is pin 30.)

The code to handle that pin is:

cval = ((uint32_t)val * (uint32_t)(FTM2_MOD + 1)) >> analog_write_res;
FTM2_C1V = cval;
FTM_PINCFG(FTM2_CH1_PIN) = PORT_PCR_MUX(3) | PORT_PCR_DSE | PORT_PCR_SRE;

As you can see, this code is much more complex than the analogWrite() call. In brief, the first line computes the counter value (match value) at which the output should go to 0. The second line stores this value into the timer control register. The third line configures pin 30 for the timer output. Next, I'll explain each of these lines in more detail.

The first line handles the difference between the conceptual timer (counting from 0 to 255) and the physical implementation of the timer, which is 16 bits and counts at a much higher rate. To match the Arduino's PWM frequency (490 Hz), the Teensy timer counts to 61439. This line scales the input value (0 to 256) to the desired range (0 to 61440). Specifically, the hardware register FTM2_MOD (timer 2 modulo) holds 61439, the value that this timer counts to.6 Multiplying the input value by 61440 and dividing by 256 scales the input value to the new range. (The value 8 for analog_write_res indicates 8 bits of count resolution, i.e. 256.)7

The next line of code stores this value into timer 2's Channel 1 Value register FTM_C1V,8 which controls the pulse width. This register holds the "match value"; when the timer counter reaches this value, the output drop to 0.

The third line configures pin 30 for the output from the timer. The FTM_PINCFG macro handles pin configuration, which in this case updates the configuration for pin 30 (CORE_PIN30_CONFIG).11 The PORT_PCR_MUX(3) macro selects the pin's function from the pin multiplexer, which I'll explain in the next section.10 The PORT_PCR_DSE option sets Drive Strength Enable, enabling high-current output. The PORT_PCR_SRE option sets Slew Rate Enable, slowing the pin's slew rate (how fast it changes value).9 These values are combined and stored in the appropriate bit fields of the Pin Control Register, shown below. (The macros ensure that each value goes into the right position.)

This diagram shows how multiple fields are packed into a Pin Control Register. (From Section 12.5.1 of the manual.)

Filling in the macros, the original analogWrite(30, 64) call becomes:

*(uint32_t *)0x400B8018 = 15360;
*(uint32_t *)0x4004A04C = 0x344;

Thus, in the end, the analogWrite call turns into two stores to microcontroller registers.

Determining the pin and its function

Pin configuration is more complex than you might expect. The problem is that the processor chip has 144 pins (in a 12×12 grid), but the microcontroller provides a much larger number of functions. The solution is that each pin has up to 8 different multiplexed functions, and you can select one of these functions for each pin. Thus, you can't use all the features of the chip at the same time, but hopefully you can use the features you need.

The chip has a 12×12 grid of solder balls on the bottom. (Photo from Digi-Key.)

In the example I'm using GPIO pin 30, but this pin number is part of the Arduino API: the microcontroller has no pin 30. So how does pin 30 get a meaning? In this section, I explain how pin 30 maps onto a physical pin of the microcontroller (pin D11 in this case) associated with a PWM timer (FlexTimer 2 channel 1 in this case).

The function of each Teensy pin is documented, but I wanted to figure out "from scratch" what GPIO pin 30 means. Looking at the schematic shows the Teensy's pin 30 is connected to pin D11 of the processor, which is labeled "PTB19". (Processor pins are labeled with a letter and number corresponding to the pin's grid position.)

Detail of the Teensy 3.6 schematic showing microcontroller pin D11 is connected to Teensy GPIO pin 30.

Chapter 11 of the manual lists the names and functions for each pin (excerpted below). As mentioned earlier, each physical pin supports multiple functions. Pin D11 has the official name "PTB19" and has seven different functions assigned to it: Touch Screen, GPIO PorT B, CAN bus, FlexTiMer FTM2_CH1 (that we're using), I2S audio, FlexBus, and FlexTiMer 2 Quadrature Decoder.

This excerpt from the manual shows the functions that can be assigned to pin D11.

Each pin has a multiplexer (MUX) that selects which function is assigned to the pin. In order to use the timer with pin D11, the pin configuration register (PCR) for D11 must be configured to assign function 3 to this pin. This was done with the macro discussed earlier, PORT_PCR_MUX(3). Thus, when an analogWrite is performed, the pin is configured to use the appropriate timer.

Initialization

Another piece necessary to make this work is the Teensy's initialization code. The main routine in main.cpp calls _init_Teensyduino_internal_(), which performs the necessary register initialization. The timer 2 initialization code is

FTM2_CNT = 0;
FTM2_MOD = DEFAULT_FTM_MOD;
FTM2_C0SC = 0x28;
FTM2_C1SC = 0x28;
FTM2_SC = FTM_SC_CLKS(1) | FTM_SC_PS(DEFAULT_FTM_PRESCALE);

This sets the initial counter value to 0 and sets the modulo value (maximum count) to 61439 as discussed earlier. The FTM2_C0SC and FTM2_C1SC lines enable PWM mode. The FTM2_SC line sets up the timer clock.12

The last piece is how the code knows the processor type. To support multiple processor types, the files are full of #ifdefs, but where do these get defined? The answer is that the board type and CPU speed are set in the Arduino IDE. The IDE uses these settings to generate flags that are passed to the compiler when compiling the code. The relevant lines for the Teensy 3.6 are in the file hardware/teensy/avr/boards.txt:

teensy36.build.flags.defs=-D__MK66FX1M0__ -DTEENSYDUINO=153
teensy36.menu.speed.180.build.fcpu=180000000

Conclusion

At this point we've reached the foundation. To summarize, the board that you select in the Arduino IDE causes various flags to be passed to the C++ compiler. These flags, in turn, select numerous definitions of registers for that processor, along with the appropriate code. The result is that a function call such as analogWrite(30), acting on an abstract pin 30, gets converted to operations on special microcontroller registers, causing the microcontroller's circuitry to output the desired signal.

It may seem like magic that high-level operations end up doing the right thing across a wide range of microcontrollers, but this is one of the key accomplishments of the Arduino ecosystem. If you really need to know what's going on, I've shown how these abstractions can be unwrapped. But for the most part, the complexity underneath can fortunately be ignored.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed. I wrote about Arduino PWM and its registers in detail here if you want to know more about PWM. Thanks to Paul Stoffregen for answering my questions about Teensy.

Notes and references

Why am I using a Teensy 3.6 instead of a newer model? Because the more recent Teensy 4.1 was out of stock. ↩
There are also Arduino models in the DIP form factor, such as the Arduino Nano and Arduino Micro. Arduino also has high-power models such as the 32-bit ARM-based Arduino Portenta. ↩
The Teensy 3.6 has two digital-to-analog converter (DAC) outputs. For those two pins, the analogWrite() function produces a genuine analog voltage, not a PWM output. ↩
The PWM output has a period of 2048 µs, yielding a frequency of about 490 Hertz. The output is controlled in units of 8 µs, so an input value of 1 yields a pulse width of 8 µs, an input of 64 yields a pulse width of 512 µs and so forth. ↩
I tried to sort out what "Kinetis" means. NXP has many different microcontrollers and Kinetis is their family of 32-bit mixed-signal ARM Cortex microcontrollers, introduced in 2010. The Kinetis family includes the high-performance K series and the low-power L series. The Teensy 3.x boards use the Kinetis K series and have the preprocessor variable KINETISK defined, while the Teensy LC board uses a Kinetis L processor and has KINETISL defined. ↩
The variable FTM2_MOD is defined as the address (400B8008) of the FTM2 modulo register in kinetis.h. Why is the modulo set to 61439? The goal is to make the PWM period match the Arduino's 2048 µs period (approximately 490 Hertz). To see how this happens, start with the Teensy's clock frequency (F_CPU) of 180 MHz. kinetis.h sets the bus frequency F_BUS to 60 MHz based on this. Then pins_teensy.c uses this for the timer frequency F_TIMER. For a frequency of 60 MHz, pins_teensy.c sets DEFAULT_FTM_MOD to 61439 and DEFAULT_FTM_PRESCALE to 1. This prescale value causes the timer to divide its input frequency by 2, so the timer runs at 30 megahertz. At this frequency, 61440 ticks will take 2048 µs as desired.

Figuring out the address for FTM_MOD2 is more confusing than I expected. If you look at the memory map in the manual (Section 45.4.2), the address for FTM2_MOD is 4003A008 (Peripheral bridge 0), but the Teensy uses address 400B8008 (Peripheral bridge 1, Table 5-3), see kinetis.h. It turns out that the chip has two paths for accessing peripherals: AIPS0 and AIPS1. The timer can be accessed through both paths, but with different register addresses.

Another confusing thing is that if you try to access FTM2_MOD through the first address, the Teensy will crash. The reason is that the microcontroller lets you conserver power by turning off the clock to each module, a function called "clock gating". If you try to access a peripheral when the clock is disabled, the system terminates with an error. The two different paths to the timer are controlled by separate clocks. Specifically, access through AIPS0 is enabled through System Clock Gating Control Register 6 (SIM_SCGC6, section 13.2.16), while access through AIPS1 is enabled through SIM_SCGC3 (sections 13.2.13). The Teensy startup code enables timer FTM2 through clock gating register SIM_SCGC3 (for AIPS1) but not SIM_SCGC6 (for AIPS0). Thus, accessing the timer through AIPS1 works, but accessing it through AIPS0 crashes. This thread has more information. ↩
By default, the value to analogWrite() can range from 0 to 256, i.e. 8 bits of resolution. However, the resolution can be changed by calling analogWriteResolution. Higher resolution gives finer-grain control over the PWM width.

The Teensy extensions to Arduino include a function analogWriteFrequency(), which provides a more convenient way of modifying the PWM frequency. ↩
The Register Descriptions section (45.4.2) describes the memory address for each register. FTM2_C1V is the "Channel Value" at address 4003A018. Section 45.4.7 explains that this register holds the 16-bit counter value that the timer matches against. ↩
On my breadboard, a signal has a rise time of 7.5 nanoseconds with slew rate disabled and 15 nanoseconds with slew rate enabled. The fast signal has a bunch of ringing, while the slower signal rises smoothly. ↩
The Pin Control Register is described in section 12.5.1 with details in chapter 11, Signal Multiplexing and Signal Descriptions. ↩
The macro FTM_PINCFG(FTM2_CH1_PIN) turns into CORE_PIN30_CONFIG, the appropriate configuration register. This is defined in core_pins.h as PORTB_PCR19. The manual (section 12.5) specifies that PORTB_PCR19 (Port B Pin Control Register 19) has address 4004A04C. ↩
Register constants FTM2_C0SC and FTM2_C1SC are set to 0x400B800C and 0x400B8014 respectively in kinetis.h. The manual defines these addresses (section 45.4.2) as 4003_A00C and 4003_A014. (The differences are because the timer can be accessed through a different path (Peripheral Bridge 1) at address 400B_8xxx.) These registers are Channel 0/1 Status and Control, discussed in manual section 45.4.6. Each register has 7 bit fields that control the timer function. The initialization value 0x28 selects Edge-Aligned PWM with high-true pulses.

Register constant FTM2_SC (timer 2 Status and Control) has address 400B8000 in the code and 4003A000 in the manual. Its fields are described in manual section 45.4.3. FTM_SC_CLKS(1) sets the CLKS field to use the system clock as the timer input. FTM_SC_PS sets the prescale to divide the clock by 2, as discussed earlier. ↩

Inside a transistorized shift register box, built in 1965 for Apollo testing

One of the under-appreciated aspects of the Apollo launches to the Moon is how much testing was required. I recently came across an item that was part of this testing: the Computer Buffer Unit. It is essentially a 16-bit shift register that interfaced test equipment to the Apollo Guidance Computer. While a shift register is a trivial circuit nowadays, back then it took a box full of transistors that weighed about 5 pounds. In this blog post, I look inside this unit, describe its unusual packaging and circuitry, and explain how it works.

The Computer Buffer Unit is a 4"×6"×6" box. The three electrical connectors on the left are covered by protective covers. It has a humidity indicator and pressurization valve at the bottom.

Testing for the Apollo missions

The Apollo spacecraft required extensive testing even while it was sitting on the launch pad. Thousands of different spacecraft components needed to be activated and analyzed for various tests. Since the control room was miles away from the launch pad, it wasn't practical to run separate wires to each component. Instead, NASA invented (and patented) a complex digital test system that communicated efficiently between the control room and the rocket. This test system sent digital commands to the launch site, where racks of control and interface units were wired to the spacecraft components. These units decoded the commands and performed the specified operation. Massive quantities of measurement data from the spacecraft were encoded digitally and serialized for communication back to the control room.

The complexity of testing is illustrated by the control room below.2 This is not Mission Control, but a separate control room specifically for testing, called ACE-S/C (Acceptance Checkout Equipment-Spacecraft). These consoles were crammed with control switches, tape readers, CRT displays, chart recorders, and status panels for conducting tests and recording results. The ACE-S/C system supported manual, semiautomatic, and automatic testing, driven by two minicomputers1.

ACE control room. From Applicability of Apollo Checkout Equipment.

All parts of the spacecraft were tested, including the fuel cells, cryogenic fuel storage, communications, and environmental control. For this blog post, the relevant subsystem is "Guidance and Navigation", responsible for determining the Apollo spacecraft's position in space using inertial navigation and guiding it on the proper trajectory including the landing on the Moon's surface. The key to Guidance and Navigation was the Apollo Guidance Computer, 70-pound computers onboard the Lunar Module and the Command Module.

The Apollo Guidance Computer that we restored, next to a replica DSKY.

In space, astronauts operated the Apollo Guidance Computer through the Display/Keyboard (DSKY), a box (above) with keys, indicator lights, and numeric displays. But for ground testing, there needed to be a way to feed commands into the Apollo Guidance Computer from the testing system. The solution was the Computer Buffer Unit, the box that I'm examining. To operate the Apollo Guidance Computer remotely, the ACE test system encoded each DSKY keypress as a 16-bit command3 and sent it to the Buffer Unit. The Buffer Unit converted the message to serial, transferring one bit at a time to the Apollo Guidance Computer, which then processed the desired keypress.4 Thus, the Apollo Guidance Computer could be controlled remotely for testing, providing control over the Guidance and Navigation system, and the Computer Buffer Unit was the interface with the Apollo Guidance Computer.

Inside the Computer Buffer Unit

Next, I'll discuss the physical construction of the Computer Buffer Unit. Removing the lid reveals the components inside.5 The main circuitry consists of six horizontal circuit boards wired into a vertical backplane board; the top board is visible below. One unusual feature is the bag of desiccant inside the unit, zip-tied to the right side of the case. The designers of the unit were worried about Florida humidity and the risk of corrosion.6 To guard against damp air, the unit has a valve on the front so it can be pressurized with dry nitrogen. On the front of the unit, you can see a humidity sensor that changes color to indicate 10%, 20%, and 30% humidity. If the internal humidity exceeded 30%, the desiccant needed to be replaced, as described by the warning label.

The Buffer Unit with the lid removed.

I removed the circuit boards with some difficulty, as they fit tightly. The photo below shows the stack of six printed circuit boards wired into the vertical backplane. The wires from the connectors are soldered directly to the backplane.

With the circuit boards pulled out of the unit, the wiring to the backplane is visible.

The circuit boards can be opened up like a book to provide access to the inner boards. The boards are not soldered directly to the backplane, but are connected by short, flexible wires, allowing them to swing apart. To prevent short circuits between the boards, they are separated by white sheets of (probably) silicone.

After removing six screws, the boards can be unfolded like a book.

The circuitry is constructed in a very unusual way that I haven't seen before. Instead of mounting components directly on the circuit boards, components are mounted on small boards, each forming a module with a logic gate or two. These smaller modules are then soldered on pins above the main circuit boards, forming two-layer boards. Essentially they built pseudo-integrated-circuits on small boards, and then constructed the circuitry from these modules.

Closeup of logic modules mounted on the circuit board. A blue resistor is visible on the underside of the module.

It is difficult to see the components sandwiched between the main board and the smaller modules, but the side view below shows some of the components. The two boards are connected by the vertical pins. A tiny glass diode is visible towards the left. The longer components are resistors. The shiny metal-can transistors are in the middle of the module and harder to see.

This side view shows a latch module (bottom) attached to the circuit board (top). The diodes, resistors, and transistors of the latch module are visible.

One question is why the circuitry is implemented with small circuit boards attached to the larger circuit board, instead of mounting the components directly on the circuit board. This approach seems overly complex and makes the boards twice as thick. One advantage, though, is that the separate logic modules could be manufactured, testing, and repaired separately, important in an era when semiconductors were less reliable. Second, the main boards and the logic modules are different types of printed circuit boards: four-layer circuit boards with widely-spaced traces versus single-sided but dense boards.

Logic gates

The circuitry is implemented with a logic family called Diode-Transistor Logic (DTL). This type of logic was used in the early 1960s as it only required one (expensive) transistor per gate, using cheaper diodes where possible. As transistor prices dropped, Transistor-Transistor Logic (TTL) became more popular because of its better performance. Nowadays fast, low-power CMOS logic is used in most integrated circuits.

I reverse-engineered the schematic below, which shows a NOR gate from this unit. This gate has two inputs, as well as two outputs (for reasons that will be explained below). If both inputs are low (0), the transistor will be turned off. As a result, the resistors pull the outputs high, producing 1 outputs.

The NOR gate with both inputs low, outputs high.

If an input is high, the circuit behaves as shown below. Current flows from the input pull-up resistor through the diodes and the transistor's base, turning the transistor on. As a result, current flows from the outputs, through the transistor to ground, pulling the outputs low. Thus, the circuit implements a NOR gate: the output is 1 if all inputs are low, and 0 otherwise.

The NOR gate with a high input, outputs high.

The reason for multiple outputs is clever. If you connect the outputs from multiple gates together, this combined output will be pulled low if any output is low (i.e. the transistor is turned on), and otherwise will be pulled high by the resistor.7 This logic is equivalent to an AND gate. Note that the AND gate is implemented "for free" by wiring outputs together, without requiring additional logic; this is called wired-AND. However, you can't use a gate's output in two different wired-AND gates, since everything will be shorted together. Instead, a gate provides multiple outputs that can be wired independently; the diodes keep the outputs isolated from each other.

The board I examined has 5 different types of logic module8, from an inverter with 1 input and 8 outputs to a module with two 2-input, 5-output NOR gates. These modules follow the circuit above, but with different numbers of inputs and outputs.

Implementation of the shift register

The idea behind a shift register is to store multiple bits in a row. Each time a clock signal is activated, the bits are shifted by one position. Shift registers can be used to store data, convert parallel data to serial, or convert serial data to parallel. In this Buffer Unit, the shift register converted a 16-bit parallel value from the test equipment into a serial stream of bits for the Apollo Guidance Computer.9

The board implements four bits of the 16-bit shift register. The schematic below shows the circuitry for a one-bit stage of the shift register. There's a lot going on, but I'll try to explain it. The heart of the stage consists of two latches, which store one bit. A bit is stored by first updating the primary latch, and then the secondary latch. (Each latch consists of two cross-coupled NOR gates, and can hold either a 0 or a 1.) The shift out lines are the outputs from the shift register stage, a regular output and an inverted output.

One stage of the shift register. It can read the bits in parallel, or shift a bit from one stage to the next.

Each shift out line is fed to the shift in lines of the next stage, allowing the bits to be transferred from stage to stage through the shift register. The shift and load control lines, along with the AND gates, select the input to each stage. With shift high, the input will be the shift out from the previous stage. With load high, the input reads the external bit in lines. This allows a 16-bit data word to be read into the shift register in parallel. (I'm not sure what the clear bit function is used for.) After a bit has been loaded into the primary latch, the clock line is activated to load the bit into the secondary latch, completing the shift or load cycle.

An interesting function of the unit is that after loading, the value in the latch is compared to the input value, to make sure that the circuit is operating correctly. If there is a mismatch, a compare AND gate will activate, clearing the match line. (A compare AND gate will activate if the input bit is 1 and the latch bit is 0, or vice versa.) This circuit also detects a fault in the bit input wires. Each bit is provided over two wires: one with the bit value and one with the inverted bit value. If a wire is broken or affected by noise, the comparison will fail.10

This diagram shows the functions of the gates. Note that the circular golden transistors are faintly visible through the circuit boards.

The board above11 contains four of these shift-register stages. The photo above shows how these stages map onto the hardware. The external signals (4 pairs of bit lines) enter at the bottom of the board, and pass through the input inverters. The 8 primary latch NOR gates implement four primary latches. Four secondary latch modules implement the four secondary latches, since each module contains two NOR gates. The clock driver, load driver, and shift driver provide 8 copies of the clock, load, and shift signals for the circuitry. Finally, the two match NOR gates combine the 8 match signals. (Note that since the AND gates are implemented with wired-AND, they don't use additional circuitry and do not appear in this diagram.)

I/O and power

I'll wrap up with a few comments on the I/O and power supply for the Buffer Unit. The unit has three military-style connectors on the front. At the top is a 61-pin connector for receiving the parallel data and control signals from ground equipment. (The pin count is larger than you might expect because each bit uses two wires as discussed earlier. Also, many of the 61 pins are unused.)

The unit has three connectors. The unit receives parallel data from ground equipment through the 61-pin connector at the top. The middle connector communicates the serial data to the Apollo Guidance Computer. The unit is powered with 28 volts through the bottom connector, which has larger pins for the high-current supply.

The middle connector has four pins that provide the serial data stream to the Apollo Guidance Computer. The wiring is a bit unusual. Instead of transmitting data over one serial line, the unit uses two pairs of lines: one to transmit "0" bits and one to transmit "1" bits. To provide electrical isolation between the unit and the Apollo Guidance Computer, these signals are transmitted via two small pulse transformers, shown below. When a pulse is fed into a pulse transformer, a similar pulse is produced on the output. (In modern equipment, an optoisolator provides similar functionality.)

Two pulse transformers on the top circuit board. Each small transformer is about 1 cm in diameter.

The bottom connector on the unit has two thick pins to provide 28 volts to the unit. This view inside the unit shows the power converter, a sealed black box. I believe this is a switching power supply module that converted the 28-volt input into the lower voltage required by the logic circuitry. It also provided electrical isolation from the power supply. The smaller black box on the right is an EMI filter on the power input; the Apollo ground test equipment encountered faults from voltage transients and electrical noise, so they added filtering.

The power supply components are sealed in black plastic.

Conclusion

This Computer Buffer Unit was built in 1965, a time when the industry was shifting from transistors to integrated circuits. This may explain the Unit's unusual construction technique, small circuit-board modules that are like integrated circuits built from discrete components.12 Interestingly, Motorola built a similar Buffer Unit for NASA that used integrated circuits (but was just as large),13 illustrating that transistors and integrated circuits were both viable approaches in 1965.

This box also illustrates the rapid pace of integrated circuit technology since the 1960s. The first commercial MOS integrated circuit was a 20-bit shift register introduced in 1964 and by 1970, Intel was producing a 512-bit shift register. In 1971, Western Digital was selling a UART chip, putting a complete parallel-to-serial and serial-to-parallel communication system onto a chip. Thus, it took 6 years to shrink the complex shift-register box down to a single chip (more or less). Nowadays, this functionality forms a tiny part of a complex chip. Coincidentally, Moore's Law, describing the exponential growth of integrated circuits, was published in 1965, the same year this box was manufactured.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. (The Twitter thread corresponding to this blog post is here.) I also have an RSS feed. Thanks to Steve Jurvetson for letting me examine this artifact. A video tour of his space museum is here. Thanks to Mike Stewart for providing documents and extensive information on this box.

Notes and references

The photo below shows the ACE computer room that supported ACE testing. The system was controlled by two 13-bit CDC 160-G minicomputers. Strangely, the CDC 160-G minicomputers were 13-bit computers, with 13-bit addresses, 13-bit registers, and 13-bit arithmetic. The earlier CDC 160 computer was 12 bits, and CDC improved the 160-G model by adding one more bit. The CDC 160 was designed by Seymour Cray, reportedly over a weekend.

"An ACE Station with twin Control Data computers." From Computers in Spaceflight.

↩
There were about 10 ACE installations for testing at various sites. ACE testing was performed at contractor sites, as well as at the launch pad. ↩
To send a DSKY keypress through the testing system, each keypress was encoded as 5 bits as shown below. The 16-bit message consisted of a 1 bit followed by three copies of the 5-bit keypress, with the middle copy inverted. (Sending the keypress in triplicate detected communication errors.)

The encoding of keys when communicating with the Apollo Guidance Computer. From ACE-S/C Operator's Manual.

↩
The serial protocol used by the Apollo Guidance Computer is a bit unusual compared to modern serial protocols. Instead of a single serial line, it used two pairs of wires: one to receive a 1 bit and one to receive a 0 bit. This worked well with the Apollo Guidance Computer hardware, which included a feature for incrementing and decrementing counters in response to interrupts. In particular, a serial input 0 triggers a SHINC instruction (shift left), while a serial input 1 triggers a SHANC (shift and increment by 1) instruction.

(The interrupt-triggered counter mechanism worked well except during the Apollo 11 landing, when the power supply for the Apollo Guidance Computer and the power supply for the rendezvous radar had a phase difference. For complex reasons, this resulted in a high rate of interrupts, overloading the Apollo Guidance Computer and causing restarts. This was indicated by the famous 1201 and 1202 program alarms during the landing.)

The K-START (Keyboard - Selections To Actuate Random Testing) panel is used to send commands to the Apollo Guidance Computer. From ACE-S/C Operator's Manual.

In the ACE testing control room, DSKY keypresses were entered on a panel called K-START (Keyboard - Selections To Actuate Random Testing), shown above. The keyboard corresponds to the keyboard on the DSKY, while it has other switches specific to testing. These key entries could also be recorded on perforated tape and played back at high speed. ↩
Another interesting feature of the unit is how it is mounted on a rack. The back of the unit has two Teflon-lined holes. Two "dagger pins" from the rack fit into these holes. On the front, the unit has two small hold-down hooks; a knob on the rack engages with the hook to hold the unit in place. The mounting hooks are type NAS 622, an aerospace standard. The hold-down mechanism is described here.

Back of the Buffer Unit with identifying label and two holes for dagger pins. The labels say "Unit, Computer Buffer Guidance & Navigation. NAA/S & ID Control No. ME901-0271-0002. Stock No. Contract No. M5H3XA-450001. NAA/S & ID Inspection Serial No. Control Data Corporation MFGR Part No. 106068-0002. Mfgr Serial No. 10136SA08185. US Nov 19 1965.

↩
The document Acceptance Checkout Equipment for the Apollo Spacecraft discusses the corrosion problems encountered by the test equipment due to humidity and insufficient air conditioning. The specifications don't discuss pressurization of the unit, but I'm assuming they used nitrogen based on other items I've studied. ↩
One subtlety with the wired-AND gate is that connecting multiple outputs together will result in multiple pull-up resistors in parallel, which may provide too much pull-up current. The solution is that some gates have outputs without pull-up resistors, so each wired-AND output has a single pull-up. The wired-AND isn't entirely free, since the multiple outputs require multiple diodes, but diodes are inexpensive compared to transistors. I should admit that I'm not 100% sure of the circuitry. Since the components are all hidden underneath the module, I had to deduce the circuitry by probing it from above. There were a few inputs that didn't seem to have connectivity; perhaps there are capacitors to make these inputs pulse-based. ↩
The board I examined uses the following types of modules:
2304: 1-in, 8-out inverter
2309: 3-in, 4 out NOR
2311: 4-in, 2-out NOR
2319: 1-in, 4-out inverter
2314: dual 2-in, 5-out NOR (larger than the other modules) ↩
The specifications for the Buffer Unit describe its purpose: "This specification covers the requirements for a Guidance and Navigation Computer Buffer Unit, hereinafter referred to as the G&N buffer. The G&N buffer shall form a part of the Digital Test Command System (DTCS) which is the up-link portion of the Automatic Checkout Equipment (ACE). The ACE will be used as ground support equipment for the Apollo space craft. The G&N buffer shall receive remotely generated digital test commands from the control room via the DTCS and shall store, verify, and shift out G&N data in appropriate format to the G&N on-board computer."

Functional diagram of the Buffer Unit. Image from Specification MC 901-0666 courtesy of Mike Stewart.

The specifications for the Computer Buffer Unit can be viewed online: MC901-0666, ME901-0666, ME901-0271, ME476-0070.

The unit includes more functionality than just a shift register (but not much more). As shown in the functional diagram above, the unit also includes the clock oscillator that controls the timing of the serial pulses. Second, it contains a control circuit to handle loading the bits in parallel and then shifting them out serially. Third, for reliability reasons, it has a comparator circuit to check that the bits loaded into the shift register match the input bits. ↩
Modern systems often use differential signaling, using two complementary signals for a bit. Looking at the difference between the two signals provides noise immunity, since electrical noise will often affect both signals equally, and thus will be canceled out. Although the Buffer Unit uses two complementary signals, it doesn't provide this noise immunity, since the two signals are processed independently rather than differentially. ↩
I only reverse-engineered one of the boards, since I didn't want to risk more disassembly, and one board is enough to understand the basic logic. I studied board 6 of the unit, which implements bits 15 through 18 of the shift register. Board 3 implements bits 3-6, board 4 implements bits 7-10, as well as mode bits 1 and 2, board 5 implements bits 11 through 14, and board 6 (the one I examined) implements bits 15 through 18. Boards 2 and 4 implement control logic, while board 1 has the output driver transformers.

With board 6 folded down, board 5 is visible.

The photo above shows board 5. Note that the circuit layout is entirely different from board 6. I thought that the unit might consist of four identical 4-bit shift register boards, but it turns out that the boards are optimized for particular roles. ↩
In the context of "not-quite-integrated circuits", I should mention IBM's use of hybrid modules (called SLT) for the System/360 mainframes. These small aluminum-cased modules contained a few transistor or diodes as silicon dies, mounted on a ceramic substrate along with thick-film resistors. These modules were not quite integrated circuits, since they were built from discrete (but unpackaged) components. But they were closer to integrated circuits than the modules in the Buffer Unit, which used packaged transistors, resistors, and diodes on a printed circuit board. ↩
Motorola made a similar Buffer Unit, but they used integrated circuits, specifically Motorola's line of high-speed ECL chips, introduced in 1962. Since each chip is a few gates, it still took multiple boards to build the unit. Apollo Guidance Computer expert Mike Stewart has photos of the Motorola box here, as well as reverse-engineered schematics. The functionality of the Motorola box is nearly identical, except it has separate inputs for the 16-bit compare value. It is built with chips such as the MC308 flip flop and MC 309 dual NOR gate, described here.

A board from the Motorola version of the Buffer Unit. Each metal can is an integrated circuit. Photo courtesy of Mike Stewart.

↩