Showing posts with label chips. Show all posts
Showing posts with label chips. Show all posts

Reverse-engineering the Yamaha DX7 synthesizer's sound chip from die photos

The Yamaha DX7 digital synthesizer was released in 1983 and became "one of the most important advances in the history of modern popular music"1. It defined the sound of 1980s pop music, used by bands from A-ha and Michael Jackson to Dolly Parton and Whitney Houston. The DX7's electric piano sound can be heard in over 40% of 1986's top hits.2 Compared to earlier synthesizers, the DX7 was compact, inexpensive, easy to use, and provided a new soundscape.3

While digital synthesis is straightforward nowadays, microprocessors4 weren't fast enough to do this in the early 1980s. Instead, the DX7 used two custom chips: the YM21290 EGS "envelope" chip generated frequency and envelope data, which it fed to the YM212805 OPS "operator" chip that generated the sound waveforms. In this blog post, I investigate the operator chip and how it digitally produced sounds using a technique called FM synthesis.6 21

I created the high-resolution die photo below by compositing over a hundred microscope photos.6 Around the edges, you can see the 64 bond wires attached to pads; these connect the silicon die to the chip's 64 pins. The chip has one layer of metal, visible as the whitish lines on top. (Power and ground are the thick metal lines.) Underneath the metal, the polysilicon wiring layer appears reddish or greenish. Finally, the underlying silicon is grayish. The overall layout of the chip is dense rectangles of circuitry with the space between them used for signal routing. I will discuss these circuitry blocks in detail below.

Die photo of the DX7's YM21280 Operator chip. Click this photo (or any other) for a magnified version.

Die photo of the DX7's YM21280 Operator chip. Click this photo (or any other) for a magnified version.

The photo below shows the integrated circuit with the metal lid removed, showing the silicon die inside. The pins have been flattened in the photo; they are normally bent downwards, but in a staggered pattern.7 The four rows of pins make this a quad in-line package, with twice the pin density as a regular DIP chip. As a result, this 64-pin chip has a smaller package than a standard 40-pin DIP chip.

The integrated circuit package with the metal lid removed, revealing the silicon die. Pin numbers are printed on the package, which is unusual.

The integrated circuit package with the metal lid removed, revealing the silicon die. Pin numbers are printed on the package, which is unusual.

Analog and digital

In the 1960s and 1970s, synthesizers were mostly analog.8 An oscillator was controlled by the keyboard, generating a wave at the appropriate frequency. This signal was fed through a filter, which shaped the frequency spectrum to produce the desired tone quality (timbre). Finally, the signal had its volume shaped by an envelope generator that made the volume ramp up when the key was pressed, and die off gradually when the key was released.9

An analog synthesizer was built from components such as resistors, capacitors, and op-amps, with analog voltages as the signals. One problem was that the analog synthesizers needed to be tuned since these component values could drift over time. Another problem was that the complex circuitry generated one note, so analog synthesizers were typically monophonic, producing a single note at a time. The functions of an analog synthesizer were typically controlled by patch cords, potentiometer knobs, and switches, which allowed a wide variety of sounds to be produced. This made it difficult to select the desired sound, since all the parameters needed to be set manually.

Digital synthesis provided a completely different way of generating sounds. The sound values were produced digitally by an algorithm that generated numeric values. These values were converted to the output signal voltages by a digital-to-analog converter (DAC). Digital synthesizers solved many of the problems of analog synthesis: they could easily play multiple notes at once (i.e. polyphony), configurations could be stored as digital files, they could be controlled digitally10, they replaced precision analog components with cheaper digital circuits, and they produced new classes of sounds. The DX7 wasn't the first digital synthesizer, but it was the first to achieve commercial success. It became one of the best-selling synthesizers ever, with over 150,000 sold.

The Yamaha DX7 synthesizer with its 61-key keyboard and digital controls. Photo by rockheim (CC BY-NC-SA 2.0).

The Yamaha DX7 synthesizer with its 61-key keyboard and digital controls. Photo by rockheim (CC BY-NC-SA 2.0).

FM synthesis

The DX7 uses FM synthesis to generate its sounds.11 The idea is that you start with a sine wave (the carrier signal) and perturb it with another signal (the modulating signal). The modulating signal changes the phase (and thus the frequency) of the carrier, creating complex harmonic structures.

The digital implementation of frequency modulation starts with a lookup table that holds a digitized sine wave. By stepping an index through the table at a specific rate, you can produce a sine wave of a fixed frequency. To make this concrete, suppose the table is 4096 entries long and the index is updated at 40960 Hertz. If you increment the index by 100 each time, you'll cycle through the table 1000 times every second, so a sine wave at 1 kHz will be produced. The index represents the phase of the signal: as the index moves through the table, this corresponds to a phase of 0 to 2π and an output of sin(0) through sin(2π). Changing the increment value controls the frequency. For instance, an increment of 44 would produce 440 Hz.12

The next step is to modulate the output by adding a modulation signal to the index. When the modulation signal increases, the index will move through the table faster, increasing the output frequency. When the modulation signal decreases, the index will step through more slowly, decreasing the output frequency.

Digital synthesis can be implemented with straightforward hardware: a sine-wave table, an increment value that controls the frequency, and an adder that adds the increment to the table index (phase angle) each time step. Frequency modulation can be implemented by another adder to add the modulation value to the table index (phase angle).

The interactive tool below illustrates FM synthesis and the effects of changing the modulation frequency and amount of modulation.13 The modulation signal is shown in yellow and the output is shown in red. (The carrier is fixed at 440 Hz.) Low levels of modulation distort the output waveform, while high levels create very complex waveforms. If the modulation and carrier frequencies have integer ratios, the output is periodic. But a detuned modulation frequency results in a complex, more bell-like sound.


Modulation level: 1
Modulation frequency ratio: 2

As you can see, a single modulator produces a variety of timbres and complex, unpredictable waveforms. However, the DX7 provides multiple modulators combined in various ways, making the sounds vastly more varied. For each note, the DX7 provides six oscillators (called operators) that can be combined in 32 different ways (called algorithms), shown below. For example, in algorithm 1, operator 6 modulates operator 5 which modulates operator 4 which modulates operator 3, which produces a sound. Meanwhile, operator 2 modulates operator 1, producing a second sound. Other algorithms combine the six operators in different ways. The level of each operator is controlled by a different envelope, so the note's timbre can evolve in complex ways over time.14

A chart of the DX7's algorithms, from the patent.

A chart of the DX7's algorithms, from the patent.

Inside the DX7

The DX7 can play 16 notes at once and each note has 6 operators, so there are 96 oscillators/operators in total. However, the circuitry operates sequentially, updating one oscillator and computing one operator at a time. The DX7 stores the current index (phase) values for each of the 96 oscillators but shares the circuitry that uses these values. Instead of RAM, the DX7 uses shift registers to hold data, in particular 96-stage shift registers to hold the 96 phase values. This approach drastically reduces the hardware requirements compared to using 96 separate oscillator circuits.

The diagram below shows the main architectural components of the DX7, with the components implemented in the operator chip highlighted. (The diagram, from the patent, is complicated but it shows the important features.) In the upper left, the keyboard circuitry detects when a key is played, generating a key code (KC), and a key-on signal (KON). The key code determines the frequency number, the increment used to compute the phase. The phase generator (blue) adds the increment to compute the phase, and the tone generator (yellow) produces the output sound value. The setting section in the lower left provides the user interface to configure the synthesizer. In the lower right (green), the sequence control generator sends control signals to the tone generator to implement the selected algorithm.

Architecture diagram of the DX7, from the patent.

Architecture diagram of the DX7, from the patent.

In more detail, the phase generator (blue) implements the phase counters for the 96 digital oscillators. The "frequency number generator" in the envelope chip provides the increment values to the adder. The phase values are stored in the 96-stage shift register. The tone generator (yellow) is where the modulation happens. It takes the phase values, modulates them, and converts them to sine waves, producing the output sound value. It also modifies the level of the signals, as specified by the envelope generator. The sequence code generator (green) generates control signals (A, B, C, D, E, S) that select how modulation takes place at each step. The implementation of these components will be described in more detail below.

Logarithms and exponentials

The chip uses logarithms and exponentials for many of the internal values. The underlying problem is that multiplication is much harder to perform with hardware than addition, especially with 1980s-era technology. The solution is that the chip uses base-2 logarithms in many places because adding logarithms is equivalent to multiplying the values. (The chip uses lookup ROMs in combination with bit shifting to obtain the logarithms and exponentials.)

The first role for logarithms is in the frequency input to the chip: instead of a phase increment value, it receives the base-2 logarithm of the increment. The motivation is that note frequencies are related exponentially: for instance, going up one octave doubles the frequency. Thus, shifting a note requires multiplying the frequency. Since the envelope chip represents frequencies as logarithms, the multiplication becomes a quick addition. The envelope chip then passes the corresponding phase increment to the operator chip as a logarithmic value. The operator chip uses an exponential look-up ROM to convert this value back to a linear value.

The second role for logarithms is to apply the envelope that shapes the signal's amplitude. The envelope is a time-varying multiplicative scale factor, scaling the amplitude to, say, 70% or 30%. To avoid multiplication, the logarithm of the scale factor and the logarithm of the signal are added. A second exponential look-up ROM converts the result back to a linear value. The envelope is provided to the operator chip by the envelope chip in logarithmic form. The logarithm of the sine-wave signal is conveniently obtained by storing log2(sin(x)) in the waveform ROM instead of sin(x), so the logarithm is obtained "for free".15

A look at the die

The diagram below labels the pins and the main functional blocks of the chip. The shift registers are the largest blocks of the chip, especially the phase shift registers in the upper left. ROMs are the second-largest blocks, especially the sine ROM and the two identical exponential ROMs. Adders provide most of the logic circuitry; there isn't much "random" logic compared to a processor chip, for instance. The chip has several bit shifters that shift a numeric value, multiplying or dividing it by a power of two.16 In this section, I look at the low-level circuitry of the die and how the functions are implemented.

Die with the pins and major functional blocks labeled. (Click for a larger version.)

Die with the pins and major functional blocks labeled. (Click for a larger version.)

Shift registers

The main component of the chip is storage: the parameters for each operator, the phase counters for each oscillator, the output values for each note, and so forth. The storage is not implemented as RAM or fixed registers as you might expect, but as loops of shift registers with bits constantly moving in a cycle. The idea of a shift register is that it consists of a number of stages, say 16. During each clock cycle, the bits are shifted, with each bit moving to the next stage. One bit exits the shift register. This bit (or a new bit) can be fed into the shift register input, and it will appear at the output 16 clock cycles later.

Since the circuitry works on one oscillator/operator at a time in fixed order, shift registers are an efficient way of storing data and providing it at the right time, without the need for addressing logic. In other words, during each time interval, the appropriate data pops out of the shift registers for processing. The data (unmodified or modified as appropriate) is then fed back into the inputs of the shift register to pass through another cycle.

For example, each of the 16 notes requires 8 bits of configuration storage: 5 to specify the algorithm and 3 to specify the feedback level. This storage is implemented with 8 shift registers, each 16-bits long, as shown below. To select an algorithm, the external CPU writes the appropriate value into the shift register. Note that unlike RAM, entries in the shift register cannot be read and written arbitrarily. The system can only use values when they appear on the shift register output.

The configuration data shift registers are organized as eight 16-bit shift registers.

The configuration data shift registers are organized as eight 16-bit shift registers.

The schematic below shows how one stage of the shift register is implemented. The chip uses a two-phase clock. In the first phase, clock ϕ1 goes high, turning on the first transistor. The input signal goes through the inverter, through the transistor, and the voltage is stored in the capacitor. In the second phase, clock ϕ2 goes high, turning on the second transistor. The value stored in the capacitor goes through the second inverter, through the second transistor, and to the output, where it enters the next shift register stage. Thus, in one clock cycle (ϕ1 and then ϕ2), the input bit is transferred to the output. (The circuit is similar to dynamic RAM in the sense that bits are stored in capacitors. The clock needs to cycle before the charge on the capacitor drains away and data is lost. The inverters amplify and regenerate the bit at each stage.)

Schematic of one stage of the shift register.

Schematic of one stage of the shift register.

The diagram below shows the physical implementation of one shift register stage. It's a bit confusing because there are three layers: the whitish metal on top, doped silicon regions on the bottom (which appear outlined in black), and polysilicon lines in the middle (which appear reddish or greenish). Transistors are formed when a polysilicon line crosses doped silicon. A capacitor is created similarly, with a polysilicon line and doped silicon forming the two plates of the capacitor. An inverter is created from a transistor that pulls the output to ground, along with a pull-up resistor. (The pull-up resistor is actually another transistor, specially doped to make it a depletion transistor.)

Implementation of one bit of the shift register. This matches the earlier schematic, but shows the components of the inverters.

Implementation of one bit of the shift register. This matches the earlier schematic, but shows the components of the inverters.

ROMs

The next building block of the chip is ROM storage, used for the numeric look-up tables and other purposes. One ROM computes the log2 sine for the waveform. The chip has two identical exponential ROMs computing 2x. One converts the log-frequency increment value into a linear increment value. The second converts the log waveform value into a linear waveform value. An algorithm ROM defines the 32 algorithms, specifying the behavior of each of the 6 operators in each algorithm. Another ROM changes the behavior of different notes and operators in a way that is still a mystery to me.

A ROM is arranged in a grid. At each position, silicon is doped to either create a transistor or no transistor, representing a 0 or 1. In a typical ROM, five address bits energize one of 32 vertical select lines to select one column of the ROM. The rows are organized in groups of 8 and three more address bits select one row from each group to yield output bits.

The diagram below shows part of the ROM circuitry. The magnified portion has been colored to show the bits. The vertical column select lines of polysilicon are colored yellow. The ROM is programmed by the pattern of doped silicon (blue). A transistor (red) is formed when a polysilicon line crosses a doped silicon region; the transistors are indicated in red and indicate the bit pattern.

Closeup of the log-sine ROM showing individual bits.

Closeup of the log-sine ROM showing individual bits.

The ROMs use several tricks to reduce space. Duplicate rows are folded together, such as high-order bits that are zero for a range of values. The sine ROM apparently uses delta encoding for alternating values; since the delta values are small, they have a lot of zero bits that can be folded. As a result, the values stored in the ROM are not obvious from the bit patterns. I'm still investigating the ROM representations and will discuss them later.

Adder

Another key building block of the chip is the adder, which sums two binary numbers. The chip has multiple adders: for the phase accumulators, inside the operators, and to apply the envelope.

A multi-bit adder is built from full adders, a circuit that adds two bits (along with a carry-in bit), and produces a sum bit (along with a carry-out bit). The diagram below shows how a one-bit full adder is implemented, adding bits A and B along with a carry-in, producing an output sum bit and a carry bit.17 Note that the outputs are inverted; other parts of the circuitry deal with that.

Structure of the full-adder circuit used in the chip.

Structure of the full-adder circuit used in the chip.

By combining multiple one-bit adders, multi-bit binary numbers can be added as shown in the 23-bit adder below. Note that the adder is at an angle relative to the shift registers. This is a clever trick for performance. One problem with adders is dealing with carries, which may need to propagate through all the bits. (The binary equivalent of needing to repeatedly carry the 1 when computing 999999+1.) The solution is to break the sum into 6 parts. Only 4 bits of each sum are added in each clock phase, so the carry only needs to propagate through 4 bits rather than all 23. The next chunk is added in the next clock phase, and so on.18

The phase adder is at the left of the shift registers that hold the 96 phase values.

The phase adder is at the left of the shift registers that hold the 96 phase values.

Bit shifter

The final building block that I'll discuss is the bit shifter, which shifts a binary value left or right numerically, which is equivalent to multiplying or dividing by a power of 2. A typical shifter is built in two layers: the first layer shifts by 0, 1, 2, or 3 positions. The second layer shifts by 0, 4, 8, or 12 positions. The combination of the two layers permits any shift between 0 and 15 bit positions.

The diagram below shows part of the shifter that shifts by 0, 1, 2, or 3 positions, controlled by the horizontal lines. I've highlighted one of the bits in green. If the "shift 0" line is activated, the leftmost green transistor (circled) will turn on and the green input bit will exit unshifted at the first output position. Likewise, if the "shift 1" line is activated, the second green transistor will turn on and the green bit will exit at the second position, shifted one position to the right. The "shift 2" and "shift 3" lines will cause the green bit to exit two or three positions to the right. The remaining transistors (circled in black) act in the same manner to shift the other bits. The result is that all the bits will pass straight throw (shift 0), or be shifted 1, 2, or 3 positions to the right.

Detail of a shifter circuit.

Detail of a shifter circuit.

Shifters are used in combination with the exponential ROMs to compute 2x. The ROM is applied to the fractional part of x, while the shifter is controlled by the integer part. This is much more efficient than using a large ROM to look up the complete value. Another shifter provides a shift of 0 to 6 bits to scale the operator feedback value. A shifter also scales the output value to increase the dynamic range.

Combining and modulating operators with an algorithm

The DX7 generates each note by combining and modulating six operators (oscillators) according to a particular algorithm. This happens sequentially: the chip processes operator 6 for channels 1 through 16, then operator 5 for all the channels, and so forth, ending with operator 1. This cycle of 96 operations repeats, providing new sound values 49096 times a second.19

The diagram below shows a typical algorithm. Operator 6 modulates operators 4 and 5, while operator 3 modulates operators 1 and 2, as well as itself. Operators 1, 2, 4, and 5 produce outputs, which are combined to create the final sound value. This section discusses the circuitry that performs the modulations for the specified algorithm.

Algorithm #19 combines the 6 operators in a specific way.

Algorithm #19 combines the 6 operators in a specific way.

The diagram below shows the implementation of the circuitry to process operators. The lower "operator" box is the circuitry previously discussed: the first adder adds the modulation value f(ωmt) to the current phase value kωt and looks up the value in the sine table. The second and third adders apply the envelope. Finally, the log/linear converter is implemented by the exponential ROM and shifter described earlier.

Diagram showing the construction of an operator, from the patent.

Diagram showing the construction of an operator, from the patent.

The upper half of the diagram determines the appropriate modulation value f(ωmt) for the selected algorithm and operator. This circuitry is complicated, since there are 5 different cases that the circuitry must handle, chosen by the selector.20 The top circuit (selector input 5) implements the feedback of an operator to itself. To provide feedback, the previous two values are stored in 16-stage shift registers, scaled by the feedback level parameter (FBL), and output as the modulation value. (Two previous values are averaged to stabilize the feedback.) Since the 16 channels are processed in sequence, the 16-stage shift registers store the feedback values until the next cycle. The next circuit (selector 4) uses the value of the self-feedback operator to modulate another operator. Selector 3 provides a shift register and adder to sum or delay values. (It is where multiple values are summed to produce the final output.) Selector 2 allows a sum to be used for modulation. Selector 1 is the simple case where the previous operator provides the modulation (e.g. 6 modulating 5). Finally, if no value is selected, the signal remains unmodulated. Control signals A, B, C, D, and E select the specific signal paths.

The diagram below shows the implementation of the modulation circuitry on the die. This circuitry corresponds to the upper part of the patent diagram above; the component numbers match the patent numbers. This circuitry occupies the middle portion of the die, with the shift registers taking up the bulk of the space. The adders and feedback level shifter are also visible.

Implementation of the modulation circuitry on the die.

Implementation of the modulation circuitry on the die.

The algorithms are specified by the algorithm ROM (below). This 192×9 ROM produces 9 control signals for the 6 operators in the 32 algorithms. The 16-stage shift register described earlier holds the selected algorithm numbers and provides the input to the ROM. Curiously, it appears that the chip permits each of the 16 notes to use a different algorithm, even though the DX7 does not support this feature.

The algorithm ROM. The circuitry at the top decodes the address (algorithm and operator number), selecting a column from the body of the ROM below. The 9 outputs (A, B, C, D, E, and S) are at the left.

The algorithm ROM. The circuitry at the top decodes the address (algorithm and operator number), selecting a column from the body of the ROM below. The 9 outputs (A, B, C, D, E, and S) are at the left.

Conclusion

The DX7 was a groundbreaking synthesizer and this chip was at the heart of it, so in a sense this chip was responsible for the 80's sound. Studying the chip's die reveals some interesting circuits. Uncovering the secrets of how the chip operates may help build more accurate DX7 emulators. The chip is complex and this article just scratches the surface so I plan to study the chip in more detail. In particular, I intend to extract the data from the ROMs to find out exactly how the waveforms are represented. In any case, I hope you've found this deep dive into a sound chip interesting.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed. Thanks to Jacques Mattheij and Anthony Richardson for providing the chip and discussion.

Notes and references

  1. The Economist published an article on how the DX7 changed modern music. The article called the DX7 "one of the most important advances in the history of modern popular music," altering the soundscape more than any instrument since the electric guitar. 

  2. The 40% number is from Prof. Megan Lavengood's detailed research on the DX7, in particular What Makes It Sound '80s: The Yamaha DX7 Electric Piano Sound. One interesting factor from Lavengood's research is the importance of preset sounds in the DX7, a feature that most earlier synthesizers didn't have. As a result, most users didn't program the DX7 but just pressed a button to use a preset sound. Programming the DX7 was much more difficult than analog synthesizers both because of the non-intuitive nature of FM synthesis and the DX7's arcane user interface: buttons and menus rather than knobs and sliders that provided immediate feedback. The DX7 also "democratized" the use of synthesizers through its low price: under $2000 (at the time), much cheaper than competing synthesizers. (The Fairlight CMI was $25,000 in comparison.) 

  3. To hear the DX7's 32 classic factory patches, check out this video. Some good examples of 80s songs using these patches are in this video

  4. The DX7 contains two CPUs: a Hitachi 63B03 and a Hitachi 6805S, both related to the 8-bit Motorola 6800. These processors manage the keyboard, user interface controls, MIDI communication, low-frequency oscillator, and so forth. These processors were not powerful enough to do the sound synthesis; they sent data to the envelope and synthesis chips, which generated the sounds. 

  5. It's unclear if the official part numbers of the chips are YM2128/YM2129 or YM21280/YM21290. The chip package and die are labeled YM2128, but the circuit board, schematic, and documentation are labeled YM21280. The chip is also known as the FM Operator Type S chip or OPS chip. 

  6. I estimate that the chip has about 45,000 transistors, a bit less than the 80186 processor (1982). I measure the feature size as 3 µm, a step behind the 1.5 µm process introduced in 1981. My conclusion is that the chip was advanced, but not quite cutting-edge. The die is approximately 7.6×6.6mm. 

  7. The photo below shows the YM21280 chip, showing the staggered pins.

    The Yamaha YM21280 chip. Photo courtesy of Jacques Mattheij.

    The Yamaha YM21280 chip. Photo courtesy of Jacques Mattheij.

     

  8. I'm going over synthesizer history extremely briefly, so I'm oversimplifying things. For instance, there are different architectures for analog synthesizers, multiphonic analog synthesizers, digitally-controlled analog synthesizers, and so forth. Wikipedia provides a detailed history. 

  9. Typically, an envelope generator used an ADSR (attack, decay, sustain, release) model. The attack is the spike in amplitude when the key is pressed, followed by a decay to a lower level. The note remained at the sustain level as long as the key was pressed, and then fell off during the release level. The times and levels could be adjusted as desired. For example, a piano-like sound has a rapid attack and decay for the initial sound, while a trumpet-like sound would have a slower attack as the note builds. 

  10. The Musical Instrument Digital Interface (MIDI) standard was announced in 1982, allowing synthesizers to be controlled over a digital link. MIDI could be used for remote keyboards, playing notes via a sequencer, computer composition, and other applications. Although MIDI is a digital protocol, the first synthesizers to use it were analog, such as the Roland Jupiter-6, converting the digital messages to analog control voltages. 

  11. Technically, the DX7 uses phase modulation (PM) instead of frequency modulation (FM), but the two techniques are related. In phase modulation, the basic frequency stays constant but the phase of the signal is increased or decreased. But if the phase increases, the oscillations happen faster so the frequency is increased. Likewise, a decrease in phase stretches out the waveform, reducing the frequency. It turns out that phase modulation is the same as frequency modulation using the derivative of the modulation signal. (Note that if the phase shift is constant, the PM output has the original frequency, just shifted in time. But a constant modulation signal for FM results in a constant frequency shift.)

    Since the derivative of a sinusoid is another sinusoid, an FM signal and a PM signal look the same with sinusoidal modulation. However, the derivative is scaled by the frequency, with the result that PM signals are more sensitive to modulation by high frequencies than low frequencies. (An FM signal will have the same frequency sweep with slow modulation and fast modulation, while a PM signal will have little frequency change if the modulation is slow.) The results of frequency modulation and phase modulation will also be different for non-sinusoidal modulation, since the derivative will be different from the modulation signal. 

  12. Note that the frequency resolution in this example isn't very good if you use integers for the increment size. For example, an increment of 44 gives 440 Hz and an increment of 45 gives 450 Hz and you can't get a frequency in between. The solution is to include a fractional part in the increment and index to provide more control. 

  13. My synthesis widget illustrates FM synthesis (actually PM synthesis) in general. It doesn't simulate the DX7 specifically. 

  14. The DX7's envelopes are complex. A typical synthesizer's attack-decay-sustain-release envelope is defined by four parameters: the attack speed, decay speed, sustain level, and release speed. The DX7's envelope has eight parameters: L1-L4 and R1-R4, defining both the level and rate for the four phases, providing more control. Each of a sound's 6 operators has its own envelope, adding even more complexity. 

  15. I don't know yet how the negative half of the sine wave is represented logarithmically. My guess is that the sign is represented separately so the waveform remains positive. 

  16. Note that the bit shifters are unrelated to the shift registers, both in design and function. The shift registers shift are used for storage, shifting numbers through time. The bit shifters operate numerically, scaling a number. 

  17. The adder's complex gates make more sense if you think through the cases. You'll have a carry-out if both inputs A and B are set. You'll also have a carry-out if you have a carry-in and at least one of A or B. The sum bit will be set if you have A, B, and carry-in set, which is handled by the lowest AND gate. The sum bit will also be set if you have at least one of A, B, and carry-in, but you need to exclude the case where two of them are set, which is handled by ANDing in the inverted carry-out.

    The underlying reason for the complex OR-AND-NOR logic instead of multiple, simpler gates is that each NMOS gate requires a pull-up resistor. Thus, one complex gate may be smaller than several simple gates because you reduce the number of pull-up resistors. 

  18. The adder can be viewed as a six-stage pipeline, with each stage adding a few of the bits. A sum needs to pass through all the stages to be completely added. Note that the stages are all active at the same time, but they are acting on different sums. 

  19. Note that the algorithms are carefully designed so operators are modulated only by operators with a higher number. Thus, starting at #6 and ending at #1 ensures that values are calculated in the right order. The 32 algorithms make it look like almost anything is possible, but the hardware creates several constraints that limit the possibilities. For instance, there is only one sum/delay register so you can't sum modulators and the output at the same time. You can't delay a non-feedback operator after an output takes place; for instance, algorithm 11 has 6 delayed to modulate 3, but only because there haven't been any outputs at that point. You can only have one self-feedback loop. 

  20. The operator circuit is a bit tricky to understand. One factor to keep in mind is that the computation is spread out over time, computing one operator at a time. Moreover, the computations are interleaved across the 15 voices, so data needs to be stored in a shift register until the next operator is processed. Although the algorithms look straightforward in the diagrams ("operator 6 feeds into operator 5"), the implementation becomes complicated when this is split into time slices. 

  21. Patent 4554857 "Electronic musical instrument capable of varying a tone synthesis operation algorithm" provides detailed information on the architecture of the DX7 synthesizer. The DX7 Schematics provide circuit-level information, including the chip pinout (below). The DX7 Technical Analysis page summarizes what is known about the DX7's internals.

    The DX7 schematic provides the chip's pinout.

    The DX7 schematic provides the chip's pinout.

     

"Space age electronics": Inside a GE thin-film paperweight from the 1960s

In the early 1960s, General Electric developed a technology called thin-film electronics.1 These circuits were built from thin films of material, much more compact than individual components. For weight-sensitive applications such as satellites and military equipment, thin-film electronics could potentially be revolutionary.

The GE paperweight consists of circuitry and a satellite model encased in thick clear plastic. It is labeled "Light Military Electronics Department, Defense Electronics Division, General Electric. Space Age Electronics, thin film circuits."

The GE paperweight consists of circuitry and a satellite model encased in thick clear plastic. It is labeled "Light Military Electronics Department, Defense Electronics Division, General Electric. Space Age Electronics, thin film circuits."

GE's Light Military Electronics department1 built the paperweight above to showcase their "Space Age Electronics". In the center is a thin-film circuit, next to a model of an early satellite. However, the paperweight contained a surprise: when picked up, the paperweight emitted a beep-beep-beep noise, sounding just like a satellite.2 In this blog post, I reverse-engineer the "Space Age Electronics" inside this paperweight and explain how it works. In brief, the visible thin-film circuit implements a flip flop. The remaining circuitry is hidden in the compartment on the left: two oscillators that produce the beeps. These oscillators are implemented in another unusual 1960s technique called "cordwood'.

The thin-film module

The most visible part of the paperweight is the thin-film module. The idea behind thin film is to build resistors and capacitors as thin layers on a substrate, rather than using individual components. Resistors are formed from thin strips of resistive material, the vertical reddish-brown lines on the module's surface. For higher resistance, these lines zig-zag back and forth.3 Capacitors are formed from two thin layers of metal (the plates), separated by an insulating dielectric material.

This angle view shows how the semiconductor components are mounted above the thin film circuitry.

This angle view shows how the semiconductor components are mounted above the thin film circuitry.

Thin-film transistors were not commercially practical in the 1960s, so the module has tiny discrete transistors and diodes mounted on top, connected by golden wires. (This must have been expensive to manufacture.) In the photo above, the shadows show that the semiconductor components (black blobs) are slightly above the surface. You can distinguish the diodes by their green dots. At the left, five metallic strips provide power and signal connections to the module, with golden contacts connecting these strips to the thin-film circuitry.

A closeup of the thin-film module.

A closeup of the thin-film module.

Interest in thin-film technology declined in the mid-1960s as integrated circuits became commercially available. Integrated circuits were cheaper, could fit more components into a chip, and could be mass-produced. For these reasons, integrated circuits took over the electronics market. Thin-film circuits are still used, but only for specialized applications.

I traced out the paperweight's thin-film circuit and found that it implements a toggle flip flop, a standard electronic circuit. The flip flop stores either a 1 state or a 0 state, like a single bit of memory. When it gets a negative pulse on the trigger input, it flips to the opposite state. Thus, as it receives input pulses, it goes "on", "off", "on", "off", etc. In the paperweight, the flip flop creates the separate beeps. The paperweight generates a beep while the flip flop is on, and is silent when the flip flop is off.

Schematic of the circuit in the thin-film module.

Schematic of the circuit in the thin-film module.

You can match up the components in the schematic with the components in the photo: two transistors, two diodes, four capacitors, and multiple resistors. Note that the two sides of the circuit are symmetrical, both in the schematic and in the photo. One side of the circuit is on and one side is off. Depending on which side is on, the circuit holds a 0 or a 1. See the footnote4 for more details.

Inside the paperweight

The left side of the paperweight has a compartment with some interesting circuitry inside. The paperweight was powered by a 22½ V battery, which was relatively common back then but is now obsolete. It looks a bit like a 9-volt battery, except it has one contact at each end. Next to the battery is a vintage earphone, the round pink component. It acts as the speaker in this device.

Looking inside the paperweight's compartment reveals more circuitry.

Looking inside the paperweight's compartment reveals more circuitry.

Another unusual component is the tilt switch in the lower right, which turns the paperweight on and off. (I don't know if this tilt switch contains mercury or has a rolling ball inside.) When the paperweight is horizontal, the tilt switch is open. But if the paperweight is picked up, the tilt switch closes. This probably added to the "drama" of the paperweight, since someone will think it is just a decoration until they pick it up and it starts beeping.

The tilt switch turns the paperweight on and off.

The tilt switch turns the paperweight on and off.

In the upper right of the compartment, a block of plastic encases the oscillator circuitry. The module is built with "cordwood" construction, a way of building high-density circuits that was popular in the 1960s. Instead of mounting components flat on a circuit board, cordwood puts components between two boards. (They are stacked together like logs, giving cordwood its name.) The photo below shows the components; it isn't as clear as I'd like because the components are embedded in yellowing plastic.

This view of the module shows three resistors (striped) and two capacitors (silver).

This view of the module shows three resistors (striped) and two capacitors (silver).

On each side of the module, components are wired with point-to-point wiring, as shown below. This photo also shows how the insulated connection wires are also embedded in the module. The large dark circles are the two transistors.

Closeup of the cordwood module, showing the wiring. The transistors and the ends of the resistors and capacitors are visible.

Closeup of the cordwood module, showing the wiring. The transistors and the ends of the resistors and capacitors are visible.

The oscillators use unijunction transistors, a somewhat unusual type of transistor, different from standard NPN and PNP transistors. Oscillators could be easily created from unijunction transistors due to their nonlinear characteristics. The unijunction transistor was invented by General Electric in 1953, so it's not surprising that General Electric made use of them in this paperweight. The GE logo is visible on top of the transistors.

In this view of the module, the script "GE" logo is visible on top of the transistors. These transistors are part number 2N491

In this view of the module, the script "GE" logo is visible on top of the transistors. These transistors are part number 2N491

The cordwood block holds two oscillators, to control the duration of each beep and to generate the beep sound itself. The first oscillator generates five pulses per second. These pulses go to the thin-film flip-flop circuit, which will change its state between off and on with each pulse. That is, the flip flop is off for 200 milliseconds, on for 200 milliseconds, and so forth. The output from the flip flop powers the second transistor oscillator, which generates a 3.5-kilohertz tone. The result is the repeating beep-beep-beep output from the paperweight.

Schematic of the unijunction transistor oscillators.

Schematic of the unijunction transistor oscillators.

The schematic above shows the two oscillators. The idea behind a unijunction transistor oscillator is that the capacitor slowly charges through the resistor. As the capacitor charges, the voltage on the emitter (symbolized by the arrow) increases. When it reaches the trigger voltage, the transistor turns on and the capacitor discharges to ground. The cycle repeats, generating a sequence of pulses on the output.

Conclusion

I think the paperweight is from approximately 1962, based on GE's thin-film research at the time and the appearance of the paperweight's model satellite.6 The paperweight was produced in the midst of the space race; John Glenn became the first American in orbit in 1962. Satellites were still a new "space-age" thing at the time, so the paperweight was a symbol of General Electric's advanced technology.5 The beeps from the paperweight are similar to those produced by Sputnik (1957). At the time, the paperweight must have been an impressive object, a vision of the future.

Thanks to Peter B. Newman, technology collector and educator for sending me the paperweight for analysis. Thanks to Mikes Electric Stuff for identifying the tilt switch for me.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed.

Notes and references

  1. The paperweight was built by GE's Light Military Electronics department. In the early 1960s, this department produced aerospace electronics such as digital guidance computers, flight-control systems, and satellite sensors. These were used in weapons including the F-105 fighter-bomber, Sidewinder missile, and Polaris and Atlas ICBMs. 

  2. Sputnik's beeps were approximately 150-300 ms long at 1.5 kilohertz. (The frequency isn't well-defined because the transmission was just a carrier switched on and off, but this is the frequency in typical recordings.) The paperweight's beeps were approximately 200 ms long at 3.47 kilohertz. The point is that the paperweight's beeps were designed to resemble the beeps from a satellite such as Sputnik, and people would have recognized this at the time. You can hear the beeps of the paperweight here; I had to edit the audio a bit because I discovered too late that the doorbell rang in the middle of the recording. 

  3. In the module, some of the resistors are connected to the metal layer through structures that have teeth kind of like a comb. I'm not sure what the purpose of these structures is. My hypothesis is that by changing the number of "teeth", the active length of the resistor can be changed, adjusting the resistor. (Modifying the metal layer is easier than modifying the thin-film layers.) 

  4. The two transistors are cross-connected, so when one transistor is on, it forces the other one off. The trigger capacitors are pre-charged through the corresponding output. The result is that the transistor that is currently on (output low) will be pulled lower than the transistor that is currently off (output high). This turns off the first transistor, flipping the state of the circuit. It's a fairly standard flip-flop circuit; more details are here

  5. In 1960, GE hoped to build a commercial communications satellite network, and formed a subsidiary "Communication Satellites Inc" in 1960. However, GE abandoned that goal in 1961 (probably due to antitrust issues) to focus on manufacturing equipment for space vehicles. 

  6. The satellite in the paperweight resembles the Ariel 1 (1962) and Ariel 2 (1964) satellites, with its paddle-like solar cells. It's not an exact match, so I don't know if the satellite is an artist's conception, or is a different satellite. If you recognize the satellite, please let me know. 

Deep dive into how the Teensy microcontroller interacts with the Arduino library

The Arduino language lets you program microcontrollers at a high level, controlling I/O pins without worry about exactly how the microcontroller works. But what's really going on behind the scenes? For my current project, I'm using a Teensy 3.6,1 a development board packaged in a breadboard-compatible 48-pin module that is considerably smaller than a classic Arduino.2 The Teensy uses a fairly powerful microcontroller, a 32-bit ARM processor running at 180 megahertz, and it is (mostly) compatible with the Arduino programming environment. I wanted to understand the low-level hardware better, so I investigated the implementation of one of the Arduino functions. Specifically, this post explains exactly how the analogWrite() function works in the Teensy 3.6. Disclaimer: this blog post goes into excessive detail on an obscure subject, so feel free to stop reading now :-)

An Arduino (top) and Teensy 3.6 (bottom).

An Arduino (top) and Teensy 3.6 (bottom).

analogWrite(): creating a PWM output

The Arduino IDE lets you quickly create an application using functions that abstract away the microcontroller's implementation details. In comparison, if you program a microcontroller directly, its hardware functions are activated by accessing special memory locations that act as control registers. There may be thousands of registers, different for each microcontroller, and described in thousand-page manuals, so programming a microcontroller directly can be daunting.

Using the Arduino library, you can put a voltage on an output pin with the analogWrite(pin, value) function. You specify a value between 0 and 256, where 0 is completely off and 256 is completely on and the library takes care of the details. For instance, analogWrite(pin, 64) produces an output value of 25% (i.e. 64/256). You might expect this would produce an analog voltage at 25% of the maximum, but despite the function's name, the output is not analog. Instead it is a digital pulse-width modulated (PWM) signal, which averages out to the desired value.3 As the oscilloscope trace below shows, the output switches between full-on and full-off, remaining on 25% of the time.4 Even though it doesn't produce a true analog output, the analogWrite function is useful for many tasks, such as controlling LED brightness.

Oscilloscope output showing the output from analogWrite().

Oscilloscope output showing the output from analogWrite().

The diagram below shows how the output changes with different analogWrite values, from 0 (completely off) to 256 (completely on). The main point is that the output is really digital, with a larger input parameter causing the output to be on for a larger fraction of the time. This technique is called Pulse Width Modulation (PWM), since the width of the pulse changes with the input.

Examples of different analogWrite values, from 0 to 256.

Examples of different analogWrite values, from 0 to 256.

The diagram below illustrates how the microcontroller produces the PWM output. Internally, a timer repeatedly counts from 0 to 255, generating a counter value. Each time the timer starts at 0, the output is set high. When the timer matches the specified value (64 in this case), the output goes low. Thus, the match value controls how long the output remains high in each cycle; the larger the value, the longer the output remains high. The timer increments every 8 microseconds, so the total cycle length is 2048 microseconds, yielding a frequency of 490 Hz.

A PWM output is implemented by a timer and a match value.

A PWM output is implemented by a timer and a match value.

The analogWrite function is sufficient for most purposes, but how does it work at the microcontroller register level? The manual for the Teensy's MK66FX1M0 processor explains how the chip's registers work, but is 2237 pages long. (I've extracted the relevant bits and give references to manual sections if you want to know more.) The code for the Teensy implementation of analogWrite is in a file called pins_teensy.c. Because the code supports multiple processors, it is full of #ifdefs; the Teensy 3.6 code is selected by the __MK66FX1M0__ and KINETISK5 defines, specifying the processor type and family. The code contains a bunch of case statements to handle all the different types of PWM pins. I'm using pin 30 in my example, which is defined in that file as FTM2_CH1_PIN (FlexTimer 2 Channel 1 pin). (I'll explain below why this timer is pin 30.)

The code to handle that pin is:

cval = ((uint32_t)val * (uint32_t)(FTM2_MOD + 1)) >> analog_write_res;
FTM2_C1V = cval;
FTM_PINCFG(FTM2_CH1_PIN) = PORT_PCR_MUX(3) | PORT_PCR_DSE | PORT_PCR_SRE;

As you can see, this code is much more complex than the analogWrite() call. In brief, the first line computes the counter value (match value) at which the output should go to 0. The second line stores this value into the timer control register. The third line configures pin 30 for the timer output. Next, I'll explain each of these lines in more detail.

The first line handles the difference between the conceptual timer (counting from 0 to 255) and the physical implementation of the timer, which is 16 bits and counts at a much higher rate. To match the Arduino's PWM frequency (490 Hz), the Teensy timer counts to 61439. This line scales the input value (0 to 256) to the desired range (0 to 61440). Specifically, the hardware register FTM2_MOD (timer 2 modulo) holds 61439, the value that this timer counts to.6 Multiplying the input value by 61440 and dividing by 256 scales the input value to the new range. (The value 8 for analog_write_res indicates 8 bits of count resolution, i.e. 256.)7

The next line of code stores this value into timer 2's Channel 1 Value register FTM_C1V,8 which controls the pulse width. This register holds the "match value"; when the timer counter reaches this value, the output drop to 0.

The third line configures pin 30 for the output from the timer. The FTM_PINCFG macro handles pin configuration, which in this case updates the configuration for pin 30 (CORE_PIN30_CONFIG).11 The PORT_PCR_MUX(3) macro selects the pin's function from the pin multiplexer, which I'll explain in the next section.10 The PORT_PCR_DSE option sets Drive Strength Enable, enabling high-current output. The PORT_PCR_SRE option sets Slew Rate Enable, slowing the pin's slew rate (how fast it changes value).9 These values are combined and stored in the appropriate bit fields of the Pin Control Register, shown below. (The macros ensure that each value goes into the right position.)

This diagram shows how multiple fields are packed into a Pin Control Register. (From Section 12.5.1 of the manual.)

This diagram shows how multiple fields are packed into a Pin Control Register. (From Section 12.5.1 of the manual.)

Filling in the macros, the original analogWrite(30, 64) call becomes:

*(uint32_t *)0x400B8018 = 15360;
*(uint32_t *)0x4004A04C = 0x344;

Thus, in the end, the analogWrite call turns into two stores to microcontroller registers.

Determining the pin and its function

Pin configuration is more complex than you might expect. The problem is that the processor chip has 144 pins (in a 12×12 grid), but the microcontroller provides a much larger number of functions. The solution is that each pin has up to 8 different multiplexed functions, and you can select one of these functions for each pin. Thus, you can't use all the features of the chip at the same time, but hopefully you can use the features you need.

The chip has a 12×12 grid of solder balls on the bottom.
(Photo from Digi-Key.)

The chip has a 12×12 grid of solder balls on the bottom. (Photo from Digi-Key.)

In the example I'm using GPIO pin 30, but this pin number is part of the Arduino API: the microcontroller has no pin 30. So how does pin 30 get a meaning? In this section, I explain how pin 30 maps onto a physical pin of the microcontroller (pin D11 in this case) associated with a PWM timer (FlexTimer 2 channel 1 in this case).

The function of each Teensy pin is documented, but I wanted to figure out "from scratch" what GPIO pin 30 means. Looking at the schematic shows the Teensy's pin 30 is connected to pin D11 of the processor, which is labeled "PTB19". (Processor pins are labeled with a letter and number corresponding to the pin's grid position.)

Detail of the Teensy 3.6 schematic showing microcontroller pin D11 is connected to Teensy GPIO pin 30.

Detail of the Teensy 3.6 schematic showing microcontroller pin D11 is connected to Teensy GPIO pin 30.

Chapter 11 of the manual lists the names and functions for each pin (excerpted below). As mentioned earlier, each physical pin supports multiple functions. Pin D11 has the official name "PTB19" and has seven different functions assigned to it: Touch Screen, GPIO PorT B, CAN bus, FlexTiMer FTM2_CH1 (that we're using), I2S audio, FlexBus, and FlexTiMer 2 Quadrature Decoder.

This excerpt from the manual shows the functions that can be assigned to pin D11.

This excerpt from the manual shows the functions that can be assigned to pin D11.

Each pin has a multiplexer (MUX) that selects which function is assigned to the pin. In order to use the timer with pin D11, the pin configuration register (PCR) for D11 must be configured to assign function 3 to this pin. This was done with the macro discussed earlier, PORT_PCR_MUX(3). Thus, when an analogWrite is performed, the pin is configured to use the appropriate timer.

Initialization

Another piece necessary to make this work is the Teensy's initialization code. The main routine in main.cpp calls _init_Teensyduino_internal_(), which performs the necessary register initialization. The timer 2 initialization code is

FTM2_CNT = 0;
FTM2_MOD = DEFAULT_FTM_MOD;
FTM2_C0SC = 0x28;
FTM2_C1SC = 0x28;
FTM2_SC = FTM_SC_CLKS(1) | FTM_SC_PS(DEFAULT_FTM_PRESCALE);

This sets the initial counter value to 0 and sets the modulo value (maximum count) to 61439 as discussed earlier. The FTM2_C0SC and FTM2_C1SC lines enable PWM mode. The FTM2_SC line sets up the timer clock.12

The last piece is how the code knows the processor type. To support multiple processor types, the files are full of #ifdefs, but where do these get defined? The answer is that the board type and CPU speed are set in the Arduino IDE. The IDE uses these settings to generate flags that are passed to the compiler when compiling the code. The relevant lines for the Teensy 3.6 are in the file hardware/teensy/avr/boards.txt:

teensy36.build.flags.defs=-D__MK66FX1M0__ -DTEENSYDUINO=153
teensy36.menu.speed.180.build.fcpu=180000000

Conclusion

At this point we've reached the foundation. To summarize, the board that you select in the Arduino IDE causes various flags to be passed to the C++ compiler. These flags, in turn, select numerous definitions of registers for that processor, along with the appropriate code. The result is that a function call such as analogWrite(30), acting on an abstract pin 30, gets converted to operations on special microcontroller registers, causing the microcontroller's circuitry to output the desired signal.

It may seem like magic that high-level operations end up doing the right thing across a wide range of microcontrollers, but this is one of the key accomplishments of the Arduino ecosystem. If you really need to know what's going on, I've shown how these abstractions can be unwrapped. But for the most part, the complexity underneath can fortunately be ignored.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. I also have an RSS feed. I wrote about Arduino PWM and its registers in detail here if you want to know more about PWM. Thanks to Paul Stoffregen for answering my questions about Teensy.

Notes and references

  1. Why am I using a Teensy 3.6 instead of a newer model? Because the more recent Teensy 4.1 was out of stock. 

  2. There are also Arduino models in the DIP form factor, such as the Arduino Nano and Arduino Micro. Arduino also has high-power models such as the 32-bit ARM-based Arduino Portenta

  3. The Teensy 3.6 has two digital-to-analog converter (DAC) outputs. For those two pins, the analogWrite() function produces a genuine analog voltage, not a PWM output. 

  4. The PWM output has a period of 2048 µs, yielding a frequency of about 490 Hertz. The output is controlled in units of 8 µs, so an input value of 1 yields a pulse width of 8 µs, an input of 64 yields a pulse width of 512 µs and so forth. 

  5. I tried to sort out what "Kinetis" means. NXP has many different microcontrollers and Kinetis is their family of 32-bit mixed-signal ARM Cortex microcontrollers, introduced in 2010. The Kinetis family includes the high-performance K series and the low-power L series. The Teensy 3.x boards use the Kinetis K series and have the preprocessor variable KINETISK defined, while the Teensy LC board uses a Kinetis L processor and has KINETISL defined. 

  6. The variable FTM2_MOD is defined as the address (400B8008) of the FTM2 modulo register in kinetis.h. Why is the modulo set to 61439? The goal is to make the PWM period match the Arduino's 2048 µs period (approximately 490 Hertz). To see how this happens, start with the Teensy's clock frequency (F_CPU) of 180 MHz. kinetis.h sets the bus frequency F_BUS to 60 MHz based on this. Then pins_teensy.c uses this for the timer frequency F_TIMER. For a frequency of 60 MHz, pins_teensy.c sets DEFAULT_FTM_MOD to 61439 and DEFAULT_FTM_PRESCALE to 1. This prescale value causes the timer to divide its input frequency by 2, so the timer runs at 30 megahertz. At this frequency, 61440 ticks will take 2048 µs as desired.

    Figuring out the address for FTM_MOD2 is more confusing than I expected. If you look at the memory map in the manual (Section 45.4.2), the address for FTM2_MOD is 4003A008 (Peripheral bridge 0), but the Teensy uses address 400B8008 (Peripheral bridge 1, Table 5-3), see kinetis.h. It turns out that the chip has two paths for accessing peripherals: AIPS0 and AIPS1. The timer can be accessed through both paths, but with different register addresses.

    Another confusing thing is that if you try to access FTM2_MOD through the first address, the Teensy will crash. The reason is that the microcontroller lets you conserver power by turning off the clock to each module, a function called "clock gating". If you try to access a peripheral when the clock is disabled, the system terminates with an error. The two different paths to the timer are controlled by separate clocks. Specifically, access through AIPS0 is enabled through System Clock Gating Control Register 6 (SIM_SCGC6, section 13.2.16), while access through AIPS1 is enabled through SIM_SCGC3 (sections 13.2.13). The Teensy startup code enables timer FTM2 through clock gating register SIM_SCGC3 (for AIPS1) but not SIM_SCGC6 (for AIPS0). Thus, accessing the timer through AIPS1 works, but accessing it through AIPS0 crashes. This thread has more information. 

  7. By default, the value to analogWrite() can range from 0 to 256, i.e. 8 bits of resolution. However, the resolution can be changed by calling analogWriteResolution. Higher resolution gives finer-grain control over the PWM width.

    The Teensy extensions to Arduino include a function analogWriteFrequency(), which provides a more convenient way of modifying the PWM frequency. 

  8. The Register Descriptions section (45.4.2) describes the memory address for each register. FTM2_C1V is the "Channel Value" at address 4003A018. Section 45.4.7 explains that this register holds the 16-bit counter value that the timer matches against. 

  9. On my breadboard, a signal has a rise time of 7.5 nanoseconds with slew rate disabled and 15 nanoseconds with slew rate enabled. The fast signal has a bunch of ringing, while the slower signal rises smoothly. 

  10. The Pin Control Register is described in section 12.5.1 with details in chapter 11, Signal Multiplexing and Signal Descriptions. 

  11. The macro FTM_PINCFG(FTM2_CH1_PIN) turns into CORE_PIN30_CONFIG, the appropriate configuration register. This is defined in core_pins.h as PORTB_PCR19. The manual (section 12.5) specifies that PORTB_PCR19 (Port B Pin Control Register 19) has address 4004A04C. 

  12. Register constants FTM2_C0SC and FTM2_C1SC are set to 0x400B800C and 0x400B8014 respectively in kinetis.h. The manual defines these addresses (section 45.4.2) as 4003_A00C and 4003_A014. (The differences are because the timer can be accessed through a different path (Peripheral Bridge 1) at address 400B_8xxx.) These registers are Channel 0/1 Status and Control, discussed in manual section 45.4.6. Each register has 7 bit fields that control the timer function. The initialization value 0x28 selects Edge-Aligned PWM with high-true pulses.

    Register constant FTM2_SC (timer 2 Status and Control) has address 400B8000 in the code and 4003A000 in the manual. Its fields are described in manual section 45.4.3. FTM_SC_CLKS(1) sets the CLKS field to use the system clock as the timer input. FTM_SC_PS sets the prescale to divide the clock by 2, as discussed earlier. 

Reverse-engineering a vintage OR/NOR chip

Recently, I received a die photo of a mystery integrated circuit, the OQ100,1 from EvilMonkeyDesignz. I analyzed the die photo and found that it is a logic chip implemented with fast ECL (Emitter-Coupled Logic) circuitry, probably from the early 1970s. The chip contains three logic gates, two with 2 inputs and one with 4 inputs. Each gate has non-inverted and inverted outputs, acting as both an OR gate and a NOR gate. This blog post summarizes my investigation. (I also recently analyzed the OQ104, a different chip in this series.)

Die photo of the Philips QC100 chip. Click this photo (or any other) for a larger version. Photo courtesy of EvilMonkeyDesignz.

Die photo of the Philips QC100 chip. Click this photo (or any other) for a larger version. Photo courtesy of EvilMonkeyDesignz.

The die photo above shows the chip under the microscope. Most of the silicon appears bright pink in this image. Regions of silicon with different doping appear green or yellowish and form the transistors and resistors of the chip. The speckled regions are the metal layer on top of the silicon, wiring the circuitry together. Around the edges, the black bond wires connect the chip to the external pins.

The chip's components

Transistors are the key components in a chip. This chip uses a type of transistor called an NPN transistor. The photo below shows a transistor as it appears on the chip. Underneath the photo is a cross-section drawing showing approximately how the transistor is constructed. The transistor is more complicated than the N-P-N sandwich you see in books, but if you look carefully at the vertical cross-section below the 'E', you can find the N-P-N layers that form the transistor. The emitter (E) wire is connected to N+ silicon. Below that is a P layer connected to the base contact (B). And below that is an N layer connected to the collector (C).

Structure of an NPN transistor. Top: transistor as it appears on the die. Bottom: cross-section diagram.

Structure of an NPN transistor. Top: transistor as it appears on the die. Bottom: cross-section diagram.

The chip also uses a few PNP transistors. Although you might expect a PNP transistor to simply be the reverse of an NPN transistor, it has a different structure, with the regions arranged laterally instead of vertically. The collector and base form concentric square rings around the emitter. The base wire is not connected to the base region directly. Instead, the wire is at a distance, and the base signal travels underneath through the N layer.

Structure of a PNP transistor. Top: transistor as it appears on the die. Bottom: cross-section diagram.

Structure of a PNP transistor. Top: transistor as it appears on the die. Bottom: cross-section diagram.

The PNP transistors in this chip have another complication. The collector is split, so the transistor has two collectors. Moreover, one of the collectors is wired directly to the base. In the photo above, you can see how the collector region is split vertically, so there is one collector on the left and one on the right, with collector on the right connected to the base. This construction may seem bizarre, but it is common in integrated circuits. The motivation is to build a current mirror, where both collectors pass the same current.

The other key components of this chip are the resistors. The photo below shows two resistors as they appear on the die. The resistors are formed from strips of higher-resistance P silicon, which appears pink in the die photos. Each end of a resistor is connected to the metal layer; the metal in the middle connects the two resistors together in series. (A metal wire also passes over the resistor but is not connected.) A resistor has higher resistance if it is longer and narrower, so these resistors have relatively high resistance.2 Resistors are relatively large on an integrated circuit and fairly inaccurate.

Two resistors as they appear on the die.

Two resistors as they appear on the die.

The circuitry

Once the components can be recognized on the die, the circuit can be traced out and reverse-engineered. But before I describe the complete circuit, I'll explain how ECL (Emitter-Coupled Logic) works.3 The schematic below shows a differential pair, or long-tailed pair, which amplifies the difference between its two inputs. (This circuit is also common in analog circuits, forming the heart of an op-amp.) The basic idea is that a current sink (the circle at the bottom) generates a fixed current I. This current gets split between the left path (I1) and the right path (I2). If the transistor on the left has a higher input voltage than the transistor on the right, most of the current will go to the left. But if the transistor on the right has a higher input, most of the current will go to the right. This circuit amplifies the voltage difference: even a small difference between the two inputs will switch most of the current from one side to the other.

Schematic of a simple differential pair circuit. The current sink sends a fixed current I through the differential pair. If the two inputs are equal, the current is split equally between the two branches. Otherwise, the branch with the higher input voltage gets most of the current.

Schematic of a simple differential pair circuit. The current sink sends a fixed current I through the differential pair. If the two inputs are equal, the current is split equally between the two branches. Otherwise, the branch with the higher input voltage gets most of the current.

To make this into an OR gate, we can put multiple transistors on the left. If any input is high, the current will be switched to the left, otherwise the current will be switched to the right. Since the current pulls that side low, the left branch will be the NOR output while the right branch will be the OR output. (With ECL, you get both the complemented and uncomplemented outputs "for free".) The schematic below shows how one logic gate is implemented; this is a two-input gate. The gate uses a second differential pair to buffer and amplify the outputs. The current sink circuit is discussed in a footnote.4

Schematic of one logic gate.

Schematic of one logic gate.

The diagram below shows how the four-input gate is implemented on the die. The majority of the area is occupied by the current sink and the associated resistors. The NPN and PNP transistors are relatively compact, but the resistors occupy a lot of space. At the bottom, the four input transistors implement the OR function, along with the reference transistor on the other branch. The output transistors are larger so they can provide more current.

One gate with functional blocks labeled.

One gate with functional blocks labeled.

The diagram below shows how the three gates are arranged on the die. (The gate described above is at the right.) The voltage divider resistors provide a voltage reference for the current sources.

The die with major functional blocks labeled.

The die with major functional blocks labeled.

Putting everything together, the diagram below shows how the circuitry of the chip maps onto its 16 pins. The three OR gates are represented by the OR symbols; the gate on the right has four inputs. Each gate has a non-inverted output and an inverted output, which is indicated by a bubble. I don't know what voltages the chip takes, so I've indicated the power pins with + and -.

Reverse-engineered pinout of the chip.

Reverse-engineered pinout of the chip.

I announce my latest blog posts on Twitter, so follow me @kenshirriff. Many thanks to EvilMonkeyz Designs for providing the photos; follow on Instagram or Twitter for more interesting die photos.

Notes

  1. A reader said that Philips used the OQ designation for their custom integrated circuits. That would explain why I was unable to find these chips in a databook. 

  2. The resistance of a resistor is proportional to the length divided by the width. To understand this, note that a region twice as long is the same as two resistors in series, so it has twice the resistance. A region twice as wide is the same as two resistors in parallel, so it has half the resistance. 

  3. The logic circuit on this chip has a couple of differences from standard ECL gates. A typical ECL gate has inputs on one branch and a reference voltage connected to the transistor on the other branch. Thus, an input higher than the reference voltage is a logic 1, and an input lower than the reference voltage in a logic 0. This gate, however, uses the output of the first branch as the input to the second branch. If an input is high, it pulls this output low, shutting off the other branch. Conversely, if the input is low, the output goes high, turning on the other branch.

    I'm not sure what the motivation is for this design. It looks a bit like NTL (Non-Threshold Logic), since there isn't a threshold set by a reference voltage. One possibility is that the circuit implements a Schmitt-trigger, a circuit with hysteresis, where once it turns on, the input must drop significantly lower to turn it off.

    The second difference between this circuit and a typical ECL gate is the output buffer. ECL gates typically use an emitter follower, rather than a second differential pair. 

  4. I'll just briefly describe the current sink circuit, shown below. Two large resistors form a voltage divider that produces a reference voltage midway between the two supply voltages (maybe 0 volts). Due to the behavior of transistors, VBE will be one diode drop (~0.7 V). The rest of the circuit generates the "correct" current through the lower-right resistor to achieve this voltage drop. On the chip, the two PNP transistors at the top are one transistor with two collectors. They implement a current mirror, where the current through the right transistor matches the current through the left transistor.

    The current sink circuit used in the chip. The divider is shared by all the current sinks on the chip.

    The current sink circuit used in the chip. The divider is shared by all the current sinks on the chip.