Pulse Code Modulation
The Objective of this Technology Primer
The objective of this technology primer is to provide you with a basic understanding of the principles of Pulse Code Modulation (PCM). The intention is that this will be a preamble to the Android development tutorials that look at producing synthesized sounds and as such the focus here will be on PCM as applied high fidelity audio recording. However, the terminology explained in tutorial applies to all forms of PCM.
An Introduction to Pulse Code Modulation
Pulse Code Modulation or PCM is quite simply a method by which analogue signals are digitized prior to transmission or storage. It has its origins in the early part of the 20th Century and first implemented in the some specialist military systems during and just after the Second World War. Through the 1950’s and 1960’s, PCM was developed for use in telecommunication systems and the results of this work can still be found in the basic building blocks for PDH and SDH transmission systems.
The diagram below illustrates the basic principles of PCM.
Figure 1. A digitized sine wave
Here a sine wave is to be represented by a series of digital value measured periodically over time. In this example the analogue value of the wave is represented by a 4 bit binary code that provides 16 levels. The scheme in this example has a but depth of 4.
Each sample is taken every time period ’t’. The time period must also be uniform. In the above example, the analogue level of the signal may fall between two digital levels when the sample is taken and must be rounded or truncated. This leads to what is called a quantization error. Another factor that affects the accuracy of the resultant digital signal is the number of samples that are taken in any time period. The analogue signal may change in amplitude between samples and this information is simply lost in the digitized signal. We use the term fidelity when comparing the digital signal, to the original. The higher the fidelity the closer the signal is the original. We can increase the fidelity of the digitised signal by increasing the number of quantization levels and the number of samples we take in any time period. The number of sample taken in any time period is called the sample rate and is 1/(sample period).
So far we have only discussed linear PCM schemes where the spacing between quantization schemes is even. For PCM transmission in telecommunication this may not be the case. Non-
The bit depth and sample rate used for any particular scheme is chosen based on the following factors:
The Nyquist Rate
We set the sample rate to at least the twice the highest frequency we need to sample. This is referred to as the Nyquist rate. The electronic engineer Harry Nyquist proposed that the maximum frequency that can be captured by sampling is half the sampling rate (Nyquist rate) and is referred to as the Nyquist frequency. Frequencies above the Nyquist frequency are lost and not be present within the recovered signal. This helps to explain why specific sample rates are chosen. For voice transmission in telecommunication systems, 8 KHz is used as the sampling rate. The frequency range for human speech is approximately 300 Hz to 3400 Hz. If we set the sample rate at least twice the highest frequency then 8 KHz is a good round value to choose. If we turn our attention to the digitization of music then we must allow for the full range of human hearing. The range of human hearing is generally understood to be between 20 Hz to 20 KHz. The sampling rate for CD audio is 44.1 KHz.
So why 44.1 KHz and not 40 KHz? I do not propose to go through the details here, but in the early days of CD recording existing video equipment was adapted to store the audio data. 44.1 KHz happened to be the one value that would allow the same equipment to store both PAL and NTSC video. There are other high fidelity audio PCM schemes that use different sampling rates, but 44.1 KHz is the rate adopted for CD audio.
What we have not discussed as yet is the bit depth for audio recording. The PCM scheme for audio CD has a bit depth of 16, giving us a total of quantization 65536 values. Audio CD is an uncompressed Linear PCM scheme.
Waveform Audio File Format (.WAV) and Audio Interchange File Format (AIFF) are two file formats that use 16 bit linear quantization with a sampling rate of 44.1 KHz to store digital audio data. Therefore, CD audio and sounds file data in a .WAV or .AIFF requires no conversion or re-
If we turn our focus to Android application development then it is easy to understand why we use a 16bit bit depth and a sample rate of 44.1 KHz when programming.
On one final point, if you look at the diagram in figure 1, the sine wave starts at the middle of the quantization range. Conceptually, we perceive a sine wave to be an Alternating Current signal with and average value of zero, particularly when the signal is converted back to analogue form to produce sound via the audio output. Therefore, we often represented the quantization level using a signed integer with values from -
For those interested in Android application development, there is a set of tutorials in which you can build project for synthesizing a sine wave using PCM.