Get Those Digits

by @MikeTofet

When was the last time you heard a dial tone?  I mean, really think about it.  The vast majority of us (even those who read this magazine) simply tap a spot on our touchscreen cellular phones and wait for the connection to be made in silence.  Dial tones aren't really even a thing in cellular communication.  If you happen to hear one, it's purely a simulation for your ears.

So when I heard one just this past week, and then actually heard the sound of the digits being dialed, I had some momentary nostalgia.  Then I got excited; I was going to get those digits!

Background

Very briefly - because I am extremely unqualified to go into any real depth here - the tones you hear when you dial an old-school landline are Dual-Tone Multi-Frequency (DTMF) sounds.  This means each sound is made up of a combination of two distinct tones: a low-tone and a high-tone.  Standard U.S. telephones can use four distinct low-group tones and three distinct high-group tones for a total of 12 possible sounds the phone can generate.  These tones are specified and assigned to the keys on your phone.

There are actually four distinct high-group tones available, making for 16 possible combinations.  Standard phones do not use this fourth group, so we can ignore them for the purposes of this discussion.  However, I highly recommend learning more about DTMF.  Use your favorite search engine to look for "DTMF" - or go support your local library and open an encyclopedia.

Since these tones are so distinct, we can easily decode them back to the dialed numbers with ease.  You can actually train yourself to do this simply by ear.  But you don't need to do that.  It is a relatively straightforward task to change sounds to waveforms and to represent waveforms as a list of the contributing frequencies and power levels that make up those waveforms.  The math that performs this conversion is called a Fourier transform and an algorithm known as Fast Fourier Transform (FFT) is well known, studied, and coded in many languages.  Again, I recommend further research on FFT.

If you take an FFT of the sound of a phone number being pressed, you will get two distinct "peaks" of power at two separate frequencies.  For example, if you happen to record the sound of someone pressing a "1" on the phone, you will hear a sound made by combining a 697 Hz low-tone and a 1209 Hz high-tone.  If you run this sound through an FFT, you will see those two frequencies returned to you very clearly and you will know it was a "1" being pressed.

So, all we need to do is record the sound of the number being dialed and run each number tone through an FFT to back out the constituent frequency pair and we will know the number pressed.  You need to get a good clean recording with a high signal-to-noise ratio but, overall, this is very simple today.

The First Experiment

The dial tone and number I heard was coming out of a Doorking 1835 series telephone entry control box at an apartment building.  In this scenario, you can look up a tenant's name and get a four-digit code.  You enter the four-digit code and the box will audibly get a dial tone and audibly dial the tenant's phone number.  This is the audio I recorded simply using the "Voice Memo' app on my iPhone.  I played it back and it was very clear.  Then I emailed the audio file to myself using the "Share" function built into the app.

Next, I knew there had to be an existing DTMF decoder out there already.  Turns out there is one on the Apple App Store, but you had to pay for it.  It isn't much, but I didn't want to pay for this little experiment.  A simple web search led me to this site: dialabc.com/sound/detect/

You simply upload your audio file and it will list the tones it detects!  Perfect!

Except Voice Memo emails M4A files and the site won't accept them.  Luckily, converting from M4A-to-WAV is pretty straightforward.  I used Adobe Audition to convert to WAV, uploaded the resulting file, and got a number back.  I put this number in Google and got a perfect response for the owner of this particular landline.  Perfect!

But I felt like I cheated a little.  I used paid software to do the conversion and someone else's server and code to do the decoding.  Using the code didn't bother me, but leaving who-knows-what logs behind on their server did.  So I set about doing everything using Linux and any open-source software I could find.

Step 1: Convert the sound file.

The best way to convert the sound file from M4A-to-WAV I found was to use avconv.  On Ubuntu, this is not a standard package, so install it first with:

$ sudo apt-get install libav-tools

Then you convert the sound file with:

$ avconv -i originalfile.m4a newfile.wav

I tested the converted file on the dialabc.com site, and it worked.

Step 2: DTMF Decoding.

After a little bit of web-searching, I found two Python-based libraries hosted on GitHub (I'm a Python kind of guy):

Immediately, both libraries failed to process the WAV file I had created.  It seems like the wave package in Python 2 itself was the issue.  After some trial-and-error, I found some avconv settings that would work:

$ avconv -i originalfile.m4a -ar 16000 -sample_fmt s16 newfile.wav

This down-sampled the original audio to 16000 samples per second and set the bit-depth to 16-bits.  Be sure to research FFmpeg or avconv to learn more about these options.

Even with this audio file, the first library by "nickrobinson" did not work.  For some reason it would only decode the last few digits of the number.

The library by "hfeeki" worked perfectly, but you had to edit the code each time to change the target file.  I made some slight changes to the code to allow a command-line argument.  I have offered the changes up as a pull request, but at the time of this writing the code has not been merged.  If you want to make the same change, simply do this to the dtmf-decoder.py file:

  1. Insert a new line 10: import sys
  2. Edit line 96 to say: wav = wave.open(sys.argv[1], 'r')
  3. Then call the file with: python2 dtmf-decoder.py filename

Now I can convert the audio file using free tools and get those digits to my heart's content.

Return to $2600 Index