In this post we are going to manipulate some wav files using Python. We would like to add some noise or sound to a recorded wav file.

Record your voice

Let’s record a sound sample of your voice using the rec command

userk@dopamine:~$ mkdir temp && cd temp
userk@dopamine:~/temp$ rec voice.wav
Input File     : 'default' (alsa)
Channels       : 2
Sample Rate    : 48000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

In:0.00% 00:00:03.16 [00:00:00.00] Out:147k  [      |      ] Hd:0.0 Clip:0    
PRESS CTRL+C
Aborted.

To play your recorded voice use the play command

userk@dopamine:~/temp$ play voice.wav
voice.wav:

 File Size: 590k      Bit Rate: 1.54M
  Encoding: Signed PCM    
  Channels: 2 @ 16-bit   
Samplerate: 48000Hz      
Replaygain: off         
  Duration: 00:00:03.07  

In:100%  00:00:03.07 [00:00:00.00] Out:147k  [      |      ] Hd:0.0 Clip:0    
Done.

By default, the rec command uses 2 channels and a sampling rate of 48kHz. This information is important since we are going the manipulate this file. For simplicity’s sake we will use sox to convert the sample to a single channel signal with 16kHz sampling rate.

userk@dopamine:~/temp$ sox -t wav voice.wav -t wav -r 16000 -b 16 -e signed-integer -c 1 voice1.wav

Then listen to the newly created sample:

userk@dopamine:~/temp$ play voice1.wav

voice1.wav:

 File Size: 98.3k     Bit Rate: 256k
  Encoding: Signed PCM    
  Channels: 1 @ 16-bit   
Samplerate: 16000Hz    
Replaygain: off         
  Duration: 00:00:03.07  

In:100%  00:00:03.07 [00:00:00.00] Out:49.2k [      |      ]        Clip:0    
Done.

Download the sound sample

Feel free to download any sound you want but keep in mind that we are looking for a .wav file.
You can download a file directly from this link and use it with this tutorial.

So, for example let’s say you are interested in overlapping a car sound to your voice, then:

userk@dopamine:~/temp$ wget http://www.userk.co.uk/download/sound/machine.wav
userk@dopamine:~/temp$ ls 
machine.wav
voice.wav
voice1.wav

We can analyze the characteristics of the sound file using avprobe from the package libav-tools.

userk@dopamine:~/temp$ sudo apt-get install libav-tools
userk@dopamine:~/temp$ sudo apt-get install libav-tools
userk@dopamine:~/temp$ avprobe machine.wav
Input #0, wav, from 'machine.wav':
  Metadata:
    encoded_by      : ZOOM Handy Recorder H4n
    date            : 2013-06-29
    creation_time   : 01:17:38
    time_reference  : 223584000
    coding_history  : A=PCM,F=48000,W=16,M=stereo,T=ZOOM Handy Recorder H4n
  Duration: 00:01:08.10, bitrate: 1536 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 2 channels, s16, 1536 kb/s

As you can see the wav file has a sampling rate of 48kHz and 2 channels. We need to convert it using sox to match the characteristics of the recorded voice.

userk@dopamine:~/temp$ sox -t wav machine.wav -t wav -r 16000 -b 16 -e signed-integer -c 1 machine1.wav remix 2

Please note that, in the above command, we have selected the second channel as source to extract the signal from.

Adding one file to the other using Python

Let’s get hands dirty! For our purpose we are going to use the numpy and audiolab library to, respectively manipulate wave files and perform Input Output operation such as open and save sound files.

Dependencies
We need the scikits learn, scikits.audiolab and the numpy python packages installed.

userk@dopamine:~/temp$ sudo apt-get install libsndfile1-dev
userk@dopamine:~/temp$ pip install scikits.audiolab --user

Here is the script we will be using

The first two lines import the required libraries, then lines 4 and 5 imports the previously saved sound samples and extract the encoding and sampling frequency information.
Lines 7 and 8 checks whether the sample have different encoding and sampling frequency.

Line 10 picks a sub array of the same length of the voice signal from the wave file containing the noise. Note that we have used the numpy.split() method to cut the data2 array in three pieces and the // operand to execute a floor division. You can find more information about the division operand in python here.

Finally we create several noise affected samples in the for loop by weighting the noise signal.