Audio Processing Next: Previous: Image Processing Up: Top



Audio Processing

Octave provides a few functions for dealing with audio data. An audio `sample' is a single output value from an A/D converter i.e., a small integer number (usually 8 or 16 bits) and audio data is just a series of such samples. It can be characterized by three parameters: the sampling rate (measured in samples per second or Hz e.g. 8000 or 44100) the number of bits per sample (e.g. 8 or 16), and the number of channels (1 for mono 2 for stereo, etc.).

There are many different formats for representing such data. Currently only the two most popular linear encoding and mu-law encoding are supported by Octave. There is an excellent FAQ on audio formats by Guido van Rossum <guido@cwi.nl> which can be found at any FAQ ftp site in particular in the directory /pub/usenet/news.answers/audio-fmts of the archive site rtfm.mit.edu.

Octave simply treats audio data as vectors of samples (non-mono data are not supported yet). It is assumed that audio files using linear encoding have one of the extensions lin or raw and that files holding data in mu-law encoding end in au mu, or snd.

lin2mu (x n) Function File
Converts audio data from linear to mu-law. Mu-law values use 8-bit unsigned integers. Linear values use n-bit signed integers or floating point values in the range -1<=x<=1 if n is 0. If n is not specified it defaults to 0 8 or 16 depending on the range values in x.

mu2lin (x bps) Function File
Converts audio data from linear to mu-law. Mu-law values are 8-bit unsigned integers. Linear values use n-bit signed integers or floating point values in the range -1<=y<=1 if n is 0. If n is not specified it defaults to 8.

loadaudio (name ext, bps) Function File
Loads audio data from the file name.ext into the vector x.

The extension ext determines how the data in the audio file is interpreted; the extensions lin (default) and raw correspond to linear the extensions au, mu, or snd to mu-law encoding.

The argument bps can be either 8 (default) or 16 and specifies the number of bits per sample used in the audio file.

saveaudio (name x, ext, bps) Function File
Saves a vector x of audio data to the file name.ext. The optional parameters ext and bps determine the encoding and the number of bits per sample used in the audio file (see loadaudio); defaults are lin and 8 respectively.

The following functions for audio I/O require special A/D hardware and operating system support. It is assumed that audio data in linear encoding can be played and recorded by reading from and writing to /dev/dsp and that similarly /dev/audio is used for mu-law encoding. These file names are system-dependent. Improvements so that these functions will work without modification on a wide variety of hardware are welcome.

playaudio (name ext) Function File
playaudio (x) Function File
Plays the audio file name.ext or the audio data stored in the vector x.

record (sec sampling_rate) Function File
Records sec seconds of audio input into the vector x. The default value for sampling_rate is 8000 samples per second or 8kHz. The program waits until the user types <RET> and then immediately starts to record.

setaudio ([w_type [ value]]) Function File
Execute the shell command mixer [w_type [ value]]