...making Linux just a little more fun!

Linux Sound System

By Mohan Lal Jangir

In two decades, Linux has grown from an early nascent stage to maturity. In the field of multimedia, for example, today Linux supports almost all audio/video formats and sound cards - something that was lacking in the early days. Sound support in Linux has had an interesting journey passing through the different phases. This article aims to summarize the Linux sound system development history, give a comparison of different architectures, and conclude with two real (albeit small) audio applications.

In 1992, Hannu Savolainen wrote the first driver for the Sound Blaster card which was the only available sound card at that time. He called it the Linux Sound Driver. As more sound cards appeared, he went on to develop the Open Sound System (OSS) - one neat little package providing an API for audio applications.

The OSS API was based on standard UNIX devices and system calls (i.e. POSIX open, read, write, ioctl), and therefore developing audio applications became pretty similar to developing other Linux applications. Moreover, audio applications were portable across all UNIX variant operating systems (UNIX, Linux, BSD etc).

The OSS source code was released under the GPL, and Linux included OSS in the mainline kernel. While OSS development was still in progress, Hannu Savolainen was contracted by 4Front Technologies and they decided to make a living with an OSS commercial version. Subsequently, Hannu stopped working on the GPLed version of OSS and continued to develop the proprietary OSS for 4Front Technologies. The result was that kernel sound drivers were frozen at OSS v3.8.

Later, Red Hat Software sponsored Alan Cox, a noted Linux developer, to enhance and modularize the kernel sound drivers. Alan Cox and others made many bug fixes and added drivers for new sound cards. The modified drivers were released for the first time with Red Hat 5.0. Being under GPL, those modifications were later included in the mainline Linux kernel. However, the progress hit a road block after Red Hat stopped sponsoring Alan Cox, as there was no dedicated maintainer for GPL OSS.

The pro-GPL community did not like OSS. In 1998 Jaroslav Kysela wrote a driver for the Gravis Ultrasound soundcard, which he later developed into a new sound architecture called ALSA (Advanced Linux Sound Architecture). ALSA development went on independently until 2002, when ALSA was included in the 2.5 kernel along with OSS.

ALSA architecture distinctly moved away from the POSIX API and introduced a much bigger as well as a more complex API set. While the pro-GPL community endorsed ALSA, it did not find much support from audio application developers who had to re-write their applications using complex ALSA API. The other factor was the risk of losing application portability since ALSA is available only on Linux.

To overcome this problem, ALSA introduced an OSS emulation layer. This made it possible to run OSS audio applications on the ALSA architecture without modifying the application.

However, as anticipated, a heated discussion about the OSS and ALSA merits/demerits started. ALSA was designed to have some additional features which were not present in OSS at that time; however, OSS 4.0 (the proprietary one) claimed to have fixed them all. While ALSA was criticized for its complex API set, OSS had the advantage of having POSIX compliance. On the other hand, OSS was criticized for its non-GPL status, somethign that ALSA had in its favor.

Finally, ALSA got a big thumbs up from Linux: in the 2.6 kernel, Linux replaced ALSA as default sound architecture and OSS was marked as being deprecated.

In 2007, in a surprising move, 4Front Technologies released OSS v4.0 under GPL - which raised many eyebrows. While some experts termed it as "too little and too late", some predicted it as a possible OSS re-entry into the kernel.

Before we conclude the article, let’s take a look at a small audio application written using both APIs to see a real comparison.

This sample application can play an uncompressed PCM, 2-channel (stereo) file. Following are the properties of the audio file shown by file command and mplayer.

root@localhost:/root# file sample.wav
sample.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz

root@localhost:/root# mplayer sample.wav
==========================================================================
Forced audio codec: mad
Opening audio decoder: [pcm] Uncompressed PCM audio decoder
AUDIO: 44100 Hz, 2 ch, s16le, 1411.2 kbit/100.00% (ratio: 176400->176400)
Selected audio codec: [pcm] afm: pcm (Uncompressed PCM)
==========================================================================

You can use any .wav audio file with similar properties. We used a .wav-format audio file because these are not encoded (for the same reason, you should not use mp3 or other encoded audio files with this application.)

Following table shows OSS and ALSA applications side by side:

#include <stdio.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/soundcard.h>

int main()
{
    int afd, ret, fd, val;
    unsigned char buf[2048];

    /* open audio device file */
    fd = open("/dev/dsp", O_WRONLY);
#include <alsa/asoundlib.h>

int main()
{
    int fd, ret;
    snd_pcm_t *handle;
    snd_pcm_sframes_t frames;
    static char *device = "default"; /* playback device */
    unsigned char buf[2*1024];

    /* open playback device */
    snd_pcm_open(&handle, device, SND_PCM_STREAM_PLAYBACK, 0);
    /* set sample size (8 or 16 bits) */
    val = 16;
    ioctl(fd, SOUND_PCM_WRITE_BITS, &val);

    /* set the number of channels */
    val = 2;
    ioctl(fd, SOUND_PCM_WRITE_CHANNELS, &val);

    /* set the PCM sampling rate for the device */
    val = 44100;
    ioctl(fd, SOUND_PCM_WRITE_RATE, &val);
    /* configure playback device as per input audio file */

    snd_pcm_set_params(handle,
         SND_PCM_FORMAT_S16_LE,
         SND_PCM_ACCESS_RW_INTERLEAVED,
         2 /* channels */,
         44100 /* sample rate */,
         1,
         500000/* 0.5 sec */);
    /* open audio file */
    afd = open("sample.wav", O_RDONLY);

    /* play audio file */
    while((ret = read(afd, buf, sizeof(buf))) > 0)
         write(fd, buf, ret);

    close(fd);
    return 0;
}
    /* open audio file */
    fd = open("sample.wav", O_RDONLY);

    /* play audio file */
    while((ret = read(fd, buf, sizeof(buf))) > 0)
    {
        snd_pcm_sframes_t total_frames = snd_pcm_bytes_to_frames(handle, ret);
         frames = snd_pcm_writei(handle, buf, total_frames);
    }

    snd_pcm_close(handle);
    return 0;
}

The first difference you will observe is the APIs. As mentioned before, the OSS application uses the POSIX APIs, while the ALSA application has different ones. Also, when you compile the applications, note that the OSS application compiles directly but the ALSA one must be linked with libasound (which means you must have ALSA library installed).

The API differences and their capabilities is an endless debate. As of the 2.6.36 kernel, Linux continues to use ALSA. However, many audio application developers are keenly awaiting to see OSS alive!


Share

Talkback: Discuss this article with The Answer Gang


[BIO]

Mohan Lal Jangir is working as a Development Lead at Samsung India Software Operations, Bangalore, India. He has a Masters in Computer Technology from IIT Delhi, and is keenly interested in Linux, networking, and network security.


Copyright © 2010, Mohan Lal Jangir. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 181 of Linux Gazette, December 2010

Tux