WebAudio Deep Note, part 2: play a sound

September 28th, 2019. Tagged: JavaScript, WebAudio

(Part 1 (intro) is here.)

Now that we know what to do, let's go for it! First rule of business: load an audio file and play it.

UI

Let's build a simple HTML page (demo) to test things:

<button onclick="play()">â–¶ play</button>
<button onclick="stop()">STOP!!!!</button>

Now let's implement this play() function.

Fetch...

Loading means fetching from the server and what could be better to use than the newish addition to the Web Platform, namely the appropriately named function fetch(). Let's go with the promise syntax (alternatively you can use the await operator too).

Rolland-something-or-other.wav is the C3 cello sample that will be the basis of all sounds in the Deep Note.

function play() {
  fetch('Roland-SC-88-Cello-C3-glued-01.wav').
    .then(/* MAGIC HERE */)
    .catch(e => console.log('uff, error!', e));
}

What happens after the file is loaded from the server into our test page? Its contents needs to be represented as an ArrayBuffer, which is then decoded and turned into an AudioBuffer. Sounds scary but once you do it, you can put all this into a utility function and forget about it.

function play() {
  fetch('Roland-SC-88-Cello-C3-glued-01.wav')
    .then(response => response.arrayBuffer())
    .then(arrayBuffer => audioContext.decodeAudioData(arrayBuffer))
    .then(audioBuffer => {
      // and now we play!
    })
    .catch(e => console.log('uff'));
}

... and release

All things web audio start with the AudioContext() constructor. You create one per page load and use it all over the place:

const audioContext = new AudioContext();

This audio contect has a destination which is your speakers or headphones. And what does each destination need? A source!

One way to begin making noise is to start with an oscillator. You know, something generated in code that goes beeeeeeeep... Very pure and unlistenable, because nothing in nature is this pure. (We need overtones to perceive timbre, but that's a discussion for another time). You can create an oscillator source in WebAudio with audioContext.createOscillator(), but we're not going to.

Another way to make noise is to start not with an oscillator source, but with a buffer source. As luck would have it, we have a pre-recorded sound (our cello sample) that we've already turned into an audio buffer. Let's create our buffer source then:

const sample = audioContext.createBufferSource();
sample.buffer = audioBuffer;

Next comes connecting the buffer source to the audio context destination.

sample.connect(audioContext.destination);

We can create sources and not plug them in and they will not make a sound. Similarly, we can disconnect (unplug) things to prevent them from playing. A main concept in web audio is the audio graph made of nodes (e.g. sources and processors) that you plug into each other any way you see fit. We'll talk about it soon enough.

OK, one last thing to do, once everything is plugged in, is to start the source, meaning hit the â–¶ button on the old cassette player/CD player/iPod.

sample.start();

And this is it, you should hear the cello sample now. Try it here: demo.

Stop it!

Whenever you're ready to stop playing, you call sample.stop(). BTW, you can also start playing not from the beginning, like sample.start(2) will start 2 seconds in.

One important thing: once you you've started a sample, you cannot start it again. You can loop it (we'll see how in the next installment), you can stop it, but you cannot reuse it. If you want to play the same sound again, you need to create another buffer source with createBufferSource(). You don't need to fetch the actual file or decode it again though.

Complete code

So here's everything together:

const audioContext = new AudioContext();
let sample;

function play() {
  fetch('Roland-SC-88-Cello-C3-glued-01.wav')
    .then(response => response.arrayBuffer())
    .then(arrayBuffer => audioContext.decodeAudioData(arrayBuffer))
    .then(audioBuffer => {
      sample = audioContext.createBufferSource();
      sample.buffer = audioBuffer;
      sample.connect(audioContext.destination);
      sample.start();
    })
    .catch(e => console.log('uff'));
}

function stop() {
  sample.stop();
}

Safari

If you tried the demo in iOS or desktop Safari, chances are you didn't hear anything. There are 3 things to take care of to make this happen, ranging from trivial, to PITA, to a hack.

Trivial: browser prefix

AudioContext is still behind a prefix in Safari, which is actually understandable given that the spec is still a "Working Draft". Easy to fix. Atop of everything we just go:

if (!window.AudioContext && window.webkitAudioContext) {
  window.AudioContext = window.webkitAudioContext;
}

... and then proceed as usual.

A bit of a pain: callback API

One of the methods we used - decodeAudioData() - doesn't return a promise, but uses an older callback API, so you're supposed to call it like decodeAudioData(arrayBuffer, callbackFunction). This is unfortunate because it messes up the nice then().then() chain. But I think I have a solution that is not half bad, imho. It may look a little confusing but the point was to make it polyfill-style so it doesn't break the chain.

The first thing is to branch based on Safari/not-Safari. To do this, we check the signature of the decodeAudioData method. It it takes two arguments, it's the old callback API. If not, we proceed as usual.

.then(arrayBuffer => {
  if (audioContext.decodeAudioData.length === 2) { // Safari
    // hack, hack!
  } else { // not Safari
    return audioContext.decodeAudioData(arrayBuffer);  
  }
})

And what to do about the old method that doesn't return a promise? Well, create the promise ourselves and return it:

return new Promise(resolve => {
  audioContext.decodeAudioData(arrayBuffer, buffer => { 
    resolve(buffer);
  });
});

The whole fetch-and-play is now:

fetch('Roland-SC-88-Cello-C3-glued-01.wav')
  .then(response => response.arrayBuffer())
  .then(arrayBuffer => {
    if (audioContext.decodeAudioData.length === 2) { // Safari
      return new Promise(resolve => {
        audioContext.decodeAudioData(arrayBuffer, buffer => { 
          resolve(buffer);
        });
      });
    } else {
      return audioContext.decodeAudioData(arrayBuffer);  
    }
  })
  .then(audioBuffer => {
    sample = audioContext.createBufferSource();
    sample.buffer = audioBuffer;
    sample.connect(audioContext.destination);
    sample.start();
  })
  .catch(e => console.error('uff', e));

Safari problem #3: the Hack

Safari wisely decides that auto-playing sounds is the root of all evil. A user interaction is needed. In our case we're playing nicely and require a click on the Play button. However because the actual playing happens in a callback/promise after the file has been fetched, Safari forgets the user interaction ever happened and refuses to play. One solution, a good one at that, is to prefetch the file you'll need to play. However sometimes there may be too many options of things to play and prefetching them all is prohibitive.

A hack is in order.

The hack is to play something on user interaction and this way unlock the playing capabilities. Later, when what we actually meant to play is downloaded, we can play it.

What is the least obtrusive something to play? Well, just one sample of nothing! Huh?

OK, so by know you know of two ways to make noise - create an oscillator or a buffer from a source file. There's another one - create the buffer yourself, in code, not from a file. Like so:

const buffer = audioContext.createBuffer(1, 1, audioContext.sampleRate);

(Note createBuffer() as opposed to createBufferSource().)

What's going on here with the three arguments?

  1. First is the number of channels. 1 for mono. No need for stereo here, we're trying to be minimal.
  2. The third one is the sample rate. In this case we're going with whatever sample rate is the default in this system/computer/sound card. Back to the basics: sound is periodic change in air pressure. When you think periodic in its simplest, you imagine a sine wave. To represent sound on the computer we need to sample that wave every once in a while. How often? How many samples? That's the sample rate. For CD quality it's 44.1kHz (44100 times per second!). It's the default on many systems. Here we can define a lower rate to be economical and techically browsers should support rates between 8000 and 96000. Well, with Safari I only had success with as low as half the CD quality. So we can make this line audioContext.createBuffer(1, 1, 22050). But why bother, keep it simple, use the default. Additionally the browser will resample 22050 to its working rate of, probably 44.1kHz. So let's not overthink this one.
  3. The second argument is the length of the buffer. In samples. Meaning that if you want one second at 44100 samples per second means the argument should be 44100. But we don't need a whole second. We just want to trick Safari into playing something, remember? So a single sample is enough. Which means our playing time will be 1/44100 or 0.00002267573696 seconds. No one can hear this.

Next we continue as before. Create a buffer source, connect() to the destination and start() it.

const buffer = audioContext.createBuffer(1, 1, audioContext.sampleRate);
const sample = audioContext.createBufferSource();
sample.buffer = buffer;
sample.connect(audioContext.destination);
sample.start();

It's essentially the same as playing a file, except that instead of loading and decoding to get a buffer, we created the buffer manually. Neat. You can actually see for yourself the contents of the buffer when using the cello sample by doing console.log(audioBuffer.getChannelData(0)); once you have the audio buffer decoded. You'll see a whole lot of values between -1 and 1 (sine wave, remember?)

A buffer full o' samples

And that concludes the hack. We don't actually need to put anything in the buffer. To put it all together, and make sure we do the hack only once, here goes:

let faked = true;
if (!window.AudioContext && window.webkitAudioContext) {
  window.AudioContext = window.webkitAudioContext;
  faked = false;
}
const audioContext = new AudioContext();
let sample;

function play() {
  if (!faked) {
    faked = true;
    const buffer = audioContext.createBuffer(1, 1, audioContext.sampleRate);
    sample = audioContext.createBufferSource();
    sample.buffer = buffer;
    sample.connect(audioContext.destination);
    sample.start();
  }
  
fetch('Roland-SC-88-Cello-C3-glued-01.wav')
  .then(response => response.arrayBuffer())
  .then(arrayBuffer => {
    if (audioContext.decodeAudioData.length === 2) { // Safari
      return new Promise(resolve => {
        audioContext.decodeAudioData(arrayBuffer, buffer => { 
          resolve(buffer);
        });
      });
    } else {
      return audioContext.decodeAudioData(arrayBuffer);  
    }
  })
  .then(audioBuffer => {
        console.log(audioBuffer.getChannelData(0));
    sample = audioContext.createBufferSource();
    sample.buffer = audioBuffer;
    sample.connect(audioContext.destination);
    sample.start();
  })
  .catch(e => console.error('uff', e));
}

function stop() {
  sample.stop();
}

The demo that works in Safari is right here.

End of part 2

In the next part let's loop this sounds so it keeps on playin'!

Comments? Find me on BlueSky, Mastodon, LinkedIn, Threads, Twitter