This is the third in a series of four tutorials on visualizing audio using the Web Audio API. Take a look at the first one, Visualizing Audio #1 Time Domain for more background and a detailed walkthrough of the basic code used here.
The first two tutorials visualized Time and Frequency Domain data in 'real-time' - rapidly changing displays that convey the 'energy' of the audio. These are impressive and can enhance any web page that outputs audio. But these displays are frenetic and do not really tell you much about the 'structure' of the audio. Indeed, they can be quite tiring to look at for any period of time.
If you want to show how the audio stream varies over time, then a better solution is to condense each batch of audio data into some summary measure and display that in a 1 pixel wide column. Each batch of audio samples produces another column in the display and so the display builds up, left to right, as the audio is played.
Tutorial 32 (Visualizing Audio #3 Time Domain Summary) showed how to display Time Domain data in this manner.
Frequency Domain can be treated in a similar way, although the graphics code is more complex.
Each column represents the frequency spectrum for each batch of samples, and each pixel within the column represents an individual frequency. The amplitude of each frequency if mapped from the numeric value to a specific color, using a color map.
This type of display is called a Spectrogram and the resulting display looks like this:
As of December 2013, the Web Audio features used here have only been implemented in Mozilla Firefox and Google Chrome browsers.
Here is a quick recap of the network of nodes that I showed in Tutorial 30 which forms the basis of the code for this tutorial.
Each component in the Web Audio API is called a Node and we connect these together to create a Network that implements a complex function. You can think of the web audio AudioContext as a container for the Nodes that make up our network.
The SourceNode is what holds the audio clip that we will play and analyse. The code loads this from an encoded file that it fetches via a XMLHttpRequest (Ajax call). The DestinationNode is a Web Audio built-in node and is what connects the AudioContext to the audio output subsystem on your computer or device.
Understanding the Code
The starting point for this series of tutorials was the excellent tutorial Exploring the HTML5 Web Audio: visualizing sound by Jos Dirsken - he has updated his tutorial in November 2013 to reflect some of the changes in browser implementations.
Most of the code for this demo is the same as Tutorial 31, and the Audio nodes are identical, but the canvas drawing function is substantially different.
The Frequency Spectrum computed from each batch of 1024 sample is a typed array of the amplitudes at each frequency. We need a way to convert these into a range of colors that we can use to color the pixels in the display.
Download a local copy of this and include it in your code. I define the color map with this statement:
var colorScale = new chroma.scale(['black', 'red', 'yellow', 'white']).out('hex');
This creates a scale the goes from black at value 0.0, through red and yellow to white at value 1.0.
The canvas drawing function here is drawSpectrogram. Each frequency spectrum is drawn into a single 1 pixel wide column, with the lowest frequency at the bottom of the canvas, which is Y value canvasHeight-1. The for loop steps through the frequency array and defines a fillStyle based on the color-mapped value. A 1x1 pixel rectangle is drawn at the correct Y value. While this is not the most efficient way to draw graphics like this, it is easy to understand and plenty fast enough for our needs.
The resulting canvas looks like this: (Black pixels indicate no amplitude at that frequency, red indicates low amplitude and yellow indicates high. )
You can see clear patterns in the display and it is relatively easy to relate these to the audio being played. For example, in some areas you can see evenly spaced horizontal lines - these represent harmonics.
I think spectrograms are very impressive. I have used them to visualize spoken text to help in pronunciation of words in foreign languages. As you can see, the code to create them, is not that complicated once you understand the basics of the Web Audio API. I hope that you will experiment with the code in these four tutorials and build your own visualizations.
Here are some other useful guides, refererences, etc. on Web Audio:
- Web Audio API specification
- Web Audio API - book from O'Reilly by Boris Smus
- Exploring the HTML5 Web Audio: visualizing sound - a tutorial by Jos Dirsken that I used this as the starting point for my series of four tutorials.
Code for this Tutorial
|1 :||tutorial_33_example_1.html||Live Demo 1|
Share this tutorial
30 : Visualizing Audio #1 Time Domain (Advanced)
31 : Visualizing Audio #2 Frequency Domain (Advanced)
32 : Visualizing Audio #3 Time Domain Summary (Advanced)
Comment on this Tutorial
comments powered by Disqus