Music visualization in Python, part 1
This post is the first in a series, in which I plan to document the process of creating a music visualizer in Python using NumPy, SciPy and other libraries that may be needed. The text assumes basic knowledge of Python, as well as the basics of NumPy and SciPy. Some signal processing and image processing knowledge will also be helpful, but not necessary, for understanding the text.
Music visualization
YouTube is a major channel for distributing and discovering new music. Since it is a video streaming service, uploaded music needs to be accompanied with a video. Although it’s perfectly acceptable to create the video from a still image, there is just something hypnotic about a picture that reacts to music.
Now, there is plenty of software that can be used to make beautiful visualizations. This tutorial is all about reinventing the wheel, while creating some visualizations for my own music.
In these posts, I will distinguish two types of visualizations: post-processing effects and drawing-based visualizations. An example of a post-processing effect would be a blur that varies in intensity based on the music signal, and a simple drawing-based visualization would be a bar graph that shows the amplitude spectrum of the audio. In this first post, we will set up the input and output.
Input and output
The first thing we need to set up is IO – we need audio input, image input and
video output. The first two are handled by scipy.io.wavfile
and
scipy.ndimage
:
For the purposes of this project, we don’t need stereo audio, so let’s convert a stereo signal to mono:
For video IO, the most mature library I could find was imageio, which runs ffmpeg and avlib under the hood. We can instantiate a video writer like so:
Now we can write images to the output:
The processing loop
We will generate the frames to the video in a for
loop. For that, we need the
number of frames, which can be calculated as
And here’s a sketch of how our processing loop will look like:
To be continued…
So in this post we only covered the boring stuff, but we now have a pipeline set up for creating more interesting things. In the next post, I will discuss what features to extract from the audio, and create a simple visualization. The code for this project, which may not be entirely in sync with the blog, is available on my GitHub.