Blog-to-podcast with ffmpeg

February 16th, 2009. Tagged: ffmpeg, Music, tools

ffmpeg is such an amazing tool, looks like it's for video what ImageMagick is for images. An all-powerful all-formats wicked cool command-line tool.

This blog post is an introduction to some of the MP3 capabilities of ffmpeg. I'll use ffmpeg to transform a blog post into a podcast-ready mp3 file. If you continue to read this longish post, here's what you can expect to see:

  • using PHP DOM
  • ffmpeg to convert to MP3
  • ffmpeg to crop (slice) an MP3
  • glue together several MP3s
  • Mac's say command for TTS (text-to-speech)

PHP DOM to get some blog content

This blog's feed is at http://phpied.com/feed. Let's create a small PHP script to extract the title, date and content of the last blog post. We'll feed these to Mac's say command to read then aloud.

Location of the feed:

$file = 'http://phpied.com/feed/';

Load the feed into a DOM instance:

$dom = new DOMDocument;
$dom->loadXML(file_get_contents($file));

Access the node that contains the first item, i.e. the last blog post

$post = $dom->getElementsByTagName('item')->item(0);

The title and a friendly-formatted date:

$title = $post->getElementsByTagName('title')->item(0)->textContent;
$date = $post->getElementsByTagName('pubDate')->item(0)->textContent;
$date = strtotime($date);
$date = date('F jS, Y', $date);

Get the content:

$ns = 'http://purl.org/rss/1.0/modules/content/';
$content = $post->getElementsByTagNameNS($ns,'encoded')->item(0)->textContent;

Strip out HTML tags and entities:

$content = strip_tags($content);
$content = html_entity_decode($content);

(Since this content will be read aloud, HTML tags and entities will make no sense. Here we cound've done better job by doing something more special for list, using ALT tags to replace images and so on...)

echo $title, "\n\n", $date, "\n\n", $content;

OK, so now let's call the script from the command line and write the output to a file:

$ php feed.php > thepost.txt

Here's the result - thepost.txt

Using Mac's say for text-to-speech

You can make your Mac talk on the command line, like:

$ say test

and it will say the word "test"

The say command can also read text from text files and write to AIFF audio files. Let's read thepost.txt into an audio file.

$ say -f thepost.txt -o thepost.aiff

Since I'll make this look like it's a part of an ongoing series of podcasts, I'll add some music and I also need a greeting and goodbye spoken text. So:
$ say -o welcome.aiff Welcome to phpied.com podcast
$ say -o thatsallfolks.aiff That was all for today, join us next time on... phpied.com

OK, so now I have three AIFF files:

Now I want to add some music before/after the podcast. I took four loops from Garage Band's library. Here they are:

Next?

Now I have a bunch of audio files. All I need to do is merge them, glue them together into one MP3. Glueing MP3 will be as easy as simply concatenating the files, using cat for example and them making a final pass through ffmpeg to correct dates and other meta data, so that the result looks like one single file, and not like a Frankenstein 🙂

In order for the concatenation to work, you only need to make sure all files are the same format, bitrate, etc.

Let's choose 22050 Hz, mono for the result. This means always add the options:

-ar 22050 -ac 1

to all calls to ffmpeg.

Let's get cracking.

ffmpeg to convert just about anything

The simplest use of ffmpeg is to convert from one file format to another, for example AVI to MPEG, WMV to FLV and what not. This is done like this for example:

$ ffmpeg -i input.avi output.flv

ffmpeg to get file information

It's useful to know what type of file we're dealing with, you can do this simply by omitting the output file:

$ ffmpeg -i input.avi

Let's check out one of the Garage Band loops:

$ ffmpeg -i opener.mp3 
FFmpeg version..... (more ffmpeg information)
Input #0, mp3, from 'opener.mp3':
  Duration: 00:00:12.6, start: 0.000000, bitrate: 191 kb/s
  Stream #0.0: Audio: mp3, 44100 Hz, stereo, 192 kb/s
Must supply at least one output file

Pretty good quality, more than I need. Plus, for some reason Garage Band added silence at the end of the files I exported, so let's cut it off.

ffmpeg to crop files

I want to remove trailing 5 seconds or so of each Garage Band loop. Here goes:

$ ffmpeg -i breaking-news.mp3 -ac 1 -ar 22050 -ss 0 -t 6 breaking-news-ok.mp3
$ ffmpeg -i opener.mp3 -ac 1 -ar 22050 -ss 0 -t 8 opener-ok.mp3
$ ffmpeg -i closer.mp3 -ac 1 -ar 22050 -ss 0 -t 33 closer-ok.mp3
$ ffmpeg -i squeeze-toy.mp3 -ac 1 -ar 22050 -ss 0 -t 2 squeeze-toy-ok.mp3

-i is the input file -ac is the number of channels (1 for mono) -ar is the rate, -ss is start, -t is length.

Now you can see how the meta information for the opener.mp3 has changed:

$ ffmpeg -i opener-ok.mp3 
Input #0, mp3, from 'opener-ok.mp3':
  Duration: 00:00:08.1, start: 0.000000, bitrate: 63 kb/s
  Stream #0.0: Audio: mp3, 22050 Hz, mono, 64 kb/s

Convert AIFF to MP3

Now let's convert the AIFF files from our TTS say command to MP3, keeping the same 22050 mono rate:

$ ffmpeg -i thepost.aiff -ac 1 -ar 22050 thepost.mp3
$ ffmpeg -i welcome.aiff -ac 1 -ar 22050 welcome.mp3
$ ffmpeg -i thatsallfolks.aiff -ac 1 -ar 22050 thatsallfolks.mp3

Here are the new MP3s:

Glue the pieces with cat and ffmpeg

Now, last stage, let's glue all the pieces with cat which simply means append the next file at the end of the previous.

$ cat breaking-news-ok.mp3 welcome.mp3 opener-ok.mp3 thepost.mp3 
            squeeze-toy-ok.mp3 thatsallfolks.mp3 closer-ok.mp3 > pieces.together

Then make these pieces a proper MP3 file

$ ffmpeg -i pieces.together final.mp3

Well, that's all folks, here's the final result:
final.mp3

Comments? Feedback? Find me on Twitter, Mastodon, Bluesky, LinkedIn, Threads