Tuesday, January 26, 2016

Audio publishing tools: circles within circles

If you think writing a book is hard, wait until you decide to make an audio-book out of it.  Now I understand why it costs $30-60 to buy the audio version of a $15 book.

If you've been wondering where the blog posts have been lately, or why it's taken so long between the publication of my Kindle book and the release of the print version.  Well, along the way I discovered that the best way to catch little niggling edits was to read the pages aloud - and it occurred to me that if I was going to do that anyway, why not create an audio version of the book at the same time?

Because it is not as easy as you might think..

Start with the assumption that you want your book to conform to Amazon's standards - can you afford to do otherwise? (I'm going to refer to the platform as ACX, Audiobook Creation Exchange, that's how you get your book on Amazon)


  • The upside is that ACX has extremely precise specification for what you have to do to make your book acceptable to them.  It has nothing to do with content.  It addresses the file length, the maximum volume, the average volume, and the audio quality (no background hiss or extraneous noises). 
  • The downside is that those tolerances aren't easily achieved.  You could pay someone to narrate your book for you (prices tend to run around $200-$400 per final narrated hour, or basically per 30-40 pages; for my current book (Let It Simmer: Making Project, Portfolio and Program Management Stick in a Skeptical Organization) which runs almost 300 pages, that would be $1200-$2500.
  • The odds are that investment would take years to break even.  So you could do it yourself, taking it on faith that the audience will forgive your rasp, your heavy accent, or whatever because they are thrilled to be hearing these words from the author in person (actually, yes, there is a pretty high tolerance for that). If that's your option, this post is for you.
Your biggest hassles are not equipment and software.  They will be your HV/AC unit, and your spouse, kids and pets thundering around the house.  It's amazing how loud sounds are that you've never heard before: your dainty spouse's footfalls sound like a herd of heffalumps and your HV/AC unit sounds like a jet engine revving up.  Since you probably can't or don't want to set up a sound-proofed studio in your house, you're just going to have to manage those problems.

But, following the advice of aforesaid mentors, who include Alun Hill (@AlunHill), Mark Timberlake (@Mark1Timberlake) and KC Carlson (www.video4results.com), I did make some minor investments in decent equipment, and it really is minor.  You need a directional mike for your video camera and you need a USB directional mike and pop screen for your PC / home studio. That's it.  Total hardware investment about $100, certainly under $150 even without going to eBay or waiting for sales (let's save the specifics of the equipment for another post).

And then ...

The software.  You can get free software from various places.  You've got to be a bit of an audio nut to appreciate most of it.   I had to use Camtasia (or something similar) anyway for related work that uses video, or to integrate slides with audio recording.  It's also quite complex but since it handles everything I needed, what the heck.  I decided to do as much as I could with what I had to use anyway.  It's not free: list is $299, but you can usually find coupons for half-off or $100 off.  You won't be sorry; it's a beautiful tool and pretty easy to use. But for audio purposes, it's just the foundation.

I'm not an audio freak.  I couldn't care less about woofers and tweeters and I really don't have much interest in a big console with all those knob things. I mean, how hard can it be to narrate a book in a reasonable tone of voice, maybe clean up some verbal miscues and package the files up for production?  It's not music or anything.

And in fact, that's the problem. You don't always know whether the music is right or not.  With human speech, you can pick up a lot of things in a tenth of a second, and a distortion or interfering noise even that short is perfectly audible and registers.  "Attention interrupt" is the whole principle behind internet and TV marketing -- it works because it is annoying.  People don't wnt that in their books.  So ACX has pretty clear specs for an audiobook.  So here's what you have to do.

Record your text.

That should be the easy part (we'll talk about tools in a minute).  The first thing you'll find out is that the text has a lot of errors your brain sort of skipped over during the editing process; when you read aloud, they all come out.  You'll end up fixing the text as you go, and starting that segment over again.

Fix errors via cut-and-splice.


This is pretty powerful; it's amazing once you get into the swing of it how you can start rambling on, or have a coughing fit, and you just have to remove those minutes of tape. (OK, it's not really tape). Most of the tools I looked into at first, including Audacity which is almost universally recommended, looked way too complicated and audio-geeky.  So I started using Camtasia to record and edit. One thing it does have is a very cool filter for background noise which makes things a lot easier later. That worked pretty well as an editing tool.

Setting the correct bitrate

For broad audience compatibility, ACX insists on MP3 files at a 192KB rate. That's not one of the Camtasia settings for MP3 exports.  So you have to use a little app (also from Tech Smith) called fre:ac.  It's free and very easy to use. Tech Smith's forum also provides instructions on how to install it.

Meeting the decibel thresholds.

Once you get your file at the right processing speed, via fre-ac, its not over by a long shot.  NOw you need a tool that Camrasia doesn't have.  Camtasia users have long been asking for a volume meter with actual decibel values on it, not just a red-yellow-green needle. You can increase the total volume from a nominal 100% to some other percent -- but where do you adjust it to in order to be ACX-compliant?  Actually, you need a lot more than just the meter: you need some analytic tools.  There was no way in Camtasia to even know whether you are in the range Amazon requires, as "green" simply isn't fine-grained enough for that.  In any case, I had all green recordings but I wasn't about to take some 100 hours to record maybe 10 hours of book (accounting for the learning curve, once you get the hang of it, it's probably a 4:1 ratio) only to be rejected by ACX and have to do it over.  (And once I had a tool to look at my files, they were all well outside the ACX thresholds even though they sounded just fine).

So, time to take another look at Audacity, which has much better instrumentation.  This highly-capable tool is open-source (i.e. free).  Once you've gotten it here, surely that's all that's required? Well, no.  It's designed to work in detail with WAV output and there are limits to the MP3 output it can produce. One of those limits is that it can't generate the 192 kb rate you need.

So now you've been forced to use Audacity for at least part of the effort, it would be possible (and is quite easy) to just start by recording in Audacity instead of Camtasia.  But when you're done recording you'll certainly have to edit, just as above, and since you'll have to pass through Camtasia to convert the MP3 file bit rate anyway ... is your head hurting yet?  So it's up to you. For editing, I found Camtasia a lot easier to work with; but it could just be that I started there.  There is a great deal more help on line from the vendor and others.  Audacity's forum etc is not as rich and a lot of the documentation has that open-source look and feel which is aimed a bit more at the geek community. But you need that techno-info from Audacity forum because it has the tools you need next.

Either way, you'll end up loading two more plug-ins to the standard Audacity set-up: "ACX Check" and RMS Normalize (different from the Normalize that is already on board). THe Audactity forum is pretty good about telling you how to do this.  Load in your MP3 file and run your ACX plug-in (under the Analyze menu) to see how the clip compares to ACX requirements.

Now the following sentence is pretty short but the doing of it is tedious.  You'll have to iterate between the on-board Normalize, setting the peak dB level to a recommended level of -3.3 level to provide a little cushion for the ACX maximum of -3.0, and the RMS Normalize and Compressor plug-ins (located under the Effects menu) which attemt to module the overall sound levels so that the whole clip nets out within ACX's very narrow range of -18 to -23 dB.  It usually took me 3 or 4 tries to find a combination that works.

The last check is for background noise, which must be held under -60dB.  I found that with the Camtasia filters applied, this wasn't a problem, but if you have to then Audacity has another plug-in (Noise Level) to further squelch that (as well as a setting in the Compressor plug-in).  Audacity's forum also notes that this plug-in starts to create the kind of signal that triggers ACX to consider that the file contains too much post-processing to be reliable.

Once you get it all sorted out, Audacity will let you create an MP3 which you can then include with the rest of the files you need to upload your entire opus to ACX as an audio book.

So there it is, more than you ever wanted to know about conditioning a simple audio file sufficiently to enable it to become an audio book. However, as several of the names I noted above mentioned, the video component of our brains is used to filling in parts it doesn't truly see.  The audio part is what causes bad reviews, even of a video.

Next time you're thinking how expensive an audio book is  .. try making one out of just a simple white paper you've written.  All of a sudden $35 seems cheap.  Although I still borrow most of mine from the library ...



No comments:

Post a Comment