Skip navigation

Auphonic Logo

Updated 03-01-2013: to correct some details noticed by Georg from the Auphonic team
 
I recently came across what may prove to be the single most useful podcasting tool I’ve seen in years. It is a free online service called Auphonic which automates the tasks of normalizing audio as well as noise reduction, encoding, distribution, and a whole lot more.
 
Many podcasters regularly use The Levelator by Conversation Networks to do some of this. Levelator is great and can save you hours of manual processing. I’ve used it a lot when I have recordings of multiple speakers spread across a room, or an uneven Skype conversation where I don’t have the raw audio from each side. Levelator does have a few shortcomings though: you have absolutely no control over any of the processing; it mangles music; is rarely updated; and only works on Windows or Mac*
 
Auphonic not only addresses these issues but goes well beyond. Working backwards: Auphonic is a web service so operating system is irrelevant; development is fast and furious and the system includes a machine learning component; it identifies music and processes it separately from voices; and you have control over what processing is done as well as the target “loudness” of the completed file.
 
Further, Auphonic will process audio and video files from/to many different formats; offers integration with Dropbox, ftp/sftp, Libsyn, and other services; will handle metadata (as well as chapter marks); and provides an API for those inclined to automate their workflow.
 
What’s It All About?
The Auphonic team’s goal is to provide end-to-end services for podcast production from recording to feed. Meaning, a system to capture a recording, edit and polish, create blog post w/show notes, and post for listener consumption. The first part of that goal is the web service to improve your audio files.
 
The service is built on open source tools, and they are planning to release the algorithms as plugins to Audacity (hopefully in the form of VSTs for use in other DAWs as well – no plans for VST at the moment. They’re working with the LV2 plugin format which Audacity supports). They have also released an IOS App to record and process files, with an Android version coming any day now.
 
As stated, the service is free and they have no plans to charge for it above voluntary donations they will try to establish a freemium model based on the amount of data people are processing. So heavy users pay a little bit for it and small podcaster[s] can still use it for free.
Much of the work is being funded by the Graz University of Music and Performing Arts and the Austrian government.
 
Feature Breakdown
When a file is processed, it is first analyzed to classify speech, music, and background segments so that each component can be optimally processed to give the best sounding output file. Current features include:
  • Intelligent Leveler – Each person speaking has their level automatically raised or lowered to give a consistent presentation.
  • Loudness Normalization – Voices and music are adjusted for momentary, short term, and overall loudness through limiting and compression. You can specify the overall loudness level based on established European broadcasting loudness standards – or the US ATSC A/85 recommendation to be compliant with the CALM act. (boy, I wish the US would adopt these! I had no idea the US had any loudness standards.. commercials sure do seem to still jump out at you!)
  • Filtering – a high pass filter that removes unnecessary low frquencies
  • Noise Reduction – removes consistent background noises from computer fans, air conditioning, or line noise (buzz or hum).
  • Encoding – the processed file can be encoded to a variety of formats including lossy (mp3, AAC, Opus, Ogg) and lossless (WAV, FLAC, ALAC). The service will create multiple output formats at the same time, so click the button once and fill all of your feeds if you offer multiple formats to listeners. Also, if the input is a video file, it can be output to the same format leaving the video untouched.
  • Metadata Management – fill in desired metadata fields once (artist, album, title, artwork, etc) and all of the output files will include the properly formatted tags, including chapter marks for enhanced podcasts. Even your MP3 and OGG files can have chapters!
  • Content Deployment – the service can read and/or write files to a host of services automating and easing the process of getting files in and out. These currently include: FTP, SFTP, Dropbox, AmazonS3, YouTube, Archive.org, SoundCloud, and Libsyn.
  • Presets – Create presets, or templates to easily process all of your files the same way. This could be as simple as predefining the bitrate for your mp3 files, all the way to what external services to copy the files to, pre-filled metadata, and what processing to do.
  • API – a complete programming interface that allows you to write scripts or full applications that will import, process, and export your files in any way you like.
  • Machine Learning – the system includes machine learning components to constantly improve all of the algorithms. Similar to email spam filters or search engines – the more people use the service, the better it gets.
  • Batch Processing – Specifying a preset, you can batch groups of files together to all be processed at the same time
 
Control Freak
You have control of many aspects of the processing and resultant files. This includes:
  • Target bitrate for audio formats
  • Stereo to mono conversion
  • Chapter splits to multiple output files
  • Which processing to perform
    • Adaptive leveling
    • Filtering
    • Global loudness normalization (on/off as well as how loud it should be)
    • Noise reduction (including the amount to reduce by)
  • Email notifications can be selected on processing completion, errors, warnings, or all of the above
 
How Does It Sound?
I’ve gone back through my archives and pulled the audio from some “challenging” recordings to put the service through it’s paces. These included live recordings from conventions with several speakers at varying distances from microphones; listener feedback recorded over phones; and a recording with significant electrical ground noise that seemed to permeate every band on the EQ.
 
Some of those took me hours to fix BEFORE getting to editing. The last was deemed unusable after I and another audio engineer took swings at it. The Auphonic exports were on par with all of the manual work I did, and the results were returned to me within minutes! The last file still had some audible hum here and there, but was totally usable in a podcast as long as you gave a little warning/caveat at the top of the show.
 
I was going to include some samples here, but seeing as the service is free and so fast – you need to just grab some raw audio and see for yourself. I’m confident that you won’t be disappointed and will likely make Auphonic the last stop for all of your future recordings.
 
Conclusion
The breadth of options and flexibility are already astounding and I can’t wait to see what features they add in the future. One in particular that was mentioned on a FLOSS Weekly interview is removing natural room reverb from a recording (presumably using downward expansion).
 
Being a completely free service, I see no reason beginner and expert podcasters alike won’t find this to be a huge time saver and go-to tool for all of their productions.
 
 
* Yes, there is technically a Linux version of The Levelator available, but the required libraries have far outpaced it, so it won’t run on modern systems. There are plenty of guides on how to use it on Linux with Wine or some such, but again, due to not getting updated, I haven’t been able to get it to output a file on Linux for a few years.

As happens just about every time we leave a Con, I’m inspired to try something new. This time I’ve decided to embark on a new music project. A lot of people do “365” projects be it taking photos, writing blog posts, or even short stories. I’m not ready to commit to a song a day, but I’m going to write a song (or at least part of a song) during a lunch hour at least once a week and post it.

I won’t write a blog post about each one, but at least upload them to a set on SoundCloud. Anybody can listen or download them for free from SoundCloud (you don’t have to sign up or anything) and it will auto post to Facebook and Twitter for me. I don’t know how long I’ll go, but the tradition for this kind of thing is a year.

Here are my rules so far. Yes, I’m making this up as I go along and they may change:

  1. Each song will be written on a single day, in a single lunch hour
  2. There’s no schedule and no deadline. I intend to write and release one a week, but have no clue what day that might be
  3. No mixing, mastering, adding, or tweaking after the fact — I write it and post it in one go
  4. I’m releasing each track with a Creative Commons

iPhone 4SIf you’ve even casually glanced at my site or social stream then you’ll know that I’m an Android fan. I am an Open Source enthusiast, in favor of Copyright and Patent reform, so on and so forth. So what am I doing with an iPhone?
It was actually issued to me for work. My Blackberry Bold 9000 was dying, and people in our enterprise have been gravitating away from Blackberry to iPhone, so it was the logical choice. Work doesn’t allow Android devices yet, but our management solution can handle it (as well as Windows phone), so it is not far off.
For my personal phone I use a DroidX on Verizon (running stock Android 2.3.4, aka Gingerbread). I love the Droid and it has served me well.

The Blackberry also served me well for that matter. I disliked moving to a non-touch screen (previous work device was a Treo 650 and personal was the original Android G1), but I really did enjoy the hardware keyboard. I only used the Blackberry for mail and calendar, so the lack of a touch screen wasn’t too big of an issue. I could have done with a larger screen, though.
I have played with a few iPhone iterations, but never “got” it. I didn’t see what all the fuss was about. Once ordered, my expectation was that after I lived with an iPhone for a few weeks that I’d fall in love with it and “join the fold” so to speak. Well, it’s been a couple weeks and that hasn’t happened yet.
I’ve gone from mild amusement and admiration for the hardware design to mild contempt. Below I’ll hit the major points of the device compared to the other devices I’ve extensively used.

Screen
The screen truly is a beautiful display. Nice and crisp, vivid colors, and good lighting.. but it is so tiny! Comparing the size to my DroidX is like comparing the iPhone screen to the Blackberry. The super duper resolution of the retina display causes text to be on the small side which isn’t too bad for web pages and such since you can zoom. The accessibility options leave much to be desired, though. Rather than increase overall font sizes it allows you to “triple finger tap” to zoom in, then pan around the zoomed screen.
My eye sight is good, so I’m OK with this, but I think it’s something for those with low vision to keep in mind. No surprise, I prefer the accessibility options in Android’s current 4.x incarnation (aka Ice Cream Sandwich) which acts more like traditional operating systems by adjusting font and icon sizes. Gingerbread doesn’t do this, so if you need it make sure the device you choose has ICS or better. ICS  doesn’t quite go far enough since it will not globally change fonts in applications, but I think it’s better than Apple’s choice.

Keyboard
I think this is my biggest gripe with the device. Spacing is OK, and the clicky sound feels natural, but text predictions are limited and the implementation sucks. Text predictions help immensely when typing on the go, and trying to get thoughts down on a small mobile screen. I hardly notice when IOS is predicting for me so blow past the words. When I do pause and look it’s usually wrong in the single guess as to what I want to type.
Another issue is that special characters and numbers require hitting an extra button. Even worse, really common special characters (#, +, *) require hitting a second special button to get to. This is particularly frustrating when you use secure passwords for your lock screen and web sites. Between shift and symbol keys it takes 12 key presses to type in an 8 character password. The Android method of “long pressing” to get numbers and symbols is far more elegant.
I do prefer IOS’s method for zooming through already written text, as well as selecting text to copy. Copy/paste came late to IOS, but they implemented it well.
The clincher for me is that you can’t download alternative keyboards. This is the first place that the locked down Apple experience really started annoying me. Other areas include the App store, the browser, and a few other app areas.
I even preferred the Blackberry hardware keyboard. It too had annoyances with regard to special characters, but doing selections copy/paste with the shift key and trackball worked great.
Unforeseen is that the iPhone keyboard is actually causing me to make more mistakes on my Android phone! Luckily, my existing prediction history and excellent auto correction is helping maneuver this bump. Hopefully my brain will auto compensate for this once I get used to the IOS keyboard.
Siri

Siri
One of the break out features of the iPhone 4S, and in fact what the “S” stands for is Siri. When this feature was first announced I predicted that it would largely be unused. I haven’t seen difinitive proof one way or the other on this but it seems to be the case, and will likely hold true for me.
The voice recognition is excellent – better than Android’s, but that’s not my issue. The two primary drawbacks to me are that it takes several seconds for Siri to analyze what I’m saying and more often than not she has no idea what I want.
It works well for doing the things in the Apple commercial, but when I try something that I think falls right in line with what Siri should do (for example, “how far away is Biloxi, Mississippi”) she returns brain dead search results that do not infer what I wanted. This is far from a “magical” experience.
Siri is pretty responsive when at home on wifi, but when out and about I can type a search for what I want faster than watching Siri’s blank screen, waiting for her to get it wrong.
I will continue to try queries, and I’m sure I will love the ability to quickly schedule appointments or place calls while I’m driving. I’ll come back and update the post if I do use Siri regularly.

Performance
This is another area that the iPhone really shines. It is responsive. It loads into apps and swipes across screens fluidly. Hands down, the best experience I’ve had with a phone in this regard. In my experience, most complex devices start this way, but things get gummed up once you’ve been using it for a few months, so I’ll check back in. My friends with iPhones never complain about them slowing down, so I think performance will continue to be great.

Camera
Another excellent mark here. The camera is wonderful. It launches instantly and the pictures are great. Overall a much better experience and better pictures than my DroidX.  I plan on moving to a Samsung Galaxy S III when I am eligible for upgrade and I can’t wait to take the cameras head to head. I’m pretty sure that the iPhone will still come out on top. 
The photo gallery is much more responsive than my Droid as well. The android photo gallery can take forever to open up once you tap it. Once inside it is fluid, and if you re-launch it is fine the rest of the day but then (presumably) the process gets kicked out of memory and it takes forever to launch again.

Battery
Battery life is good, but not much better than my DroidX, so this is pretty much a wash. I easily go all day while using the phone pretty heavily and just charge each night as I sleep. I would prefer to have a removable battery with the option to buy something with a longer charge, but I think I’m OK with it. We’ll see in a couple years if the battery starts degrading.
The Blackberry battery is legendary. Well, not the battery itself (though it is removable), but the device’s power performance. You can use it heavily and still not have to charge it for days. This isn’t a fair comparison though, giving the limited processing power, limited data, and limited apps.

Apps
Yes, there are a bazillion apps in the App store. But you know what, just about any good app costs money and there are few demo or “light” versions. This is very frustrating coming from Android where it is more common to have ad supported apps with the option to buy an ad free version. I’m not even going to touch the fact that your only option is to use Apple’s store. I may throw the phone against a wall it frustrates me so.
The truth is, I’m rarely disappointed that an app is available for iPhone and not my Android. There are usually analogs that work just as well, or almost as well (and sometimes better) but are free. Yes, there are plenty of stinkers, but the power of user reviews and ratings makes steering clear of those easy.
I hear an area where the iPhone shines is music creation apps. Unfortunately, I can’t really take advantage of playing with these as it is a work phone. Blackberry app choices are abysmal. I’m sure there are lots more out there that I just never found, but I never experienced the same discoverability or ease of installing inherent in the Apple and Android stores.

Other Issues
I haven’t touched on a number of issues, but they’ve all been well documented by other people. Some are device specific, and some are more philosophical. I may write a blog post dedicated to these in the future. Among the other gripes are: the proprietary connector, reliance on paid Apple services for what I consider to be basic functionality, lacking configuration options in many areas (ie: Apple making choices for me instead of giving options), no removable storage, the perils of hacking and jailbreaking, poor task switching and app concurrency, minimal hardware buttons.. That’s most of them. Other small annoyances have cropped up as I’ve used it more.

The Scoreboard
Here’s where I give scores to these three devices. I will rate each aspect from 1 to 10 (10 being best). Best possible score is 100. Let’s see how they did.

Feature:  DroidX  iPhone
4S
 Blackberry
Bold 9000
Overall Experience 9 7 4
Screen 9 7 4
Responsivness 8 10 7
Keyboard 9 4 8
Battery Life 8 9 10
Apps 9 8 3
Camera 8 10 4
Configurability 10 6 2
Accesibility 7 6 3
Accessories 9 9 5
Final Score  86 76  50 

OK Apple fans – now it’s your turn to tell me why I’m completely wrong and how I still don’t get it. It has only been a couple weeks, so I’ll come back and update the post in the next month to see if I still feel the same way.

I love open source. I love the technology, the philosophy, and the people in the trenches making things better.

But I give up. Between proprietary hardware drivers and slow software release cycles for the things that matter most to me, I’m done.

I just spent two hours to accomplish the following: Download a new album that we paid for and play it back on my television.