Saturday, September 24, 2011

I wrote a DTS encoder

Let me announce a piece of software that I have published a week ago: dcaenc, an open-source DTS encoder. The package contains sources for a shared library, a command-line tool and an ALSA plugin.

DTS is one of the compressed formats that allow transfer of multichannel (e.g., 5.1) audio over SPDIF connections. The other common format is AC3. The SPDIF standard does not define a method for passing more than two channels of uncompressed PCM audio, so compression has to be used. Both AC3 and DTS are also used in DVD sound tracks.

Open-source decoders for both AC3 and DTS already exist: liba52 and libdca (side note: please don't use libdca, it is a security risk, there are some files that crash it or are decoded improperly). FFmpeg can also decode these formats. However, useful open-source encoders existed only for AC3: one in FFmpeg, and the other one (aften) based on it. The DTS "encoder" in FFmpeg was ported by someone else from my old proof-of-concept code that served as a tool to understand the DTS subband transform. It could only encode stereo PCM files into a valid DTS bitstream of the same bitrate, which is useless for any practical purpose. Now dcaenc provides a useful encoder that accepts multichannel sound and encodes it to the bitrate specified by the command line parameter.

As already mentioned, there are the following use cases for my encoder:
  • On-the-fly encoding of multichannel PCM audio produced by arbitrary ALSA applications (e.g. games) for transmission via SPDIF
  • Creation of DVD soundtracks and DTS CDs.
Some people ask me why I didn't integrate my encoder into FFmpeg instead of releasing it as a standalone package. Indeed, there are faster implementations of the basic DSP building blocks in FFmpeg, and the criticism that I reinvented a lot of wheels is valid. Integration with FFmpeg is indeed a desired long-term goal.

There are still several reasons why I decided not to integrate right from the beginning. First, I don't think that my work is in the necessary shape for integration yet. E.g., in FFmpeg, floating-point codecs are preferred, while my library currently uses fixed-point (I thought it would be beneficial for porting to the ARM architecture). Second and the most important reason: when the encoder is standalone, users can get it immediately and use it, without the hassle of replacing the distribution-provided FFmpeg package and potentially breaking distribution-provided software such as VLC that depends on it. Third, if I know that I wrote all the code myself, I can sell LGPL exceptions.

While dcaenc already produces "transparent" output at 1411 or 1536 kilobits per second, there is still room for quality improvement at lower bitrates. This is because the library does not yet use all compression possibilities offered by the DTS standard. I am going to implement at least linear prediction (incorrectly called ADPCM in the specification) and channel coupling in the future versions. Stay tuned!


Maxwell said...

Looking forward to trying it out!

Dutchy said...

Thanks, your encoder worked great for me!
I compiled it on Fedora and the convert instructions work quite good.

The only problem I stumbled upon was when I tried to mux it with video into an mkv (mkvtoolnix nagged about missing dts headers).
The sollution was to mux the dts with tsMuxer prior to muxin it into an mkv.

Fabrice on the Blog said...

amaizing and inspiring stuff.
can you advise me based on your experience with the AC3 libraries.
I try to port a AC3 decoder into a STM32 arm processor.
I have found a decent and recent liba52 accepting fixed and not only float. on the other end, we have also the ac3 decoder from the ffmpeg library, which looks a bit more complex to me as some part are also writen for encoding. would you give me a kind advise on the best starting point to compile such decoder ? also liba52 doesnt seem to be imediately compatible with DTS. any tricks for that ?
thank you in advance / fabriceo

Fabrice on the Blog said...

amaizing stuff.
based on your experience, could you give a small advise : I try to port an AC3 decoder on STM32F3 which has 256kb prog and 32K ram. liba52 seems ok. I found a recent/decent version also writtent with Fixed support. on the other end, ffmpeg now comes with an ac3dec.c which is interresting. could you give me some thoughts on that ? also I d like to be compatible with DTS... Thanks

Alexander Patrakov said...

First, DTS and AC3 are completely different codecs, so it is pointless to speak about compatibility. You need two different decoders and two different encoders.

As for the available AC3 decoder implementations, the one on ffmpeg is the only one currently maintained. Also liba52 does not support E-AC3 (relevant for soundtracks ripped from BluRays).

Anatolio said...

Thanks for you encoder! Could you please explain its use for correct DTS-CD burning. Unfortunately the .dts files that I get are not accepted by my CD burning software (Plextools).

Alexander Patrakov said...

You have to find a way to add a wav header to them then. Under linux, you can try this:

ffmpeg -ar 44100 -f s16le -ac 2 -i file.dts file_dts.wav

Anatolio said...

Thank you. I hope some windows applications would help too. A quick search revealed the following: spdifer.exe, dts wav.exe and dts2wav.exe. I'll let you know the results. Currently I'm using commercial applications, but looking for a single stage solution for 5.1 44100 wav -> dts wav conversion in Foobar.
Is there a chance to modify dcaenc to produce dtswav with a correct wav header?

Alexander Patrakov said...

I have no time to work on this project, but will apply a good patch if someone (maybe you?) creates it.

Human+ said...

Why is floating point preferred in FFmpeg? it requires specialized hardware that's only common to x86, and it's not even bit exact...

Matt said...

Hey, thanks for writing this. I just want to report that your ALSA plugin works great with my Onkyo TX-SR304 receiver (no IEC61937-5 wrapping or AES changes required).

I had one issue, which I was able to work around. VLC tries to open the output device first with 2 channels and apparently then switches to the correct number of channels for the media being played. Your ALSA plugin returns an error if the number of channels is 2, and this causes VLC to give up and not output any audio at all. My workaround was to define a new ALSA output of type "route", using "dca" as its slave and forcing 6 channels. I just mapped the 6 channels one-to-one. Now when I tell VLC to output to that device, it is able to initialize ALSA with 2 channels and then switch to 6 channels.

Now I can finally play movies that have AAC-encoded 5.1-channel audio tracks and enjoy the surround mix as intended!

Alexander Patrakov said...

Thanks for a bug report. The workaround with the route plugin is indeed valid. The two-channel mode can be added if we figure out what it means - a pointless encode or a passthrough.