Saturday, May 29, 2010

When linear search is faster than binary

It is often stated that a binary search algorithm performs better than linear search. Indeed, let's assume that the goal is to find the index of the largest integer in the sorted zero-based array A that is less than or equal to the given integer X. For simplicity, let's also assume that the size of array is N = 2K, that for all possible inputs A[0] <= X and that one can also put a sentinel value after the end, so that X < A[N] for all possible inputs X. E.g., if it is known that X always fits in 16 bits, one can put 65536 as the sentinel.

The assumptions above are true for the entropy decoder found in the Monkey's Audio codec. In that case, N = 64, X is unsigned and always fits in 16 bits, A[0] = 0, A[64] = 65536.

This is linear search:

A[N] = sentinel;
index = 0;
while (A[index + 1] <= X)

This is binary search:

int bit = N >> 1;  /* N is a power of two */
index = 0;
while (bit != 0) {
    if (A[index + bit] <= X)
        index += bit;
    bit >>= 1;

If the size of array to be searched is N, then the average case complexity is commonly stated to be O(log(N)) for binary search and O(N) for linear search. So, for the Monkey's Audio case presented above, one would expect, on average, 32 loop iterations for the linear search algorithm and 6 iterations for the binary search.

However, there is one key fact that makes the estimate above invalid. Such simple averaging is valid only if each result appears with the same probability. For Monkey's Audio entropy decoder, this is not the case.

In fact, out of the 64 possible results, "0" appears with 31% probability, and the total probability of the first 6 results is 87%. So, it is quite natural that linear search is faster than binary in this case, because it often stops very early.

Monday, May 10, 2010

Converting audio samples

Different audio applications process audio in different sample formats. Some use 16-bit samples internally, others use 32-bit or floating point samples. In either case, there is a need to convert the sample format for the purposes of export and playback. Sounds like a simple task? No, a rich field for disagreements and bugs, even without taking dithering into account.

E.g., when converting from floating-point samples to 16 bits, there is no universal agreement on the scale. Some (e.g., JACK) think that 1.0 should convert to 32767 and -1.0 to -32767. Within that model, there are two opinions how to get -32768 (the most negative integer representable in 16 bits): "-1.00003" or "you can't". The other viewpoint, implemented, e.g., in ALSA-lib, is that -1.0 should map to -32768, 0.99997 should give 32767 and 1.0 should clip. There is also a third possibility, namely, to use different scale factors for negative and positive sample values, so that 0.0 maps to 0, 1.0 maps to 32767 and -1.0 maps to -32768.

And now some code. If one adopts the 32767.0 scale factor (as in JACK) and declares the -32768 sample value as being unreachable, one can use the following C function for conversion:

int16_t to_int16(float f)
if (f > 1.0) f = 1.0;
if (f < -1.0) f = -1.0;
return (int16_t)(f * 0x7fff);

As expected, to_int16(1.0) == 32767 and to_int16(-1.0) == -32767. However, the following quite similar function that is supposed to convert floating-point samples to 32-bit signed ones doesn't actually work properly:

int32_t to_int32(float f)
if (f > 1.0) f = 1.0;
if (f < -1.0) f = -1.0;
return (int32_t)(f * 0x7fffffff);

Indeed, to_int32(1.0) returns the minimum, not the maximum possible 32-bit integer. That happens because, unlike 0x7fff, one cannot represent 0x7fffffff exactly as a floating-point number. So the nearest representable one is substituted, and that's 0x80000000. So the overflow happens, and the function returns a wrong result.

One fix would be to cast 0x7fffffff to a double-precision floating point number. Another (but not 100% equivalent) option is to change the scale factor so that it is close enough, but doesn't cause this overflow, i.e. to 0x7fffff00.

New job

This blog didn't receive any updates for quite a while. That's both because I was too busy at Yandex and because there were no interesting bugs to blog about. I no longer work at Yandex. There was no single big reason for this decision, just a lot of small issues that accumulated and that don't apply to anyone else.

Now I work for a different company (won't name it here, it is nether SEO nor Google) as a senior developer of a web service. The most pleasant result of the job change is that now I feel more like a developer: write mostly C code, not bash scripts, and can even make architectural decisions.