Saturday, October 25, 2008

Unexpected warning

Today I refactored a function in my C program in order to make it more readable and remove special-casing. What came unexpected is that I got a new warning in a function that I didn't touch. While this sounds strange, here is a very simple example that shows how it could happen.

This is the function that I didn't touch, but that will, at the end, get a warning:

extern const int db_to_add[32];

static int add_db(int a, int b)
{
if (a < b) {
int tmp = a;
a = b;
b = tmp;
}
if (a - b >= 32)
return a;

return a + db_to_add[a - b];
}


Here is a caller, just to make it sure that a static function is used:

int peak_db[32];
static int noise(int band1, int band2, int spectrum1, int spectrum2)
{
/* actually a lot more complex, but the idea is that
it takes two bands, except for special cases */
int noise1 = peak_db[band1] + spectrum1;
int noise2 = peak_db[band2] + spectrum2;
return add_db(noise1, noise2);
}

int valid_use(int b, int s)
{
/* a dummy warning-free function
only for the purpose of this blog post */
return noise(b, b + 1, 0, s);
}


With all of the above, there is no warning, even when "-O3 -Wall" flags are passed to gcc. Now add this (in the real program, this is the result of value propagation over various branches):


int trigger_warning()
{
return noise(0, 0, 0, -200);
}


The intention is to attenuate the unneeded term enough so that it doesn't matter, and use the general two-band case. With gcc-4.3.1 -O3 -Wall, the result, referring to the line in add_db() where a is compared with b, is:

warning: assuming signed overflow does not occur when assuming that (X - c) > X is always false


So it looks like gcc inlined everything and attempted to test whether peak_db[0] < peak_db[0] - 200. Because C compilers are allowed to think that signed overflow never occurs, this is always false (as intended). However, tests of this form are often incorrectly used as security checks, that's why the warning.

Thursday, October 23, 2008

Don't believe rrdtool

rrdtool is the industry standard for plotting time-dependent data (t.g., for monitoring). However, Debian's rrdtool 1.3.1-4 can create misleading plots. Here is how to reproduce.

First, create a round-robin database that will hold our test data.

rrdtool create testdata.rrd --start 1224453000 --step 1800 \
DS:testdata:GAUGE:28000:0:U RRA:LAST:0.5:1:1800

Then, populate it with numbers:

rrdtool update testdata.rrd 1224453300:1350535
rrdtool update testdata.rrd 1224467700:1350545
rrdtool update testdata.rrd 1224482100:1350554
rrdtool update testdata.rrd 1224496500:1350560
rrdtool update testdata.rrd 1224514800:1350562
rrdtool update testdata.rrd 1224539700:1350562
rrdtool update testdata.rrd 1224557700:1350562
rrdtool update testdata.rrd 1224576000:1350562
rrdtool update testdata.rrd 1224590100:1350562
rrdtool update testdata.rrd 1224604800:1350562
rrdtool update testdata.rrd 1224622500:1350562
rrdtool update testdata.rrd 1224636900:1350562
rrdtool update testdata.rrd 1224651300:1350562
rrdtool update testdata.rrd 1224669300:1350562
rrdtool update testdata.rrd 1224683700:1350562
rrdtool update testdata.rrd 1224698100:1350562
rrdtool update testdata.rrd 1224712500:1350562
rrdtool update testdata.rrd 1224730500:1350562
rrdtool update testdata.rrd 1224744900:1350562

The number before the semicolon is a UNIX timestamp, and the number after the semicolon is the corresponding value. As you see from the numbers, the value is slightly above 1.35 million and slowly grows in the beginning of the period.

Let's plot it:

rrdtool graph testdata.png -t testdata --start 1224453000 --end 1224757634 \
DEF:testdata=testdata.rrd:testdata:LAST 'LINE2:testdata#ff0000'

Good result

Indeed, there is not much change. But let's suppose that we are really interested in the small change that happened over the week.

Let's use the alternative autoscaling option that, according to the manual page, is designed specifically for such cases:

rrdtool graph testdata-bug.png -t testdata --start 1224453000 --end 1224757634 \
-A -Y DEF:testdata=testdata.rrd:testdata:LAST 'LINE2:testdata#ff0000'

bug?!
What? The plot says that the value is 35000M, which is waaaay wrong.

I don't know yet whether this bug is Debian-spedific, but that's enough for discouraging me from using -A and -Y options.

Update: not a bug. The label is just cut off from the left. Here is the correct plot:

rrdtool graph testdata-bug.png -t testdata --start 1224453000 --end 1224757634 \
-A -Y -L 10 DEF:testdata=testdata.rrd:testdata:LAST 'LINE2:testdata#ff0000'

non-bug

Although I must say that MS Excel handles this case better: it replaces numbers that don't fit with "###". Such placeholder (no number) is better than a wrong (chopped) number.

Sunday, October 12, 2008

Porting C/C++ code from Unix to Win32

At Yandex, some employees use Microsoft Visual Studio on Windows for compiling and debugging their code, some use gcc and gdb on Linux or FreeBSD (via ssh). And thus, portability problems arise. Here is one example.

A chunk of new code was written on FreeBSD that, essentially, opens a file (or maybe stdin), determines which plugin (i.e., shared library) to load, and passes the file descriptor to a function in the plugin. A natural design for Unix, however, it doesn't work for Windows: the file descriptors opened with open() are valid only in the same module (i.e., DLL or application) that opened them. It appears that one has to use _get_osfhandle() and _open_osfhandle() Win32 functions to overcome this limitation. That's just ugly.

Just for fun, I also tried to rebuild my own C code (not related to my work at Yandex), on Windows with Microsoft Visual Studio. This, obviously, resulted in problems:

  • The Microsoft compiler doesn't have such standard C99 headers as <stdint.h>. I had to download this header from the msinttypes project.

  • When compiling C files, the compiler doesn't recognize the "inline" keyword and the definition for int64_t from the <stdint.h> header doesn't work. Both issues were fixed with the /TP switch that tells the compiler to treat the code as C++.

  • The compiler warns about the POSIX function open() and suggests to use the "standard" _open() function. What standard does it talk about?

  • Well, the standard C function for opening files is actually fopen(), so I converted my program to use that. Guess what? A warning saying that fopen() is deprecated and suggesting to use fopen_s() that actually exists only in Microsoft compilers.
The intentions of Microsoft are obvious: to teach people to write programs that will require significant amount of work when porting to other platforms. Thus, if you are a newbie programmer working on Win32, please, switch to a non-Microsoft compiler now, before you acquire any bad habits.