DRAFT: C compiler weirdness

Date: 2018-mm-dd
Git: https://gitlab.com/mort96/blog/blob/published/content/00000-home/draft-c-compiler-stuff.md

In a previous blog post, I wrote about some weird features of C, the C preprocessor, and GNU extensions to C that I used in my testing library, Snow.

This post will be about some of the weird compiler and language quirks, limitations, and annoyances I've come across. I don't mean to bash compilers or the specification; most of these quirks have good technical or practical reasons.

Compilers lie about what version of the standard they support

There's a handy macro, called __STDC_VERSION__, which describes the version of the C standard your C implementation conforms to. We can check #if (__STDC_VERSION__ >= 201112L) to check if our C implementaion confirms to C11 or higher (C11 was published in December 2011, hence 2011 12). That's really useful if, say, you're a library author and have a macro which uses _Generics, but also have alternative ways of doing the same and want to warn people when they use the C11-only macro in an older compiler.

In theory, this should always work; any implementation of C which conforms to all of C11 will define __STDC_VERSION__ as 201112L, while any implementation which doesn't conform to C11, but conforms to some earlier version, will define __STDC_VERSION__ to be less than 201112L. Therefore, unless the _Generic feature gets removed in a future version of the standard, __STDC_VERSION__ >= 201112L means that we can safely use _Generic.

Sadly, the real world is not that clean. You could already in GCC 4.7 enable C11 by passing in -std=c11, which would set __STDC_VERSION__ to 201112L, but the first release to actually implement all non-optional features of C11 was GCC 4.9. That means, if we just check the value of __STDC_VERSION__, users on GCC 4.7 and GCC 4.8 who use -std=c11 will see really confusing error messages instead of our nice error message. Annoyingly, GCC 4.7 and 4.8 happens to still be extremely widespread versions of GCC. (Relevant: GCC Wiki's C11Status page)

The solution still seems relatively simple; just don't use -std=c11. More recent compilers default to C11 anyways, and there's no widely used compiler that I know of which will default to setting __STDC_VERSION__ to C11 without actually supporting all of C11. That works well enough, but there's one problem: GCC 4.9 supports all of C11 just fine, but only if we give it -std=c11. GCC 4.9 also seems to be one of those annoyingly widespread versions of GCC, so we'd prefer to encourage users to set -std=c11 and make the macros which rely on _Generic work in GCC 4.9.

Again, the solution seems obvious enough, if a bit ugly: if the compiler is GCC, we only use _Genric if the GCC version is 4.9 or greater and __STDC_VERSION__ is C11. If the compiler is not GCC, we just trust it if it says it supports C11. This should in theory work perfectly:

#if (__STDC_VERSION__ >= 201112L)
# ifdef __GNUC__
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif

Our new IS_C11 macro should now always be defined if we can use _Generic and always not be defined when we can't use _Generic, right?

Wrong. It turns out that in their quest to support code written for GCC, Clang also defines the __GNUC__, __GNUC_MINOR__, and __GNUC_PATCHLEVEL__ macros, specifically to fool code which checks for GCC into thinking Clang is GCC. However, it doesn't really go far enough; it defines the __GNUC_* variables to correspond to the the version of clang, not the version of GCC which Clang claims to imitate. Clang gained support for C11 in 3.6, but using our code, we would conclude that it doesn't support C11 because __GNUC__ is 3 and __GNUC_MINOR__ is 6. We can solve this by adding a special case for when __clang__ is defined:

#if (__STDC_VERSION__ >= 201112L)
# if defined(__GNUC__) && !defined(__clang__)
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif

Now our code works with both Clang and with GCC, and should work with all other compilers which don't try to immitate GCC - but for every compiler which does immitate GCC, we would have to add a new special case. This is starting to smell a lot like user agent strings.

The Intel compiler is at least nice enough to define __GNUC__ and __GNUC_MINOR__ according to be the version of GCC installed on the system; so even though our version check is completely irrelevant in the Intel compiler, at least it will only prevent an otherwise C11-compliant Intel compiler from using _Generic if the user has an older version of GCC installed.

User: Hi, I'm using the Intel compiler, and your library claims my compiler doesn't support C11, even though it does.

You: Upgrading GCC should solve the issue. What version of GCC do you have installed?

User: ...but I'm using the Intel compiler, not GCC.

You: Still, what version of GCC do you have?

User: 4.8, but I really don't see how that's relevant...

You: Try upgrading GCC to at least version 4.9.

(Relevant: Intel's Additional Predefined Macros page)

_Pragma in macro arguments

C has had pragma directives for a long time. It's a useful way to tell our compiler something implementation-specific; something which there's no way to say using only standard C. For example, using GCC, we could use a pragma directive to tell our compiler to ignore a warning for a couple of lines, without changing warning settings globally:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
// my_float being 0 indicates a horrible failure case.
if (my_float == 0)
	abort();
#pragma GCC diagnostic pop

We might also want to define a macro which outputs the above code, so C99 introduced the _Pragma operator, which works like #pragma, but can be used in macros. Once this code goes through the preprocessor, it will do exactly the same as the above code:

#define abort_if_zero(x) \
	_Pragma("GCC diagnostic push") \
	_Pragma("GCC diagnostic ignored \"-Wfloat-equal\"") \
	if (x == 0) \
		abort(); \
	_Pragma("GCC diagnostic pop")

abort_if_zero(my_float);

Now, imagine that we want a macro to trace certain lines; a macro which takes a line of code, and prints that line of code while executing the line. This code looks completely reasonable, right?

#define trace(x) \
	fprintf(stderr, "TRACE: %s\n", #x); \
	x

trace(abort_if_zero(my_float));

However, if we run that code through GCC's preprocessor, we see this mess:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
#pragma GCC diagnostic pop
fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)"); if (my_float == 0) abort();

The pragmas all got bunched up at the top! From what I've heard, this isn't against the C standard, because the standard not entirely clear on what happens when you send in _Pragma operators as macro arguments, but it sure surprised me when I encountered it nonetheless.

For the Snow library, this means that there are certain warnings which I would have loved to only disable for a few lines, but which I have to disable for all code following the #include <snow/snow.h> line.

Side note: Clang's preprocessor does exactly what one would expect, and produces this output:

fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)");
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
 if (my_float == 0) abort();
#pragma GCC diagnostic pop

Line numbers in macro arguments

Until now, the quirks I've shown have been issues you could potentially encounter in decent, real-world code. If this quirk has caused issues for you however, it might be a sign that you're slightly over-using macros.

All testing code in Snow happens within macro arguments. This allows for what I think is a really nice looking API, and allows all testing code to be disabled just by changing one macro definition. This is a small example of a Snow test suite:

#include <stdio.h>
#include <snow/snow.h>

describe(files, {
	it("writes to files", {
		FILE *f = fopen("testfile", "w");
		assertneq(f, NULL);
		defer(remove("testfile"));
		defer(fclose(f));

		char str[] = "hello there";
		asserteq(fwrite(str, 1, sizeof(str), f), sizeof(str));
	});
});

snow_main();

If that assertneq or asserteq fails, we would like and expect to see a line number. Unfortunately, after the code goes through the preprocessor, the entire nested macro expansion ends up on a single line. All line number information is lost. __LINE__ just returns the number of the last line of the macro expansion, which is 14 in this case. All __LINE__ expressions inside the block we pass to describe will return the same number. I have googled around a bunch for a solution to this issue, but none of the solutions I've looked at actually solve the issue. The only actual solution I can think of is to write my own preprocessor.

Some warnings can't be disabled with pragma

Like the above example, this is probably an issue you shouldn't have come across in production code.

First, some background. In Snow, both the code which is being tested and the test cases can be in the same file. This is to make it possible to test static functions and other functionality which isn't part of the component's public API. The idea is that at the bottom of the file, after all non-testing code, one should include <snow/snow.h> and write the test cases. In a non-testing build, all the testing code will be removed by the preprocessor, because the describe(...) macro expands to nothing unless SNOW_ENABLED is defined.

My personal philosophy is that your regular builds should not have -Werror, and that your testing builds should have as strict warnings as possible and be compiled with -Werror. Your users may be using a different compiler version from you, and that compiler might produce some warnings which you haven't fixed yet. Being a user of a rolling release distro, with a very recent of GCC, I have way too often had to edit someone else's Makefile and remove -Werror just to make their code compile. Compiling the test suite with -Werror and regular builds without -Werror has none of the drawbacks of using -Werror for regular builds, and most or all of the advantages (at least if you don't accept contributions which break your test suite).

This all means that I want to be able to compile all files with at least -Wall -Wextra -Wpedantic -Werror, even if the code includes <snow/snow.h>. However, Snow contains code which produces warnings (and therefore errors) with those settings; among other things, it uses some GNU extensions which aren't actually part of the C standard.

I would like to let users of Snow compile their code with at least -Wall -Wextra -Wpedantic -Werror, but Snow has to disable at least -Wpedantic for all code after the inclusion of the library. In theory, that shouldn't be an issue, right? We just include #pragma GCC diagnostic ignored "-Wpedantic" somewhere.

Well, as it turns out, disabling -Wpedantic with a pragma doesn't disable all the warnings enabled by -Wpedantic; there are some warnings which are impossible to disable once they're enabled. One such warning is about using directives (like #ifdef) inside macro arguments. As I explained earlier, everything in Snow happens inside of macro arguments. That means that when compiling with -Wpedantic, this code produces a warning which it's impossible to disable without removing -Wpedantic from the compiler's arguments:

describe(some_component, {
#ifndef __MINGW32__
	it("does something which can't be tested on mingw", {
		/* ... */
	});
#endif
});

That's annoying, because it's perfectly legal in GNU's dialect of C. The only reason we can't do it is that it just so happens to be impossible to disable that particular warning with a pragma.

To be completely honest, this issue makes complete sense. I imagine the preprocessor stage, which is where macros are expanded, doesn't care much about pragmas. It feels unnecessary to implement pragma parsing for the preprocessor just in order to let people compile files with -Wpedantic but still selectively disable this particular warning. That doesn't make it less annoying though.

Funnily enough, I encountered this issue while writing Snow's test suite. My solution was to just define a macro called NO_MINGW which is empty if __MINGW32__ is defined, and expands to the contents of its arguments otherwise.

Read More