C compiler quirks I have encountered

Date: 2018-07-26
Git: https://gitlab.com/mort96/blog/blob/published/content/00000-home/00011-c-compiler-quirks.md

In a previous blog post, I wrote about some weird features of C, the C preprocessor, and GNU extensions to C that I used in my testing library, Snow.

This post will be about some of the weird compiler and language quirks, limitations, and annoyances I've come across. I don't mean to bash compilers or the specification; most of these quirks have good technical or practical reasons.

Compilers lie about what version of the standard they support

There's a handy macro, called __STDC_VERSION__, which describes the version of the C standard your C implementation conforms to. We can check #if (__STDC_VERSION__ >= 201112L) to check if our C implementaion confirms to C11 or higher (C11 was published in December 2011, hence 2011 12). That's really useful if, say, you're a library author and have a macro which uses _Generics, but also have alternative ways of doing the same and want to warn people when they use the C11-only macro in an older compiler.

In theory, this should always work; any implementation of C which conforms to all of C11 will define __STDC_VERSION__ as 201112L, while any implementation which doesn't conform to C11, but conforms to some earlier version, will define __STDC_VERSION__ to be less than 201112L. Therefore, unless the _Generic feature gets removed in a future version of the standard, __STDC_VERSION__ >= 201112L means that we can safely use _Generic.

Sadly, the real world is not that clean. You could already in GCC 4.7 enable C11 by passing in -std=c11, which would set __STDC_VERSION__ to 201112L, but the first release to actually implement all non-optional features of C11 was GCC 4.9. That means, if we just check the value of __STDC_VERSION__, users on GCC 4.7 and GCC 4.8 who use -std=c11 will see really confusing error messages instead of our nice error message. Annoyingly, GCC 4.7 and 4.8 happens to still be extremely widespread versions of GCC. (Relevant: GCC Wiki's C11Status page)

The solution still seems relatively simple; just don't use -std=c11. More recent compilers default to C11 anyways, and there's no widely used compiler that I know of which will default to setting __STDC_VERSION__ to C11 without actually supporting all of C11. That works well enough, but there's one problem: GCC 4.9 supports all of C11 just fine, but only if we give it -std=c11. GCC 4.9 also seems to be one of those annoyingly widespread versions of GCC, so we'd prefer to encourage users to set -std=c11 and make the macros which rely on _Generic work in GCC 4.9.

Again, the solution seems obvious enough, if a bit ugly: if the compiler is GCC, we only use _Genric if the GCC version is 4.9 or greater and __STDC_VERSION__ is C11. If the compiler is not GCC, we just trust it if it says it supports C11. This should in theory work perfectly:

#if (__STDC_VERSION__ >= 201112L)
# ifdef __GNUC__
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif

Our new IS_C11 macro should now always be defined if we can use _Generic and always not be defined when we can't use _Generic, right?

Wrong. It turns out that in their quest to support code written for GCC, Clang also defines the __GNUC__, __GNUC_MINOR__, and __GNUC_PATCHLEVEL__ macros, specifically to fool code which checks for GCC into thinking Clang is GCC. However, it doesn't really go far enough; it defines the __GNUC_* variables to correspond to the the version of clang, not the version of GCC which Clang claims to imitate. Clang gained support for C11 in 3.6, but using our code, we would conclude that it doesn't support C11 because __GNUC__ is 3 and __GNUC_MINOR__ is 6. Update: it turns out that Clang always pretends to be GCC 4.2, but the same issue still applies; __GNUC__ is 4, and __GNUC_MINOR__ is 2, so it fails our version check. We can solve this by adding a special case for when __clang__ is defined:

#if (__STDC_VERSION__ >= 201112L)
# if defined(__GNUC__) && !defined(__clang__)
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif

Now our code works with both Clang and with GCC, and should work with all other compilers which don't try to immitate GCC - but for every compiler which does immitate GCC, we would have to add a new special case. This is starting to smell a lot like user agent strings.

The Intel compiler is at least nice enough to define __GNUC__ and __GNUC_MINOR__ according to be the version of GCC installed on the system; so even though our version check is completely irrelevant in the Intel compiler, at least it will only prevent an otherwise C11-compliant Intel compiler from using _Generic if the user has an older version of GCC installed.

User: Hi, I'm using the Intel compiler, and your library claims my compiler doesn't support C11, even though it does.

You: Upgrading GCC should solve the issue. What version of GCC do you have installed?

User: ...but I'm using the Intel compiler, not GCC.

You: Still, what version of GCC do you have?

User: 4.8, but I really don't see how that's relevant...

You: Try upgrading GCC to at least version 4.9.

(Relevant: Intel's Additional Predefined Macros page)

_Pragma in macro arguments

C has had pragma directives for a long time. It's a useful way to tell our compiler something implementation-specific; something which there's no way to say using only standard C. For example, using GCC, we could use a pragma directive to tell our compiler to ignore a warning for a couple of lines, without changing warning settings globally:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
// my_float being 0 indicates a horrible failure case.
if (my_float == 0)
	abort();
#pragma GCC diagnostic pop

We might also want to define a macro which outputs the above code, so C99 introduced the _Pragma operator, which works like #pragma, but can be used in macros. Once this code goes through the preprocessor, it will do exactly the same as the above code:

#define abort_if_zero(x) \
	_Pragma("GCC diagnostic push") \
	_Pragma("GCC diagnostic ignored \"-Wfloat-equal\"") \
	if (x == 0) \
		abort(); \
	_Pragma("GCC diagnostic pop")

abort_if_zero(my_float);

Now, imagine that we want a macro to trace certain lines; a macro which takes a line of code, and prints that line of code while executing the line. This code looks completely reasonable, right?

#define trace(x) \
	fprintf(stderr, "TRACE: %s\n", #x); \
	x

trace(abort_if_zero(my_float));

However, if we run that code through GCC's preprocessor, we see this mess:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
#pragma GCC diagnostic pop
fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)"); if (my_float == 0) abort();

The pragmas all got bunched up at the top! From what I've heard, this isn't against the C standard, because the standard not entirely clear on what happens when you send in _Pragma operators as macro arguments, but it sure surprised me when I encountered it nonetheless.

For the Snow library, this means that there are certain warnings which I would have loved to only disable for a few lines, but which I have to disable for all code following the #include <snow/snow.h> line.

Side note: Clang's preprocessor does exactly what one would expect, and produces this output:

fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)");
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
 if (my_float == 0) abort();
#pragma GCC diagnostic pop

Line numbers in macro arguments

Until now, the quirks I've shown have been issues you could potentially encounter in decent, real-world code. If this quirk has caused issues for you however, it might be a sign that you're slightly over-using macros.

All testing code in Snow happens within macro arguments. This allows for what I think is a really nice looking API, and allows all testing code to be disabled just by changing one macro definition. This is a small example of a Snow test suite:

#include <stdio.h>
#include <snow/snow.h>

describe(files, {
	it("writes to files", {
		FILE *f = fopen("testfile", "w");
		assertneq(f, NULL);
		defer(remove("testfile"));
		defer(fclose(f));

		char str[] = "hello there";
		asserteq(fwrite(str, 1, sizeof(str), f), sizeof(str));
	});
});

snow_main();

If that assertneq or asserteq fails, we would like and expect to see a line number. Unfortunately, after the code goes through the preprocessor, the entire nested macro expansion ends up on a single line. All line number information is lost. __LINE__ just returns the number of the last line of the macro expansion, which is 14 in this case. All __LINE__ expressions inside the block we pass to describe will return the same number. I have googled around a bunch for a solution to this issue, but none of the solutions I've looked at actually solve the issue. The only actual solution I can think of is to write my own preprocessor.

Some warnings can't be disabled with pragma

Like the above example, this is probably an issue you shouldn't have come across in production code.

First, some background. In Snow, both the code which is being tested and the test cases can be in the same file. This is to make it possible to test static functions and other functionality which isn't part of the component's public API. The idea is that at the bottom of the file, after all non-testing code, one should include <snow/snow.h> and write the test cases. In a non-testing build, all the testing code will be removed by the preprocessor, because the describe(...) macro expands to nothing unless SNOW_ENABLED is defined.

My personal philosophy is that your regular builds should not have -Werror, and that your testing builds should have as strict warnings as possible and be compiled with -Werror. Your users may be using a different compiler version from you, and that compiler might produce some warnings which you haven't fixed yet. Being a user of a rolling release distro, with a very recent of GCC, I have way too often had to edit someone else's Makefile and remove -Werror just to make their code compile. Compiling the test suite with -Werror and regular builds without -Werror has none of the drawbacks of using -Werror for regular builds, and most or all of the advantages (at least if you don't accept contributions which break your test suite).

This all means that I want to be able to compile all files with at least -Wall -Wextra -Wpedantic -Werror, even if the code includes <snow/snow.h>. However, Snow contains code which produces warnings (and therefore errors) with those settings; among other things, it uses some GNU extensions which aren't actually part of the C standard.

I would like to let users of Snow compile their code with at least -Wall -Wextra -Wpedantic -Werror, but Snow has to disable at least -Wpedantic for all code after the inclusion of the library. In theory, that shouldn't be an issue, right? We just include #pragma GCC diagnostic ignored "-Wpedantic" somewhere.

Well, as it turns out, disabling -Wpedantic with a pragma doesn't disable all the warnings enabled by -Wpedantic; there are some warnings which are impossible to disable once they're enabled. One such warning is about using directives (like #ifdef) inside macro arguments. As I explained earlier, everything in Snow happens inside of macro arguments. That means that when compiling with -Wpedantic, this code produces a warning which it's impossible to disable without removing -Wpedantic from the compiler's arguments:

describe(some_component, {
#ifndef __MINGW32__
	it("does something which can't be tested on mingw", {
		/* ... */
	});
#endif
});

That's annoying, because it's perfectly legal in GNU's dialect of C. The only reason we can't do it is that it just so happens to be impossible to disable that particular warning with a pragma.

To be completely honest, this issue makes complete sense. I imagine the preprocessor stage, which is where macros are expanded, doesn't care much about pragmas. It feels unnecessary to implement pragma parsing for the preprocessor just in order to let people compile files with -Wpedantic but still selectively disable this particular warning. That doesn't make it less annoying though.

Funnily enough, I encountered this issue while writing Snow's test suite. My solution was to just define a macro called NO_MINGW which is empty if __MINGW32__ is defined, and expands to the contents of its arguments otherwise.

Read More

Some obscure C features you might not know about

Date: 2018-01-25
Git: https://gitlab.com/mort96/blog/blob/published/content/00000-home/00010-obscure-c-features.md

I have been working on Snow, a unit testing library for C. I wanted to see how close I could come to making a DSL (domain specific language) with its own syntax and features, using only the C preprocessor and more obscure C features and GNU extensions. I will not go into detail about how Snow works unless it's directly relevant, so I recommend taking a quick look at the readme on the GitHub page.

Sending blocks as arguments to macros

Let's start with the trick that's probably both the most useful in everyday code, and the least technically complicated.

Originally, I defined macros like describe, subdesc, and it similar to this:

#define describe(name, block) \
	void test_##name() { \
		/* some code, omitted for brevity */ \
		block \
		/* more code */ \
	}

The intended use would then be like this:

describe(something, {
	/* code */
});

The C preprocessor doesn't really understand the code; it only copies and pastes strings around. It splits the string between the opening ( and the closing ) by comma; that means, in this case, something would be sent in as the first argument, and { /* code */ } as the second argument (pretend /* code */ is actual code; the preprocessor actually strips out comments). The C preprocessor is smart enough to know that you might want to pass function calls to macros, and function calls contain commas, so parentheses will "guard" the commas they contain. describe(something, foo(10, 20)) would therefore pass something as the first argument, and foo(10, 20) as the second argument.

Now, we're not passing in function calls, but blocks. The preprocessor only considers parentheses; braces { } or brackets [ ] don't guard their contents. That means this call will fail:

describe(something, {
	int a, b;
	/* code */
});

The preprocessor will interpret something as the first argument, { int a as the second argument, and b; /* code */ } as the third argument, but describe only takes two arguments! The preprocessor will halt and show an error message.

So, how do we fix this? Not being able to write commas outside of parentheses in our blocks is quite the limitation. Not only does it prevent us from declaring multiple variables in one statement, it also messes with array declarations like int foo[] = { 10, 20, 30 };.

Well, the preprocessor supports variadic macros; macros which can take an unlimited amount of arguments. The way they are implemented is that any extra arguments (indicated by ... in the macro definition) are made available through the __VA_ARGS__ identifier; __VA_ARGS__ is replaced with all the extra arguments separated by commas. So, what happens if we define the macro like this?

#define describe(name, ...) \
	void test_##name() { \
		/* some code, omitted for brevity */ \
		__VA_ARGS__ \
		/* more code */ \
	}

Let's call describe like we did above:

describe(something, {
	int a, b;
	/* code */
});

Now, the arguments will be interpreted the same way as before; something will be the first argument, { int a will be the second argument, and b; /* code */ } will be the third. However, __VA_ARGS__ will be replaced by the second and third argument with a comma inbetween, and together they produce { int a, b; /* code */ }, just as we intended. The entire describe call will be expanded into this (with added newlines and indentation for clarity; the actual preprocessor would put it all on one line):

void test_something() {
	/* some code, omitted for brevity */
	{
		int a, b;
		/* code */
	}
	/* more code */
}

And just like that, we successfully passed a block of code, with unguarded commas, to a macro.

Credit for this solution goes to this stackoverflow answer.

Generic macros with _Generic

I wanted to be able to use one set of macros, asserteq and assertneq, to be able to do most simple equality checks, instead of having to write asserteq_str for strings, asserteq_int for integers, etc. The C11 standard added the _Generic keyword, which sounds like it's perfect for that; given a list of types and expressions, _Generic will choose the expression whose associated type is compatible with a controlling expression. For example, this code will print "I am an int":

_Generic(10,
	int: printf("I am an int\n"),
	char *: printf("I am a string\n")
);

By itself, _Generic isn't terribly useful, but it can be used to make faux-generic function-like macros. The cppreference.com page uses the example of a generic cbrt (cube root) macro:

#define cbrt(x) _Generic((x), \
	long double: cbrtl, \
	float: cbrtf, \
	default: cbrt)(x)

Calling cbrt on a long double will now call cbrtl, while calling cbrt on a double will call the regular cbrt function, etc. Note that _Generic is not part of the preprocessor; the preprocessor will just spit out the _Generic syntax with x replaced with the macro's argument, and it's the actual compiler's job to figure out what type the controlling expression is and choose the appropriate expression.

I have a bunch of asserteq functions for the various types; asserteq_ptr(void *a, void *b), asserteq_int(intmax_t a, intmax_t b), asserteq_str(const char *a, const char *b), etc. (In reality, the function signatures are a lot uglier, and they're prefixed with _snow_, but for the sake of this article, I'll pretend they look like void asserteq_<suffix>(<type> a, <type> b)).

At first glance, _Generic looks perfect for this use case; just define an asserteq macro like this:

#define asserteq(a, b) _Generic((b), \
	const char *: asserteq_str, \
	char *: asserteq_str, \
	void *: asserteq_ptr, \
	int: asserteq_int)(a, b)

It's sadly not that simple. _Generic will match only specific types; int matches only int, not long. void * matches void pointers, not any other form of pointer. There's no way to say "match every pointer type", for example.

However, there is a default clause, just like in switch statements. My first solution was to just pass anything not otherwise specified to asserteq_int, and use _Pragma (like #pragma, but can be used inside macros) to ignore the warnings:

#define asserteq(a, b) \
	do { \
		_Pragma("GCC diagnostic push") \
		_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
		_Generic((b), \
			const char *: asserteq_str, \
			char *: asserteq_str, \
			default: asserteq_int)(a, b) \
		_Pragma("GCC diagnostic pop") \
	} while (0)

That solution worked but it's not exactly nice. I assume it would eventually break, either due to compiler optimizations or due to weird systems where an intmax_t is smaller than a pointer or whatever. Luckily, the good people over in ##C@freenode had an answer: subtracting a pointer from a pointer results in a ptrdiff_t! That means we can nest _Generics, and appropriately choose asserteq_int for any integer types, or asserteq_ptr for any pointer types:

#define asserteq(a, b) _Generic((b), \
	const char *: asserteq_str, \
	char *: asserteq_str, \
	default: _Generic((b) - (b), \
		ptrdiff_t: asserteq_ptr(a, b), \
		default: asserteq_int(a, b)))(a, b)

Defer, label pointers, and goto *(void *)

I once saw a demonstration of Golang's defer statement, and fell in love. It immediately struck me as a much better way to clean up than relying solely on the try/catch stuff we've been used to ever since 1985. Naturally, I wanted to use that for tearing down test cases in Snow, but there's not exactly any obvious way to implement it in C.

For those unfamiliar with it, in Go, defer is basically a way to say, "run this expression once the function returns". It works like a stack; when the function returns, the most recently deferred expression will be executed first, and the first deferred expression will be executed last. The beautiful part is that even if the function returns early, either because some steps can be skipped, or because something failed, all the appropriate deferred expressions, and only the appropriate deferred expressions, will be executed. Replace "function" with "test case", and it sounds perfect for tearing down tests.

So, how would you implement that in C? Well, it turns out that GCC has two useful non-standard extensions (which are also supported by Clang by the way): local labels, and labels as values.

Local labels are basically regular labels which you can jump to with goto, but instead of being global to the entire function, they're only available in the block they're declared in. That's fairly straightforward. You declare that a label should be block scoped by just putting __label__ label_name; at the top of the block, and then you can use label_name: anywhere within the block to actually create the label. A goto label_name from anywhere within the block will then go to the label, as expected.

Labels as values is weirder. GCC adds a new unary && operator, which gets a pointer to a label as a void *. Moreover, if you save that pointer in a variable which is accessible outside the block, you can jump back in to that block from outside of it, even though it's a local label. This will print "hello" in an infinite loop:

{
	void *somelabel;

	{
		__label__ lbl;
		lbl:
		somelabel = &&lbl;
		printf("hello\n");
	}

	goto *somelabel;
}

Yes, the somelabel is a void *. Yes, we dereference somelabel to go to it. I don't know how that works, but the important part is that it does. Other than being dereferencable, the void * we get from the unary && works exactly like any other void *, and can even be in an array. Knowing this, implementing defer isn't too hard; here's a simplified implementation of the it(description, block) macro (using the __VA_ARGS__ trick from before) which describes one test case, and the defer(expr) macro which can be used inside the it block:

#define it(description, ...) \
	do { \
		__label__ done_label; \
		void *defer_labels[32]; \
		int defer_count = 0; \
		int run_defer = 0; \
		__VA_ARGS__ \
		done_label: \
		run_defer = 1; \
		if (defer_count > 0) { \
			defer_count -= 1; \
			goto *defer_labels[defer_count]; \
		} \
	} while (0)

#define defer(expr) \
	do { \
		__label__ lbl; \
		lbl: \
		if (run_defer) { \
			expr; \
			/* Go to the previous defer, or the end of the `it` block */ \
			if (defer_count > 0) { \
				defer_count -= 1; \
				goto *defer_labels[defer_count]; \
			} else { \
				goto done_label; \
			} \
		} else { \
			defer_labels[defer_count] = &&lbl; \
			defer_count += 1; \
		} \
	} while (0)

That might not be the most understandable code you've ever seen, but let's break it down with an example.

it("whatever", {
	printf("Hello World\n");
	defer(printf("world\n"));
	defer(printf("hello "));
});

Running that through the preprocessor, we get this code:

do {
	__label__ done_label;
	void *defer_labels[32];
	int defer_count = 0;
	int run_defer = 0;

	{
		printf("Hello World\n");

		do {
			__label__ lbl;
			lbl:
			if (run_defer) {
				printf("world\n");

				/* Go to the previous defer, or the end of the `it` block */
				if (defer_count > 0) {
					defer_count -= 1;
					goto *defer_labels[defer_count];
				} else {
					goto done_label;
				}
			} else {
				defer_labels[defer_count] = &&lbl;
				defer_count += 1;
			}
		} while (0);

		do {
			__label__ lbl;
			lbl:
			if (run_defer) {
				printf("hello ");

				/* Go to the previous defer, or the end of the `it` block */
				if (defer_count > 0) {
					defer_count -= 1;
					goto *defer_labels[defer_count];
				} else {
					goto done_label;
				}
			} else {
				defer_labels[defer_count] = &&lbl;
				defer_count += 1;
			}
		} while (0);
	}

	done_label:
	run_defer = 1;
	if (defer_count > 0) {
		defer_count -= 1;
		goto *defer_labels[defer_count];
	}
} while (0);

That's still not extremely obvious on first sight, but it's at least more obvious than staring at the macro definitions. The first time through, run_defer is false, so both the defer blocks will just add their labels to the defer_labels array and increment defer_count. Then, just through normal execution (without any goto), we end up at the label called done_label, where we set run_defer to true. Because defer_count is 2, we decrement defer_count and jump to defer_labels[1], which is the last defer.

This time, because run_defer is true, we run the deferred expression printf("hello "), decrement defer_count again, and jump to defer_labels[0], which is the first defer.

The first defer runs its expression, printf("world\n"), but because defer_count is now 0, we jump back to done_label. defer_count is of course still 0, so we just exit the block.

The really nice thing about this system is that a failing assert can at any time just say goto done_label, and only the expressions which were deferred before the goto will be executed.

(Note: in the actual implementation in Snow, defer_labels is of course a dynamically allocated array which is realloc'd when necessary. It's also global to avoid an allocation and free for every single test case. I omitted that part because it's not that relevant, and would've made the example code unnecessarily complicated.)

Update: A bunch of people on Reddit and Hacker News have suggested ways to accomplish this. I ended up using the __attribute__((constructor)) function attribute, which makes a given function execute before the main function. Basically, each describe creates a function called test_##name, and a constructor function called _snow_constructor_##name whose only job is to add test_##name to a global list of functions. Here's the code: https://github.com/mortie/snow/blob/7ee25ebbf0edee519c6eb6d36b82d784b0fdcbfb/snow/snow.h#L393-L421

Automatically call all functions created by describe

The describe macro is meant to be used at the top level, outside of functions, because it creates functions. It's basically just this:

#define describe(name, ...) \
	void test_##name() { \
		__VA_ARGS__ \
	}

Calling describe(something, {}) will create a function called test_something. Currently, that function has to be called manually, because no other part of Snow knows what the function is named. If you have used the describe macro to define the functions test_foo, test_bar, and test_baz, the main function will look like this:

snow_main({
	test_foo();
	test_bar();
	test_baz();
})

I would have loved it if snow_main could just know what functions are declared by describe, and automatically call them. I will go over a couple of ways I tried, which eventually turned out to not be possible, and then one way which would definitely work, but which is a little too crazy, even for me.

Static array of function pointers

What if, instead of just declaring functions with describe, we also appended them to an array of function pointers? What if snow.h contained code like this:

void (*described_functions[512])();

#define describe(name, ...) \
	void test_##name() { \
		__VA_ARGS__ \
	} \
	described_functions[__COUNTER__] = &test_##name

__COUNTER__ is a special macro which starts at 0, and is incremented by one every time it's referenced. That means that assuming nothing else uses __COUNTER__, this solution would have worked, and would have been relatively clean, if only it was valid syntax. Sadly, you can't set the value of an index in an array like that in the top level in C, only inside functions.

Appending to a macro

What if we had a macro which we appended test_##name(); to every time a function is declared by describe? It turns out that this is almost possible using some obscure GCC extensions. I found this solution on StackOverflow:

#define described_functions test_foo();

#pragma push_macro("described_functions")
#undef described_functions
#define described_functions _Pragma("pop_macro(\"described_functions\")") described_functions test_bar();

#pragma push_macro("described_functions")
#undef described_functions
#define described_functions _Pragma("pop_macro(\"described_functions\")") described_functions test_baz();

described_functions // expands to test_foo(); test_bar(); test_baz();

This is actually a way to append a string to a macro which works, at least in GCC. Snow could have used that... except for one problem: you of course can't use #define from within a macro, and we would have needed to do this from within the describe macro. I have searched far and wide for a way, even a weird GCC-specific possibly pragma-related way, to redefine a macro from within another macro, but I haven't found anything. Close, but no cigar.

The way which actually works

I mentioned that there is actually one way to do it. Before I show you, I need to cover dlopen and dlsym.

void *dlopen(const char *filename, int flags) opens a binary (usually a shared object... usually), and returns a handle. Giving dlopen NULL as the file name gives us a handle to the main program.

void *dlsym(void *handle, const char *symbol) returns a pointer to a symbol (for example a function) in the binary which handle refers to.

We can use dlopen and dlsym like this:

#include <stdio.h>
#include <dlfcn.h>

void foo() {
	printf("hello world\n");
}

int main() {
	void *h = dlopen(NULL, RTLD_LAZY);

	void *fptr = dlsym(h, "foo");
	void (*f)() = fptr;
	f();

	dlclose(h);
}

Compile that code with gcc -Wl,--export-dynamic -ldl -o something something.c, and run ./something, and you'll see it print hello world to the terminal. That means we can actually call functions dynamically based on an arbitrary string at runtime. (The -Wl,--export-dynamic is necessary to tell the linker to export the symbols, such that they're available to us through dlsym).

Being able to run functions based on a runtime C string, combined with our friend __COUNTER__, opens up some interesting possibilities. We could write a program like this:

#include <stdio.h>
#include <dlfcn.h>

/* Annoyingly, the concat_ and concat macros are necessary to
 * be able to use __COUNTER__ in an identifier name */
#define concat_(a, b) a ## b
#define concat(a, b) concat_(a, b)

#define describe(...) \
	void concat(test_, __COUNTER__)() { \
		__VA_ARGS__ \
	}

describe({
	printf("Hello from function 0\n");
})

describe({
	printf("Hi from function 1\n");
})

int main() {
	void *h = dlopen(NULL, RTLD_LAZY);
	char symbol[32] = { '\0' };

	for (int i = 0; i < __COUNTER__; ++i) {
		snprintf(symbol, 31, "test_%i", i);
		void *fptr = dlsym(h, symbol);
		void (*f)() = fptr;
		f();
	}

	dlclose(h);
}

Run that through the preprocessor, and we get:

void test_0() {
	{ printf("Hello from function 0\n"); }
}
void test_1() {
	{ printf("Hi from function 1\n"); }
}

int main() {
	void *h = dlopen(NULL, RTLD_LAZY);
	char symbol[32] = { '\0' };

	for (int i = 0; i < 2; ++i) {
		snprintf(symbol, 31, "test_%i", i);
		void *fptr = dlsym(h, symbol);
		void (*f)() = fptr;
		f();
	}

	dlclose(h);
}

That for loop in our main function will first call test_0(), then test_1().

I hope you understand why even though this technically works, it's not exactly something I want to include in Snow ;)

Read More

Replacing Apple TV

Date: 2015-12-18
Git: https://gitlab.com/mort96/blog/blob/published/content/00000-home/00009-replacing-apple-tv.md

For a long time now, I and my family have used an Apple TV as our media PC. Not those newfangled ones with third-party games and apps, but the older generation, those with a set of pre-installed "apps" which let you access certain quarantines of content, such as Netflix, YouTube, iTunes, etc.

The Apple TV worked well enough when accessing content from those sources. The Netflix client was good, the YouTube client kind of lackluster, and the iTunes client decent enough. There various other "apps", but those went mostly unused. The main problem however, affecting basically everything on the platform, is that I live in Norway; as a result, most of the time, somewhat new TV shows or movies we want to watch simply isn't available through those sources. Often, we needed to play video files obtained through other means. This left us two options:

  1. Find a Mac, play the video file in VLC there, mirror the screen to the Apple TV. This gives us various degrees of choppy frame rate, but lets us play the video instantly after it's downloaded, and lets us use subtitles if we so desire.
  2. Spend around half an hour converting the video to mp4, and stream it to the TV with this tool. This gives smooth frame rate, but takes a while due to converting media. It also doesn't support subtitles.

One day, I decided I'd had enough. I found an old laptop, threw Linux on it, connected it to the TV, and started writing code.

Introducing MMPC

MMPC, Mort's Media PC, is the fruits of my endevours. It's designed to be controlled from afar, with a web interface. It's also written in a modular fashion, and I'll go through what each module does.

Media Streaming

https://github.com/mortie/mmpc-media-streamer

The media streamer module is the most important module. When playing a movie or episode from a TV show, we generally have a torrent containing the media file. In the past, we would download the movie from the torrent file, and then find a way to play it on the TV when it's done. This module however lets us instead either paste in a torrent link, or upload a torrent file, and it'll stream that to a VLC window which opens on the big screen. VLC also comes with a web interface, so once you start playing a video, the browser loads the VLC web interface, and you can control video playback from there.

The control panel, letting you paste a magnet link, youtube link, etc:

Control Panel

VLC playback controls:

Playback Controls

Remote Desktop

https://github.com/mortie/mmpc-remote-desktop

Sometimes, you need more than streaming torrent files. Netflix, for example, is very useful, whenever it has the content we want to watch, and the same applies to various other websites. As the computer is running a full Linux distro instead of some locked down version of iOS, it's useful to have a way to control it directly. However, it's also annoying to have a wireless keyboard and mouse connected to it constantly and use that. Therefore, I decided it would be nice to be able to remote control it from the browser.

Implementing remote desktop in the browser sounded like it would be an interesting challenge. However, it turned out to be surprisingly easy. There's already a library out there called jsmpg which basically does everything I need. It has a client to stream an mpeg stream to a canvas element, and a server to stream to the client using websockets. The server also has an HTTP server, which you can stream video to and have it appear in all connected clients. Ffmpeg can both record the screen, and output to an HTTP server.

Once I had streaming video to the client working, the rest was just listening for various events on the client (mousemove, mousedown, etc.), and send HTTP requests to an HTTP server, which then promptly runs an xdotool command, and voila, remote desktop.

Remote Desktop

Live Wallpapers

https://github.com/mortie/mmpc-wallpaper

One nice thing about the Apple TV is that it can be set to display random pictures in your photo gallery, which is very nice. However, those pictures have to be in your iCloud photo library, which is sort of problematic, considering I don't use Apple devices, and dislike that kind of platform lock-in for something as important as photos. I therefore moved everything from what we used to use for photos, a shared iCloud photo stream, over to a NAS, mounted that NAS on the media PC as a webdav volume with davfs, and wrote this module to pick a random picture every 5 seconds and set it as the wallpaper. If the random picture it picked was in portrait orientation, it will find another portrait picture, and put them side by side using imagemagick before setting it as the wallpaper.

Why not Plex/Kodi/whatever?

Media PC software already exists. However, both Plex and Kodi, to my knowledge, sort of expects you to have a media collection stored on hard drives somewhere. They excel at letting you browse and play that media, but we rarely find ourselves in a situation where that would be beneficial. Most of the time, we just want to watch a movie we haven't seen before and have had no reason to already have in a media library, and we generally decide what movie to watch shortly before watching it. Writing the software myself lets me tailor it specifically to our use case.

UPDATE: It has come to my attention that Kodi has some addons which let you stream torrent files. However, even with that, there are some things Kodi doesn't do:

  • Streaming from arbitrary video services - some services, like Netflix, have kodi addons, but many streaming services don't have such plugins. There's no plugin for daisuki for example. Just having a regular desktop with Google Chrome, and some links to websites on the desktop for easier access, solves this.
  • Having those unobstructed dynamic live wallpapers of pictures we've taken whenever video isn't playing is rather nice.
  • Being able to control from a laptop instead of a remote control is useful; remote controls get lost, laptops don't. Typing on a laptop keyboard is also a lot easier than with a remote control.

You could get many of the features I want by using Kodi by installing, and maybe writing, lots of plugins, but I'm not convinced that would've been mch easier than just writing the thousand lines of javascript this project required.

Read More

Housecat, my new static site generator

Date: 2015-10-08
Git: https://gitlab.com/mort96/blog/blob/published/content/00000-home/00008-housecat.md

This website has gone through several content management systems throughout the times. Years ago, it was WordPress, before I switched to a basic homegrown one written in PHP. A while after that, I switched to a static site generator written in JavaScript, which I called Nouwell.. Now, the time has come to yet again move, as I just completed Housecat, my new static site generator.

Nouwell, like the PHP blogging system which came before it, was designed to be one complete solution to write and manage blog posts. It was an admin interface and a site builder in a complete package. With Housecat, I decided to take a different route. It's around 1500 lines of C code, compared to Nouwell's roughly 5000 lines of javascript for node.js, PHP, and HTML/CSS/JS. That's because its scope is so much more limited and well defined; take a bunch of source files in a given directory structure, and create a bunch of output files in another directory structure.

Housecat is designed to be a tool in a bigger system. It does its one thing, and, in my opinion, it does it pretty well. Currently, I'm editing this blog post with vim. I'm navigating and administrating articles with regular unix directory and file utilities. I have a tiny bash script which converts my articles from markdown to HTML before Housecat processes them. Eventually, I even plan to make a web interface for administrating things and writing blog posts and such, which will use Housecat as a back-end. To my understanding, this is what the UNIX philosophy is all about.

Housecat features a rather powerful theming system, a plugin system, pagination, drafts (start a post with the string "DRAFT:", and it'll only be accessible through the canonical URL, not listed anywhere), and should be compatible with any sane web server, and is, of course, open source.

Now, some of you might be wondering why anyone would ever use C to write a static site generator. To be honest, the main reason I chose C was that I wanted to learn it better. However, after using it for a while, it doesn't seem like a bad choice at all. Coming mostly from javascript, it's refreshing to have a compiler actually tell me when something's wrong, instead of just randomly blowing up in certain situations. C certainly isn't perfect when it comes to compiler warnings, as anyone who has ever seen the phrase segmentation fault (core dumped) will tell you, however having a compiler tell you you're wrong at all is a very nice change, and valgrind helps a lot with those segfaults. I also think that being forced to have more control over what I'm doing and what goes where helps; with javascript, you can generally throw enough hacks at the problem, and it disappears for a while. That strategy generally literally doesn't work at all in C. That isn't to say that you can't write good code in javascript, nor that you can't write bad code in C, but I found it nice nonetheless.

Read More

Apple and security: Are we back to using our favorite band as our passwords?

I have relatively recently started switching all my online account over to using a password system where all my passwords are over 20 characters, and the password is different for each and every account. It also contains both numbers, lowercase characters, and uppercase characters. I should be safe, right? Well, not quite.

A while ago, I just randomly decided to try out Apple’s “forgot password” feature. I’m a web developer, and am sometimes curious as to how websites implement that kind of thing, so I headed over to http://iforgot.apple.com/ and typed in my Apple ID. I noticed that it gave me the option to answer security questions.

I was first greeted with this screen, asking me for my date of birth:

apple-dob

The date of birth is obviously not classified information, and is basically available to anyone who knows my name and how to use Google.

Having typed in this, I get to a new page, which looks like this:

apple-secquestions

It asks me what my favourite band is and what my first teacher’s name is. None of that is secret either; anyone who know me knows that my favorite band is Metallica, and there are traces of that all throughout the Internet, and if it’s not in public records somewhere, anyone could just randomly ask me what my first teacher’s name was, and I’d probably answer honestly.

Anyways, typing in that information, I find something truly terrifying:

apple-terrifying

I was able to change my password. Only knowing my email address, my date of birth, my favourite band, and my first teacher, anyone could take complete control of all my Apple devices, remotely delete everything on them, access all my images, all documents, everything. And there would be nothing I could do to stop it. After some days, I would probably notice that I couldn’t log in to anything, and would call tech support, but at that point, it would already have been way too late. Anyone could by then have remotely deleted all my data, after backing it up on their machine. Only by knowing publicly available information about me, or asking me seemingly innocent questions via chat.

This isn’t even a case of me using terrible security questions either. Apple only allows you to pick from a small set of security questions, and the vast majority of them were completely inapplicable to me. I have no favourite children’s book. I’m not sure what my dream job is. I didn’t have a childhood nickname, unless we count “mort”, which isn’t really a “childhood” nickname, as it’s my current nick to this day. I don’t have a car, so I don’t know the model of my first car. I have no favourite film star or character. Et cetera. Those are all questions I could’ve chosen instead of “Who was your favourite band or singer in school?”, but none are applicable to me, and more importantly, none of them would be more secure than my current security questions.

Is this standard for security really acceptable from anyone, much less the world’s most valuable tech company, in this day and age? Are we really back to the dark ages of using birth dates, favorite bands, and other personal information as our passwords? Didn’t security experts find out that this was a bad idea a long time ago?

There are of course ways to mitigate the effects of Apple's poorly designed system. You could generate new random passwords for each security questions if you're using a password manager, or you could make up fake answers. I highly suggest going to https://appleid.apple.com/signin and change your security questions right away. However, Apple's solution is still broken. I expect that the vast majority of people will give their actual personal information as the answers, because after all, that's what the website asks you to do.

Read More