Protothreads: Stackless Concurrency in C
Published: May 18, 2026 | Tags: #c #low level #concurrency
So, you know, when you learn about concurrent software in C, you (or at least I) get that bitter-sweet feeling in my mouth. At first I didn’t know what was wrong, but there’s that weird taste. And it was clearly the weight of the threads. Don’t get me wrong, pthreads (or C11 threads) are really useful, but so they are “heavy”. And not portable, the ‘p’ in pthreads stands for POSIX, and we know what happens with that (and if not, we can always write another blog post). Either way, they are not the best. Imagine using them in embedded programming! Actually, let’s imagine that. In this article we will talk about Protothreads, a really cool library made by Adam Dunkels and Oliver Schmidt that provides stackless, lightweight threading in pure C.

But, Zuhaitz… Why not just use standard threads?
I already talked a bit about this in the introduction, but it is good to fully grasp what’s happening to understand the value of protothreads.
When you create a traditional thread, the OS allocates a dedicated stack for it. And even if minimal, it can take kilobytes or megabytes of RAM. Also, switching between threads requires the OS to perform a context switch—saving CPU registers, swapping out memory spaces, and restoring the next thread’s state. In other words, that’s overhead. And for embedded or event-driven systems this is a no go.
But that’s why protothreads are useful, as they take a different approach. First of all, they are stackless as we said. So they don’t have their own call stack. They all share the same stack as the system that calls them. They also have tiny footprints, as a protothread only requires two bytes of memory to track its state. And finally, there’s no architecture-specific assembly required. In a sense it is peak of portability.
How is this even possible?
If you have been paying a bit of attention, which I am sure you have, you surely feel like there’s something wrong: if there is no stack being saved, how does a thread “pause” execution and resume right where it left off?
The magic behind Protothreads relies on a slightly bizarre property of the C language. switch statements can actually be used in… quite unconventional ways. Specifically, Protothreads use a technique related to Duff’s device. Duff’s device deserves its own article, but for now check Wikipedia (or ask about it to your favorite LLM).
By wrapping a switch statement inside a macro, the library can jump back to the exact line of code where a thread yielded. No need for stack! When you call a protothread function, it checks a stored state variable (those 2 bytes we mentioned), hits the switch statement, and jumps to the specific case label inside the function.
The magic: just a few macros.
So, I am a visual learner. Usually, I wouldn’t share the whole implementation of a library in a single post, but Protothreads is really simple. You can write it in a few lines of code! That’s what I find most beautiful about it.
struct pt
{
unsigned short lc;
};
#define PT_THREAD(name_args) char name_args
#define PT_INIT(pt) pt->lc = 0
#define PT_BEGIN(pt) switch(pt->lc) { case 0:
#define PT_WAIT_UNTIL(pt, c) pt->lc = __LINE__; case __LINE__: \
if(!(c)) return 0
#define PT_END(pt) } pt->lc = 0; return 2
Now let’s think about how this expands during preprocessing. PT_BEGIN opens a switch block. When PT_WAIT_UNTIL is evaluated, if the condition c is false, it saves the current file line number into lc and returns 0 to give up control. The next time the function is called, the switch statement evaluates lc, matches the case LINE on that same statement, and resumes right where it returned. It is this simple.
Time to code!
So, let’s also look at how a simple program made with Protothreads could look:
#include "pt.h"
#include <stdbool.h>
// This struct holds the 2-byte state of our thread.
struct pt my_pt;
struct timer my_timer;
// Our protothread function.
PT_THREAD(example_thread(struct pt *pt))
{
// Don't forget about this!!!
PT_BEGIN(pt);
while (true)
{
if (initiate_io())
{
timer_start(&my_timer);
// The thread will "block" here without locking up the CPU.
// It will return control to the caller until the condition is met.
PT_WAIT_UNTIL(pt, io_completed() || timer_expired(&my_timer));
read_data();
}
}
// Don't forget about this either!!!
PT_END(pt);
}
The preprocessed function.
char example_thread(struct pt *pt)
{
switch(pt->lc) { case 0:
while(true)
{
if(initiate_io())
{
timer_start(&my_timer);
pt->lc = 21; case 21: if(!(io_completed() || timer_expired(&my_timer))) return 0;
read_data();
}
}
} pt->lc = 0; return 2;
}
I would recommend using a formatter, but as you can see this is valid C code! When the thread returns 0, it drops off the call stack completely. Then, on the next invocation, the function hits switch(pt->lc), sees lc == 21, jumps directly into the middle of the loop, validates the condition again, and continues executing linearly if it passes.
IMPORTANT!
Protothreads don’t have their own stack, which means you cannot use local variables across blocking calls, so to get around this, any state that must persist across a PT_WAIT_UNTIL must be declared as static or passed as part of a struct.
Also, DO NOT put two PT_WAIT_UNTIL calls in the same line. If you do so, both case labels will have the same number!
Conclusion.
Protothreads are beautiful because they are simple. Sure, abusing the preprocessor is not something I will often agree with, but for this use case it does make sense. Also, threads in general are quite interesting implementation-wise. They will take longer than just ~10 lines of code, but in a few hundred we could implement them. I might do so for a next blog post. But for now, remember that if you are working with microcontrollers, or writing network protocol stacks (or you are just bored), this tool will be good to remember.