Is computed goto used for anything other than interpreter loops? Because if not, I would rather have a special "it looks like you're trying to implement an interpreter loop" case in the compiler than add new syntax.
I don't know that _efficient_ is the best word. If you use goto to force a particular control flow rather than a more constrained form of control flow (e.g. if), you make it harder for the optimiser to work; it can only make "as-if" changes, ones that mean the code that executes looks as if it's what you wrote.
The most efficient control flow is one that describes only what your algorithm needs, coupled with an optimiser that can exploit the particular flow you're describing.
Among the many things discovered by the author of https://blog.nelhage.com/post/cpython-tail-call/, Clang/LLVM was able to optimise the standard switch based interpreter loop as if it had been written with computed gotos.
> The most efficient control flow is one that describes only what your algorithm needs,
Yes. I’m not saying goto is just faster in general. But some algorithms are difficult to describe with while and if (bunch of examples in Knuth).
> Clang/LLVM was able to optimise
Because it’s implicit, this is the kind of optimization that’s easy to silently regress a few weeks before ship by accidentally violating a rule.
I think unrestricted goto is not good. But an alternative is to make the semantics and scope stricter, something like a Common Lisp prog/tag body block.
'computed goto' is used in gcc to mean the same as assigned goto in Fortran. The Fortran variation appears to have more restrictions than the gnuc one.
No, it has the same number of branches as a switch inside. The only difference is that computed goto switches miss the initial range check. And proper switch implementations by a better compiler would be more optimizable. But we are waiting more than 20 years for that already. Sparse switch dispatch tables are also compiled horribly.
No, it doesn't, because in a switch with a loop, at the end of each case (after the break), the instructions have an unconditional jump to the top of the switch again, where it performs the indirect jump. With the code using computed goto, the indirect jump is at the end of every single case of the switch. You can see this in the assembly the author provides. I'm ignoring cases where there's fallthrough to another case of the switch, which is sometimes done in interpreters like these. And though I wasn't thinking of it, the lack of a bounds check also reduces the number of branches.
The reduced branching used to bring speed improvements due to better branch prediction, though they have shrunken and become smaller and smaller on newer CPUs as branch prediction has improved in general.
The fact that they had become increasingly miniscule was acknowledged even ten years ago:
Nehalem shows a few outstanding speedups (in the 30%–40% range), as well as Sandy Bridge to a lesser extent, but the average speedups (geomean of individual speedups) for Nehalem, Sandy Bridge, and Haswell are respectively 10.1%, 4.2%, and 2.8% with a few outstanding values for
each microarchitecture. The benefits of threaded code decreases with each new generation of microarchitecture.
I never understood this argument. Without RAII you can easily get screwed by resource leaks without goto when returning early. In this regard, using goto is expedient. How do C programmers avoid this problem without goto?
bool function_with_cleanup(void) {
int *buffer1 = NULL;
int *buffer2 = NULL;
FILE *file = NULL;
bool success = false;
// Allocate first resource
buffer1 = malloc(sizeof(int) * 100);
if (!buffer1) {
goto cleanup; // Error, jump to cleanup
}
// Allocate second resource
buffer2 = malloc(sizeof(int) * 200);
if (!buffer2) {
goto cleanup; // Error, jump to cleanup
}
// Open a file
file = fopen("data.txt", "r");
if (!file) {
goto cleanup; // Error, jump to cleanup
}
// Do work with all resources...
success = true; // Only set to true if everything succeeded
cleanup:
// Free resources in reverse order of acquisition
if (file) fclose(file);
free(buffer2); // free() is safe on NULL pointers
free(buffer1);
return success;
}
bool function_with_cleanup(void) {
int *buffer1 = NULL;
int *buffer2 = NULL;
FILE *file = NULL;
bool success = false;
// Allocate first resource
buffer1 = malloc(sizeof(int) * 100);
if (buffer1) {
// Allocate second resource
buffer2 = malloc(sizeof(int) * 200);
if (buffer2) {
// Open a file
file = fopen("data.txt", "r");
if (file) {
// Do work with all resources...
fclose(file);
success = true; // Only set to true if everything succeeded
}
free(buffer2);
}
free(buffer1);
}
return success;
}
Much shorter and more straightforward.
One-time loops with break also work if you're not doing the resource allocation in another loop:
bool function_with_cleanup(void) {
int *buffer1 = NULL;
int *buffer2 = NULL;
FILE *file = NULL;
bool success = false;
do { // One-time loop to break out of on error
// Allocate first resource
buffer1 = malloc(sizeof(int) * 100);
if (!buffer1) {
break; // Error, jump to cleanup
}
// Allocate second resource
buffer2 = malloc(sizeof(int) * 200);
if (!buffer2) {
break; // Error, jump to cleanup
}
// Open a file
file = fopen("data.txt", "r");
if (!file) {
break; // Error, jump to cleanup
}
// Do work with all resources...
success = true; // Only set to true if everything succeeded
} while(false);
// Free resources in reverse order of acquisition
if (file) fclose(file);
free(buffer2); // free() is safe on NULL pointers
free(buffer1);
return success;
}
Still simpler to follow than goto IMHO. Both these patterns work in other languages without goto too, e.g. Python.
Open a scope when you check resource acquisition passed, rather than the opposite (jump to the end of the function if it failed).
It can get quite hilly, which doesn't look great. It does have the advantage that each resource is explicitly only valid in a visible scope, and there's a marker at the end to denote the valid region of the resource is ending.
EDIT: you mentioned early return, this style forbids early return (at least, any early return after the first resource acquisition)
Maybe that is exactly the problem, stop using a language designed in 1970's that ignored on purpose the ecosystem outside Bell Labs, unless where it is unavoidable.
And in such case, the C compiler doesn't have a limit to write functions and better modularize their implementations.
bool function_with_cleanup(void) {
int *buffer1 = NULL;
int *buffer2 = NULL;
FILE *file = NULL;
// Allocate first resource
buffer1 = malloc(sizeof(int) * 100);
if (!buffer1) {
return false;
}
// Allocate second resource
buffer2 = malloc(sizeof(int) * 200);
if (!buffer2) {
free(buffer1);
return false;
}
// Open a file
file = fopen("data.txt", "r");
if (!file) {
free(buffer1);
free(buffer1);
return false;
}
// Do work with all resources...
fclose(file);
free(buffer1);
free(buffer1);
return true;
}
Ah, but all those free() calls get tedious can be forgotten and mistyped
bool function_with_cleanup(void) {
int *buffer1 = NULL;
int *buffer2 = NULL;
FILE *file = NULL;
// Allocate first resource
buffer1 = arena_alloc(¤t_arena, sizeof(int) * 100);
if (!buffer1) {
return false;
}
// Allocate second resource
buffer2 = arena_alloc(¤t_arena, sizeof(int) * 200);
if (!buffer2) {
arena_reset(¤t_arena);
return false;
}
// Open a file
file = fopen("data.txt", "r");
if (!file) {
arena_reset(¤t_arena);
return false;
}
// Do work with all resources...
fclose(file);
arena_reset(¤t_arena);
return true;
}
Can still be improved with a mix of macros and varargs functions.
Or if using language extensions is a thing, the various ways to do defer in C.
There are situations in practical code where goto is the only reasonable choice if you want to avoid spaghetti code. Absolutes have no place in software, you can find scenarios where almost every typically bad idea is actually a good idea.