union { char *cp; int64_t *ip; } u;
u.cp = p;
return *u.ip; // bad
...because the undefined part isn’t casting the pointer, but reading through it. Nor may you read through a pointer to union, if the memory was not already typed as that union type:
union { char c[8]; int64_t i; } *up;
up = (void *) p;
return up->i; // bad
(...well, you probably shouldn’t, anyway. One of the clauses in the C standard seems to condone it, but a WG document[1] suggests that the clause is just misworded, meant to allow accessing a pointer to a structure field with the field’s type, but allowing the opposite instead. And suggests that it will be fixed in future standards. Who knows, though.)
You can do this:
union { char c[8]; int64_t i; } u;
for (int i = 0; i < 8; i++)
u.c[i] = p[i];
return up->i; // ok
But that’s more verbose than memcpying directly into an int64_t, and the compiler will optimize both versions into a plain load (as long as the target arch supports unaligned loads), so there’s not much point doing it that way.
If your data was declared as a union in the first place, of course, that’s different. But the OP was asking about strings, so I assumed the input in the C case was just a char *.
You can do this:
But that’s more verbose than memcpying directly into an int64_t, and the compiler will optimize both versions into a plain load (as long as the target arch supports unaligned loads), so there’s not much point doing it that way.If your data was declared as a union in the first place, of course, that’s different. But the OP was asking about strings, so I assumed the input in the C case was just a char *.
[1] http://open-std.org/jtc1/sc22/wg14/www/docs/n1520.htm