> The whole point of UTF-8 is that it's identical to Latin-1 except in cases whe...

dlubarov · on June 9, 2017

Just in case there's any confusion:

- 7-bit ASCII encodes 128 characters; latin-1 encodes 256 characters; UTF-8 encodes all unicode characters

- latin-1 and UTF-8 both extend 7-bit ASCII; the three encodings are equivalent for ordinals 0-127

- UUID characters are part of 7-bit ASCII, so they have the same encoding in ASCII, latin-1 and UTF-8

seorphates · on June 9, 2017

100% correct - UUID strings are ASCII < 128 .. but there are imposters for character code 45 ("-"). Assumption based approaches to conversions or field handling should not be assumed to be safe ones just because UUID is "technically" ASCII. Only the keeper of the field can truly enforce that constraint.

sillysaurus3 · on June 9, 2017

Aha, thanks for calling me out. I was indeed thinking of ASCII.