> Ehm, that’s exactly user input. If it were machine input, there would be a normalized definition, and you’d use that internally.
That depends on the definition of "user"; most things I'm used to refer only to the current user of the device as the user, and everything else as untrusted input.
The problem with having a formal definition of the strict subset is that you end up with bugs (often security critical) in almost every implementation because of some case where the conversion produces something not in the strict subset. That's something that's happened with way too many formats.
Usually, in such a situation, you can fail early, and you can log a warning.
The alternative, of trying to fix the developers error with heuristics, almost always ends with worse security-critical bugs.
There’s a reason people advocate for strict typing, and proper errors, and not PHP’s "any unreferenced constant is of type string with its name being its content".
That depends on the definition of "user"; most things I'm used to refer only to the current user of the device as the user, and everything else as untrusted input.
The problem with having a formal definition of the strict subset is that you end up with bugs (often security critical) in almost every implementation because of some case where the conversion produces something not in the strict subset. That's something that's happened with way too many formats.