But this is where "be conservative in what you do" comes into play. The STEP format has formal rules for exporting all ASCII, Unicode, and ISO-8859 characters.
A well-written STEP string exporter should handle them all without difficulty, no matter what goofy things are in the string.
And again, if you're worried that there may be an attack vector, change high-bit-set characters to "[Illegal character value N]". Though it might be more merciful to assume they just wanted ISO-8859-1 characters and substitute the appropriate control code.
The tl;dr of the article is to define handling of invalid input, so that all conforming implementations will handle it in the same way, without having to reverse-engeneer eachother to be interoperable.
So you're saying that every time I find a STEP file written in an invalid fashion, I should convene an ISO 10303 committee and wait for years to find out how everyone should handle it? That's doubly insane, because it would take many bugs that can be fixed in a day and make my customers suffer from them for years, while at the same time requiring me to modify my program to handle every bug found by every STEP software vendor or cease to be conforming.
And again, if you're worried that there may be an attack vector, change high-bit-set characters to "[Illegal character value N]". Though it might be more merciful to assume they just wanted ISO-8859-1 characters and substitute the appropriate control code.