Hacker News new | past | comments | ask | show | jobs | submit login

Well, for example, MySQL/MariaDB using utf8 tables will instantly go down if someone inserts a single multibyte emoji character, and the only way out is to recreate all tables as utf8mb4 and reimport all data.



Surely nobody would use that format and allow a commit message including emojis to cause an effective DOS for a large Sonarqube project.


It doesn't block inserts with invalid data? I thought that was the whole point of telling the database what types you're using


MySQL historically isn't very good about blocking bad data. Sometimes it would silently truncate strings to fit the column type, for example. It's getting better as time goes on, though.


It does and poster above is incompetent


I have had customer production sites go down due to this issue when emojis first arrived. It was a common issue in 2015. I would hope it is fixed by now!


Having dealt with utf8mb4 data being inserted into the utf8mb3 columns many many times in the past, I've never had a table "instantly go down". You either get silent truncation or a refusal to insert the data.


Well, your applications haven’t used a serialized or JSON column. That’s how you go from truncation to downtime.

That said, I do remember this being an issue even with plain text.


I need more info about this.


In MySQL the `utf8` character set is originally an alias for `utf8mb3`. The alias is deprecated as of 8.0 and will eventually be switched to mean `utf8mb4` instead. The `utf8mb3` charset means it's UTF8 encoded data, but only supports up to 3 bytes per character, instead of the full 4 bytes needed.

https://en.wikipedia.org/wiki/UTF-8#MySQL_utf8mb3




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: