Hacker News new | past | comments | ask | show | jobs | submit login

Neat! I did something similar as a patch to PostgreSQL proper in May of this year and gave a talk on it at BerlinSides[1]. I like the fact that your solution is implemented as an extension rather than as a patch to the core, but I worry about the fact that the extension API only has access to the post-analysis parse trees. While I'm comfortable saying that all of the types of SQLi attack that I'm aware of will cause a difference in the raw parse tree, I'm not so sure about whether those differences will carry through into the results of the analysis.

Also, how much overhead does this add? My software is linear (albeit with a fairly large constant) in the size of the input query and constant in the number of queries in the training set.

[1] https://github.com/thequux/postgres




> but I worry about the fact that the extension API only has access to the post-analysis parse trees.

I doubt that's a problem. Parse analysis won't remove information from the query - otherwise it'll not be available for the actual planning and execution ;)


> Parse analysis won't remove information from the query

That's right.

PostgreSQL's parse analysis keeps a statement structure with token-by-token in the parse tree, and PostgreSQL's query jumbling calculates a hash value from the parse tree.

So, it's possible to find something strange in the statement(s) if someone attempts to cheat.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: