SESHAT: Handwritten math expression parser

AlbertoGP · on Aug 30, 2016

Wow, just skimmed through the PhD thesis that includes this work[1] and it's impressively thorough. The algorithm supports both vector and bitmap input, and uses Recursive Neural Networks and probabilistic grammars to disambiguate the symbols. Their open source implementation is in C++, under GPLv3 license.

[1] https://riunet.upv.es/handle/10251/51665

I've tried the web demo and for me it works quite well. In my case, definite integrals work except for the "dx" which gets interpreted as "d_x" for no apparent reason.

Some years ago I did some work in this direction which might be relevant: it only involved the formula structure analysis with an improved version of the DRACULAE algorithm from Zanibbi et al. (also cited in SESHAT's paper), starting from given characters (no symbol recognition except for hand-drawn fraction bars) freely positioned/scaled on the page. It delivers layout/presentation markup (MathML, DRACULAE tree), semantic encoding (OpenMath), natural language (English) and speech output, all in Javascript: http://matracas.org/tacto/

At the time, speech worked in Firefox, Chrome and even Safari, but nowadays only works in Firefox. I don't remember whether it worked in Opera but all other features did.

[Edit to add:] My application does not implement matrices either, just arithmetics including exponents, subindexes, and fractions.

marxidad · on Aug 30, 2016

You have to bring the x up a bit higher and/or make it bigger. I had the same problem, but with "dy" becoming "d_y".

tonmoy · on Aug 30, 2016

I have been using the free app "MathPad" on iPhone for a few years now. It does the same thing (A LOT better). But the free version does not allow export to LaTeX, so this open source alternative is more than welcome.

emddudley · on Aug 30, 2016

This reminds me of work by one of my professors at Rochester Institute of Technology (Dr. Richard Zanibbi). There is an application called Freehand Formula Entry System (FFES) available on his website for download (GPL source as well) [1]. He also published a paper on it titled "Recognition and Retrieval of Mathematical Expressions" (2012) [2].

[1] https://www.cs.rit.edu/~rlaz/ffes/

[2] https://www.cs.rit.edu/~rlaz/files/mathSurvey.pdf

teraflop · on Aug 30, 2016

This is an awesome project! The demo could use some work, though. It's pretty hard to write legibly with a mouse, so I tried to open it on my phone, but the canvas doesn't work properly when zoomed in. (using Chrome on Android)

Someone · on Aug 30, 2016

Works great when it does, fails miserably when it doesn't.

For example, it doesn't seem to know about matrices. That's fine, but when you enter one, it comes up with spectacular failures. I got integrals (probably because an integral sign somewhat matches the 'opening parenthesis' of a matrix) i^i, cases where it almost randomly stringed together parts of a matrix, etc.

tantalor · on Aug 30, 2016

On the other hand, matrices (1324)(1234), left scripts (nCmnCm) or piecewise functions (f(n)={n1−nn≥0n<0f(n)={nn≥01−nn<0) are not recognized yet.

yorwba · on Aug 30, 2016

The symbol recognition is really good, but it does not handle positioning and relative sizes that well: http://imgur.com/7FCr6TD

goldenkey · on Aug 30, 2016

Your handwriting is pretty bad tbh. Could be the reason. If we were using styluses the problem might not exist.

yorwba · on Aug 30, 2016

My handwriting is pretty bad, but it actually recognized all the symbols correctly (which is really impressive, considering how deformed the pi and 1 are).

My point is that assembling the symbols into a complete expression does not seem to work as reliably yet.

I had a look at the PhD thesis on the approach and it seems that the algorithm essentially relies on relative positioning only, plus some heuristics for symbols with "appendages" like 'd' or 'p' (This probably explains how "dx" ends up as "d_x").

If it were to additionally take into account relative sizes also, performance might improve a bit.

JadeNB · on Aug 30, 2016

The handwriting is not great (though let he who can calligraph with a mouse pointer cast the first stone), but, however good or bad it is, the `\pi` clearly is not the main symbol, and the `x^2` and `dx` (which, as has been mentioned (https://news.ycombinator.com/item?id=12388859), inexplicably renders as `d_x`) are clearly at the same height.

oxguy3 · on Aug 30, 2016

Hehe, guess I'm not the only one who likes using ancient gods as names for technology. My backup server is called Seshat, since she's seen as a record keeper.

eric_the_read · on Aug 30, 2016

I can't seem to make it recognize logs with arbitrary bases:

http://imgur.com/a/GkW9E

rahkiin · on Aug 30, 2016

This works rather well! I tried it on my iPhone. Simple expressions work just fine. I can't seem to make finite integrals work though.

misnome · on Aug 30, 2016

Indeed, it keeps trying to interpret it as a sum. Though perhaps the pi symbol is confusing it. I actually have an archive of all my physics degree notes - it'll be interesting to see how well it handles them (e.g. equations not written with it in mind, at all)

mkl · on Aug 30, 2016

Unless you wrote your notes digitally, it won't work: "the parser accepts input files in two formats: InkML and SCGINK". These are stroke based formats - the recogniser needs to know which points are connected, and possibly which order things were drawn.

misnome · on Aug 30, 2016

Oh, I was misinterpreting the definition of handwritten. A pity.

Edit: Though, thinking of it, does anyone know of research into decomposing images of text into stroke patterns? It seems like it's a problem that must have had some decent research on it...

kobigurk · on Aug 30, 2016

This is great and worked very well on the few examples I tried. I wonder how it would work with a photo - author, has it been tried?

jdironman · on Aug 30, 2016

Something seems off here: http://i.imgur.com/qBJCIqh.png

tobr · on Aug 30, 2016

The link text is something of a garden path sentence... At first I thought it was a handwritten parser for math expressions, not a parser for handwritten math expressions!

aninhumer · on Aug 30, 2016

Assuming the title hasn't changed... I don't see it?

The confusion of a garden path sentence is caused by a prefix being parsed differently from the final sentence, but "Handwritten math expression" parses as one would expect here.

misnome · on Aug 30, 2016

Except the phrase I parse is "Math expression parser", prefixed by "Handwritten".

tobr · on Aug 30, 2016

You're probably right. I guess in a sense it's almost the opposite of a garden path sentence since the ambiguous part comes at the end.