It's honing in on equations without getting distracted by nearby Hanzi or Cyrillic, or even pictures of dogs. Wow.
I keep going back to dig through your resources and getting more impressed.
EDIT: I guess my only constructive criticism is that you should brag more. I like a simple landing page, but I think you've earned a short list of examples of corner cases you tackle well, if the whole API is packed into that free app, because they're really impressive.
The perfection in those examples makes me suspect that they are cherry-picked or part of the training data. Especially the handwritten text is not always clear and could reasonably be interpreted differently. I'd expect a machine-learning model to get at least some things wrong some of the time.
If I wanted to use this in an application, I'd definitely want to see some accuracy figures on validation data as well as a few failure cases to see whether the output remains reasonable even when it is wrong.
The examples are actually very simple compared to a lot of crazy stuff Mathpix can recognize so it's an honest representation of its capabilities. Mathpix is built for perfection because 99% isn't good enough.
> the handwritten text is not always clear and could reasonably be interpreted differently
Digital pen input contains more info than the resulting bitmap; strokes are lost while rasterizing.
That info was the reason how old devices were able to reliably recognize characters written by a stylus. It worked well even on prehistoric hardware, such as 16MHz CPU + 128 kB RAM in the first Palm PDA.
This is really awesome OP! Thank you for sharing :)
One note I should make: it was not entirely clear (to me) upon a cursory view of the website, that the purpose of mathpix was to convert handwritten text into LaTeX. For some reason (maybe my coffee hasn't kicked in yet) I thought this was strictly intended to take screenshots of equations on an existing pdf document or a website etc and that will be converted to LaTeX.
My thought at that point was "I wonder if they could do this for handwritten text" and then I looked at the docs and facepalmed..
Their usage does not match that definition. When recognising writing, they are recognising an image of the page, not an image of what was on the screen at some other point.
Also, your definition describes a noun but claims it's a verb.
You should reread OPs paragraph. Also if you go straight to the website it shows you the desktop app first which is using screenshots from whatever to turn it into Latex.
Stupid question, how well would this work on a PDF of a latex document?
This would be great for blind people, as pdfed latex is extremely non-accessable, and I have to email authors of papers to get the original latex from them, which is often lost.
> This would be great for blind people, as pdfed latex is extremely non-accessable, and I have to email authors of papers to get the original latex from them, which is often lost.
Many years ago I "translated" course materials into a form which was accessible to a blind grad student. It was a really interesting job and taught me a lot about accessibility.
I was effectively doing latex, but without all the leading \ characters. It made learning latex comparatively easy.
What interface do you use to read equations? Screen reader speaking the straight latex, or do you have some Middleware to make it more digestible when listened to?
I'm not blind, but I have a blind PhD. He seems most happy with just straight latex. I suspect this is because it strikes a good balance between requiring little work for him, or others, as it usually already exists.
I may be naive here, but is there anything preventing running regular OCR over a page, and then feeding whatever it couldn't deal with into this? Sure, the plumbing for this is probably missing, but it sounds more like a matter of picking up the shovel rather than inventing something.
I've done some work cleaning up documents for use by a blind student. OCR starts to fail really badly when math is involved. Common errors like O versus 0, any accents or additions to characters, small formulas embedded in sentences, graphs with captions, etc. Can all throw things off considerably.
Best case I could copy and paste paragraphs at a time from a PDF of the textbook (with copy protection removed). Worst case I was retyping or fixing every few words in a sentence.
I was working on this from about 2011-2013. Advances with image processing and machine learning have been significant since then, so there may be much better software available now.
If anyone has ideas or packages they'd recommend, I'd be interested.
Im curious if the developers are fans of The Big Bang Theory TV series? They were using a smart phone app...and of course was less useful due to it being fiction...
Suggestion: instead of making me download a pdf to see examples of what the results look like, maybe put them on the page directly. You can have a couple. Then put the details in the pdf.
Bug report: it appears that multiline summation subscripts are not recognized correctly. For example, Eq. 8 of [1]. These are often created using \substack as part of amsmath.
I assume you just got a lot of installs from India, because the large publishing houses contract out many re-typesetting jobs that are basically to take scans of technical texts and convert them back into LaTeX.
I strongly suggest you talk to the publishers about integrating your tech into their TeX.
Want a math-ish PDF and some LaTeX source for training on possible edge cases? Think I might get someone (or something) to read my dissertation this way...
Any way you could make this available outside the Mac App store? Apple seems to have decided I did something horrible and unforgivable by moving to a different country after creating an account, thus making it impossible for me to use the store.
We used their API to make a simple screenshot2latex tool (select screen region -> puts latex formula in clipboard). From my experience it still fails on a couple of fairly common things like:
I was looking for an API that provides math OCR. Great, going to integrate it into our app soon :-) Let me know if you want to add us to your "trusted by" section.
Mathematicians use operator overloading all the time. It would be nice to have a tool that explains to me what an equation actually means in a given context.
You are talking about the semantics of an equation while this tool is already satisfying when understanding correctly the syntax (in LaTeX).
There are actually a number of ongoing research projects to establish standards of semantical mathematical representations. Probably one of the best funded running projects (budget ~10MEUR) which has a work package on this topic is http://opendreamkit.org/ . Work is going on at https://mathhub.info/ from my knowledge. I would like to provide a deep link but the site seems to be in a broken state. Apparently people are working on it right in the moment.
I’ve long had this dream of building a system that would allow for symbolic algebra manipulations, where the representation knows about the semantics. The idea is to replace long and tedious pen-and-paper symbolic calculations, while still having direct control over each step of the calculation. In looking for something like this I’ve found specialized proof verifiers, which are doing something different. At the other end of the spectrum there’s Mathematica, which does symbolic manipulation but doesn’t understand enough semantics.
Are these projects aiming at something like what I’m describing? Or are they more about something else like verifying proofs?
Well, in principle a CAS like Mathematica (or in principle every mature alternative) allows you to implement such a workflow (i.e. naming mappings or an algebra on your objects). I think what Wolfram is concentrating on since a decade is accumulating "knowledge" into their system, i.e. exactly this kind of semantic information we talk about. However, as an end user I don't really see that, it seems to be more well-groomed behind Wolfram alpha.
On the other hand, there are these research projects which however seem to concentrate on standards rather then actually accumulating semantic knowledge.
I would love to see an adoption of hypertext and semantic mathematical notation in scientific papers. Instead of writing $E=m c^2$ in a (LaTeX) paper, we would instead define the symbols machine readably with a code like
set E = physics/Energy
set m = physics/Mass
set c = physics/constants/speed-of-light
I have never seen actual scientific papers which do this kind of stuff, i.e. which are machine readable.
This is insanely impressive. Great work. Wish tools like this existed when I was still in school...almost makes me want to go back and do some more math :)
I tried this on some of my (fairly neat) handwritten physics notes and it was mostly pretty impressive. Failures: a lot of lowercase deltas became 8s and it had no idea what to make of hbar (converting it to n, pi, k, or most often refusing to convert the equation entirely). A fair number of little typos, but you'd want to check all its work anyway.
This is gold, great job. The one clear bug that I've noticed thus far is that it seems to correctly identify \hat{} but not finish it correctly when it also has an index, so that:
\hat { y _ { i } }
erroneously becomes:
\hat { y } _ { i }
This changes everything concerning my university life for me. I always was on the verge of doing everything digitally, but it was always cumbersome to type out (La)Tex by hand and convert handwritten notes to digital versions.
This is something I would have definitely used if I were still a student, although I'm not a fan of it being a Chrome extension. I'm still curious enough to test it out.
Very cool! I had this exact idea of a service a few years back, but was one of those things I never found the time or motivation to actually do. The results seem very nice.
This is great! it is very useful for writing papers. I tried it with some of my equations! It actually improved it by adding more approrpiate braces than I had.
I maintain an idea notebook in latex via Lyx and presently, while reading a math PDF I have to paste a screenshot in paint, clip there an equation, and paste the image in lyx. Very clumsy.
I'm sorry if it's off the topic. The testimonials really cracked me up. “ If I had known about Mathpix earlier, perhaps I would have had enough time to work out the Grand Unified Theory. ” - Albert Einstein
So I guess I can expect to see more low quality mathematical typesetting. The technology is great, but I don't see how it will help with high quality typesetting.
I can type an equation into LaTeX more quickly that I can photograph it and then go in and manually correct all the spacing issues. And there are spacing issues. The examples PDF has things that just look horrible. No small spacing or negative spacing to space out things like matrices and integrals etc. If I'm going to manually tweak it anyway I might as well do it manually from the start. Typing it up is something that gets quicker with practice like everything else.
But what I can actually see happening is people not tweaking the output manually. Either train yourself to use TeX properly, or let someone do it for you.
I think the point is that for 50 GBP you will get a much higher quality typesetting than spending 1 GBP on the API.
As.usual, it depends on the use case. If you are producing a textbook or something that requires high quality typesetting, you will probably pay for it.
If not, you can use a tool like this one but you will have to accept certain margin of error.
"Either train yourself to use TeX properly, or let someone do it for you."
The same complaint could be raised about using a biro (vs calligraphy), using a printing press (vs handwriting) or using a wysiwyg editor (vs the old word processing paradigm).
In my view, this sort of accessibility is a great advance.
I disagree. Technology should either make things easier or better. Making things easier but worse is not progress. Your analogies are incorrect.
Calligraphy is not handwriting. It's an art form and very slow to execute.
The printing press isn't just an "easier handwriting". It's actually harder to use. But it's better. It's closer to calligraphy than handwriting, in fact.
I'm not sure what your argument about WYSIWYG text editors is. In my experience they produce absolutely horrible results because, ironically, they require the user to know more about typesetting than TeX does if high quality output is desired.
thanks for your comments, can you share what you want your Latex to look like? how do you want to space out matrices and integrals? we're working on the Latex formatting so we can improve this
I would have put a small space (\,) between the upright "det" and the matrix. An integral sign is usually followed by a negative small space (\!) and the "dx" is separated by a small space (\,).
Trouble is I'm not sure if these are absolute rules. I'm sure you can find pathological cases where the spacing will be wrong. Maybe the best thing would be to allow the user to modify the spacing easily after the automatic version is made.
I trust you guys have read the TeXBook? Knuth dedicates two chapters to subtleties of mathematical typesetting.
https://tex.stackexchange.com/questions/1443/what-is-the-sta...
Under API... you're already doing handwriting? This is uh, nontrivial work to say the least. Really impressive.
The endorsements are a nice touch. :)
Made me really curious how far the system goes, what cases break it.
Oh... nevermind. You have a PDF of examples here: https://docs.mathpix.com
It's honing in on equations without getting distracted by nearby Hanzi or Cyrillic, or even pictures of dogs. Wow.
I keep going back to dig through your resources and getting more impressed.
EDIT: I guess my only constructive criticism is that you should brag more. I like a simple landing page, but I think you've earned a short list of examples of corner cases you tackle well, if the whole API is packed into that free app, because they're really impressive.