Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I disagree with the author that PDFs are a terrible format. They guarantee layout, which is very important for complex scientific presentations. Even slight differences in layout can make a complex set of equations difficult to parse. LaTeX also has a much superior word-break/hyphening algorithm to the HTML engines of browsers.

I find PDF math papers easy to browse, unlike the author. They're much easier and more organized than a website, can be easily searched and have a *proper table of contents* compared to websites. As for poorly browsable on a phone -- well I think that is irrelevant because nobody is going to read a complex technical paper in practise on a phone. They do look decent in tablets, and as for screen readers...well that's a valid point but screen readers don't work well for material with lots of equations anyway.

I applaud the author for the effort but looking at the result, I would not want to read math that way.



> nobody is going to read a complex technical paper in practise on a phone

I do, in fact. Or rather, I often would like to but with PDF? No chance. IEEE explore online reading sometimes works, but it would work better if they cleaned up their UI to be compatible with phones.

I have read thousands of pages of fiction on a phone and quite enjoyed it. Phones are great for reading if the content reflows properly.

Now publishers and content creators would need to embrace non-paginated, reflowing output. This would not only facilitate reading on phones, but also on tablets and laptop screens.

O‘Reilly‘s online platform does a good job with their app.

There is zero reason why paginated output should be the default in 2022.


Yes, fiction works because the layout is simple, consisting of text, and maybe images?

Research papers are far more complex, and have established standards that aid quick reading and parsing. I absolutely don't want to deal with reflowing equations, reflowing figures, or whatever when publishing papers. Precise margins and column widths.


Yet, by far the vast majority of content produced today, technical or prose, is read on screens.

Responsive webdesign has been around for quite a while. I don’t see a reason, other than lack of effort/investment, why we shouldn’t be able to read technical papers on variable-width screens, in a non-paginated form.

Dealing with the technical challenges should not be the task of the author, but the publisher. And indeed, most publishers are on it.

What‘s missing is a standardised format that can be downloaded, annotated, re-shared like a PDF.


I wish there were a convention for sharing whole websites. Even a zip file containing an index.html plus images, css, other pages, etc. would be fine if browsers just supported it.


The SingleFile browser extension can export a webpage into one HTML with all images, fonts, etc that are needed embedded in the HTML.


Yeah but that works through base64 data urls, which are clumsy. Epubs are zips of separate resources and that works great. We should have an equivalent for webpages and other similar documents—a digital-first competitor to pdfs. Or maybe just broader compatibility for epubs, such as first-class browser support.


You might prefer SingleFileZ, see https://github.com/gildas-lormeau/SingleFileZ


O'Reilly doesn't publish math books. All math books in epub/mobi format look like garbage. There isn't a single exception. If you know of one, please tell me. It seems currently too hard to get layout, resolution and inline formulas right in a portable format.


With MathML epubs can look decent. For example take a look at the sample MathML epub "A First Course In Linear Algebra" [0] (in a reader that supports MathML of course). It looks pretty good. The problem is Amazon STILL doesn't support MathML, so publishers just churn out a gross version where all the equations are images and so then it doesn't scale properly with the text and the book becomes 300+ MB because of it. And they can't be bothered to make two versions for readers like Kobo that do support MathML.

[0]: https://github.com/IDPF/epub3-samples/releases/download/2017...


MathML is not good enough. HTML+KaTeX/MathJax would be good enough, but is not properly supported in any epub readers I know of.


I tried the book. There are several places where long equations are cut off. Other minor spacing issues here and there.


O‘Reilly‘s online offer has not only O‘Reilly books, but ones from other publishers as well. Some of them have equations. However, they are often rendered as images.

IEEE explore does a good job rendering equations on phone screens. Therefore, it is possible.

There is no technical reason why equations couldn’t be rendered on a screen just as well as on a PDF. Sure, canvas size constraints might interfere, but this problem exists in principle also on paginated output. Plus, horizontal scrolling is a thing.

I‘m not saying a phone is the ideal platform to read a paper containing free energy-like math, but it can go a long way. Much longer than with the artificial restriction to paginated output like PDF.


Of course it is technically possible, but I haven't seen it done properly. I have never seen a book with math rendered as images that was of satisfactory quality or even close to what PDF can offer. I doubt IEEE explore is an exception, but I don't have an account, so cannot check.

I would like to be able to read a book also on a phone, but I am not going to compromise on quality for that, given that I can just read it on a large tablet in PDF format.


It is possible to find Open Access articles with math on ieeexplore with little effort. Have a look here: https://ieeexplore.ieee.org/document/6767058

Does this live up to your maths standards?


Thank you for the example! Yes, that definitely looks good, but is still just a webpage. Also, it has pictures with bad resolution, and a latex table that has been rendered as an image for some strange reason. So as usual, it is not consistent in its quality, which is usually the problem.

To compare, open the accompanying PDF, which is also provided along with the webpage. It is of MUCH higher quality, which is partially due to the fact that layout is static.

Furthermore, the webpage doesn't support pagination. The problem is turning it into a book, and there doesn't seem to be a good standard that supports HTML+KATEX/MATHJAX properly. In theory there is no reason they shouldn't, as epub support javascript, but in practice it just doesn't work properly.


> but is still just a web page

> doesn’t support pagination

Isn’t that exactly the point? Please, no pagination on a screen. A web page will do just fine. In fact, that was what the www was conceived for: publishing science.

> open the accompanying PDF, […] of much higher quality.

The only thing that is higher quality in the PDF is the justified alignment and pagination. The figures have the same (poor) resolution.

The bottom line is: * it is perfectly possible to publish technical papers in a format that is accessible on a phone. * non-paginated, free flowing output also works better on larger screens (e.g) I can resize the window and have my note-taking app open next to the paper. * PDFs are still great for annotating and printing, if required.


You are right, the figures have the same poor resolution. I didn't notice that because they just look higher quality in the PDF because they are seen in context.

As for your bottom line:

* Of course you can make HTML pages with KaTeX/MathJax that look great. I have done it myself, and this very HN post also points to such a web page.

* While a webpage is often fine, for anything I will spend a significant amount of time reading, I definitely want a separate entity, a book or a paper, that I can have in my library. There is no such format currently that has satisfactory quality, except for PDF.

* Non-paginated, free flowing output is the only real option for a phone, but does not work well on a large screen. Your IEEE example looks bad on a large screen, compared to a PDF. I am looking at it on a 27 inch screen, and on an iPad 12.9 inch, and it shows me one column, which separates the images from each other and from their context, and makes it harder for me to scan back and forth between related content. Pages provide a context, even though an admittedly rather arbitrary one, and non-paginated text is missing that context. You could do two columns, but how would that work for non-paginated text that knows no boundaries?

* For me resizing is not an issue, I take notes on paper or on my computer, but I can see how that might be important for somebody who is willing to trade reading quality. For me, PDFs are clearly superior for digital books and papers to anything else currently out there, at least if they have been created with care. PDFs that have been created by printing an epub or webpage, on the other hand, are useless for me and land immediately in the garbage bin whenever I encounter them.


> and as for screen readers...well that's a valid point but screen readers don't work well for material with lots of equations anyway.

This is something that we’d like to change. There are many visually impaired students who need to learn mathematics the same as you and I.

My “eyes were opened” when I was working with a blind student in my class. The textbook I’d written in pretext (transpiled to pdf and HTML) could be read on his BrailleNote but some of the equations were wonky, so I rewrote them to work for everyone.

It would be better if we developed tools to make them work for everyone straight away, instead of relying on authors. That’s one of my career goals.


I applaud you for this.

I think MathML (which has gotten much better in browsers, thanks to Igalia[1]) is a much better bet we have to make this possible than LaTex compiling to PDF.

[1] https://mathml.igalia.com/


You can't have animations with PDFs. Anyone using beamer is familiar with this frustration. But animations are incredibly helpful in explaining many works. 3Blue1Brown became so popular in major part due to his use of (fantastic) animations that more easily explain the material than any static image could.


The animate package from CTAN draws animations in PDFs. It has limitations (most pdf readers won't show them because they rely on some JavaScript), but it does work.


Maybe that's my issue. But I've tried about a dozen pdf readers and failed.


Adobe works. I've heard of at least one other but I couldn't get it to go. People say Adobe Reader is bloated, but if you need it then it is not bloat.


> I find PDF math papers easy to browse

So do I. Still, I wish LaTeX produced easily reflowable PDFs, especially when a document is formatted in two columns.


But it does, doesn't it? You add the "twocolumn" option and recompile. Unless your LaTeX is too fancy this will tipically give a very good result (at worst, some figures with hardcoded sizing will be awkardly placed).


I cannot do that when I'm reading a paper written by somebody else, and I only have the produced PDF.


That's why arxiv is a god send, because the source is available there, if the author has uploaded it there.

Science needs a culture of open sharing, the same way physics and math has it.


twocolumn is terrible for mobile.

what's needed are narrow margins.


what you are asking for is called a "round-trip" by some printers.. This was requested the week after PDF was invented! It does work, unless it does not.. the company that invented this technology is apparently infested by MBAs and charismatic nobodies, since they announced they are exiting the type "business" ? Our house of cards is showing.


Check zotero. It has that feature


If your equations are in MathML, the browsers should be able to screen read them at some point.

> Even slight differences in layout can make a complex set of equations difficult to parse.

Such set of equations should normally be represented by a single block, I can't imagine a reason why layout should change inside that block.

The layout of pdf is unnecessarily rigid. When I'm reading it on my screen, there's no reason the text should be split into A4 pages with very specific margin values. Latex also often moves your figures a few pages ahead because they didn't fit on the specific page. There's absolutely no reason for that when you have access to the big continuous canvas of an html page. This works for equations too; if you have a long equation block that happens to be right between two pages, you either have to let one page have a gap, or reorder/rewrite your paragraphs to make the equations fit. None of this has a good excuse when it's read on a screen.

I don't think we need a website, but a js-free webpage with hyperlinks would be a lot better than pdf. Pdfs I find imperfect but ok.


> I don't think we need a website, but a js-free webpage with hyperlinks

Wasn't this precisely the use case for HTML and the WWW as originally conceived by Berners-Lee and his fellow internet pioneers?


> LaTeX also has a much superior word-break/hyphening algorithm to the HTML engines of browsers.

And because the PDF has a fixed layout it's also much easier to prevent "rivers" in paragraphs. Which hence makes it a no-brainer to use justification. To me many print publication using justified text (including LaTeX documents) are a thing of beauty and I do hate how "left align" breaks the flow of reading. I'm taking slightly different spacing between words due to justification every day over horizontal lines of different length, which I find fugly and confusing beyond repair.

More hyphenation controls are coming to CSS and, one can dream, it may be possible one day to programatically detect rivers?

Meanwhile rivers be damned, I override anyway many sites and add "text-align: justify". The nice thing is: because "text-align: left" is the default many sites and minifiers do not bother with text-align at all, so adding "text-align: justify" works for many, many, many sites.

And I only half-buy anyway the justifications (ah!) for left alignment on the Web.

It's basically saying: "We know better than people who've been working in print since decades (or more), left align is easier to read". I don't buy it. Left align breaks my reading flow. And I cannot be the only one.

To me left align is trading potentially ugly looking paragraphs (due to rivers) for certainly ugly looking paragraphs (due to left justification: just look at the right of each paragraph... Such lack of clarity, such chaos cannot be unseen. It's pure fail).

P.S: I've actually typeset books both in LaTeX and QuarkXPress and their were justified, not left-aligned.


> I override anyway many sites and add "text-align: justify".

I think you're an outlier in your strong preference for justified text but this serves as an example in favor of using HTML to present content. Well made web content is much more malleable by users to make it meet their needs and preferences.


I think you give latex more credit than it deserves. It gives little straightforward control over layout and the only reason documents are manageable is that pages are fixed size and layout changes are mostly local.

It’s paragraph breaking was state of the art when it was new but other systems break paragraphs now and potentially better. I also think ragged margins aren’t really a problem.

I think if layout mattered as much as you imply, scientists would have to use a tool that offers more control like indesign.

None of this is to say that getting good layout in HTML is easy, of course.


> I think if layout mattered as much as you imply, scientists would have to use a too that offers more control like indesign.

Yes, precisely that. As a scientist I don't even want to have to deal with layout. That's what publishers are paid extremely well for. When I self-publish content I want the process to be as simple as possible. If this means ragged margins, browser-default styles for headings etc., default colors and fonts — so be it.

(but to be fair, optimising the layout is an excellent way to procrastinate on doing hard research)


PDF papers are also much easier to save/archive and use offline. And great for printing


We need either an app that can compile LaTeX source (+all included libs, which sounds like a lot, but it's equivalent to a JS-heavy web page) on all the clients (preferably as a browser plugin or integrated feature!)

or authors should distribute their PDFs as bundles that include formatted versions for all of paper, large screen, and small screen.


The standard single column of content layout of nearly every webpage is a bad fit for scientific content because the information density is way too low. A pdf, where I can display multiple pages, each with two columns, side-by-side is much better. This is really handy when you need to do something like refer to a figure/equation/table on the last page. I have yet to see any website solve this lack of density problem in any meaningful way. Of course, paper is still better, but I'll take a pdf over a web site any day.

A pdf is also much easier to archive. The job of sci-hub would but a lot more difficult if every paper came with separate html, css, javascript, and images.


Screen readers work perfectly fine with mathml. At worst one can just get the screen reader to read the latex for maths and browse the rest in nice HTML.

On the other hand, PDFs generated from Latex are completely useless for screen readers.


Get rid of the 2 column thing and most people would be happy.

What guarantees of layout do you require?

In related news, MathML is back in Chrome v109


> What guarantees of layout do you require?

Some people write documents that can only be clearly presented on a 15" or larger display. Maybe a comparison table with a bunch of columns, maybe a detailed chart, maybe a PCB schematic, whatever.

These people, being considerate of their readers, want to ensure if someone with a 13" screen comes along, they'll get scrollbars or small text, rather than a badly reflowed table where the word 'Yes' gets split over 3 lines.

Other people want to read those documents on 5" phone displays.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: