I think it is easy to produce vague allegations of a lack of integrity, but I would really expect more substance to such a critique.
> ... desires of the state. The dollars come from somewhere and there are always strings attached.
Is this suggesting that the public funders influence the outcome of research? It is not something I have ever witnessed. They sometimes may seek to take influence on the direction of a research programme, especially when the performance is below expectations. but I have yet to see an example where a public funder has attempted to influence research results.
> I think we’d quickly learn that academia of today is not at all what it presupposes itself to be
What does it presuppose to be? Academia of today is 1) a place of teaching and knowledge dissemination, 2) a place of research and knowledge creation, 3) a multibillion dollar business that sells tickets to successful professional lives. The latter leads to overinflated self-marketing, and unfortunately this affects how research results are communicated.
Those who hold the "purse strings" have little interest in influencing research outcomes. Rather, they care about the reputation of their course (students), their own reputation (alumni), recruiting talent and/or outsourcing research (businesses), or actual research results (public funders).
This is at least the case in most institutions in the US and the UK, and most of western Europe.
“Is this suggesting that the public funders influence the outcome of research?”
The funders decide what gets funded.
One example, from Stanford,
“ Stanford University receives hundreds of millions of dollars of funding from the NIH, without which researchers would not have the resources to conduct many worthwhile experiments and studies. NIH funding also confers prestige and status within the scientific community. At Stanford, it is very difficult for a biomedical researcher in her department to earn tenure without landing a major NIH grant. The attack by Collins and Fauci sent a clear signal to other scientists that the GBD was a heretical document.”
The type of "career scientist" that the article portrays is part of the problem. In order to get ahead with a science career, one needs to pursue an extremely narrow specialisation. Otherwise it is impossible to build a stellar reputation. And reputation is the key determinant for being offered a permanent position at a high-ranking institution.
However, many interesting and relevant problems live at the interface of several disciplines. Unfortunately, those working between disciplines will have a hard time getting a permanent job at top universities: Whatever faculty they apply to, there will always be other applicants who are super-specialised and therefore appeal more to the super-specialised faculty members in the hiring committee. That is why true interdisciplinary research still doesn't happen very much, even though it has been praised and encouraged for more than two decades now.
Coming back to the article, in my opinion the solution to overwork is to cut back on elitism. Less famous universities tend to be more relaxed in their recruitment and tenure criteria. Less pressure means more mental flexibility, which can help maintain a wider network of researchers across disciplines. And the wider the network is, the better the chances of being invited to collaborative projects, especially when one has a record of successful interdisciplinary collaboration.
The price to pay is that one will not be able to impress with the name of one's university when doing small talk. But one will be a much more interesting conversation partner — and have free time to meet people outside work with whom to talk.
First thing we did after moving to the UK was to buy a bread baking machine. Not quite bakery-level bread, but good enough for daily consumption, and it leaves a nice smell in the house.
Computer science, and academia in general, has always adapted to technical progress, although slowly.
In the case of ChatGPT and similar LLMs, these should become part of the toolbox that students are being taught. I.e., how does it work, what can it do, what are its limits, how can it be used to help solve a problem or complete a task.
An exam question could then be e.g. to ask ChatGPT for an essay on a topic, critically discuss its shortcomings, and improve the essay e.g. by adding references and deeper discussion, which will be graded.
Alternatively, use ChatGPT to iteratively discuss and improve the essay. The whole chat transcript should be included and will be graded.
Plus one to this. The reason we make students read and write is to teach them what good communication is. ChatGPT currently produces (and likely will continue to produce for a while) mediocre prose. How do I know that? Because I've read good prose and I've attempted to write it and understand the editing process. At some level it represents a huge pedagogical opportunity.
(Grading IS moderately harder, but not that much.)
The slide rule didn't render arithmetic obsolete. Neither did the calculator. Being able to do math on paper is a side effect of understanding the base 10 representation of number most of the world has settled on. And the first thing someone in college for EE does is re-learn this arithmetic in base 2. The calculator is just an extra tool to get to numerical results faster.
ChatGPT is an extra tool as well. Smart students will realize they have an extra tool to get an upper edge when competing in the zero-sum game they've been thrown in. They'll have ChatGPT generate the essay from an outline they wrote and will then trim and re-formulate as needed, spending most time fact checking and reading from source material to make sure ChatGPT isn't making stuff up (what will get some of their classmates caught).
> I think if layout mattered as much as you imply, scientists would have to use a too that offers more control like indesign.
Yes, precisely that. As a scientist I don't even want to have to deal with layout. That's what publishers are paid extremely well for. When I self-publish content I want the process to be as simple as possible. If this means ragged margins, browser-default styles for headings etc., default colors and fonts — so be it.
(but to be fair, optimising the layout is an excellent way to procrastinate on doing hard research)
> nobody is going to read a complex technical paper in practise on a phone
I do, in fact. Or rather, I often would like to but with PDF? No chance. IEEE explore online reading sometimes works, but it would work better if they cleaned up their UI to be compatible with phones.
I have read thousands of pages of fiction on a phone and quite enjoyed it. Phones are great for reading if the content reflows properly.
Now publishers and content creators would need to embrace non-paginated, reflowing output. This would not only facilitate reading on phones, but also on tablets and laptop screens.
O‘Reilly‘s online platform does a good job with their app.
There is zero reason why paginated output should be the default in 2022.
Yes, fiction works because the layout is simple, consisting of text, and maybe images?
Research papers are far more complex, and have established standards that aid quick reading and parsing. I absolutely don't want to deal with reflowing equations, reflowing figures, or whatever when publishing papers. Precise margins and column widths.
Yet, by far the vast majority of content produced today, technical or prose, is read on screens.
Responsive webdesign has been around for quite a while. I don’t see a reason, other than lack of effort/investment, why we shouldn’t be able to read technical papers on variable-width screens, in a non-paginated form.
Dealing with the technical challenges should not be the task of the author, but the publisher. And indeed, most publishers are on it.
What‘s missing is a standardised format that can be downloaded, annotated, re-shared like a PDF.
I wish there were a convention for sharing whole websites. Even a zip file containing an index.html plus images, css, other pages, etc. would be fine if browsers just supported it.
Yeah but that works through base64 data urls, which are clumsy. Epubs are zips of separate resources and that works great. We should have an equivalent for webpages and other similar documents—a digital-first competitor to pdfs. Or maybe just broader compatibility for epubs, such as first-class browser support.
O'Reilly doesn't publish math books. All math books in epub/mobi format look like garbage. There isn't a single exception. If you know of one, please tell me. It seems currently too hard to get layout, resolution and inline formulas right in a portable format.
With MathML epubs can look decent. For example take a look at the sample MathML epub "A First Course In Linear Algebra" [0] (in a reader that supports MathML of course). It looks pretty good. The problem is Amazon STILL doesn't support MathML, so publishers just churn out a gross version where all the equations are images and so then it doesn't scale properly with the text and the book becomes 300+ MB because of it. And they can't be bothered to make two versions for readers like Kobo that do support MathML.
O‘Reilly‘s online offer has not only O‘Reilly books, but ones from other publishers as well. Some of them have equations. However, they are often rendered as images.
IEEE explore does a good job rendering equations on phone screens. Therefore, it is possible.
There is no technical reason why equations couldn’t be rendered on a screen just as well as on a PDF. Sure, canvas size constraints might interfere, but this problem exists in principle also on paginated output. Plus, horizontal scrolling is a thing.
I‘m not saying a phone is the ideal platform to read a paper containing free energy-like math, but it can go a long way. Much longer than with the artificial restriction to paginated output like PDF.
Of course it is technically possible, but I haven't seen it done properly. I have never seen a book with math rendered as images that was of satisfactory quality or even close to what PDF can offer. I doubt IEEE explore is an exception, but I don't have an account, so cannot check.
I would like to be able to read a book also on a phone, but I am not going to compromise on quality for that, given that I can just read it on a large tablet in PDF format.
Thank you for the example! Yes, that definitely looks good, but is still just a webpage. Also, it has pictures with bad resolution, and a latex table that has been rendered as an image for some strange reason. So as usual, it is not consistent in its quality, which is usually the problem.
To compare, open the accompanying PDF, which is also provided along with the webpage. It is of MUCH higher quality, which is partially due to the fact that layout is static.
Furthermore, the webpage doesn't support pagination. The problem is turning it into a book, and there doesn't seem to be a good standard that supports HTML+KATEX/MATHJAX properly. In theory there is no reason they shouldn't, as epub support javascript, but in practice it just doesn't work properly.
Isn’t that exactly the point? Please, no pagination on a screen. A web page will do just fine. In fact, that was what the www was conceived for: publishing science.
> open the accompanying PDF, […] of much higher quality.
The only thing that is higher quality in the PDF is the justified alignment and pagination. The figures have the same (poor) resolution.
The bottom line is:
* it is perfectly possible to publish technical papers in a format that is accessible on a phone.
* non-paginated, free flowing output also works better on larger screens (e.g) I can resize the window and have my note-taking app open next to the paper.
* PDFs are still great for annotating and printing, if required.
You are right, the figures have the same poor resolution. I didn't notice that because they just look higher quality in the PDF because they are seen in context.
As for your bottom line:
* Of course you can make HTML pages with KaTeX/MathJax that look great. I have done it myself, and this very HN post also points to such a web page.
* While a webpage is often fine, for anything I will spend a significant amount of time reading, I definitely want a separate entity, a book or a paper, that I can have in my library. There is no such format currently that has satisfactory quality, except for PDF.
* Non-paginated, free flowing output is the only real option for a phone, but does not work well on a large screen. Your IEEE example looks bad on a large screen, compared to a PDF. I am looking at it on a 27 inch screen, and on an iPad 12.9 inch, and it shows me one column, which separates the images from each other and from their context, and makes it harder for me to scan back and forth between related content. Pages provide a context, even though an admittedly rather arbitrary one, and non-paginated text is missing that context. You could do two columns, but how would that work for non-paginated text that knows no boundaries?
* For me resizing is not an issue, I take notes on paper or on my computer, but I can see how that might be important for somebody who is willing to trade reading quality. For me, PDFs are clearly superior for digital books and papers to anything else currently out there, at least if they have been created with care. PDFs that have been created by printing an epub or webpage, on the other hand, are useless for me and land immediately in the garbage bin whenever I encounter them.
The real shocker is that it’s 2022 and LaTeX is still the best writing environment for a PhD thesis. It has so many downsides: the markup syntax is ugly, it really works best only if one used paginated output such as PDF, a zoo of partly incompatible packages, need for compilation, obscure figure placing algorithms that are difficult to control, and so on.
It still beats the competition because of rock-solid referencing, both to in-text elements like equations, chapters, etc as well as citing literature with bibtex.
Plus, it’s extremely stable, so someone who learnt LaTeX 20 years ago, like yours truly, can download the newest TeX distribution and feel at home immediately.
Nevertheless, I would prefer a Markdown-based system that can use CSS and MathML, and has a 100% bibtex clone for references.
Yes, pandoc goes quite a long way along this route, but setting up such a pipeline is still too complicated for many.
It must depend on the field. A close relative of mine is a PhD advisor in a science field. He's hands-off about it, but is also aware of what his students are doing. If asked, he recommends MS Word, which is also what he uses for his manuscripts.
My own experience was as a physics student, 30 years ago. Students paid a heavy price for being able to print and submit the entire thesis with no manual intervention. The students who chose LaTeX took the longest at it. I didn't have access to a Unix terminal anyway, and banged out my thesis on an MS-DOS machine. Whatever my word processor couldn't support, I added by hand. The readers were OK with this.
My solution to all typographic problems was "take care of it after defense." I spent a few days after my defense getting my copy to be ready for duplication, including sticking all of the page numbers on with glue because I couldn't make inline figures work.
Sure, one can write a thesis in MS Word. It has come a long way with support for large documents. But I still find its referencing clumsy, opaque and unstable.
For example, automatic updates of figure numbers in captions and references: Countless times it failed on me and I had to manually recreate the fields, bookmarks, cross-references, and whatnot is needed.
Bibliographies are hardly doable without an external tool that comes with its own headaches.
Typography in MS word is quite decent these days, though. Anyway, the content of a PhD thesis shouldn't be judged by its typography (as long it maintains a readable standard).
I think things have changed a bit since you were a physics student. Conferences hand out latex templates and expect you to use them (wish they would also hand out an overleaf template. If any conference organizers are reading this...). Universities also do this with their undergrad/masters/thesis templates. Arxiv expects you to upload tex source code (it'll reject a PDF if you wrote that PDF with latex. It also is terrible at error messaging which is a huge pain since submission timing is for some stupid reason important). I'm sure latex is also easier than back then, but there's a lot of momentum in the latex direction that I think would be really difficult to undo. Even paper acceptance is highly influenced by formatting and figure design. I think it is just a different world as we have a lot more researchers now than even 30 years ago.
Amusingly, some things haven't changed. I was the first student to turn in a word processed term paper at my college, I think in 1983. And I estimate that I earned as much as a full letter grade on my GPA because the prof's had never thought about how to grade a paper that was 100% mechanically perfect. It didn't hurt that I had become a very fast typist thanks to programming. I selectively chose courses where the grading was primarily based on written work, something that most students feared.
I'm sorry, I'm failing to realize why this story is about how things have/n't changed w.r.t. word/latex usage withing the last 30 years in academic writing.
Your comment about formatting and figure design influencing acceptance, triggered my droll little reminiscence. I certainly wasn't disagreeing with you.
> If asked, he recommends MS Word, which is also what he uses for his manuscripts.
My university actually required that people use MS Word for their thesis, which seemed to work out okay for many, despite such a top down approach not seeming like the best option.
Personally, I used LibreOffice anyways and while it was certainly as clunky as Word (especially once images, diagrams and formulas got involved), it was also passable.
LaTeX has, like Org Mode, this mythical aura of being super hard. However, replicating the functionality of Word is trivial and takes an hour or two for a savvy computer user to grasp.
There's always Overleaf, Pandoc or LyX to make things even simpler. LyX in particular deserves to be better known.
Complex things, like TikZ, are of course difficult and time consuming. But those are impossible using Word.
IMHO, the biggest advantages of LaTeX are reproducibility and reference management. Big Word documents are quite fragile. And reference management is a mess.
Honestly, it isn't the writing part that annoys me the most. It is tikz and the fact that I can't make animations in beamer. Just resolving these issues would go a long way for me. Tikz could be fixed simply if there was a GUI that could allow for sliders or moving specific objects. Or at least a better way to make a good grid (tip: draw a grid on your canvas, draw whatever you want, remove grid). Things are so difficult to properly line up, even if we have mathematical representations. It shouldn't be that hard...
While it does not directly address the issues you point at, it does alleviate some issues.
* The syntax is somewhat easier to parse.
* It is a lot easier to write functions to redraw the same components over and over again.
* Doing math calculations to systemically place objects in relation to each other is a lot easier because python's arithmetic syntax is a lot more intuitive than TeX's.
Of course, this does mean that you have to fire up python to draw figures.
Since it looks like you've contributed to this project, I have one MAJOR suggestion. Show examples. There's countless tikz projects I've seen that promise a lot and show absolutely nothing. I've invested lots of time to fruitless ends that would have been resolved if I just could see some examples that would show me if this is even in the right ballpark of what I'm looking for or not.
Examples code is a must for any software (test cases work as examples btw) and example graphics are a must for any graphics software. I know this isn't your fault, but I'm just venting a bit here. I'll check this out the next time I'm writing a paper but I just don't get how people can put software out there without any examples (even toy examples). If you have any examples you're willing to share I'd find that extremely helpful.
Otherwise, this is just a python interface for tikz. Whatever is possible in tikz is possible in this. For every command in tikz, the corresponding python command is here https://allefeld.github.io/pytikz/tikz/
However, I do agree that more examples would help. In its current form, its mostly useful to people who are already familiar with tikz and need simpler syntax, and can quickly get up to speed with this.
Mathpix Markdown is an attempt and bringing together the best of words (Markdown and LaTeX) while providing excellent interoperability with LaTeX, meaning you can easily export your Mathpix Markdown documents to LaTeX, including equation references, tabular environments, images, etc:
I tried MyST recently. All I see is a markup language that slowly become more and more complex over time to support more and more features that LaTeX already supports while at the same time acquiring the same syntax complexity of latex.
What people don't acknowledge is that there is a base level of syntax complexity needed to produce fully general documents. If you do, the natural conclusion is that to fix latex, you need a full rewrite of latex with minor changes to fix all the inconsistencies that have crept into it.
+1 for quarto, i wrote my thesis in rmarkdown which flipped easily between latex and html output, with a bibtex referencing system. It also allowed you to inline latex for more complex outputs. And inlining calculated tables and charts meant i could keep my writing and code together. Quarto is the successor.
> Nevertheless, I would prefer a Markdown-based system
My free, cross-platform desktop Markdown editor, KeenWrite[1], integrates with the ConTeXt typesetting software[2]. I'm working on a branch to make integration containerized[3] because its installation is painful. KeenWrite limits math to plain TeX[4] so that the output can be rendered using any TeX-based typesetter (ConTeXt, LaTeX, MathJax, εχTEX, etc.).
Here's a sample document typeset using ConTeXt (skip to page 40 for the math):
Adding CSS mixes presentation logic with content, which is something KeenWrite strives to avoid. Instead, KeenWrite implements Pandoc's annotation syntax to keep presentation logic out of the content. I've written about this extensively in my Typesetting Markdown series[5].
You can produce some pretty amazing documents just with annotations, such as the following that I wrote in Markdown and typeset using ConTeXt:
Markdown fails at references. At some point, I'd like to implement cross-references in KeenWrite. Except there's at least six competing standards for the syntax, which I've also remarked upon[6], making the choice of syntax difficult[7].
> setting up such a pipeline is still too complicated for many
FWIW, my Typesetting Markdown series, which explains how to set up a typesetting pipeline using Pandoc, is one of the reasons I developed KeenWrite: to replace that entire pipeline (R, Markdown, externalized variable interpolation, math, and typesetting) with a single tool.
Parsing ASCII simply takes time. What you describe sounds like a use case for SQLite. Parse once when building the database. When indexed properly, searching should be much faster.
> ... desires of the state. The dollars come from somewhere and there are always strings attached.
Is this suggesting that the public funders influence the outcome of research? It is not something I have ever witnessed. They sometimes may seek to take influence on the direction of a research programme, especially when the performance is below expectations. but I have yet to see an example where a public funder has attempted to influence research results.
> I think we’d quickly learn that academia of today is not at all what it presupposes itself to be
What does it presuppose to be? Academia of today is 1) a place of teaching and knowledge dissemination, 2) a place of research and knowledge creation, 3) a multibillion dollar business that sells tickets to successful professional lives. The latter leads to overinflated self-marketing, and unfortunately this affects how research results are communicated.
Those who hold the "purse strings" have little interest in influencing research outcomes. Rather, they care about the reputation of their course (students), their own reputation (alumni), recruiting talent and/or outsourcing research (businesses), or actual research results (public funders).
This is at least the case in most institutions in the US and the UK, and most of western Europe.