Not sure the language you choose matters as much as making the API usable by a wide audience. Sure if performance is a real issue then rust makes more sense than JS but I’m not sure that’s going to be hugely meaningful in most use cases.
I’ve never been a fan of Latex despite writing some mammoth documents over the years. Latex always felt like a beast for academics not for business. Yet there’s often things I wanted to do consistently in Word etc. that have never been easy.
Styles can easily become a muddle. Having consistent numbering and bulleting is a pain and errors can easily creep in.
Tracking changes becomes a real problem when you get into many revisions and that often always ends up relying on a level of trust between parties to not override the tracking. I think there’s a killer app in just fixing this issue with a product that guarantees that guarantees all changes are properly shown from the start of a process to it being fully approved by all parties.
Businesses, lawyers etc would love that stuff. Heck if you sprinkle blockchain in you might even get easy funding but I think it’s more of a basic cryptography thing than a blockchain thing - at least it doesn’t need that level of complexity.
A moment of imagining how that would influence the market value of certain skillsets should easily cure you from that surprise ;)
On the other hand, legal systems have effectively been doing the equivalent of git since basically forever. There have been very few law books written from the ground up. All other law authoring, be it by kings, priests, dictators or parliaments, was I the form of diffs to an existing codebase.
The Common Paper app[1], though not quite a git workflow, has always struck me as being pretty close to how an software engineer might approach contracts:
1. An immutable set of standard terms, with variable references.
2. A collection of cover page variables, that modify the standard terms by reference.
3. A structured negotiation workflow, where users "propose changes" to the cover page variables with automatic "diff-ing" (redlining).
It's not a product targeted to software engineers, but has always appealed to me as a way to sneak in some engineering best-practices into the world of lawyering :)
Nisus Writer Pro [0] has been around for 40 years this year IIRC (IANANWP) and has a user base who can vouch for many of the features that HN readers want something to offer.
One of the interesting things that I discovered while working on some legal papers with a lawyer were that legal documents don't have copyright protection. Lawyers regularly copy and paste from other lawyers work. I suppose, that since legislators most often have backgrounds as lawyers, they legislated rules for themselves that are not the same as the rule the rest of us have to follow.
I am not a lawyer so I don't know that anything in the previous paragraph is true; it's just based on a recollection of something I was told once a long time ago.
1. That's an overstatement: For "original works of authorship," copyright happens automatically upon "fixation" in a "tangible medium of expression" (e.g., saving to a file, maybe even just typing). [0] And it doesn't take much "original ... authorship" to qualify for copyright protection.
2. Here's A hypothetical example: Alice drafts a contract from scratch as Version 1 and saves it to a file. It's copyrighted; on these facts, Alice owns the copyright. [0] Then Bob takes Alice's Version 1 and modifies it to create Version 1.1: Bob's "original" contributions to Version 1.1 are themselves protected by copyright, which Bob owns, bu with two caveats:
(a) Bob has no claim to copyright in Alice's Version 1; and
(b) Bob's own contributions to Version 1.1 won't be protected unless one or both of the following is true: (1) Bob had Alice's permission to base his "derivative work" on Alice's Version 1; [1] and/or (2) Bob's use of Version 1 qualified as "fair use" (a complicated question in itself). [2]
Style and substance separation is easy and should be a requirement. Legal is pure text "programming" and what I mean is that the style of the text has zero bearing on the judicial process.
The benefits of working at the proper level of abstraction compound. It enables tech like diffs and git, which then nicely solves a bunch of other problems as well. Using Word completely side-steps all those benefits. Sure, you get a few nice buttons, but that's literally it. You are trapped forever with no way forward.
This feels like actually programming in Word and manually highlighting comments to be green or something. It's a travesty IMO.
Of course this isn't and hasn't been true for quite some time. I'm the first to blast MS Word for being a total disaster (esp. templates, ie. style/substance separation, are bad) but it is no longer a locked-in platform. Even the docx format is only a zipped XML file. If you want, you can unpack the document file and put it into git. Thank you Open Document Foundation!
On top of that, all contemporary word processors I'm aware of have, of course, versioning with diffs. It is just different than git (or other programmer tools.) Just as you are using your tools of your trade and don't know much about MS Word, lawyers use their tools of their trade and don't know much about git. It's like saying that editing POs is superior to Trados, because for a programmer it is but a professional translator is going to tell you a different story.
(Of course, everybody everywhere should be using LaTeX for fine-looking documents in all circumstances. No argument here ;))
My point is not that. Sure, you can go from Word to OpenOffice. Great, now you manually highlight your code in that..
It’s a deeper thing. You can hack Word and related tools for coding and eventually it is acceptable I guess, but it’s starting from the wrong foundation.
This ladder will never reach the moon.
Word’s diffs are not “just different”. they are objectively inferior in many ways. I personally witness daily the travesty of government staff’s handling of information.
Word is a fancy digital typewriter and IMO it’s the wrong abstraction for this day and age and cultural issues are the only thing keeping us back. As always.
Edit: academic papers looking like they were written on a 19th century typewriter.. I don’t get this fascination with style, from scientists of all people. Lay down the info, provide the data. Kerning your fonts properly.. oh my god, I need to cool down. I am a hot headed type of guy, sorry about that.
Hey, thanks for the reply. From what I read, I think you think that language use is some kind of coding with words where you have a deterministic relation between input and output. You seem to treat the semantic content of a statement as if it is somehow static, objective, and oberservable. I don't think it is and I'm in good company on that matter.
That being said, that's just my reading of your comment and I could be wrong, which is kind of my argument here. If I'm right lawyers don't care about your notion, they use language for something different than mere information encoding. Therefore they need tools that support their use case. MS Word (or word processors in general) might not be the best tool for that job, but it is good enough. Integrating a well trained ChatGPT into MS Word will help lawyers much more than any structured entry form ever could.
BTW, the LaTeX quip was intended to make light of the idea of separating content and style, which goes way back. Consider TeX' age. Your reaction tells me, you think LaTeX is a styling tool, which in a sense it is, and that's what it is about, which it is not. Hordes of scientists (and type-setting professionals) argue in favor of LaTeX (or other type-setting systems) because you just write the content in plain text. LaTeX takes care of the style. TeX files are also just markup and easily git'able. It does make life easier, but it is not as important as some people make it out to be.
Thanks for indulging me. I know I am yelling at the clouds.
I also know people usually misunderstand me because I am a “programmer” and all I see is “code”. I guess that’s fair enough, but I fully understand legal being of a completely different nature from Rust.
What I also understand is that no matter how long everyone argues about it, the only thing that matters about legal is the text. The font, the styling, etc is all secondary. It might be important, but it’ll never be primary. Unless courts start judging differently based on page margins I guess.
The same goes for science. Publishing “attention is all you need” in an 8bit NES font might not be fashionable, but it does not and cannot detract from the discovery within it. LaTex produces the exact same documents (I know it is configurable but we are going for a certain style) and that’s what this is about. Not how the tools work but that we fundamentally even care about it instead of focusing on the primary issues like correctness, openness, accessibility. I’d like academic papers to be APIs actually.
Again I see the importance of styling and appearance in general. It’s just that we start with that and I think that’s problematic and actively harms our progress.
Also, to conclude, I am nitwit. This is just my take.
Edit: A man can dream, right? If a paper was plaintext I could typeset it last minute in 8bit NES fonts if I’d be so inclined. I hate ya’ll deciding how everything looks and works. I know that’s technically challenging, but to me that’s where the progress is. An academic paper like, say, a jupyter notebook would be awesome, not? Would you give up your fancy type setting? I would!
If you are a nitwit, I'm one, too. Don't worry. I think I get your take. You say the important part of legal and scientific texts is their content, not their form. And I agree. But that is not where we started. We started with (paraphrasing) "programmer tools like git are superior to MS Word, therefore lawyers should use git." There, I disagree.
This 100%. It does get interesting when you get into non-plaintext things that have to somehow integrate into plaintext systems (git managed codebases). We've kind of left it up to CMS systems to handle the non-plaintext bits but this leads to many more orthogonal process problems.
IMO, I think it really comes down to finding a universal mechanism for diffing and 3-way merging things that aren't plain text (document diffing). I think distributed version control can be universal (at least on a data level), how an application renders a meaningful diff for a specific task is incredibly subjective to the document type and task at hand. My point being that I completely agree that plaintext makes a whole lot of sense for programmers and pretty much nobody else. However, distributed version control does not have to be confined to plaintext, it's just tricky to see when all the version control systems we're familiar with are plaintext ones.
Git is popular because it's linear, and the linear paradigm usually translates well to serial things such as programs, instructions, document sets, etc.
It's actually bad at non-linear stuff, which you will have noticed if you have ever been working with hierarchical formats, especially e.g. xml or nested JSON.
Word is bad for a whole litany of reasons, but the reason it can't be easily versioned (atop the format being a literal Goldberg machine requiring inane transforms to properly) is that it encodes a bunch of non-linear formatting instructions. Sure, we can sort-of reason about this stuff e.g. with a hierarchical css+html+js structure, but without a way to render that I challenge you to be able to simply diff that information. Seeing "bold" or "blue" seems simple enough, as long as you also know to which elements it applies and in what layout. So, suddenly you can't reasonably diff the css file without also difficulty the html.
For programmers, we are used to reducing things by their dimensions into fairly linear spaces, this then helps us reason fairly linearly about changes, but doing this from any other context is challenging. Lawyers e.g. perhaps focus on the relations between various clauses, so linearizing their document flow is not very important to them, at least when there exists methods to diff the general textual content without investing much in how they are doing that.
As programmers we see the similarities to editing a code base and that excites us, however we do have a tendency to go off and write frameworks to parse and simplify these things, without ever actually bothering to learn to apply these things. This is not invaluable, but it's a different focus, which maybe explains why lawyers are not in the habit of using git.
> Sure, we can sort-of reason about this stuff e.g. with a hierarchical css+html+js structure, but without a way to render that I challenge you to be able to simply diff that information. Seeing "bold" or "blue" seems simple enough, as long as you also know to which elements it applies and in what layout. So, suddenly you can't reasonably diff the css file without also difficulty the html.
We’re in complete agreement. But you can do this, you just need to provide a “renderer” and a schema that describes how your tree structure should merge or conflict. If you want to test out a weird version control for structured data, my email is in my bio.
You are correct. But this is a culture issue. Culturally legal folk don’t see what they do as programming so they use different tools and work their processes their way. This is advantageous to outsiders who see through this! ;)
I think the main benefit would be to be able to represent yourself in court... otherwise there currently exist certain pratical and ethical hurdles to capitalizing into this, such as passing a bar exam (non-trivial), providing credentials, operating in the best interest of your client/society...etc.
We used to have word processors that exposed mark up. I wrote immense amounts of documentation in Wordstar on 8 bit machines and it was definitely more efficient than the WYSIWYG word processors that came later and faster even when the newer ones were running on much faster hardware.
Something like Wordstar would be better than MarkDown.
Markdown isn’t detailed enough for legal stuff. Internal references, tables, complex section numbering require extensive post processing or simply don’t work. You quickly wind up with a lot of hidden magic that frustrates people used to word.
Last time I lost patience with doing Legal stuff in word and evaluated alternatives, I was most optimistic about Asciidoc. Unfortunately the ecosystem was relatively anemic… the strong syntax was limited by the tooling.
Looks like there’s been some improvement, maybe I’ll try again. There’s a nice new homepage at least: https://asciidoc.org/
The IntelliJ AsciiDoc plugin is a little juwel with all bells and whistles, Syntax highlight, Preview, structure view, even refactoring of references. We use it together with Antora.
Lawyers have gotten rid of secretaries and so spend time (and billing) futzing around with document formatting, fonts, margins, bullets, numbering, autonumbering and the like.
To me, that just calls for a sensible UI with attractive styling and interaction over a git backend for the heavy lifting of tracking changes through time.
A lot of successful products have been built in this way. I've seen developers get upset with Apple for making successful products out of just giving a nice UI to a piece of open source tech that does the heavy lifting. Like it's cheating.
You should research what happened with distros and UI systems...open source was building lots of nice UIs and you could even have them on Windows/Mac, but the was constant drama over the "right" way to code a UI framework...which led to a ton of fracturing and leaning back towards minimalism (because the nice stuff was always very heavy).
This even happened with Microsoft, they had so many false starts and changes in messaging that they killed their own portfolios. I suspect at least that is why they "embraced linux" because it was excellent at web, and web wasn't busy changing every month (it has been, but that's a different story).
Apple introduced Swift but besides new Xcode versions I get the general impression their tooling has been far more stable.
Actually the little known "Review" feature of word allows to visually track approve/reject/comment collaborative changes over a document in a really user friendly way, not need for git here
The standard for legal docs is to redline changes with an additional tool, because you don't necessarily trust the other contributors. They have decent tools for this, and the system works ok I suppose. Editing tends to be in tic-toc fashion anyways so I guess it works. You could do someting like this with git and a markup language, but I don't know you'd convince many lawyers that the squeeze was worth the juice.
I don't know the names. They all seem to have the pro Acrobat stuff, but more often use something also bundled with search tools perhaps? Communication between lawyers on opposite "sides" often seems to be by PDF, not source (although sometimes that too) so I imagined they both have working docs kept separate because they don't want to to share some of the markup/comments. I asked one of them about that they claimed that using clean pdf output (no metadata or history) was worth the extra hassle as it avoided costly errors.
Anyway that's my limited experience having dealt with a bunch of them - no expert.
It's never pdf. You can't easily make corrections on a pdf, never mind major revisions (such as moving sections around). If someone sends me a pdf I ask for a Word document, or convert the pdf to Word myself. Sending someone a pdf is a little like saying "fuck you."
Confetiur.
Collegial lawyers don't send each other pdf's. They are impossible to mark up. One "innovation" I have seen is with banks. They do not want their employees to be creative; edge cases don't exist, everything is binary.
So the banks issue grids/tables containing a list of questions. The answers are found in the corresponding place on the table. Imagine 7 columns, all containing binary answers: "yes" or "no."
Except, everything is not binary.
So the 8th column contains, I dunno, 500 words, a minitable, etc., running over page after page. The other columns on these runover pages are blank.
And these are all pdf's.
> Not sure the language you choose matters as much as making the API usable by a wide audience.
Fully agree with this, and having typeset my masters thesis and later my resume using LaTeX, I think that the “authoring experience” is definitely the place to focus on improving, LaTeX just takes too damn long to get something good.
If you’re interested in the “markup to document publishing” space, you might also be interested in the open-source report publishing tool I’m now working on, Evidence.dev (https://github.com/evidence-dev/evidence).
It’s similarly based on markdown, though uses code fences to execute code, HTML style tags for charts and components, and {…} for JavaScript, i.e.
---
title: Lorem Ipsum
description: dolor sit amet, consectetur adipiscing elit
---
```sql petal_vs_sepal
SELECT
petal_length,
sepal_length
FROM iris_dataset_table
ORDER BY 1 DESC
```
<ScatterPlot
title="Petal vs Sepal Length"
data={petal_vs_sepal}
x=petal_length
y=sepal_length
/>
The longest petal in the dataset is {petal_vs_sepal[0].petal_length}.
Our design philosophy here is that the rendered documents should be beautiful by default, but highly configurable so you can get pixel perfect results.
We’re also aiming for first class output options for desktop, mobile, PDF and image export.
I always disliked that it was so difficult to interact with Word if you wanted to create automated documents. Instead I'd love it if there was a developer-first experience to create standardised documents from nice looking participation certificates, invoices, memos, documentation up to multi-tome histories.
Not sure the language you choose matters as much as making the API usable by a wide audience. Sure if performance is a real issue then rust makes more sense than JS but I’m not sure that’s going to be hugely meaningful in most use cases.
I’ve never been a fan of Latex despite writing some mammoth documents over the years. Latex always felt like a beast for academics not for business. Yet there’s often things I wanted to do consistently in Word etc. that have never been easy.
Styles can easily become a muddle. Having consistent numbering and bulleting is a pain and errors can easily creep in.
Tracking changes becomes a real problem when you get into many revisions and that often always ends up relying on a level of trust between parties to not override the tracking. I think there’s a killer app in just fixing this issue with a product that guarantees that guarantees all changes are properly shown from the start of a process to it being fully approved by all parties. Businesses, lawyers etc would love that stuff. Heck if you sprinkle blockchain in you might even get easy funding but I think it’s more of a basic cryptography thing than a blockchain thing - at least it doesn’t need that level of complexity.