The Future of Markdown

blasdel · on Oct 26, 2012

John Gruber's original Markdown.pl is one of the worst small programs I have ever read, completely riddled with outright bugs and misfeatures that continually bite its users in the ass. It's awful even by the already low standards of hand-written many-pass regex-based spaghetti-parsers.

Nobody should be using the original script, and unfortunately many of the other implementations out there are direct transliterations that replicate all of its absurd errors, like where if you mention the MD5 hash of another token in the document, the hash will be replaced with the token, because it uses that as an inline escaping mechanism! Reddit got hit with a XSS virus that got through their filters because of it: http://blog.reddit.com/2009/09/we-had-some-bugs-and-it-hurt-...

See the changelog for what started as a PHP transliteration and turned into a rewrite that squashed 125 (!) unacknowledged bugs: http://michelf.com/projects/php-markdown/

The worst part is that he outright refuses to either disclaim or fix his implementation, and so far he's repudiated everyone else's attempts to do so. He's a terrible programmer and a worse maintainer, he really still thinks the documentation on his site is comprehensive and canonical. As much as Jeff Atwood leaps at every chance to play the fool, there's no way his directorship can be anything but an improvement.

dandelany · on Oct 26, 2012

I'm so tired of this mentality that says basically, if you release something for free on the internet, you are obligated to maintain and support it for the rest of your life. Gruber created this program, for free. You are under no obligation to use it. Don't like it? Here's your money back. It may be true that the code is shit. If you think so, don't use it.

Like other responders, I worry that this mentality causes fewer coders to release their projects, for fear of backlash like this post. Think about it: Your feelings toward Gruber are incredibly negative and hostile, and in fact, you would have better feelings toward him if he had kept Markdown to himself and never released it at all. Does that seem fair to you? If the ill will generated by people like yourself outweighs the good will generated by those who appreciate the code I release, or if I fear that it might, what motivation do I have to release my code?

alanh · on Oct 26, 2012

The problem is not that Gruber doesn’t want to maintain Markdown. If that were it, perhaps it would be easier to move on without him.

It’s that he thinks the best option is to do nothing. He claims the title of BDFL without playing the role.

See his first reply to the Markdown mailing list in nearly three years: http://six.pairlist.net/pipermail/markdown-discuss/2012-Octo...

blasdel · on Oct 26, 2012

You have it absolutely correct

He enjoys the credit for being the creator of something used by millions every day, but is entirely unwilling to take the responsibility that comes with that creation being a very public mess.

It works for his usage, but all the ambiguities and undefined behaviors affect a huge number of people, and his only response he's made for eight years has been to retain sole moral authority and refuse to use it.

grey-area · on Oct 26, 2012

It works for his usage, but all the ambiguities and undefined behaviors affect a huge number of people, and his only response he's made for eight years has been to retain sole moral authority and refuse to use it.

You gain moral authority by actually doing something. If you use markdown, and wish to propose a better markdown than markdown, go for it, I'm sure lots of people will be pleased, but bear in mind that lots of people will whine about any bugs as well, and expect you to work for them for free and spend significant resources and time fixing edge cases which don't matter to you personally, just because you released the source.

If you're ready for that though, of course go ahead and create a better markdown, or help with this proposed spec, Gruber certainly can't stop you. It doesn't really matter what Gruber says, what trolls on a mailing list say, or what you say on this forum about his responsibilities, whether he is self-appointed dictator for life etc (though I'd dispute that), what counts is putting the work in, which is often surprisingly difficult - far more difficult than criticism.

matthewowen · on Oct 26, 2012

Of course you can do that. I think people's problem is with Gruber's unwillingness to lend any support to those efforts to create a less buggy implementation.

Right now, the canonical markdown implementation isn't very good. It's hard to change what the canonical implementation is without Gruber's support. It doesn't even require much from him - just his blessing.

He isn't under any obligation to do this. No-one thinks he is. But it would be a good thing for him to do.

grey-area · on Oct 27, 2012

Frankly, I don't think what Grubber says matters here, and I see why he wouldn't want to engage all the trolls who are enraged by his inaction. Markdown works for him and he prefers it simple if somewhat vague/buggy, if you need something different, build it.

If someone wants to write a spec, a better parser or a better markdown full stop, there is absolutely nothing to stop them. The problem is, people would rather bitch about why no one else is doing anything and complain about gruber than actually do the hard work necessary. Do the work first, then look to gruber for canonization if you must, though by that point his views on the matter would be irrelevant.

ceol · on Oct 26, 2012

I don't think they're asking for lifetime maintenance and support. I think they're asking that if the author is aware of bugs and exploits, they should at least make the small effort to alert users who are still downloading their code.

If the exploits are as well-known as the grandparent comment asserts, and Gruber is aware of them, there really is no excuse for him to leave the code up without any warning that it contains known exploits. However, if he has no idea and everyone is assuming he knows without someone telling him, that's not exactly fair.

nikatwork · on Oct 26, 2012

It is far easier to destroy than create. Or to put it in the vernacular of the times: haters gonna hate.

tptacek · on Oct 26, 2012

What an embarrassing post to be occupying the top of this thread. Blaming Markdown.pl for security flaws? I suppose the memory corruption bugs in the "optimized" C Markdown parsers are somehow his fault too?

He wrote a text-to-HTML parser with a particularly elegant little language design and got on with his life, which involves writing more than keeping up with bug reports in Perl scripts. Get over yourself; comments like this make us all look bad.

cynicalkane · on Oct 26, 2012

The punchlines were this:

unfortunately many of the other implementations out there are direct transliterations that replicate all of its absurd errors

he outright refuses to either disclaim or fix his implementation

This is important to know if you are interested in Markdown.

Personally, I encountered edge cases almost as soon as I started using it.

gordonguthrie · on Oct 26, 2012

Except that the source code specifically tells you to report bugs to him.

We all write code with bugs and flaws and we sometimes release it online.

'Fix or deprecate' is not an unrealistic obligation on a technology journalist with a public persona and a large readership.

adgar2 · on Oct 26, 2012

> What an embarrassing post to be occupying the top of this thread. Blaming Markdown.pl for security flaws?

I believe markdown.pl is being blamed for over 100 bugs. Not just security flaws.

> I suppose the memory corruption bugs in the "optimized" C Markdown parsers are somehow his fault too?

Strawman, you're better than that.

> He wrote a text-to-HTML parser with a particularly elegant little language design and got on with his life

And he did a horrible job of it. Horrible. But he considers himself the BDFL of Markdown. Break that down for me.

> which involves writing more than keeping up with bug reports in Perl scripts

He clearly can't keep up with any bug reports, so it's good his life is more broad than bug reports.

> Get over yourself; comments like this make us all look bad.

No, comments like this make us look like we have higher expectations than "it worked on my machine, suck a dick!"

philwelch · on Oct 26, 2012

> And he did a horrible job of it. Horrible. But he considers himself the BDFL of Markdown. Break that down for me.

Christ, you're being a dick. All John Gruber did to you was design a minimalist markup language and write a quick-and-dirty proof-of-concept Perl script to implement it. Just use a better implementation and get on with your day.

Confusion · on Oct 26, 2012

If that was all he did, it would be fine. But it isn't. His website still encourages people to use his script and his specification, even though they are known to be buggy. If you publish something on the internet and it turns out be wrong or defective, you have a moral obligation to point that out, especially if better alternatives are available.

jopt · on Oct 26, 2012

> If you publish something on the Internet and it turns out to be wrong or defective, you have a moral obligation to point that out

No you don't, that's insane.

Confusion · on Oct 26, 2012

Is it insane to try and keep the good stuff from disappearing under the garbage? Have you ever searched for something, only to come across a multitude of pages that were just incomplete, wrong or otherwise useless? A search term for which the gems are buried under so much manure you need all your Google-fu to find the gem? How do you think this will play out in the years to come?

People like Gruber, with an audience, a following, should set an example. If his code has bugs and he is informed of those bugs, he should take a few minutes to list those bugs. He doesn't have to solve them. He doesn't even have to point people elsewhere. Just listing them is enough and saves a lot of people a lot of time. If you can't be bothered to do that, please take your code down: it is nothing but pollution, keeping us from finding the better code.

olalonde · on Oct 26, 2012

This submission, which suggests forking or standardizing Markdown, has currently over 400 points. My guess is that "just use a better implementation and get on with your day" is not a good enough answer for many of us, including Jeff Atwood and David Greenspan.

Magenta · on Oct 27, 2012

>> But he considers himself the BDFL of Markdown

Well he needs to be something other than a pathetic apple fanboy! :D

cdmoyer · on Oct 26, 2012

This would be a lot stronger argument without the ad hominem bits. He's obviously not so terrible if he created this thing that has people so up in arms.

I think there's something to your post, but the tone makes me want to dismiss it. I know, stupid emotions.

This internet lynch mob mentality... I wonder how much this discourages people from releasing things. So, Gruber releases markdown.pl. People like it. People love it. People use it, people reuse it, people rewrite it. Next think you know, he's being insulted on the internet because he released something he wrote to serve his needs and not passing on some sort of figurative mantle or blessing.

w1ntermute · on Oct 26, 2012

> He's obviously not so terrible if he created this thing that has people so up in arms.

Oh please, double standards like this disgust me. Microsoft had shit slung at it for years on end by the tech community because IE was terrible and held back innovation on the web, but no one claimed that IE is "not so terrible" just because everyone cared about it.

The difference with Gruber is that he's a darling of the tech community because he's Apple-anointed nobility. But as a programmer, in my eyes (and in the eyes of any other objective observers) he's absolute shit.

erikpukinskis · on Oct 26, 2012

The reason Microsoft is dead to many developers (myself included) is that they used their massive corporate power to shut down good startups making cool stuff.

I didn't care that IE was terrible (until v3 or whenever), I cared that Microsoft went to all the major PC manufacturers and told them that their licensing deals were toast if they preloaded Netscape.

I didn't care that Word was a crappy word processor, I cared that they used their market position on office documents to make minor incompatibilities that prevented WordPerfect from interoperating.

I didn't care that Windows file sharing wasn't half as good as NFS, I care that they continually fucked with the SMB protocol so that no one could sell UNIX machines that could share with Windows networks.

It has been an absolute pleasure watching that Microsoft's power over device makers disappear. The world is better off for it, and Microsoft will always be an asshole in my book.

angersock · on Oct 26, 2012

Well thank god we don't have any platform owners today making changes that screw over developers.

Tloewald · on Oct 26, 2012

There's a differenc between trying to maintain app store policies that strike a balance between security, end user interests, developer interests, profitability, etc. and intentionally putting bugs in your OS to break third party products, stealing third party products and building them into your OS, or bundling free products with your platform or office suite to drive third parties out of a market.

angersock · on Oct 26, 2012

Agreed that the difference exists--that says nothing though about who's practicing what these days.

squidsoup · on Oct 26, 2012

No, the difference is that Gruber is a person whereas Microsoft is a corporation. Corporations aren't subject to the same social mores - it isn't hurtful to say something nasty about a corporation.

Abandoning basic etiquette that you should have learned in primary school and calling someone "absolute shit" is not cool.

skeletonjelly · on Oct 26, 2012

While I agree that calling somebody that is not cool, I think you'll find that product owners at corporations do take it to heart when harsh criticisms are called out. They identify with their output.

blasdel · on Oct 26, 2012

The thing is this isn't something new, I and a number of other people have been enraged by this for eight years now.

Over those years it's grown in usage exponentially, and so has his fame as a sportswriter for team Apple. Throughout that period he's continued to brush off all kinds of attempts to clean up bugs, define ambiguous behavior, or fix the security vulnerabilities. It hasn't mattered what approach people have taken, he just does not give a shit. Here's an example from last week: http://six.pairlist.net/pipermail/markdown-discuss/2012-Octo...

He's spent so long burning off any goodwill I would have for him on the matter, being cordial just isn't a priority anymore. NERD RAGE.

tptacek · on Oct 26, 2012

You've been enraged by the handling of a text-to-HTML converting Perl script written by a tech writer?

You're right: he just does not give a shit. I can see why. What possible upside could there be to engaging with someone who handles themselves like you are here?

stickfigure · on Oct 26, 2012

You've never been enraged by poorly written tools or technologies? There have been times I wished I could reach through my monitor and throttle some well-meaning idiot for wasting hours or days of my time.

Imagine that you're a mechanic working on a motor that uses the ProprietaryNew fastener (hex heads? so 20th century!). Unfortunately, the only ProprietaryNew wrenches are made out of cheap metal, and the sockets strip with regularity. You can't imagine saying a few nasty words about the parentage of everyone in ProprietaryNewCo?

jopt · on Oct 26, 2012

Gruber has a long-standing money-back policy

icebraining · on Oct 29, 2012

What about a time-back policy?

jarek · on Oct 26, 2012

You have not yet attained the Zen of HN :(

Tyrannosaurs · on Oct 26, 2012

Actually I think he does give a shit, he just feels that things could be a lot worse than they are and tinkering isn't necessarily going to improve things. Given how widely used it is there is some merit to this argument.

From a comment on Twitter yesterday in response to someone, it doesn't seem that Gruber is particularly interested in engaging with this (though I may be wrong). If that is the case I suspect we'll see a forking of the project and we'll get to see whether a committee will do better.

It may well do - Jeff has a good track record at getting things done and is well respected and well liked - but personally I wouldn't put my mortgage on it because as well intentioned as these things are we all know how design by committee usually turns out.

bigiain · on Oct 26, 2012

"Enraged" huh? For "_eight years_"?

Care to show us your alternative? It'll be on Github or GoogleCode, or maybe your personal blog, somewhere we can download it, try it out, and criticize it too, right?

Surely 8 years of rage is enough encouragement to write your own replacement for ~1000 odd lines of Perl?

Or by "enraged", did you mean "annoyed enough to write critical posts on random internet sites, but not motivated enough to spend an evening or two solving the problem myself"?

"NERD RAGE" indeed…

blasdel · on Oct 26, 2012

There are many great reimplimentations, another would only compound the interoperability issue.

The real problem cannot be solved without some forward action on his part. He has refused over and over again.

bigiain · on Oct 26, 2012

"The real problem cannot be solved without some forward action on his part."

Sure it can - you write something better, then get everybody who's already using Gruber's version or one of it's presumably also-flawed reimplementations to switch to yours. Or, is "8 years of enragement" really just keyboard-warrior-hyperbole on your part and an excuse to criticize someone who's achieved widespread adoption of some code you claim is 2nd rate, but which haven't bothered to improve or replace?

Besides, it sounds to me like David Greenspan has come up with a fine solution.

jopt · on Oct 26, 2012

This is bot a privilege issue. Nobody appointed Gruber. Just do a better job and you'll be fine without his permission.

SCdF · on Oct 26, 2012

> He's obviously not so terrible if he created this thing that has people so up in arms.

Being great at designing a format and writing code are two different things: one can be great at once while being terrible at the other.

blasdel · on Oct 26, 2012

The perceived simplicity of the format (driven by the naivete of the implementation) played a significant role in making it popular, but lays a minefield of bugs and ambiguities for implementors especially if they want any combination of sanity and interoperability.

gordonguthrie · on Oct 26, 2012

Wisest thing in this thread.

It is a great format.

The original parser (and specification) has serious problems.

squidsoup · on Oct 26, 2012

I find it disheartening that the top voted comment is so blatantly rude. I'm not sure what is gained by calling John Gruber a terrible programmer and maintainer. If you want to praise Jeff Atwood for taking over the stewardship of Markdown, great.

blasdel · on Oct 26, 2012

Go read Markdown.pl, and then consider the millions of uses it's had in the last eight years with neither maintenance, guidance, or abdication.

At some point there's no constructive criticism left to give.

His program is bad and he should feel bad

doesnt_know · on Oct 26, 2012

Christ, maybe I should pull all my open source projects that I no longer work on down from github for fear of people actually using it and then complaining when I don't maintain it indefinitely.

The dude released a script to the public under a free software license and people used it. If you think it's bad, fork it and fix it, otherwise don't use it, that's how the open source ecosystem works.

JeremyBanks · on Oct 26, 2012

There are a dozen MarkDown implementations that are better than Grubber's piece of shit, and a couple that are actually nice. The lack of a decent implementation is not the issue. This issue is that Gruber is (regrettably but naturally) seen as the authority on MarkDown by a whole lot of people, and he's still encouraging them to use his insecure, bug-ridden disaster instead of pointing them in the direction of something that anybody outside of his use case would ever want to use.

He just needs to add a sentence or two to his website and he'll save countless developers a ton of headache. He just doesn't give a shit.

It's pathetic.

Confusion · on Oct 26, 2012

Yes, you should take down such projects if you have been informed of problems and can't or won't take the time to document them. Infinite maintenance is a straw man, because that wasn't requested. Only some civility was requested.

Your project can cost people a lot of time if it promises, but doesn't deliver. I just spent quite some time searching for a decent XSD parser in Ruby, haveing to wade through a score of projects that promise to do what I want, but turn out to be incomplete, buggy or otherwise useless to me. Many people will perform the same quest and together a lot of time is wasted, which could have been prevented if people would not just publish any damned thing, but would also take the time to properly document its state.

Open source projects without proper documentation, I can do without. This problem will only become worse in decades to come. I sure as hell hope github will start purging old projects with too many 'not useful' votes within the next few years. Otherwise it'll be a morass of stink where the gems can no longer be found.

alexchamberlain · on Oct 26, 2012

I thunk the point he's trying to make is that if Gruber doesn't want to maintain Markdown, that's cool. However, he should announce that and let the community take over.

tptacek · on Oct 26, 2012

Grow up.

atacrawl · on Oct 26, 2012

Get over yourself. The guy wrote something that suited his own needs and released it so that others could use it too if they wanted. Programmers ported it to other languages because they liked the idea and wanted to see it thrive -- I've used the PHP port in many homespun web apps over the years. Is it perfect? No, and no software is. But when I've needed a script that easily converts line breaks and hyphens into paragraphs and unordered lists (the normal use case I've taken advantage of), it's done the job every time.

acuozzo · on Oct 26, 2012

> Is it perfect? No, and no software is.

You may want to look into seL4. (If you consider formal verification to be perfection, that is.)

oemera · on Oct 26, 2012

This is one of the reasons Why left the programming community. People are not thankful. Even if you release a great idea - with obviously _not_ the best code - the one thing you get is criticism. Maybe also the words that "you are the worst programmer on earth" following the words that "you put your family in such a shame".

You know what programmers think because of such comments? "My code is so bad I can't release it. Even if the idea is good." And this Sir helps no-one.

Stop this shit.

acuozzo · on Oct 26, 2012

> You know what programmers think because of such comments? "My code is so bad I can't release it. Even if the idea is good." And this Sir helps no-one.

I agree. This has even stopped me from __starting__ a few FOSS projects I've wanted to develop.

jarek · on Oct 26, 2012

Out of curiosity, what do you do now?

ta12121 · on Oct 26, 2012

I assume he means http://en.wikipedia.org/wiki/Why_the_lucky_stiff

lloeki · on Oct 26, 2012

That's why I write it _why, as it's unambiguous (and well, "_why".camelize him?)

jarek · on Oct 26, 2012

Thank you, my bad, I misread

Steko · on Oct 26, 2012

Gruber's Law: the highest upvoted comment for any Daring Fireball link or Markdown discussion on HN will tend to be a repulsive ad hominem whinge.

Tyrannosaurs · on Oct 26, 2012

What I love here is that not just satisfied with running down Gruber, you also feel the need to get in a little swipe at Jeff ("as much as [he] leaps at every chance to play the fool"), a man who seems broadly in agreement with you (though is far more constructive in his approach).

Stay classy.

kamaal · on Oct 26, 2012

>>John Gruber's original Markdown.pl is one of the worst small programs

Perl makes it look small, if you have to write something like this in Java or Python, multiply the LOC by at least 20. But I assure it will be higher.

klibertp · on Oct 26, 2012

No. I was curious and checked - this: https://github.com/waylan/Python-Markdown is slightly below 3k loc mark (2977) excluding extensions, and by the look of it it's much more heavily commented (I didn't strip blank lines or comments at all). Markdown.pl has 1450 loc.

So - no, bullshit, you need to multiply by 2 at most in case of Python :)

draegtun · on Oct 26, 2012

Another interesting comparison would be with the Text::Markdown CPAN module which is Gruber's original code & comments converted to an OO interface with POD documentation added + bug fixes.

This comes (including blank lines, comments & POD) to 1739 loc - https://metacpan.org/source/BOBTFISH/Text-Markdown-1.000031/...

dgreensp · on Oct 26, 2012

Wow, I wasn't expecting my email to Jeff to end up as a front-page blog post!

The point here is that Markdown doesn't have a spec, nor do any of its variants to my knowledge, so I was proposing to come up with some Markdown-like language that does have a spec. Under discussion here is the more ambitious (but also appealing) plan of writing an official spec for Markdown, the same way JavaScript got a spec in the form of ECMAScript that we now identify with JavaScript itself.

A spec is a long, tedious, human-readable document that explains the behavior of a system in unambiguous terms. Specs are important because they allow us to reason about a language like Markdown without reference to any particular implementation, and they allow people to write implementations (Markdown processors) independently that behave identically. The Markdown Syntax Documentation is not a spec (it's highly ambiguous), nor is any implementation (not human-readable; some behaviors are probably accidental or incidental and difficult to port perfectly). The hard part of writing a spec is codifying the details in English, and secondarily making decisions about what should happen in otherwise ambiguous or undefined cases.

My motivation for working on a Markdown spec is first and foremost avoiding "bit rot" of content, which happens when we write content against one Markdown implementation and then later process it with another. We don't have this concern with HTML, JSON, or JavaScript, or at least we know what bounds to stay within to write code that will work on any implementation. This is achieved through specs, even if only implementers ever read them.

I would love pointers to Markdown processors that are implemented in a more principled way than the original code, for example using standard-looking lexing and parsing passes, but that still handle nested blockquotes and bullet lists together with hard-wrapped paragraphs.

greggman · on Oct 26, 2012

Specs are important but I'd argue a conformance test is equally or even more important.

With a conformance test people can test their implementations and when ambiguities arrise new tests can be added or old tests fixed. Without, conformance tests different interpretations of a spec lead to divergent behavior. Once that behavior is out there long enough it becomes difficult to fix as people are depended on it and/or its quirks.

mistercow · on Oct 26, 2012

I'm confused. How are you supposed to have a conformance test without first having a spec? To what would the implementations be conforming, if not a spec?

greggman · on Oct 26, 2012

Sorry, What I meant to really say is specs are not sufficient. Sure specs are important. But even with specs, without comprehensive conformance tests implementations will diverge.

I work on the WebGL spec and tests based off the OpenGL ES spec and tests. Even though the OpenGL ES spec is long and detailed, every edge case for which there is no test is broken or different in one driver or another.

jamesrom · on Oct 26, 2012

In an ideal world, the spec would be the test. A spec is just a human readable test. A test is just a machine readable spec.

ZeroGravitas · on Oct 26, 2012

It's entirely possible to build a conformance test suite around an existing 'black box' of code and accept that whatever it does, even stuff that might be considered a bug, is what any compliant implementation should also do.

(You could also fix some of the more egregious bugs, and then expect implementations to follow your new behaviour)

SideburnsOfDoom · on Oct 26, 2012

> Specs are important but I'd argue a conformance test is equally or even more important.

I agree, but the article mentions tests prominently. They didn't miss that out.

bergie · on Oct 26, 2012

Here are the tests we wrote for HTML-to-Markdown conversion. Could be useful when switched the other way around :-)

https://github.com/bergie/to-markdown/blob/master/test/tests...

draegtun · on Oct 26, 2012

Also worth looking at the tests for Text::Markdown which is the canonical version of Markdown on CPAN.

https://metacpan.org/source/BOBTFISH/Text-Markdown-1.000031/...

th · on Oct 26, 2012

> Without, conformance tests different interpretations of a spec lead to divergent behavior.

I think you mean either:

> Without conformance tests incorrect interpretations of a spec lead to divergent behavior

Or

> Without conformance tests different interpretations of an ambiguous spec lead to divergent behavior.

Conformance tests should test specific behavior defined in the spec. Tests will never (or rarely) prove spec conformance but spec conformance should always prove test conformance.

lmm · on Oct 26, 2012

I have yet to encounter a spec that was not in some way ambiguous.

jjr · on Oct 26, 2012

1 + 1 = 2 Is fairly unambiguous until you introduce operator overloading (and why i despise overloading).

_pferreir_ · on Oct 27, 2012

> 1 + 1 = 2 Is fairly unambiguous until you introduce operator overloading (and why i despise overloading).

It is only unambiguous if you specify the numbering system you are using ;)

</pedantic>

cynicalkane · on Oct 27, 2012

Is there actually a notation and + operator for which 1 + 1 = 2 is ambiguous? Only being semi-rhetorical, I'd be interested to know if there is one. (Modular arithmetic is not considered ambiguous, at least not in the math neck of the woods.)

josch · on Oct 27, 2012

Not really answering your question, but more to the point of the discussion in my opinion, a() + b() could be ambiguous if a() and b() are functions with side effects and the evaluation order of the +-operator is not specified.

ramses0 · on Oct 27, 2012

1 + 1 = 10b is what he's saying.

_pferreir_ · on Oct 29, 2012

Yes, that's what I meant - even "1 + 1 = 2" can be ambiguous if no context is provided.

JadeNB · on Oct 26, 2012

> 1 + 1 = 2 Is fairly unambiguous until you introduce operator overloading (and why i despise overloading).

What is it specifying—or, perhaps a better question, what would an implementation look like? It might be more precise to say "I have yet to encounter an implemented spec that was not in some way ambiguous."

richbradshaw · on Oct 27, 2012

1 + 1 != 2 if the 1s are magnitudes of vectors - if the directions are opposite, then they can be 0. In fact, 0 <= 1 + 1 <= 2 in that system.

Dylan16807 · on Oct 31, 2012

That's not how addition works. You're either adding the vectors or you're adding the magnitudes. You don't get to sleight of hand by writing one and adding the other.

nollidge · on Oct 26, 2012

Overloading is a feature of natural language though, and therefore completely unavoidable in writing a spec.

anthonyb · on Oct 26, 2012

Hee hee, your interpretation of his sentence is introducing ambiguities :)

eslaught · on Oct 26, 2012

You should look at Pandoc[1]. It's a Markdown-to-everything converter, and though I'm not super familiar with the code I believe it's well written. One of my favorite tools for writing.

[1]: http://johnmacfarlane.net/pandoc/

kenko · on Oct 26, 2012

Pandoc is fantastic, the parsing code is clear, and its extensions to markdown are well thought out.

It's not just a markdown->everything converter, though. My install understands these:

Input formats: native, json, markdown, markdown+lhs, rst, rst+lhs, textile, html, latex, latex+lhs

Output formats: native, json, html, html5, html+lhs, html5+lhs, s5, slidy, dzslides, docbook, opendocument, latex, latex+lhs, beamer, beamer+lhs, context, texinfo, man, markdown, markdown+lhs, plain, rst, rst+lhs, mediawiki, textile, rtf, org, asciidoc, odt, docx, epub

Loic · on Oct 26, 2012

You also need to add that the parser of Pandoc is not regular expression based. It is a very robust parser, parsing the document in an internal abstract tree you can easily modify. You can also adapt the parser to your needs with simple extensions. You can then get very nice outputs in HTML or PDF[1]. As it is a nice command line tool, it is easy to generate everything with a simple Makefile too.

[1]: http://notes.ceondo.com/mongrel2-zmq-paas/

intractable · on Oct 26, 2012

+1 for Pandoc. It's awesome.

Also, it needs to be mentioned that Pandoc supports its own set of (IMO) sensible extensions / defaults to Gruber markdown [1].

[1] http://johnmacfarlane.net/pandoc/README.html#pandocs-markdow...

evolve2k · on Oct 26, 2012

Yes I came here also to add a +1 to ensuring that whatever standards discussions would include Pandoc and it's awesome implementation and converter.

From the link, scroll down to 'Pandoc’s Markdown' about a third of the way down the page for all the details. http://johnmacfarlane.net/pandoc/README.html

178 · on Oct 26, 2012

+1 for Pandocs extensions. I can verify that is sufficient for producing a doctoral dissertion (in German; not my own…), via LaTeX. The way Pandoc handles citations, footnotes, tables(!), etc is just delightful.

mongol · on Oct 26, 2012

Yet another "Yay!" for pandoc. It is just so good.

buro9 · on Oct 26, 2012

I agree with all of this.

I'd also like to note that some parts of Markdown from the user perspective are non-intuitive and clumsy.

Such as links and images (inline).

Markdown works so well because it is intuitive and appeals to those who once saw old word processors. They don't have to worry about syntax, and can just enter their text into a textarea (free from JavaScript WYSIWYG interference and the inherent troubles of running that on old and new mobile phones, their playstation, web browsers, etc)... and it just works.

Yet some parts of markdown are simply not intuitive. Links and images are two places where I see in usability testing that the end user will constantly refer to help documentation to figure out how to do it.

Beyond getting the code consistent, maintainable, and testable I'd love to see the language itself solve some of the papercuts that trouble the lay end user.

Realising that what I was planning to do for my project (discussion forums, tumblr for forums) was to create an alternative to markdown that would resolve some of these user issues as well as parser issues, I had already decided that I would not call it markdown and that I would educate my users in something new that hopefully solves their and my needs and would remain very stable due to a well-thought out and documented design in the first place. If what you're proposing is in this vein then consider my hat thrown in, what help I can give I will... take me to your git repository.

ams6110 · on Oct 26, 2012

This is sort of my problem with the whole genre of "light" markup formats.... ultimately they are still a bunch of arbitary conventions to remember, and at that point you might as well just learn/write HTML. YMMV.

zmj · on Oct 26, 2012

That's not the problem being solved by Markdown. You don't want to be serving HTML written by users.

halostatue · on Oct 27, 2012

Strictly speaking, that isn't the problem being solved by Markdown, either.

Remember that Daring Fireball does not have comments, so this isn't a concern. Markdown was something originally created to solve an authorial problem, not something for forum creators to use for comments.

It's flexible enough to do that, but it isn't the purpose of Markdown.

georgemcbay · on Oct 27, 2012

Why not? You have to protect against malicious HTML injection whether or not the user is using markdown, plain text or html, so why not let them use a carefully restricted subset of html?

nsmartt · on Oct 28, 2012

While this is not wrong, markdown involves much less typing than HTML. Also, should one need to use HTML, markdown does allow one to switch between HTML and markdown.

wwweston · on Oct 27, 2012

They're arguably less arbitrary than HTML; most of them are based in things that people have done/tend to do with plain text when they're trying to create a formatted document. That yields two benefits: the source document is inherently formatted -- transparently so, since even people who don't know what Markdown or the like is will be able to read it -- and creates something of a visual mnemonic for the formatting convention.

Not saying any one implementation is perfect, just that they have these nice properties.

alexchamberlain · on Oct 26, 2012

What is your problem with the link syntax? I think it's one of the best bits.

fudged71 · on Oct 26, 2012

Having wasted a lot of my life reading miles of reddit comments, there are some people who don't understand the link syntax, or switch the brackets ()[]. It happens. But not that often, when the alternate way of posting a link is to just paste it in plain text with no title associated with it.

jacobr · on Oct 26, 2012

You have to remember 1) which comes first, url or link text? 2) is it () or []?

uvtc · on Oct 26, 2012

I look at it like this: [this] looks a bit like a button you might press (or a link you might click). It looks like what it does.

(this) looks like something you'd whisper off to the side --- to add to the conversation. For example, "That picture is nice (but it needs more trees).".

So links go like: [this is the thing you click](and btw, this is where it leads).

178 · on Oct 26, 2012

I struggled with that until I looked at it from a parser perspective: links start with [] because () are much more commons in normal language.

spion · on Oct 26, 2012

There is actually an even easier method.

In normal text, we usually link to things by first citing them then providing the link. An example would be Google (http://www.google.com). Its natural that we surround the link in () in normal text.

Now the bit we add to help the parser is to tell it where the thing we're linking to begins and where it ends. [ and ] are not usually found in normal text and are therefore used for this purpose. Thus the modified example would be [Google](http://www.google.com).

SiVal · on Oct 26, 2012

I would much rather see a more mnemonic approach for links, where you would write either _click here_ [http://wherever.com] or _click here_ (http://wherever.com). The underscores look like an underline, which is the traditional appearance of link text in HTML. It's easy to remember. Whether the URL is in square brackets or parentheses (maybe either one would be acceptable as long as they are a matched pair) doesn't matter, but underlining the anchor text is the way to go, IMO.

daigoba66 · on Oct 26, 2012

Beautiful explanation. And one that I can remember while writing in markdown.

alexchamberlain · on Oct 26, 2012

I'm just wondering whether a better solution to this is a polite warning in the UI... Did you mean []() instead of ()[]?

ars · on Oct 26, 2012

I like the wikipedia syntax better for this [url text]. It's much easier to remember.

lloeki · on Oct 26, 2012

> It's much easier to remember.

How so?

Markdown has this brilliance in that it's not just random markup that produces html or something output, it's own human-targeted plain text output is its own source. A such the link syntax is extremely easy to remember. When I write text, I want to read the label, whose reference is a link, hence the order is label then link. If I wrote this link in pure plain text I'd simply naturally write:

    Lorem ipsum dolor sit amet (http://www.amet.com), consectetur adipiscing elit.

So the order is the same is markdown since it's its own output. Still the world is not ideal so we need a hint to tokenize a little, and since parentheses surround the link, let's use square brackets. This also fits very well with the array/hash/dictionary syntax, where you associate a key sitting between square brackets with a value.

If I want to make a more remote reference so as not to interrupt the reading flow with a big link (remember that markdown source is its own plain text output) I can simply do what I'd do if I inserted a reference, that is annotate the label.

    Lorem ipsum dolor sit amet[1], consectetur adipiscing elit.

The footer naturally follows, as we again assign a key to a value. And since there can be an optional title to the link, it comes afterwards so that urls vertically align.

    [1]: http://www.amet.com "optional title"
    [2]: http://adispicit.org "optional title"

Again, I find brilliance in Markdown in that it leveraged long (as in decades) established conventions (e.g headers, italics, lists, quotes, code blocks...) that apply directly to plain text in order for the source to be its own output, so that even for someone who doesn't have the first clue about Markdown it's readily readable as if it were not markup.

alexchamberlain · on Oct 26, 2012

The problem is that this is a discussion about standardising the Markdown syntax, not changing it.

buro9 · on Oct 26, 2012

The fact that Meteor wish to call this Rockdown suggests to me it is to change it.

By creating a specification and tests, they will be modifying Markdown even if just subtly enough to remove bugs and ambiguity. Doing even that is to create something that isn't Markdown, existing Markdown parsers may do something different with the same text... so clearly this isn't Markdown.

It may share 90% (or more, or less) similarity with Markdown, but some areas would certainly need tweaking and changing to make the goal (specification, testability, consistent implementations) possible.

To my understanding this means that Rockdown is Markdown inspired/derived.

And as such, that there is a hope that a good specification and the ability to be implemented in a testable and consistent way... will, by mass use (on the sites mentioned), eventually result in Rockdown superseding Markdown.

In my universe, I'm simply looking to get rid of bbcode as the markup syntax of choice. And if you were at the support end of trying to tell people how to insert images on forums then you'd understand that the Markdown syntax is a real difficulty for a lot of people.

The path that I prefer is to use Markdown, but the areas that cause the greatest difficulty in usability testing, to modify those to make things more implicit and frictionless. Worrying about []() (or vice versa) is an obstacle to the user that shouldn't exist. But if everything else is good from a user perspective then I definitely want to keep that stuff.

ars · on Oct 26, 2012

Wikipedia also uses markdown. So since there is no standard format for URLs, let's pick a good one.

Also, it could support both of these syntaxes at once - they don't conflict.

tripzilch · on Oct 26, 2012

I still get those the wrong way around [text url] often enough. I only mess up MarkDown slightly less often because I write more of it.

The "() is more common than [] in regular text, so it makes parsing-sense to start a special markup code with []" is going to help me with that, though. That particular mnemonic only works for coders, though :)

Also, doesn't MediaWiki syntax sometimes also have a pipe | between the url and the text in a link? Or is that only for internal links?

buro9 · on Oct 26, 2012

A parser should not be an interactive application.

bitcartel · on Oct 26, 2012

Why are people still interested in Markdown - what problem is it solving and for whom?

It's 2012 and we have HTML 5 compliant WYSIWYG text editors. We no longer have to write plain text littered with special codes, for the purpose of running through a parser, to produce HTML which looks nice on a web page. Maybe it made sense a decade ago when web forms had terrible editors, but not anymore. I think Joe Internet writing blog posts and forum replies would agree with me.

For developers, a README file in plain text looks great everywhere, and avoids any Github vs Bitbucket vs Assembla display issues. If you need to write structured documentation for a system, there's probably already a designated markup language you're supposed to use, so Markdown doesn't help there.

ChiperSoft · on Oct 26, 2012

I think you are grossly overestimating the quality of current rich text editors... None of them produce reliable clean code. Some produce less shitty code than others, but they do so through tremendous amounts of post-processing.

Markdown just works and always produces clean, easy to style code.

jdbernard · on Oct 26, 2012

I am in love with plain-text files. I edit them easily no matter what device I am on, whether it is my desktop, my phone, over SSH, in a browser. I can use my more powerful text-editing software instead of being stuck in a weak WYSIWYG editor.

My blog is just a directory of Markdown files that get compiled into HTML files. No cumbersome WordPress or other blog software installation I need to maintain, just copy the HTML to the server. Simple.

Same thing with documentation. Write once, from anywhere, output clean code to multiple formats: HTML, LaTeX (and PDF, DVI, etc. from there). No need to worry about making sure complex documentation toolchains are installed, just a Markdown compiler. A huge advantage of Markdown is that the source is as readable as the output. JavaDoc, for example, sucks in this regard if you try to do any formatting (lists, tables, etc.). If I am reading the source code--and I prefer to read the source if I can--then the javadoc is often useless for complex functions where it is most needed, because the HTML formatting is unreadable. I am actually building my own documentation system because of this (see https://github.com/jdbernard/jlp if you are interested).

keithpeter · on Oct 26, 2012

"Why are people still interested in Markdown - what problem is it solving and for whom?"

I understand the question and wonder why it attracted down-votes. The executive summary of my answer: light markup is used in places other than text areas on Web sites.

Many of us use Markdown to keep formatted text in, and, ultimately, to generate valid html markup from. Markdown is not just for textareas in Web pages. Whole books have been marked up in markdown.

Markdown files are just text files with 'explicit' markup within them and so will always be accessible/editable/processable in the future. Markdown is 'flexible' in the sense that it allows alternative markup for the same output, and is 'light' in the sense that a limited range of styling and block formats are available. Compare that with LaTeX which shifts with each tex-live release, and which means that markup written some years ago may not format without hand editing on a modern LaTeX release.

There is the reference implementation available from Gruber's site, and, as others have mentioned, pandoc implements markdown in a way that can be extended. I'm dubious of the need for an 'official' specification; I rather like the scrappy streetwise nature of Markdown.

JadeNB · on Oct 26, 2012

> Compare that with LaTeX which shifts with each tex-live release, and which means that markup written some years ago may not format without hand editing on a modern LaTeX release.

What? Certainly details of individual packages may change, including addition or removal of functionality, over time, but I don't even know what it means to say that "LaTeX shifts". The underlying markup language for LaTeX, which is the same as that for TeX, is very flexible, but has had the same flexibility since at least 1982, if not 1978.

I understand the idea that built-in extensibility leaves a platform, to some extent, to blame for its plug-ins; but many people in this discussion have proposed building some sort of plug-in architecture for Markdown, too.

bitcartel · on Oct 26, 2012

Sure, but why are we using the lowest common denominator, plain text, and then adding formatting?

Why don't we start with something like a visual HTML or DocBook editor, and then apply parsers to generate plain text, PDFs, etc?

Is it because we as programmers live in a plain text environment and are happy with the status quo, or is it because the tools aren't good enough?

dspillett · on Oct 26, 2012

> Why are people still interested in Markdown - what problem is it solving and for whom?

For techie types it is familiarity: as a programmer I've used mark-down style syntax in code comments (and emails back when plain text was the norm) for years so it feels natural. I can type and edit text with mark-down faster then with full WYSIWYG control (even if the editor has good keyboard shortcuts). Also I can use the same text+markdown in code, emails and documentation that is desired to look a little fancier - I don't need to reformat for a second audience. Markdown isn't perfect, but for me it is a better solution for some things than WYSIWYG.

Simplicity and resulting cleanliness of mark-up can be an issue sometimes. I've seen WYSIWYG editors get tied in knots over bad HTML that they generated themselves.

Simplicity is another bonus for sites taking in user content, particularly sites like the StackExchange family. By effectively having a while-list of formatting options you can keep a consistent look on your site more easily. If you aren't sure what problem this solves, think Geocities.

It makes security a little easier to: you don't have to worry so much about potential injection attacks that are very possible if trying to generate/accept full HTML and filter it for the bad stuff afterwards (you still have to be careful to think about these issues of course, but your available attack surface is much smaller and therefore easier to manage).

> Github vs Bitbucket vs Assembla display issues

This is the very problem that people are actually discussing - competing interpretations make mark-down less useful then it could otherwise be, and a global standard might help that issue if it gets widely adopted. Of course many incompatible interpretations already exist and any new standard will face adoption friction, so we may end up in a situation like parodied in http://xkcd.com/927/ (it worked well enough for Javascript in the long run, but I think there were fewer common and slightly incompatible implementations of that at the time than there are of markdown now).

silvestrov · on Oct 26, 2012

Markdown is fantastic because it restricts the formatting.

When users copy/paste from MS Word into HTML editors the paste often keeps the font from the Word document which is not the proper font for the html page. That results in a messy page layout.

In my experience it's easier for users to understand Markdown syntax than to understand the "hidden codes" in HTML editors when they want to fix the layout.

bitcartel · on Oct 26, 2012

Sure, but if we could paste HTML between editors and browsers without any rendering or layout issues, there would be no reason for users to look beneath the surface. I do agree that we're not quite there yet.

dspillett · on Oct 29, 2012

There is still the need for destination system to fully validate the input to catch badly formed content which may cause unpredictable issues and (more importantly) to screen out injection attacks, which may not be a trivial task. Relative to validating markdown decorated content it is far from trivial.

Gravityloss · on Oct 27, 2012

You can use markdown as source for formatting, but the source is very readable by itself.

This is extremely important for writing and also for fall back purposes. Any time there is any kind of a problem in writing or reading or interpreting, just fall back to source.

People can say html is human readable but the difference is huge. I didn't even know that markdown is parsed on hacker news, I just used stars as underscores as natural formatting.

There's also potentially much much less overhead. Ultimately, browsers and your cell phone apps etc should just format markdown natively, meaning huge performance boosts, as html, css, js etc has just become so complex for many tasks as a consequence of do-everything.

My phone's email client struggles browsing back and forth of what should be just a few small text files and file attachments. It uses way too many cycles since it's also a jpeg and css and js renderer with scaling and everything. As someone who's used text email and news groups, lightweight formatting is very natural and informative. Those services would not have been possible in the past if they had been bogged down with large amounts of excess complexity.

I guess reducing overhead is not sexy because it doesn't involve new features. In the future everybody will just create their typed static oneliner email as 10000x10000x10000 pixel 100 fps 3D movies of 2 minute length (you see, that's the time it takes to read it), and if some supposedly slightly technically literate person asks that perhaps they should use a word 2020 .doc instead for reduced file size, the answer is "but you can't do a space station flythrough in it".

thenomad · on Oct 26, 2012

The sheer popularity of Markdown implies that there are some use cases for it. For example, Reddit uses Markdown exclusively, and I don't believe ALL their users are highly tech-savvy.

Personally, I don't trust or use (if I can avoid it) any on-the-web WYSIWYG text editor, because a) they tend to be slow and clunky and b) it's difficult to predict what HTML they will turn out a lot of the time, and I will often end up needing to edit that HTML.

Also, for common blogging / commenting tasks, Markdown is as fast or faster to use than WYSIWYG editors. I've been faffing around with a forum that won't let me use Markdown recently, and forces me to use its clunky WYSIWYG editor to add lists into my posts. It didn't take long for me to start yearning for the simplicity of linebreak - asterisk - item rather than a multiple-click WYSIWYG process.

bitcartel · on Oct 26, 2012

If we assume WYSIWYG editors keep on improving, at some point, they'll become good enough. What then for Markdown?

falcolas · on Oct 26, 2012

Personally? I would still prefer to write plain text with a bit of simplistic markup.

YMMV, but I've never been fond of WYSIWYG editors. I'd rather say what I mean, instead of say what this other guy thinks I mean.

vidarh · on Oct 26, 2012

I find it more comfortable to write Markdown than to use a WYSIWYG text editor. That the WYSIWYG editors tends to produce really shitty output doesn't make things better.

Markdown and similar formats can also easily be used to produce other output than HTML.

And Markdown is plain text, and that's another appealing factor: Even if you read it somewhere that won't render the Markdown, it is still readable. The same can't be said about HTML.

fudged71 · on Oct 26, 2012

For some people and some uses, plain text is just more comfortable.

I love the ability to scratch something down in Markdown really quickly, and being able to convert it to LaTeX if I want to, without any hassle.

I'd love to write blog posts in Markdown. Google+ uses the italic and _bold_ syntax in their own posts, which is very useful.

I don't exactly know why it's better, to be honest. Maybe WYSIWYG editors are inherently dependant on their own implementations and rely on people learning the interface.

Text-As-An-Interface is beautiful in it's simplicity. Markdown is the perfect mix of simple markup without the insanity of BBCode.

sopooneo · on Oct 26, 2012

But is it plain text? If it is, then we can make an argument that html documents are as well.

fudged71 · on Oct 26, 2012

I think that's the difference, though. You can read it easily without being compiled. HTML is very markup-intensive.

omaranto · on Oct 26, 2012

1. Could you recommend some of these good WYSIWYG editors you claim exist? The ones I've seen are all slow or awkward (basically not having enough keyboard shortcuts) or get confused when I delete part of the text (and usually have all of these problems).

2. It sounds like you are only concerned in producing HTML. I like using markdown for everything. Using the excellent Pandoc I produce both HTML and pretty PDFs from markdown text, with nice looking formulas too. (PDFs are produce through LaTeX, for HTML Pandoc offers several options, I use MathJax).

bitcartel · on Oct 26, 2012

As an end-user I find the Wordpress editor to be pretty good, and Google Docs ok. The Aloha editor looks great: http://aloha-editor.org/index.php

Pandoc looks great. So why not write in HTML or Docbook and then convert to plain text / RTF / PDF etc?

tripzilch · on Oct 26, 2012

It's not just for comments and blog posts.

I often find myself writing short plain text files, usually quick personal notes. Occasionally they grow larger than what I'd call a "short" plain text file, and at this point two extra requirements pop up:

1) they are going to need some structure.

2) odds are, that I'm going to want to show this document to someone else at some later point in time[1].

Markdown offers (to me) by far the easiest transition from "basic plain text note" to "slightly longer formatted document", because most times, all I have to do is continue writing the way I was doing already.

[1] corollary: as programmers know, when reading their own code, "someone else" also includes you in three months time.

Adrock · on Oct 26, 2012

In addition to all the other great answers here:

1. You can pry my {Emacs,vim} from my cold, dead hands.

2. Version controlled markdown is much easier to deal with.

gordonguthrie · on Oct 26, 2012

Yeah, but whereas git and other version control systems are great with lines of code - and diff works well on that - for text we need a diff that works on lexical units within a (much longer) line or structure.

I know git can be configured to use an arbritary programme for diff - just not sure where to find one...

chokma · on Oct 26, 2012

Many users like their WYSIWYG text editors (even though they are often problematic between older browser versions or with "paste-from-word" or conflicts with other JS-includes). And the expectations are rising - one customer of ours would like a WYSIWYG-editor "with Apple style interface" in the browser, with all the bell's and whistles of a full blown publishing solution including complete layout control, input validation, CMS support and so on. The old TinyMCE fields of his legacy app will no longer cut it.

Meanwhile, I write the documentation for my software in Markdown, because it is the fastest way for me to get structured text from my mind to my machine. And with the help of pandoc, it's convertible into all other text formats, from HTML to PDF and LaTeX etc. (Although the html-with-images to PDF support of wkhtmltopdf is even better, so it's Markdown => pandoc[for HTML] + wkthmltopdf [for HTML to PDF]).

krsunny · on Oct 26, 2012

Because in 2012 there are still lots of plain textareas, like the one I'm writing this reply in.

sweetdreamerit · on Oct 26, 2012

using Markdown you can: - track versions with tools like git - using pandoc, you can transform it in html, pdf, latex, beamer, rtf, slidy - you are sure you are separating style from content - it can be used everywhere: using console editors like vim (or the simple nano, that I love), text editors like notepad+, gedit. I found out a very nice android app, Epstile, that uses markdown and syncs the data using dropbox. - just this morning, in the office where I'm working, there was a discussion like "this doc was saved using office 2010 or office 2007?". Using markdown, you can avoid such a mess.

city41 · on Oct 27, 2012

Most tech-type people don't like WYSIWYG. I know I don't. The product that my company makes has a very nice WYSIWYG editor built into it. For a hackathon I replaced it with a Markdown editor and all the other developers went nuts (sadly, for various reasons, it never made it into the real product).

The main appeal is Markdown lets us write formatted text without ever having to use the mouse or figure out what the current WYSIWYG editor's key commands are. Often times -- like making nested bulleted lists -- there just plain aren't any key commands. I'd much rather type a few square brackets and create a nice hyperlink on the fly than invoke some silly menu, paste in my link in the window that pops up, create the display text, etc etc.

alexchamberlain · on Oct 26, 2012

Safety.

If you exclude HTML tags, Markdown is safe. People can't inject code.

I know this can be done for HTML as well, but it's harder.

blasdel · on Oct 26, 2012

Unfortunately, that is absolutely not true in any implementation derived from the original Markdown.pl without a whole lot of bugfixes

Because of the chickenshit way he attempted to escape things by replacing them with their MD5 hashes and then switching them back, you can encode anything you want to be output by the markdown processor. Reddit was owned by a viral XSS comment because of this in 2009, it's been exploited a few other times since.

bitcartel · on Oct 26, 2012

Found the info, interesting. Easy to think plain text is safe and forget about the parser.

http://www.f-secure.com/weblog/archives/00001777.html

http://blog.reddit.com/2009/09/we-had-some-bugs-and-it-hurt-...

Too · on Oct 26, 2012

WYSIWYG always means either 1. proprietary file format and/or 2. Shitty code generation.

Neither works very well with version control or scenarios where you need compatibility between different system.

Plain text ALWAYS works.

tomp · on Oct 26, 2012

Most WYSIWYG require the use of a mouse.

peteretep · on Oct 26, 2012

I can't be bothered to learn LaTeX to mark up my Haskell assignments, so I've been using Markdown + LHS

halostatue · on Oct 26, 2012

https://github.com/jgm/peg-markdown and https://github.com/fletcher/peg-multimarkdown

I haven't looked at the implementations of these, but they are most certainly grammar-based, not regex-based.

someone13 · on Oct 26, 2012

Another good parser is python-markdown2 [1], which has a very extensive set of test cases, and the code is very readable and well-commented.

[1] https://github.com/trentm/python-markdown2

arnarbi · on Oct 26, 2012

I do not agree that a spec has to be written in English. Like most natural languages, it is very easy to be ambiguous and to underspecify. Resolving this can make English just as human un-readable as anything else.

Specs like this should be written as executable reference implementations in a well defined programming language. This can very well be human readable, and should be done without regard for efficient execution. It's less ambiguous, amenable to automated conformance testing, and is easier to evolve than a natural language document.

chc · on Oct 26, 2012

If that's the spec, how do we know which aspects of the executable are really part of the specification and which are implementation details? For (a somewhat extreme for the sake of illustration) example, if the spec is written for Ruby 1.8, do C implementations have to intentionally slow down to match its execution speed?

arnarbi · on Oct 28, 2012

Of course not. Such a spec would state what are its inputs and outputs, either as natural language descriptions or as types (say, if the reference language supports it).

Everything outside that (e.g. timing) would be fair game to optimize. Moreover, as I hinted at, a spec should not be optimized for speed or efficiency, but rather readability and unambiguity.

chc · on Oct 28, 2012

> Of course not. Such a spec would state what are its inputs and outputs, either as natural language descriptions or as types (say, if the reference language supports it).

So you agree it wouldn't work for specs to be "written as executable reference implementations in a well defined programming language"?

> Everything outside that (e.g. timing) would be fair game to optimize.

But sometimes you want to specify performance characteristics.

The simple fact is that without just embedding a traditional spec within it, an implementation cannot tell you what aspects of its behavior are "specified" and which are just implementation details. This makes reference implementations inadequate specifications. That's all I was getting at.

arnarbi · on Oct 28, 2012

> So you agree it wouldn't work for specs to be "written as executable reference implementations in a well defined programming language"?

No, then I would be disagreeing with myself. :)

> But sometimes you want to specify performance characteristics.

For a Markdown processor, hardly. But...

> ... cannot tell you what aspects of its behavior are "specified" and which are just implementation details.

That is a fair point indeed. I was merely disagreeing with the linked article in that a spec should be in a natural language, which I still hold to. Such specs generally have problems with under specifying, leaving too much up to implementation, which often ends with a few main implementations sort of becoming the actual practical spec. Look at browsers over the history for many examples. Your point seems to be that reference implementations might overspecify instead. This is true, but is easier to compensate for IMO. For example with clear types, natural language parts or even by conformance test suites being part of the specification.

Too · on Oct 26, 2012

Use of well established standards. For example BNF or railroad diagrams can cover a majority of a language specificaton without using English.

chc · on Oct 26, 2012

I think you're unintentionally moving the goalposts here. The idea I was questioning wasn't "Specs should be written in something other than English," it was "Specs should take the form of an executable reference implementation." AFAIK railroad diagrams are not executable. I'm also not sure how BNF is supposed to specify things like "the map function applies a function passed in to each argument in a list".

blasdel · on Oct 26, 2012

IETF standards are required to have multiple interoperable implementations of the usually post-hoc english language spec.

One implementation can never be enough.

fiddlosopher · on Oct 26, 2012

[lunamark](https://github.com/jgm/lunamark/tree/master/lunamark) is another PEG-based implementation.

uvtc · on Oct 27, 2012

I'm not so sure what's needed here is a spec.

What people really want is for all Markdown implementations to be basically the same so they don't have to learn any implementation-specific ideosyncrasies to switch from one to another.

The problem though, as I see it, tends to go away under certain circumstances. And the circumstances, I think, are these:

* If you've got a strong implementation, robust, actively-maintained, and runs fast,

* is easy to obtain, install, and use,

* is well tested and well documented, and

* has just the right blend of sensible additions to the syntax (for example, tables, def lists, LaTeX math, etc.) --- done tastefully,

then folks will just use that, model their own implementations after that, and just overall start considering that to be the standard.

I think this has been slowly and steadily happening with Pandoc.

And, aside from all that, two additional "killer features" that Pandoc seems to have over other implementations:

1. it can convert to/from other doc markup formats, thus making it easy to just convert your existing docs to pandoc-markdown and then use that as your master source format to generate other formats you might need; and

2. with its carefully-chosen set of additional features, it has been slowly proving itself capable of being a replacement for raw LaTeX for certain types of longer technical documents.

My understanding is that there's even some features in the works (for the next release) for converting between markdown dialects --- which would make it even easier to convert markdown files of various flavors into plain standard pandoc-markdown.

So, if you're looking for a standard, I'd suggest that it's for the most part already here. :)

draegtun · on Oct 26, 2012

>I would love pointers to Markdown processors that are implemented in a more principled way than the original code, for example using standard-looking lexing and parsing passes...

Have a look at Markdent - An event-based Markdown parser toolkit.

https://metacpan.org/module/Markdent

gordonguthrie · on Oct 26, 2012

I wrote a manual parser with look-aheads in Erlang (making extensive use of Erlang's pattern matching.

https://github.com/hypernumbers/erlmarkdown

No regexps.... except for url's and emails :(

aleemb · on Oct 26, 2012

I hope some more thought goes into the usability as well. I find the syntax for links confuses a lot of users for whom I have enabled mark down editing. Wiki style [[link text]] works great if you assume links have no space or maybe something similar. Similarly the syntax for images has also always bothered me.

vidarh · on Oct 26, 2012

I have no problems remembering the Markdown link format, but I never remember the expected order in what you describe as "Wiki style" (which is really a style employed by some wiki software - there are plenty of wiki software that uses other syntaxes, including Markdown)

ExpiredLink · on Oct 26, 2012

IMO, for the spec the most important point is to add mandatory versioning information on the first line of markdown scripts, e.g.

    _markdown_version_ Rockdown_1.0

This would allow processor implementers to support more than one markdown version (for a transition period or in general).

jdbernard · on Oct 26, 2012

Ugh, no. This defeats the simplicity of Markdown. I do not want to have to remember some obscure implementation versioning string whenever I want to start a new document or comment. Maybe as an option, sure, but I will probably never use it.

Helianthus · on Oct 26, 2012

I don't know if Reddit's implementation of Markdown is principled, but as a syntax it's pretty ubiquitous so you might have the traction to make it standard.

jamii · on Oct 26, 2012

> I would love pointers to Markdown processors that are implemented in a more principled way than the original code

This is almost what you want:

http://news.ycombinator.com/item?id=555153

EDIT: wrong link :(

X-Istence · on Oct 26, 2012

I might be the only one, but I actually prefer Markdowns handling of a single "enter" without spaces at the end to mean that the paragraph is not finished. It makes writing blogs and various other stuff in Vim much simpler, and I can more easily reformat text to wrap at 80 characters, and have better control over it.

Could I soft-wrap in my editor? Sure, but that would mean that the text files sitting on my hard drive now have very long strings in them making it harder to grep, making it harder to add to git (change a single character, entire line is now a diff :-().

I hope that doesn't become the default.

eob · on Oct 26, 2012

Lots of people are with you.

I think this behavior is the better route because it accommodates both crowds. The line-wrap folks can just press enter twice; no biggie. But the console and vim users of the world can continue using line-breaks the way that work best for their environment.

On the flip side, making a single enter start a new paragraph wouldn't really help the GUI users (what's the difference between one line break and two, really?) but it would really hurt the console users of the world

jomar · on Oct 26, 2012

I guess there is a distinction between

* people typing markdown in text files, where you want to split paragraphs into word-wrapped lines of a sensible length, and consecutive non-blank lines form paragraphs just as they always have in text files;

* and people typing markdown into text entry boxes on web pages, where you would like pressing the <enter> key to actually mean something.

These two situations probably prefer a different default.

X-Istence · on Oct 26, 2012

Then make the form that the user enters data into be either parsed differently, by pre-processing it, or have some javascript magic that automatically adds the extra two spaces required to make it work.

This way if the user presses <Enter><Enter> then the JS can remove the extra two spaces and it will still be valid markdown.

halostatue · on Oct 26, 2012

If this is a change that Jeff really wants, he needs to fork Markdown. This change actively subverts one of the goals that Gruber set down for Markdown, which is that Normal People can use it.

Changing this, especially for people who implement Markdown parsers, is geek arrogance.

alanh · on Oct 26, 2012

You’re flat-out wrong.

The biggest deployments of Markdown — StackOverflow & GitHub — do not squash line breaks. They made this decision because authors did not assume their line breaks would be discarded.

The change is not geek arrogance. It is, in fact, a reaction to how Normal People use it.

halostatue · on Oct 26, 2012

If you think that users of StackOverflow and GitHub are “Normal People”, you need to adjust your personal RDF.

My wife is not a geek or a programmer. She will, if forced, learn the intricacies of an idiotic programmer‘s decision to make something easier for him (and it’s almost invariably a him) in order to participate in a particular forum. This is why people put up with crap like bbcode every single day. That doesn’t mean she enjoys it.

Markdown works because it makes plain text readable and formatted nicely. Markdown works because it works just as well in a plain text email as it does in an HTML-formatted email.

It doesn’t work because some geek says that the people who use heavily geek-oriented sites (StackOverflow & GitHub, to use your examples) are in fact ”Normal People”. (And, just for what it’s worth, I do find the fact that I have to join all my hard-wrapped lines back together in SO & GH comments to be immensely annoying. At least their markdown parser for .markdown files doesn’t do this.)

So, no I am not flat-out wrong. I hear geek whining that parsing is made harder because of a design feature that is present because most people don't think about or want to think about “correct” line-endings.

Before you decide that I’m full of hot air, understand that I am working with my wife’s novel that’s written in Markdown so that I can parse it (possibly using an existing well-written-and-maintained parser like peg-markdown or peg-multimarkdown as an intermediate mechanism) and then transform it into the LaTeX formatting required for making her PDF output and the HTML formatting required for making her EPUB (and thus MOBI) output.

I am well aware of how ugly some of it is…but it makes it easier for her to write her novel without falling back to proprietary formats like .docx. I’m willing to deal with some pain as an implementor in order to minimize what she has to deal with.

As such, I don’t really have a lot of patience for this type of geek whinging.

chipotle_coyote · on Oct 26, 2012

I'd respectfully submit that if your wife is writing a novel with actual line break characters at the end of each visual line, rather than only at the end of paragraphs, she's kind of an edge case. "Normal people" who have been using word processors since the late 70s have had it drilled into their head that you only hit return at the end of a paragraph.

It's my suspicion that Markdown was written the way it was solely to handle the case of blockquotes having ">" characters at the start of each line, and the reason that mattered is because BBEdit and Mailsmith had commands to wrap text that way, chiefly for quoting mail messages. This happens to fit in very nicely with the way Vim and Emacs wrap paragraphs, and I know there are people who've written novels in both. But I assure you that the vast majority of fiction writers are used to word processors, and for that matter, so are the vast majority of users typing things into text boxes. And if those users wrote a little poem

  That had lines
  Which looked like this
  And failed to rhyme
  Because they weren't really poets

...the chances are they would just hit return at the end of every line, because that is the way every program they have used in their entire lives works. This is why GitHub and StackOverflow made that change: because Markdown is not like HTML to most people. It's like plain text. In plain text, when you press return it means "end of paragraph." The way Markdown does it -- and the way Hacker News does it, I just discovered! -- violates the principle of least surprise. I am not writing HTML, I am writing text, and I expect it to act like text.

(For what it's worth, I've written a collection of short stories in Markdown and created the EPUB from that using my own Python scripts to do so. I am an edge case. Anyone who is writing fiction in a text editor instead of a word processor is also an edge case. Sorry, but it's the truth.)

X-Istence · on Oct 27, 2012

Stackoverflow requires you to add two spaces to the end of every one of those sentences in your poem to work as you just said it did ... Stackoverflow DID NOT make that change you suggested it did.

http://stackoverflow.com/editing-help#linebreaks

maxerickson · on Oct 26, 2012

If you are throwing away the ability to have visually formatted source text, all the teeth gnashing over the name of the new, properly standardized and beautifully implemented format becomes completely ridiculous. Nice looking source text is pretty much the core principle of markdown.

Which I suppose is the source of Gruber's apparent obstinance, if someone wants to make changes that exchange his design principles for other design principles (I don't think he cares much about naive users), why should he hand them the name of his project?

(to clarify, "apparent obstinance" is me doing a poor job of keeping sarcasm out of my post, I don't have a problem with Gruber insisting that the markdown name is his)

chipotle_coyote · on Oct 26, 2012

I agree -- but it doesn't seem to me that "treat return characters as hard line breaks" equates to "throwing away the ability to have visually formatted source text," unless one is using an editor that is unable to deal with soft-wrapped text gracefully.

maxerickson · on Oct 26, 2012

Well, if the editor is wrapping things then the text itself isn't actually visually formatted. I agree that the distinction isn't very significant.

But I like the idea that a shared file will look about the same without relying on there being a smart editor on the other end.

loup-vaillant · on Oct 27, 2012

Furthermore, what editor would correctly soft-wrap this?

  Really really […] really long line

  * Really […] really long list item 1
  * Really […] really long list item 2

Most editors I know would show that:

  Really really[…]
  really long line

  * Really […] really
  long list item 1
  * Really […] really
  long list item 2

Instead of this:

  Really really[…]
  really long line

  * Really […] really
    long list item 1
  * Really […] really
    long list item 2

saurik · on Oct 26, 2012

Why do you feel like normal people, who have been trained for decades--not just on computers but on typewriters as well--that hitting enter means "go to the next line", would have an easier time if that assumption was violated? You even see the formatting in front of you: you are now on the next line, and yet when rendered this information is somehow ignored and the text is pasted together. E-mail clients, which have been doing crazy things to our text for years, don't even do this: the closest they come is format=flowed, and that's a special set of rules designed to be generated by clients, not typed by mere mortals.

halostatue · on Oct 26, 2012

I feel this way because of the reality that many people will centre text in a document by inserting spaces in front of the text they want centered, rather than using their word processor's centre command. It drives me absolutely bonkers to see that, but it's reality.

I feel this way because of the reality that many people will hit enter two dozen times to go to a new page, or to position a bit of text.

In the time that I've been working with computers, what I've come to realize is that most people simply don't have a grasp of the abstractions of the computer—they remember particular incantations that worked for them last time and keep doing them again and again.

My wife is an intelligent person; probably smarter than I am. She speaks three languages and has taught for the last twenty-two years. Watching her use the computer drives me absolutely bonkers because she doesn't use any of the things that I know make the computer easier to use. Neither does her sister. Why? Because neither of them have time to specialize in the use of the computer. They use it a lot, and I've been able to introduce a few things—but only for things that simplify their lives because they integrate into how they already work.

If two of the most intelligent people I know don't have a lot of these expectations…why do people think that most people would expect things to work the way that Jeff has proposed?

In a sibling comment (I only want to reply once), stickfigure says:

“Normal People have a consistent expectation regarding the behavior of line breaks.”

Actually, I don't think that Normal People have any understanding of line breaks at all. They understand letters, words, sentences, paragraphs, and to a degree pages. They know they want something to show up on the page, but they have no understanding of how the computer does its magic to make that happen.

stickfigure: “Normal People are comfortable with markdown in the first place. Neither you nor your wife are Normal People.”

For the second sentence, I agree and disagree. I'm certainly a geek with a more-than-passing interest in layout, formatting, fonts, etc.… This stuff interests me in a way that most people and most geeks certainly would never be interested in it. My wife is also not normal people on some level—she’s married to a geek who can take care of things like this for her. At the same time, she is 100% a computer user and has no interest in learning to program or learning to fight with programs that actively get in her way.

She also doesn't know Markdown as such. She knows a tiny subset that I have introduced to her that provides her with exactly enough to write her novel and for me to format it the way she wants it. (That's the paragraph rules, italics, and headings.) I've shown her a few other things for when she blogs—but even when she's using the Squarespace WYSIWYG editor, I often have to go in and edit the links to make them right.

There's a lot more that I could show her, but she neither has the time nor the interest to learn it.

saurik · on Oct 27, 2012

Both of your examples--using spaces to center text and using a dozen newlines to position text--are based on the reasonable assumption of "what I see in my text is what I got when I formatted it": removing carriage returns entirely and pasting lines together to form larger lines violates that assumption. If the user expects that they can go to the second page by adding a dozen newlines, how could they possibly understand that newlines now don't mean anything at all? I will repeat: for decades, the enter key has meant a very specific behavior, and Markdown is the only setup I can come up with that either myself or my mother (or my grandmother) would have ever come across where the enter key doesn't cause at least a single line break.

halostatue · on Oct 27, 2012

Most people think that a block of text that is set together is a paragraph, especially as block paragraph format has become the norm for casual and business writing; indented paragraphs are the exception used when you have a layout process and not just depending on your word processor to get it right for you.

Email and HTML have not helped this; it is a common expectation that a blank line between two blocks of text represents the paragraph break—at least based on the emails that I've seen over the last two decades.

A fundamental philosophy of Markdown is that a Markdown document looks reasonable when sent via email as text or when sent as a formatted web page. That it's flexible enough to be usable for more than just email and HTML is a bonus.

saurik · on Oct 28, 2012

It is also a common expectation that if you put a newline in an e-mail message, even without a blank line intervening, that the person who receives the message will not get the two lines pasted together to form a single line. The problem with the newline behavior of markdown--and to a very real extent this is the only such feature of markdown that has this bug--is that the text you are looking at looks nothing like the text that ends up being rendered.

stickfigure · on Oct 26, 2012

Implied assumptions:

* Normal People have a consistent expectation regarding the behavior of line breaks.

* Normal People are comfortable with markdown in the first place.

Neither you nor your wife are Normal People.

JeremyBanks · on Oct 26, 2012

You're mistaken: Stack Overflow does collapse line breaks in posts. Example: http://meta.stackoverflow.com/a/152661/134300

It is a pretty common mistake for new users to make, though most of the time an editor will fix it pretty quickly and the user will learn.

alanh · on Oct 26, 2012

Hmm, case-in-point, then. Thanks for pointing that out.

X-Istence · on Oct 27, 2012

StackOverflow DOES:

http://stackoverflow.com/editing-help#linebreaks

Notice the requirement for two spaces at the end, ONLY Github doesn't do that.

bdr · on Oct 26, 2012

You seem to be in violent agreement.

RegEx · on Oct 26, 2012

I agree completely. Visually highlighting a paragraph that went a bit too wide and cleaning it up with 'gq' makes me a happy camper.

Evbn · on Oct 26, 2012

Do you know about

gqip

?

RegEx · on Oct 26, 2012

No, I didn't, but thanks! That's very useful. Additionally, google + the help command led me to :help text-objects, which describes a lot more goodies similar to 'ip'.

pjscott · on Oct 26, 2012

Believe me, you're not the only one who prefers this behavior. It makes working with hard-wrapped text a hell of a lot easier.

fiddlosopher · on Oct 26, 2012

I, too, would be strongly against "automatic return-based linebreaks." Given that markdown has constructions for lists and code blocks, one very rarely needs a hard line break anyway. Currently markdown works fine both for people who hard-wrap and people who soft-wrap. Let's keep it that way.

raldi · on Oct 26, 2012

I'd also advocate for accepting reversed ()[]'s on links.

In other words, let the user type:

    [something](http://whatever.com)

or

    (something)[http://whatever.com]

...and have both work exactly the same.

It will save a lot of trouble -- and especially when linking to a Wikipedia page whose URL contains parentheses.

AngryParsley · on Oct 26, 2012

I feel your pain. Parens in links made me never forget %28 and %29. But I think your solution causes more problems than it solves. The biggest issue is that it's not backwards-compatible with current markdown. For example, in a citation following parentheses:

  Blah blah blah (side note)[1](http://url1).

Your markdown generator could output

  <a href="1">side note</a>(http://url1).

or

  (side note)<a href="http://url1>1</a>.

The latter is current behavior, but your suggestion makes the grammar ambiguous. Most parsers are "greedy", so the former is more likely to be output. Unless you specified more complex behavior, people would have to go back and escape their parens near links.

thaumaturgy · on Oct 26, 2012

I've just finished writing a hyperlink parser; getting it to handle your specific case would be totally do-able.

I think your overall point still stands -- there would be ambiguous edge cases -- but if Reddit comments are anything to go by, they really wouldn't be any worse than the current confusion, and making ()[] interchangeable would clear up a lot of the common problems that users have with making links.

AngryParsley · on Oct 26, 2012

To fix the ambiguous grammar, we need to add a url parser to markdown? This is a perfect example of how changes that seem simple to humans can have enormous impacts on software complexity. As you said, introducing that complexity wouldn't help in many cases:

  Blah blah blah (side note)[1](#ref_1).
  Blah blah blah (side note)[1](ref_1.html).

I don't mean to be hostile, but just because something is "doable" doesn't mean it should go in the spec. Adding things to a standard imposes costs that feature-proposers rarely consider. It forces programmers to write more code to support that feature. More code means more bugs. These bugs annoy users and cost valuable programmer time. Multiply these costs by the number of times the standard is implemented and you can easily end up with millions in lost value.

Vanilla markdown doesn't require a url parser. It doesn't have ambiguous grammar. It's small. It's simple. And that's a good thing, because markdown parsers have enough bugs in them already.

thaumaturgy · on Oct 26, 2012

1. URL parsing is a solved problem. Since I so frequently have my competence questioned on HN, I have to conclude that if I can do it, anyone can.

2. There are a number of people in this thread, and many many many more on Reddit and on other forums that use markdown, that have had a specific UX problem with markdown which we can solve. The solution will not solve all possible problems, but it will solve many of them.

3. I subscribe to the Linus Torvalds camp of software development, which is that software exists to solve user problems.

4. While the trade-off between additional code complexity (and better overall user experience) versus simpler code (and poorer overall user experience) isn't always justified, in this case I think it is. I mean, HN parses URLs for you and makes them clickable -- would you really support removing that as a feature, in favor of having simpler code, and having to copy & paste links into your browser?

5. This particular feature rather nicely lends itself to extensive testing so that you can make sure you've covered most of your edge cases. If we decide not to implement solved, testable features out of fear that some other programmer might not implement it correctly or test it thoroughly enough, then I don't know what to say about our industry other than that it's going to come to a standstill.

I'm afraid that's all the energy I have tonight for arguing about things I don't really have any influence over on HN.

cryptoz · on Oct 26, 2012

I've been annoyed by that on reddit for...like 6 years. "You can just escape it" people say. As if normal reddit users are gonna escape strings themselves these days.