What if journalists had story writing tools as powerful as those used by coders?

jawns · on Dec 10, 2014

When I was working as a web editor at a metro daily paper a few years ago, I proposed something similar: an XML-like syntax that would allow for metadata to be included in drafts of news stories, some (but not all) of which could be made use of in online versions of the story (such as a link to a map when you're referencing a location).

A lot of wire copy already includes metadata, but it's generally just in a header that accompanies the story.

What I was envisioning was something more like what is being proposed for the semantic web:

<name id="1394">John Smith</name> was elected president of the <organization id="2315">New Castle County Council</organization> on <date value="2014-12-10">Wednesday</date> at the <place lat="39.685881" long="-75.613047">county headquarters</place>.<source id="23" name="Mila Jones" title="New Castle County public relations officer"></source>

I also wanted to use the metadata to help copy editors trim wire stories:

<priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>

It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.

Plus, in order for this to work on a larger scale, you'd have to get an incredible amount of buy-in. You'd have to get reporters and editors to agree that it's worth their time. You'd have to build software to support it. You'd have to get all of the different media companies out there to agree on standards.

It's just ... not what the media industry is (or should be) focused on right now. They've got bigger things to think about, like how to find a viable business model.

VonGuard · on Dec 10, 2014

Absolutely true that journalists are too busy to bother with markdown. What would be great, however, is to have tools for copy editing that do things like recognize people's names, Google them and spell check them. Recognize people's titles, Googles and confirms them and spell checks them. Style passes that could do simple grammatical edits around a site's style guide: call it a style filter.

I have to say, for me as a journalist in tech, the one thing that ends up taking the most time in my stories is looking up name spellings and people's titles, and most importantly, trying to figure out if your freaking company is spelled TheCompany, The Company, or Thee Cmpany or some crazy variant. You startups and your mid-word-capital-letters. The bane of copyeditors everywhere.

scholia · on Dec 11, 2014

You can do quite a lot of that in Microsoft Word, including the use of research tools, but it's a lot of work to set up.

Agree about the problems coping with company and product titles, etc. The problem could be reduced by refusing to play that game, eg by using registered names rather than marketing styles or logos.

VonGuard · on Dec 11, 2014

That only works until marketing calls your sales people and says "you spelled our name wrong!"

scholia · on Dec 11, 2014

So you point to the company registration documents and tell them you are happy to display their logos in the ads they can buy to correct it ;-)

schnevets · on Dec 10, 2014

  <priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>

Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button. Activating a different tag would include the reporter's subjective commentary (or perhaps have multiple editorials based on the same "scaffolding")?

kd5bjo · on Dec 10, 2014

In print journalism, you're supposed to include information in descending order of importance. As a reader, you're expected to stop reading once you get to details you don't really care about, confident that you won't miss something more important buried farther down.

jawns · on Dec 10, 2014

That's largely true for hard news stories. But what about news feats or op-eds or sports profiles, where the traditional inverted pyramid structure isn't employed?

apozem · on Dec 10, 2014

Exactly. Feature stories often use a delayed lead to set the scene or catch the reader's attention.

dredmorbius · on Dec 10, 2014

While I can understand the arguments for this, in practice the approach frustrates me to no end.

Worse are authors whose writing has no excerptable lede. Gina Kolata and mumber Morgenstern (health / Well articles in the NY Times) especially do this.

I find some old-school journos -- Dan Gilmore particularly comes to mind, I've called out a few others in G+ posts -- still practice strong ledes and heads. Many newer ones start with "My latest at <some website somewhere> read more".

Which ... tells me fucking nothing.

Lede with your lede. Trail with your link or call-to-action.

dredmorbius · on Dec 10, 2014

The so-called "inverted pyramid". I'm finding it applied less and less frequently.

I've also noted that it's become pretty much standard practice for news bureaus to write single-sentence paragraphs. Literally, every sentence of a story is its own paragraph. I don't know when that became standard practice, but sometime in the later 1990s or 2000s, particularly as stories moved online.

mpclark · on Dec 11, 2014

Sadly, hardly anyone writing now outside big traditional media orgs has done any sort of formal journalism training.

dredmorbius · on Dec 11, 2014

That's ... not all bad.

I think there's too much insularity within much of the journo community. But many of the newcomers are also subject to outside influences which call their credibility strongly into question. Lack of uniform copyediting, for better or worse, means a wide range of writing quality.

Though I'm seeing that even in long-standing brands -- NY Times, Forbes, and elsewhere.

ganeumann · on Dec 11, 2014

Zinsser mentions this one sentence paragraph thing in "How to Write Well", citing an AP article from 1993. (In the "Paragraphs" section of Chapter 10.)

dredmorbius · on Dec 11, 2014

Thanks. And yes, it makes sense that it would be AP style or similar.

rafekett · on Dec 10, 2014

this seems to be a lost art, unfortunately.

coldtea · on Dec 11, 2014

>Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button.

I imagine it, and most users would still not care. They skim articles anyway.

jawns · on Dec 10, 2014

Actually, this is an extremely simplified example, and once you get into the nitty gritty, it becomes a lot more difficult to add/remove various elements. For instance, you might need to capitalize a word differently depending on whether something has been excised immediately before it, or you may need to adjust punctuation in ways that you can't do simply using an XML-like format. Really, you need what amounts to a Natural Language Generation library to implement a robust system.

jerf · on Dec 10, 2014

I implemented that for my blog somewhere around 2001 or so. It's quite tedious to write and I haven't done it since around 2001 or so. Even two versions of a story is a lot.

arebop · on Dec 10, 2014

I did something like this in college, and I was thinking semantic annotations would be an editorial pass like copyediting. These days you'd probably let some ML system take a crack at it before using human effort to bring the quality up to your publication's standard. In any case, it doesn't have to get in the way of writing a good story.

ajuc · on Dec 11, 2014

You can do this with lisp-style syntax (which also (despite opinions to the contrary) is quite natural).

dangayle · on Dec 10, 2014

>> It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.

Nailed it. I'm a developer in a newsroom, and I'm dealing with flack just asking them to write a non-automated teaser text for their blog posts. They can't be bothered.

limelight · on Dec 10, 2014

> They've got bigger things to think about, like how to find a viable business model.

This is actually potentially a big part of that. People are reading more than they ever have—it's just not necessarily newspapers that they're reading.

basisword · on Dec 10, 2014

>> "People are reading more than they ever have—it's just not necessarily newspapers that they're reading."

Any stats on that. I agree people are reading more than ever but disagree that they're not reading newspapers (online). I think they are, they just aren't paying for it anymore.

limelight · on Dec 10, 2014

If you look at this NiemanLab post, table 1 shows that even as online time has increased, time on online newspapers has definitely decreased.

http://www.niemanlab.org/2014/06/are-online-ads-more-valuabl...

basisword · on Dec 10, 2014

Interesting, thanks.

lsseckman · on Dec 11, 2014

stats requested, stats delivered, hackernews

lstamour · on Dec 10, 2014

That said, you could end up applying this to a "news article IDE" automatically, with less human intervention required -- or at least, provide automatic suggestions. I couldn't find any clear links on the topic, but here are a few that can be followed with a bit of research:

http://www.nltk.org/book/ch07.html

http://stanbol.apache.org/docs/trunk/components/enhancer/nlp...

This last one was actually trained on WSJ content: http://nlp.lsi.upc.edu/freeling/index.php?option=com_content...

At this point, though, I'm thinking it'd be the equivalent to spelling and grammar suggestions in Word, appreciated somewhat but ultimately considered useless the first time it screws up. But still better than nothing, right? ;-)

mindcrime · on Dec 10, 2014

Stanbol could definitely be used as part of a system like this. In fact, that sort of thing is a big part of how we're using it at Fogbeam, although aimed at assorted knowledge workers in an enterprise setting, and not at journalists specifically.

djb_hackernews · on Dec 11, 2014

There are automated services that let you do this, I've used Open Calais and MetaCarta in the past with great results.

To be honest I'm surprised services like those aren't automatically used on new content within every major media outlet as a standard.

Cthulhu_ · on Dec 11, 2014

Yeah, just that: they're too busy. Most news articles are valid for just one edition of a newspaper, so about half a day or even less if the newspaper is published more than once a day. It's write it and move on, in a lot of cases. Investigative journalism probably has more use for a system like this, but even then I doubt they'd want to fill in XML forms when they could just write sentences. Besides, usually you can trust an editor to read an article and remove the bits that aren't relevant based on their own judgment, instead of a 'priority' hint by the original author.

7952 · on Dec 10, 2014

A lot of this information can be extracted directly from the text. It is not like "New Castle County Council" is particularly ambiguous. The trouble for a lot of reporting is that the stories simply lack depth and good links to external content.

EGreg · on Dec 11, 2014

I agree that this can be generated at publish time. The question is, how much value does it provide over something generated at read time by a browser plugin? The answer is - at publish time presumably someone at the publisher outfit will take a cursory look at it. That's it. May as well have readers mark up the articles!

Alex3917 · on Dec 10, 2014

I've been messing around with something similar, specifically though focused on tracking strengths and weaknesses in US infrastructure:

http://www.alexkrupp.com/Citevault.html

It's actually a lot easier than you'd think, because thanks to Zipf's law pretty much every article in these evergreen topic areas is using the same set of 1,000 or so facts. And these facts mostly come from the same set of government or NGE reports, which are updated at most once a year, and often only once every ten years.

The cool thing is that you can then use a javascript snippet to track which facts are being used in which documents, automatically mark facts as outdated when they change, etc.

MrBuddyCasino · on Dec 10, 2014

Thats a really great idea! We definitely need some primary source of truth to refute all this half-wisdom thats going around, and of course to self-educate.

Not to nitpick, but I looked up a single random article, namely http://www.alexkrupp.com/Citevault.html#parenting, and Wikipedia says its not that simple:

"An attempt to replicate this study in 5934 8 year old children failed: No relationship of the common C allele to negative effects of formula feeding was apparent, and contra to the original report, the rare GG homozygote children performed worse when formula fed than other children on formula milk.[5] A study of over 700 families recently found no evidence for either main or moderating effects of the original SNP (rs174575), nor of two additional FADS2 polymorphisms (rs1535 and rs174583), nor any effect of maternal FADS2 status on offspring IQ.[6]"

Source: http://en.wikipedia.org/wiki/FADS2

Alex3917 · on Dec 10, 2014

Good catch, that's exactly why this needs to be open sourced!

I actually looked at that wiki article, but probably before that was added. And that's a good example a fact that had some editorializing on my part, which I ultimately want to eliminate. The original idea was to give context to facts, but I think it would make more sense to create a metadata format to do this, e.g.:

- To understand this fact, you need to understand X, Y, Z.

- This fact is needed in order to understand facts X, Y, Z.

This way facts can live and die on their own, rather than being bundled together according to the tastes of a single curator.

EGreg · on Dec 11, 2014

I actually registered findmeaning.in to be the domain for a website I plan to build. Happy to accept partners! If you can find my email at http://qbix.com/about then hit me up.

The website was originally supposed to be an annotation site like rapgenius - and I have already spoken to people at LyricFind who would give me a license to use their entire database of songs!

But then I thought it could go much further. First of all, you could release a bookmarklet and browser plugins that would let people mark up websites for others to see. Each such mark would have its own associated discussions etc.

But we could go even further. We could source facts and debunkings of claims. We could in fact build a graph of dependencies between claims and give people a way to assess their truth value. In this way we could for example have a discussion of varios religious claims once and for all, and any visitor would see what other claims they depend on. I thought, for instance, that Coca Cola invented the modern image of Santa Claus as wearing red. I was wrong.

Snopes, Politifact, etc. could all be sources. Something like facebook posts could definitely use this bookmarklet.

So do you want to findmeaning.in/The-US-Constitution? Or findmeaning.in/Pokerface or findmeaning.in/Book-Of-Jubilees?

If you want to help me build it, drop me a line.

PS: Also I find it really valuable to have a followup after a news story has faded, eg to see if that kid facing life in jail for hash brownies got off (he did).

alaskamiller · on Dec 11, 2014

Genius.com meets first SERP in a meta reincarnation of hypertext. It's so esoteric that seems like such a good idea but requires so much definition and scoping in execution. Good luck :)

SandersAK · on Dec 10, 2014

This is amazing! I might be able to help get more media people to use it. I'm on of the founders at Beacon. Right now we're focused on helping journalism projects get crowdfunding but I'd love to chat more about what you're doing! Email me if you're game: adrian@beaconreader.com

Alex3917 · on Dec 10, 2014

Thanks, I'll send you a note.

mnarayan01 · on Dec 10, 2014

Bookmarked. Editing nit in the FAQ.

  C.f. comments on Peter Thiel's graph of the year...

Should be "Cf." -- also I believe "cf." should be read as "compare", so really it shouldn't even be that (see http://en.wikipedia.org/wiki/Cf.)

Alex3917 · on Dec 11, 2014

Thanks, updated!

indubitably · on Dec 10, 2014

aaaaaaand no one cares.

pestaa · on Dec 11, 2014

I didn't know about the English equivalent, so I did care.

worldsayshi · on Dec 10, 2014

Wow, this is.. Something else. I will definitely suggest this as a possible master thesis research topic at my work place.

Alex3917 · on Dec 10, 2014

Thanks! Yeah I spent a couple thousand hours on it, but for whatever reason have gotten pretty lukewarm feedback from various media organizations I've pitched it to. I think the power to make conent creation 10x cheaper while vastly improving quality and total derived value is pretty self evident, but go figure. So for now it's been fairly dormant. I think the sections on antibiotic resistance and arrest statistics have both changed slightly since I last updated this, but everything else should be pretty current.

exelius · on Dec 10, 2014

Pretty sure the major news services handle this already through their CMSes. Editors can cut and paste "story blocks" that are updated over time. Unfortunately, many news agencies don't do actual research or fact-checking anymore, so your implementation probably failed because it targeted the way news did business 20 years ago, not the entertainment-driven news media we have today. You probably also didn't understand enough about their workflows and CMS systems.

I can't be bothered to look for it right now, but the NYTimes did a behind-the-scenes look at their CMS probably a year ago.

EDIT: Ok, never mind, I looked for it: http://open.blogs.nytimes.com/2014/06/17/scoop-a-glimpse-int...

Alex3917 · on Dec 10, 2014

I'm familiar with the scoop functionality you mention, and my last job involved making APIs to deliver content to media CMS, so I have some general familiarity with the space. Obviously infotainment will always be a large portion of what the media produces, and this isn't really relevant to that. But once you get beyond that type of content, media organizations aren't actively trying to produce low quality articles, it's just that that's what they're incentivized to produce. If you can make it less expensive and more profitable to produce high quality content, then that's what will get produced.

Also there was never really any implementation that failed per se, I actually still think this would work. I'm just currently spending my time on other stuff that I think has more potential. For the last several years this has been kind of a hobby project that I work on here and there whenever I get burnt out on whatever I'm supposed to be doing.

The main problem with it is that it just takes an enormous amount of time to add new content. Any given fact can easily take an entire day to add... By the time you go through all the competing claims, find the the primary sources, read through the methodology for each one, etc., that's easily 8 - 10 hours per fact. Which is why there still isn't any code or UI even after multiple years of work.

davemel37 · on Dec 10, 2014

It's never easy to get someone to change their behavior. When you pitch them, all they hear is the pain of changing platforms. Your assumption that media companies actually care about better quality, is also flawed. They care about sensationalism.

the people want to be entertained, and unless you can show how this entertains them, you'll have a hard time getting past their resistance to change.

Retra · on Dec 10, 2014

Sounds to me like it would be an awesome resource for mathematicians. It's just a different brand of information, but many people would love to have a useable proof reference and knowledge dependency system. Especially if it can be audited and contributed to easily.

hammerandtongs · on Dec 10, 2014

This really seems valuable, you are on the right track and should keep pushing with this project.

Awesome!

limelight · on Dec 10, 2014

This is awesome! I didn't notice a license—is there one?

Alex3917 · on Dec 10, 2014

Good call. This is the first time I've shared a link to this publicly, so I haven't put much thought into this yet. A lot of the quotes, especially in the sections about healthcare and pharma, come from copyrighted books. So there might be some issues with allowing people to redistribute this entire thing wholesale, especially for commercial use. But for now you only need to cite the primary source for any given fact. At some point I'll dump this on citevault.org, and I'll have to put some thought into licensing issues I guess.

Hansi · on Dec 10, 2014

When I wrote my disertation I used Scrivener: http://www.literatureandlatte.com/scrivener.php which is a wonderful tool for writing. An expanded open source application similar to that with a good plugin system might be very useful.

egypturnash · on Dec 10, 2014

My first thought upon seeing the article title was "How is this going to be different from Scrivener", which is what a huge chunk of the pro writers I know seem to use.

tezza · on Dec 10, 2014

Most journalists are barely informed cut'n'paste merchants.

Journalists value the sensational over the factual, and work hard (true) to tight deadlines.

So they already do not care about the tech features promised, namely indicating:

* there is not enough evidence to make a given point

* a certain person or company has not been investigated thoroughly enough

* a certain point is not relevant

[edit: data points]

Cut'n'paste obituary from Wikipedia: http://www.theregister.co.uk/2007/10/03/wikipedia_obituary_c...

"Hack": A self referential term journalists use for each other in the UK http://en.wikipedia.org/wiki/Hack_writer

jsankey · on Dec 10, 2014

Most programmers are barely capable Stack Overflow copy'n'paste merchants.

Programmers value the hacky over the elegant, and work hard to tight release cycles.

And yet there are programmers who care, and who seek out and use better ways of doing things, including the tools to help them. Just as there are journalists who care about getting things right, digging up and exposing the truth.

In fact, a great way to lift the general standard is to make the right way of doing something also the easy way of doing it: better tools can counteract short deadlines and occasional lapses of discipline.

superuser2 · on Dec 11, 2014

>Journalists value the sensational over the factual, and work hard (true) to tight deadlines.

No, people do. There are tons of highly professional journalists who want to do good work and write important, well-crafted, accurate stories. Who were inspired to get into the field by Watergate. Who are constantly begging their bosses to do labor-intensive features.

The economic reality is that there are not enough people who want this enough to fund it, except barely at a handful of institutions like NYTimes. Good work takes time and manpower and it doesn't sell. If management is doing its job (maximize shareholder value) then it is doing everything it can to turn its paper into Buzzfeed.

There are thousands of journalists who left their dying papers because they couldn't stand it anymore. Thousands more who were simply laid off, or took buyouts because they saw that they were going to be laid off if they didn't. My dad is one of these. They'd want nothing more than to work at a real paper again, but there aren't really real papers anymore.

Being sloppy and fast is absolutely about service to the customer. With a daily publication deadline, you can generally take the time to do it right. But the readership (and therefore management) wants stories on the internet as quickly as possible. Of course they are going to be sloppy.

(I agree with all your complaints about TV news, because that's what it's always been. In print... that's what they were forced to become when the money became tight.)

_tbgl · on Dec 10, 2014

Hard to see how this apparent stereotyping (a) wouldn't apply to people working in nearly every industry and (b) is relevant to this tool.

zo1 · on Dec 10, 2014

I guess it's relevant because the OP is saying that they wouldn't use it because they think that facts are irrelevant.

Definitely, we'd have to look at the studies in this space, namely whether journalists cite their facts, or write their pieces using citable studies/facts, etc. So I also won't accept the OP's statement of it as fact.

Though, in my, supposedly biased, opinion I'd say journalists are quite adept at twisting facts to suit their points, and also of omitting (i.e. cherry picking) facts that support their views/points.

_tbgl · on Dec 10, 2014

Of course. I just hope you recognize comments like these do precisely the thing they lament: make unsourced assertions designed to advance a point of view.

"Journalism" is a broad tent. I suspect there's plenty of room for tools like this one.

RobertKerans · on Dec 10, 2014

Hold on, you're using the fact that UK journalists ironically recycle a term that's normally used in a perjorative sense as justification for stereotyping them? What you've said can be effectively applied to any possible occupation in any field.

pavlov · on Dec 10, 2014

It's true that coders have lots of great tools for working with textual representations of programs... But IMHO programmers' tools are stuck in a certain kind of local maximum due to the difficulty of moving beyond text. We've done all we can to make textual programs easier to manipulate, but there are fundamental difficulties that can't be solved this way.

I'm personally interested in this question: What if coders had design tools as powerful as those used by architects and construction engineers?

I wrote a blog post about this recently: http://blog.neonto.com/?p=44

TallGuyShort · on Dec 10, 2014

I believe that's the idea behind this: http://www.mathworks.com/products/simulink/. I know a software engineer in the aerospace field that loves this product and describes it the way you do (except that it's more for embedded systems than for user experiences)

pbhjpbhj · on Dec 11, 2014

This makes me think of http://code.google.com/p/blockly/ .. particular as I used it to do the hour of code with my 5yo this week. I find it actually an alright way to right some javascript in the limited context of code.org - it works OK on app.inventor too IMO.

declan · on Dec 10, 2014

I worked full-time as a journalist for a bunch of different news organizations before leaving to found the forthcoming http://recent.io/.

My suspicion is that neither I nor other journalists I know would use this newsclip.se tool. In a single newsroom I've seen people using Word, TextEdit, Google Docs, Notepad, BBEdit, Gmail, phone-based email clients, phone dictation, and even emacs (me, a few times) to write news articles. While the CMS is newsroom-wide, the writing and editing processes tend to be very personal. Journalists also tend to be individualistic and dislike being forced to use a standardized system without clear benefits.

Where something like newsclip.se might be beneficial would be as a kind of preprocessor/lint for the CMS. It could do a lot of what's being described in the linked article but without replacing the entire journalistic stack.

cryoshon · on Dec 10, 2014

Making a tool like this has occurred to me, but my idea had a slightly different focus.

Rather than attempt to link evidence to statements, which, if we're being honest, doesn't really bring us closer to the truth since many "sources" are merely other people's words, it was much simpler: identify weasel words, euphemisms, or use of the passive voice. Between these three features, I think that most factual writing could be improved a colossal amount.

I would certainly appreciate an easier way to keep track of claims, citations, evidence, and interplay between a story's moving parts, though. I think that the article mentions a few tools which work toward that goal admirably. Right now, to make a really bulletproof piece, you need to be extremely scrupulous with self-identifying your claims and then providing written explication of evidence or hyperlinked evidence.

Additionally, it'd be really useful to have a tool which kept track of sentence structure and also allowed you to track logical rhetoric by keeping track of "If this, then that" style statements.

This is probably asking too much, though. A final hurdle is that journalists and writers tend to be old-school when it comes to technology, so it might be a hard sell to the older segment of that market.

lstamour · on Dec 10, 2014

Some of this (the identification of words, passive voice) exists already, though results vary: https://www.goodreads.com/topic/show/1094904-proofreading-so... (I link to this more because it's a comprehensive list of possible software even though the comparison was done by a vendor.)

bp999 · on Dec 10, 2014

Hemmingway does some of what your talking about like identifying adverbs, use of passive voice and rating the reading level needed to understand what has been written: http://www.hemingwayapp.com/

LukaAl · on Dec 10, 2014

What do you mean for "writing tools as powerful as those used by coders"? In my experience good coders doesn't need very powerful tool. Super-complex IDE are usually a disturbance to good coders rather then an help and even the uber-geeks that use Emacs don't really use in their jobs all the macros.

A lot of good software engineer I know prefer lightweight editor like the good old Vim or SublimeText over IDE like Visual Studio, Xcode or Eclipse. That's because when you write complex code your brain is slower than your finger and the syntax and structure of the language you are using is something under your skin, you don't need to think at it.

So, no, great coder doesn't use powerful editors, if they can chose they use very simple one with the minimum level of features they need to help them think faster (syntax highlighting, autocompletion, indention) and stop. All other features of complex IDE are a disturbance rather than an help.

I don't know what a good journalist need but my guess is that they have the same problem, their brain is slower than their finger and the grammar of the language they are using should be well known. I don't know which tool they use, but in my opinion a simple text editor with a minimal spellchecking system (but not correction) well integrated with the publishing process is more than enough.

From an Innovation point of view I'm usually suspect when someone claim to have an innovative tool for an established industry. If the industry is established the actors have already optimized their tools and if there could be an improvement is very small. What usually happens is that something changes in the industry and make the actual tool obsolete (e.g: a new process or a new technology became available). But if you want to create something really innovative, you have to find this change. Simply bringing something existing in a field in a different field rarely works. And if it works is because you have a really good understanding of both fields.

debaserab2 · on Dec 10, 2014

While it's definitely true in some cases, what you've stated is a generalization: there are some great coders who use SublimeText or some other lightweight editor. Unfortunately, this does not mean using SublimeText makes you a good coder (or even is a sign of one). There is a lot of buy-in to the concept that using a more lightweight IDE makes you a better coder. I see a lot of entry level developers taking on this mentality.

Sometime I cringe when I watch a developer using a lightweight IDE, missing so many obvious syntax errors that they're not going to realize until they go to compile or run their code. So many wasted workflow cycles.

I work on large codebases and I know that I don't have the mental capacity to hold the entire structure (object interfaces, app structures, third party library usage) in my head. It's great that I can offload this task to my IDE and concentrate on the problem I'm solving instead of getting disrupted by having to think about code mechanics.

On a side note, I think it's funny that people like to consider vim a lightweight IDE. All the good vim coders I know of make heavy use of plugins to get the exact same features a full fledged IDE has.

freshhawk · on Dec 10, 2014

As someone also in the "editor > IDE" camp, I agree that a single monolithic tool is maybe the wrong path to take.

But good coders still need powerful tools, just lots of tools, rather than one complex one. Simple and powerful are orthogonal.

I disagree that established industries have already optimized their tools. That's just blatantly untrue in real life, and for good reason. The tool makers always have the best tools. A journalist can't make a better tool for journalism without learning an entire new career (and tool making requires more experience than average as well, beginners don't make good tools).

LukaAl · on Dec 10, 2014

Agree on the need of powerful tool. And sometimes you ends up spending time writing the right tool for that specific tasks when it is missing (done more than one time). On your rebuttal on the tools for established industries, I just said that it's a suspicious claim, not that is false. The point is this: if an industry is a niche, usually who works in that niche has to build their own tools. In larger industries, like journalism, you think that a professional is not asked by toolmakers which feature they need to make their works faster or they don't approach themselves the toolmakers to ask for better tools? But let's see the problem from a different perspective. Could you list me an innovation that is just the application of a consolidated concept in an area to another consolidated area without a change on the underlying industry or of the technology. There's a lot of literature on this point. If you are interested, you could search for the Utterback S or the concept of paradigm shift in innovation.

efuquen · on Dec 10, 2014

> All other features of complex IDE are a disturbance rather than an help.

The things about IDEs that suck productivity wise is that instead of just giving you all the things you do want (syntax, highlighting, autocompletion, indentation) they provide bloat that doesn't make a task faster to do but easier to. What I mean by easier is putting GUIs on top of command line tools and APIs. A great coder looks at all that GUI cruft and thinks, many what hell is the underlying API or CLI tool I can use instead and get rid of all this middleware crap that is probably full of bugs. Someone who is not comfortable with that instead leans on those GUI tools because they don't really want to understand what is going on under the hood, they need the wizards and the text boxes, drop downs, and such to make it easier on themselves.

With that said, there is a value in having an editor that can actually understand your target language all the down way to the AST level. That's how you can get nice things like reliable refactoring without having to use grep, sed, and friends. Or with detecting syntax errors and such, and even though the VIM plugin system is doing great there can definitely be more innovation in that space (and there clearly has been lately). But let's just leave out all the Spring XML file builders and Tomcat/Jetty container managers and focus on the code.

zo1 · on Dec 10, 2014

>"So, no, great coder doesn't use powerful editors, if they can chose they use very simple one with the minimum level of features they need to help them think faster (syntax highlighting, autocompletion, indention) and stop. All other features of complex IDE are a disturbance rather than an help."

Says you, and your anecdotal evidence/experience. Sounds to me that you have more prejudice against "complex" IDEs than you do actual facts. Would countering your points by telling you of the countless bad "coders" I've worked with that use text-editors instead of IDEs convince you to change your mind? (I'm going to stop using your term 'coders' here) Or the absolutely great programmers that I've seen that use powerful IDEs to supplement and augment their abilities?

Do you write in a compiled language that can provide rich intellisense, or something more like Ruby/Python/bash that is a tad lacking in that department?

LukaAl · on Dec 10, 2014

I use coder because that's the term used in the question. I prefer software engineer usually. I'm not a coder, or software engineer, for what it matters but I'm doing it now and I understand how it works and I use all sort of language, from C to Python. I have no prejudice against complex editor. In my opinion everyone should be able to choose without problems the editor/IDE that prefers and that's not always true. My anecdotal evidence is that, when free to choose, the better you are at a language the more you prefer an editor. Nevertheless, I could accept your point of view, but you should explain me something, if there are different point of view you should accept that the advantage of this so called "powerful tool" is less than clear.

mkehrt · on Dec 10, 2014

Unrelated to the actual point, but I'm curious. You seem to be loathe to use the word "coder". Why is that? How is it different from programmer? Is there some sort of cultural shibboleth I'm unaware of here?

zo1 · on Dec 10, 2014

I don't think it's a broader cultural thing, or anything. And personally, I'm fine with the term. It's just there was something about the way the OP was using the word "coder" that didn't quite sit well with me. Reading back at his post, I can't quite put a finger on what it was that actually bothered me about it.

kenrikm · on Dec 10, 2014

I think there may be a case for using something like GitHub to allow editors to edit etc.. and writers to pull the edits though I'm pretty sure most media outlets have systems that do almost the same thing.

chc · on Dec 10, 2014

Git and GDB (or your preferred equivalents) are pretty powerful tools relative to what most journalists routinely work with, and pretty much every good coder uses those.

derekp7 · on Dec 10, 2014

That actually brings up a side topic that I've been trying to follow -- writing text so that it works better with version control. You put each thought or phrase on its own line, and let the markup system compile it into its final form.

Fede_V · on Dec 10, 2014

My distinct impression is that the limiting factor in quality journalism is gum-shoe reporting, not incredibly powerful CMS.

Buzzfeed has the most advanced CMS, but their reporting pales with respect to the NYT. When they do bother to do proper reporting (McKay Coppins, etc) they get excellent stories - the rest of the time they use filler, because it's cheap to produce.

limelight · on Dec 10, 2014

I actually think technology can improve the quality of journalism. Not necessarily by making great stories better (though the NYT is definitely proving that's possible with some of their interactives) but by making reporting in general cheaper. If a great CMS can make it possible to produce all the economically required filler in half a day, that leaves time to do actual investigative reporting later.

Buzzfeed is definitely pushing this a little bit, though I do think they have a bit too much baggage to get far enough with it.

(PS. If anyone wants to help tackle this problem, we're hiring @ Cafe: http://cafe.com/careers)

Fede_V · on Dec 10, 2014

I'm sure technology can improve things - for example, the NYT is doing a much better job now of marketing its (excellent) cooking section.

My impression though, is that what makes great journalism is reporting. Almost all the time, when I'm reading a great article, I do so using readability to strip away all the crud and just end up with the text in a decent size and no pictures/hyperlinks/movies.

limelight · on Dec 10, 2014

> My impression though, is that what makes great journalism is reporting.

Yup. At the end of the day, I think in-depth textual reporting does rule. But what tech can do is make that great reporting slightly cheaper (and thus more plentiful/viable).

apozem · on Dec 10, 2014

This is absolutely true. I spend hours laying out print pages. A good CMS could cut that time down to nothing. It's a ton of work up front, but it saves you so much time later.

Edit: I also need a job soon and am applying to write for Cafe!

prawn · on Dec 10, 2014

"the rest of the time they use filler, because it's cheap to produce"

And because people read filler because it's easy and unchallenging.

aethertap · on Dec 10, 2014

This is a great idea. So much so that I have "Why didn't I think of that?" syndrome.

Here are a couple of suggestions that I could use if you're looking for feature requests. Most of these things exist in one place or another, but having them integrated into a one-tool workflow would be awesome.

1. Some kind of crowdsourced reputation system for sources (i.e. medical journal sites have high reputation, naturalnews.com has low)

2. Auto cross-referencing between articles based on content.

3. TODO list management

4. License-aware relevant image suggester (please!!!) This alone would be a killer feature for me. Pick out topic words and search selected image sites, then give me thumbnails to choose from.

patrics123 · on Dec 10, 2014

Yes please! Anybody needs such a thing for blogging anyway! > 4. License-aware relevant image suggester (please!!!)

jseliger · on Dec 10, 2014

Yes please! Anybody needs such a thing for blogging anyway!

I wrote this in another comment, but a tool like this already exists: http://www.stevenberlinjohnson.com/movabletype/archives/0002... and has for about a decade. I drop all my blog posts in Devonthink Pro, and I'm often surprised by the connections that emerge.

aethertap · on Dec 10, 2014

Thank you for this. I don't use a Mac, but sometimes tools like this make me wish I did.

mjklin · on Dec 11, 2014

Similar tools for Windows:

http://scan.sourceforge.net/

http://dtsearch.com/

phkahler · on Dec 10, 2014

I've always wondered why writers don't use things like revision control and decent diff tools. I'm not sure the existing tools are well suited to them (yet).

jonnathanson · on Dec 10, 2014

"I've always wondered why writers don't use things like revision control and decent diff tools. I'm not sure the existing tools are well suited to them (yet)."

Certain writers do. It just depends on the domain. For book writers, there are tools like Scrivener. For screenwriters, there are tools like Final Draft. These tools help tremendously with auto-formatting and version control. They also keep track of things like characters and settings, presumably by assigning them to certain classes and IDs "under the hood." (I've never looked under the hood, but I assume they are XML-based.) They are fantastic tools for the writers they're made for. They're also fairly WSIWYG, and do not require writers to know anything about markup languages. That's pretty key, because many writers are not technically proficient. (Modest technical proficiency, at the very least with HTML and some CSS, is probably going to become a requirement in the future...but I digress.)

For journalism, however, Word is the default—and obviously, it has its strengths and its weaknesses. Wonky changelogs and substandard version control are two of the biggest. Some editors at some publications are moving over to Google docs, for the superior versioning and collaboration capabilities. But Google's editor doesn't offer the robustness and feature set of Word, and it's unfamiliar, so a lot of editors and writers are reluctant to adopt it. Personally speaking, I much prefer Google to Word when it comes to working with my editors. I'll gladly take in-doc comments and version control over emailing Word docs back and forth, any day of the week.

danielweber · on Dec 10, 2014

Microsoft Word has had revision control for at least 15 years, probably even longer. It's not like the document creation world doesn't have exposure to these ideas.

scholia · on Dec 11, 2014

Microsoft Word has a shedload of features that journalists can't be bothered learn and/or use, and that even the author appears not to know about, judging by his reference to "replacing simple text processors like Microsoft Word".

runevault · on Dec 10, 2014

Some stuff (revision control) is at least reasonably good, though it depends somewhat on what format you are saving text into. Also since most diff tools work on a new line level it isn't as useful as if they split on say periods/exclamation points/question marks, since those are the real dividing points in any form of writing (journalism/prose at least).

cpach · on Dec 10, 2014

I agree that prose requires different diffs than source code. Here is an example of how Github tackles that issue: https://news.ycombinator.com/item?id=7240122

runevault · on Dec 11, 2014

Oh interesting I either had not seen or forgot about this. Good to know thanks!

jkaunisv1 · on Dec 10, 2014

As with programmers, I think it's a matter of education and time pressures. If a cowboy coder hasn't been exposed to a strong version control culture, and works mostly alone, they won't have as strong a need for VC (or even know about it). When you're always under deadline pressures, you're focused on getting the product out above all else.

Basically, they don't know there's a better way. They're used to hacking through Word's track changes, and they always have a deadline to meet. That doesn't leave much time for improving your toolset.

Also, many writers have very idiosyncratic workflows. From what I gather their education and workplaces focus more on the craft of writing well, regardless of whether it's with a pencil or a keyboard. Whereas it's hard to contribute to a codebase without being exposed to version control.

27182818284 · on Dec 10, 2014

Hmm. I think it is akin to the success of Dropbox. Tools like Rsync and binary diffs already existed, but they weren't friendly enough for a lot of developers, dads, non-tech coworkers, etc to use until Drew Houston wrapped them up nicely in Python.

(I vaguely recall him mentioning that on his app even)

limelight · on Dec 10, 2014

More and more newsrooms are actually integrating diffing tools into their CMSs.

globalpanic · on Dec 10, 2014

scrivener allows you to take snapshots which is a kind of basic version control

lastofus · on Dec 10, 2014

I've found that Draft Control is excellent easy to use version control for Word/Pages

http://www.draftcontrol.com/

limelight · on Dec 10, 2014

This is a very cool prototype, and I especially like the idea of calling it a "journalism IDE" (as opposed to a CMS).

At work (http://cafe.com), I've actually been working on something similar with our CMS (Monsoon). We're trying to use technology to make telling cohesive online narratives a lot easier.

Interestingly, one of the biggest hurdles so far has been in decomposing stories. Media traditionally treats each story as a big blob of text (in most cases, HTML), but we're trying to change that so that each story is actually just an arrangement of smaller tidbits (we call them droplets). Switching to that model helps us to encode a lot more semantic information, and also to reflow stories effectively for context.

We're not yet to the point where we integrate/suggest droplets from other stories automatically, but that's definitely the goal. Maybe we could integrate something like Newsclip.se to encourage that.

(PS. If you want to help us get there, we're hiring: http://cafe.com/careers)

div · on Dec 10, 2014

I agree that it's a cool idea ! It may be nitpicky, but I had the opposite reaction to calling it a journalism IDE.

It just seems as if the intended audience wouldn't have the faintest idea what an IDE is.

limelight · on Dec 10, 2014

So, I think it's a question of who you're marketing to. I'd never call this an IDE to the journalists themselves, but calling it a "journalism IDE" is a great way to recruit developers/technologists—which is actually a big challenge (in my experience, most great developers don't want to touch anything close to content, probably due to latent association with the abomination that is WordPress).

pudo · on Dec 10, 2014

Hi all, post author here! We're working on the code base and would love everybody's input here: https://github.com/pudo/tmi

barkingcat · on Dec 10, 2014

Hi pudo,

What is the relationship btw the version of newsclipse that's running as a demo online and the repo:

https://github.com/Canvas-Hackathon-Teams/Newsclipse

It seems to be the same - is that where I get a copy to run locally?

pudo · on Dec 10, 2014

I've just decided to move the code base to run against a SQL backend, rather than mongo & begun to implement user permissions etc.

bhaumik · on Dec 10, 2014

Here's a neat collection of tools that might help: http://www.producthunt.com/e/tools-for-writers

grey-area · on Dec 10, 2014

Your original mockup link is broken on the github repo.

dredmorbius · on Dec 11, 2014

Look up the hNews microformat, mentioned above.

couchand · on Dec 10, 2014

This somewhat plays into Rebecca Parsons' talk [0] at hack.summit() last week. She emphasized that we've done a great job building tools for other technical folks, but we've really dropped the ball on supporting non-technical fields. Her perspective was specifically on the potential of DSLs, but the call to developers to be more relevant is likewise compelling.