When I was working as a web editor at a metro daily paper a few years ago, I proposed something similar: an XML-like syntax that would allow for metadata to be included in drafts of news stories, some (but not all) of which could be made use of in online versions of the story (such as a link to a map when you're referencing a location).
A lot of wire copy already includes metadata, but it's generally just in a header that accompanies the story.
What I was envisioning was something more like what is being proposed for the semantic web:
<name id="1394">John Smith</name> was elected president of the <organization id="2315">New Castle County Council</organization> on <date value="2014-12-10">Wednesday</date> at the <place lat="39.685881" long="-75.613047">county headquarters</place>.<source id="23" name="Mila Jones" title="New Castle County public relations officer"></source>
I also wanted to use the metadata to help copy editors trim wire stories:
<priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>
It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.
Plus, in order for this to work on a larger scale, you'd have to get an incredible amount of buy-in. You'd have to get reporters and editors to agree that it's worth their time. You'd have to build software to support it. You'd have to get all of the different media companies out there to agree on standards.
It's just ... not what the media industry is (or should be) focused on right now. They've got bigger things to think about, like how to find a viable business model.
Absolutely true that journalists are too busy to bother with markdown. What would be great, however, is to have tools for copy editing that do things like recognize people's names, Google them and spell check them. Recognize people's titles, Googles and confirms them and spell checks them. Style passes that could do simple grammatical edits around a site's style guide: call it a style filter.
I have to say, for me as a journalist in tech, the one thing that ends up taking the most time in my stories is looking up name spellings and people's titles, and most importantly, trying to figure out if your freaking company is spelled TheCompany, The Company, or Thee Cmpany or some crazy variant. You startups and your mid-word-capital-letters. The bane of copyeditors everywhere.
You can do quite a lot of that in Microsoft Word, including the use of research tools, but it's a lot of work to set up.
Agree about the problems coping with company and product titles, etc. The problem could be reduced by refusing to play that game, eg by using registered names rather than marketing styles or logos.
<priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>
Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button. Activating a different tag would include the reporter's subjective commentary (or perhaps have multiple editorials based on the same "scaffolding")?
In print journalism, you're supposed to include information in descending order of importance. As a reader, you're expected to stop reading once you get to details you don't really care about, confident that you won't miss something more important buried farther down.
That's largely true for hard news stories. But what about news feats or op-eds or sports profiles, where the traditional inverted pyramid structure isn't employed?
While I can understand the arguments for this, in practice the approach frustrates me to no end.
Worse are authors whose writing has no excerptable lede. Gina Kolata and mumber Morgenstern (health / Well articles in the NY Times) especially do this.
I find some old-school journos -- Dan Gilmore particularly comes to mind, I've called out a few others in G+ posts -- still practice strong ledes and heads. Many newer ones start with "My latest at <some website somewhere> read more".
Which ... tells me fucking nothing.
Lede with your lede. Trail with your link or call-to-action.
The so-called "inverted pyramid". I'm finding it applied less and less frequently.
I've also noted that it's become pretty much standard practice for news bureaus to write single-sentence paragraphs. Literally, every sentence of a story is its own paragraph. I don't know when that became standard practice, but sometime in the later 1990s or 2000s, particularly as stories moved online.
I think there's too much insularity within much of the journo community. But many of the newcomers are also subject to outside influences which call their credibility strongly into question. Lack of uniform copyediting, for better or worse, means a wide range of writing quality.
Though I'm seeing that even in long-standing brands -- NY Times, Forbes, and elsewhere.
Zinsser mentions this one sentence paragraph thing in "How to Write Well", citing an AP article from 1993. (In the "Paragraphs" section of Chapter 10.)
>Forget about editors, use this kind of mark-up for your readers. Imagine changing an in-depth article into a truncated, 200-word summary with the click of a button.
I imagine it, and most users would still not care. They skim articles anyway.
Actually, this is an extremely simplified example, and once you get into the nitty gritty, it becomes a lot more difficult to add/remove various elements. For instance, you might need to capitalize a word differently depending on whether something has been excised immediately before it, or you may need to adjust punctuation in ways that you can't do simply using an XML-like format. Really, you need what amounts to a Natural Language Generation library to implement a robust system.
I implemented that for my blog somewhere around 2001 or so. It's quite tedious to write and I haven't done it since around 2001 or so. Even two versions of a story is a lot.
I did something like this in college, and I was thinking semantic annotations would be an editorial pass like copyediting. These days you'd probably let some ML system take a crack at it before using human effort to bring the quality up to your publication's standard. In any case, it doesn't have to get in the way of writing a good story.
>> It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.
Nailed it. I'm a developer in a newsroom, and I'm dealing with flack just asking them to write a non-automated teaser text for their blog posts. They can't be bothered.
> They've got bigger things to think about, like how to find a viable business model.
This is actually potentially a big part of that. People are reading more than they ever have—it's just not necessarily newspapers that they're reading.
>> "People are reading more than they ever have—it's just not necessarily newspapers that they're reading."
Any stats on that. I agree people are reading more than ever but disagree that they're not reading newspapers (online). I think they are, they just aren't paying for it anymore.
That said, you could end up applying this to a "news article IDE" automatically, with less human intervention required -- or at least, provide automatic suggestions. I couldn't find any clear links on the topic, but here are a few that can be followed with a bit of research:
At this point, though, I'm thinking it'd be the equivalent to spelling and grammar suggestions in Word, appreciated somewhat but ultimately considered useless the first time it screws up. But still better than nothing, right? ;-)
Stanbol could definitely be used as part of a system like this. In fact, that sort of thing is a big part of how we're using it at Fogbeam, although aimed at assorted knowledge workers in an enterprise setting, and not at journalists specifically.
Yeah, just that: they're too busy. Most news articles are valid for just one edition of a newspaper, so about half a day or even less if the newspaper is published more than once a day. It's write it and move on, in a lot of cases. Investigative journalism probably has more use for a system like this, but even then I doubt they'd want to fill in XML forms when they could just write sentences. Besides, usually you can trust an editor to read an article and remove the bits that aren't relevant based on their own judgment, instead of a 'priority' hint by the original author.
A lot of this information can be extracted directly from the text. It is not like "New Castle County Council" is particularly ambiguous. The trouble for a lot of reporting is that the stories simply lack depth and good links to external content.
I agree that this can be generated at publish time. The question is, how much value does it provide over something generated at read time by a browser plugin? The answer is - at publish time presumably someone at the publisher outfit will take a cursory look at it. That's it. May as well have readers mark up the articles!
It's actually a lot easier than you'd think, because thanks to Zipf's law pretty much every article in these evergreen topic areas is using the same set of 1,000 or so facts. And these facts mostly come from the same set of government or NGE reports, which are updated at most once a year, and often only once every ten years.
The cool thing is that you can then use a javascript snippet to track which facts are being used in which documents, automatically mark facts as outdated when they change, etc.
Thats a really great idea! We definitely need some primary source of truth to refute all this half-wisdom thats going around, and of course to self-educate.
"An attempt to replicate this study in 5934 8 year old children failed: No relationship of the common C allele to negative effects of formula feeding was apparent, and contra to the original report, the rare GG homozygote children performed worse when formula fed than other children on formula milk.[5] A study of over 700 families recently found no evidence for either main or moderating effects of the original SNP (rs174575), nor of two additional FADS2 polymorphisms (rs1535 and rs174583), nor any effect of maternal FADS2 status on offspring IQ.[6]"
Good catch, that's exactly why this needs to be open sourced!
I actually looked at that wiki article, but probably before that was added. And that's a good example a fact that had some editorializing on my part, which I ultimately want to eliminate. The original idea was to give context to facts, but I think it would make more sense to create a metadata format to do this, e.g.:
- To understand this fact, you need to understand X, Y, Z.
- This fact is needed in order to understand facts X, Y, Z.
This way facts can live and die on their own, rather than being bundled together according to the tastes of a single curator.
I actually registered findmeaning.in to be the domain for a website I plan to build. Happy to accept partners! If you can find my email at http://qbix.com/about then hit me up.
The website was originally supposed to be an annotation site like rapgenius - and I have already spoken to people at LyricFind who would give me a license to use their entire database of songs!
But then I thought it could go much further. First of all, you could release a bookmarklet and browser plugins that would let people mark up websites for others to see. Each such mark would have its own associated discussions etc.
But we could go even further. We could source facts and debunkings of claims. We could in fact build a graph of dependencies between claims and give people a way to assess their truth value. In this way we could for example have a discussion of varios religious claims once and for all, and any visitor would see what other claims they depend on. I thought, for instance, that Coca Cola invented the modern image of Santa Claus as wearing red. I was wrong.
Snopes, Politifact, etc. could all be sources. Something like facebook posts could definitely use this bookmarklet.
So do you want to findmeaning.in/The-US-Constitution? Or findmeaning.in/Pokerface or findmeaning.in/Book-Of-Jubilees?
If you want to help me build it, drop me a line.
PS: Also I find it really valuable to have a followup after a news story has faded, eg to see if that kid facing life in jail for hash brownies got off (he did).
Genius.com meets first SERP in a meta reincarnation of hypertext. It's so esoteric that seems like such a good idea but requires so much definition and scoping in execution. Good luck :)
This is amazing! I might be able to help get more media people to use it. I'm on of the founders at Beacon. Right now we're focused on helping journalism projects get crowdfunding but I'd love to chat more about what you're doing! Email me if you're game: adrian@beaconreader.com
Thanks! Yeah I spent a couple thousand hours on it, but for whatever reason have gotten pretty lukewarm feedback from various media organizations I've pitched it to. I think the power to make conent creation 10x cheaper while vastly improving quality and total derived value is pretty self evident, but go figure. So for now it's been fairly dormant. I think the sections on antibiotic resistance and arrest statistics have both changed slightly since I last updated this, but everything else should be pretty current.
Pretty sure the major news services handle this already through their CMSes. Editors can cut and paste "story blocks" that are updated over time. Unfortunately, many news agencies don't do actual research or fact-checking anymore, so your implementation probably failed because it targeted the way news did business 20 years ago, not the entertainment-driven news media we have today. You probably also didn't understand enough about their workflows and CMS systems.
I can't be bothered to look for it right now, but the NYTimes did a behind-the-scenes look at their CMS probably a year ago.
I'm familiar with the scoop functionality you mention, and my last job involved making APIs to deliver content to media CMS, so I have some general familiarity with the space. Obviously infotainment will always be a large portion of what the media produces, and this isn't really relevant to that. But once you get beyond that type of content, media organizations aren't actively trying to produce low quality articles, it's just that that's what they're incentivized to produce. If you can make it less expensive and more profitable to produce high quality content, then that's what will get produced.
Also there was never really any implementation that failed per se, I actually still think this would work. I'm just currently spending my time on other stuff that I think has more potential. For the last several years this has been kind of a hobby project that I work on here and there whenever I get burnt out on whatever I'm supposed to be doing.
The main problem with it is that it just takes an enormous amount of time to add new content. Any given fact can easily take an entire day to add... By the time you go through all the competing claims, find the the primary sources, read through the methodology for each one, etc., that's easily 8 - 10 hours per fact. Which is why there still isn't any code or UI even after multiple years of work.
It's never easy to get someone to change their behavior. When you pitch them, all they hear is the pain of changing platforms.
Your assumption that media companies actually care about better quality, is also flawed. They care about sensationalism.
the people want to be entertained, and unless you can show how this entertains them, you'll have a hard time getting past their resistance to change.
Sounds to me like it would be an awesome resource for mathematicians. It's just a different brand of information, but many people would love to have a useable proof reference and knowledge dependency system. Especially if it can be audited and contributed to easily.
Good call. This is the first time I've shared a link to this publicly, so I haven't put much thought into this yet. A lot of the quotes, especially in the sections about healthcare and pharma, come from copyrighted books. So there might be some issues with allowing people to redistribute this entire thing wholesale, especially for commercial use. But for now you only need to cite the primary source for any given fact. At some point I'll dump this on citevault.org, and I'll have to put some thought into licensing issues I guess.
When I wrote my disertation I used Scrivener: http://www.literatureandlatte.com/scrivener.php which is a wonderful tool for writing. An expanded open source application similar to that with a good plugin system might be very useful.
My first thought upon seeing the article title was "How is this going to be different from Scrivener", which is what a huge chunk of the pro writers I know seem to use.
Most programmers are barely capable Stack Overflow copy'n'paste merchants.
Programmers value the hacky over the elegant, and work hard to tight release cycles.
And yet there are programmers who care, and who seek out and use better ways of doing things, including the tools to help them. Just as there are journalists who care about getting things right, digging up and exposing the truth.
In fact, a great way to lift the general standard is to make the right way of doing something also the easy way of doing it: better tools can counteract short deadlines and occasional lapses of discipline.
>Journalists value the sensational over the factual, and work hard (true) to tight deadlines.
No, people do. There are tons of highly professional journalists who want to do good work and write important, well-crafted, accurate stories. Who were inspired to get into the field by Watergate. Who are constantly begging their bosses to do labor-intensive features.
The economic reality is that there are not enough people who want this enough to fund it, except barely at a handful of institutions like NYTimes. Good work takes time and manpower and it doesn't sell. If management is doing its job (maximize shareholder value) then it is doing everything it can to turn its paper into Buzzfeed.
There are thousands of journalists who left their dying papers because they couldn't stand it anymore. Thousands more who were simply laid off, or took buyouts because they saw that they were going to be laid off if they didn't. My dad is one of these. They'd want nothing more than to work at a real paper again, but there aren't really real papers anymore.
Being sloppy and fast is absolutely about service to the customer. With a daily publication deadline, you can generally take the time to do it right. But the readership (and therefore management) wants stories on the internet as quickly as possible. Of course they are going to be sloppy.
(I agree with all your complaints about TV news, because that's what it's always been. In print... that's what they were forced to become when the money became tight.)
I guess it's relevant because the OP is saying that they wouldn't use it because they think that facts are irrelevant.
Definitely, we'd have to look at the studies in this space, namely whether journalists cite their facts, or write their pieces using citable studies/facts, etc. So I also won't accept the OP's statement of it as fact.
Though, in my, supposedly biased, opinion I'd say journalists are quite adept at twisting facts to suit their points, and also of omitting (i.e. cherry picking) facts that support their views/points.
Of course. I just hope you recognize comments like these do precisely the thing they lament: make unsourced assertions designed to advance a point of view.
"Journalism" is a broad tent. I suspect there's plenty of room for tools like this one.
Hold on, you're using the fact that UK journalists ironically recycle a term that's normally used in a perjorative sense as justification for stereotyping them? What you've said can be effectively applied to any possible occupation in any field.
It's true that coders have lots of great tools for working with textual representations of programs... But IMHO programmers' tools are stuck in a certain kind of local maximum due to the difficulty of moving beyond text. We've done all we can to make textual programs easier to manipulate, but there are fundamental difficulties that can't be solved this way.
I'm personally interested in this question: What if coders had design tools as powerful as those used by architects and construction engineers?
I believe that's the idea behind this: http://www.mathworks.com/products/simulink/. I know a software engineer in the aerospace field that loves this product and describes it the way you do (except that it's more for embedded systems than for user experiences)
This makes me think of http://code.google.com/p/blockly/ .. particular as I used it to do the hour of code with my 5yo this week. I find it actually an alright way to right some javascript in the limited context of code.org - it works OK on app.inventor too IMO.
I worked full-time as a journalist for a bunch of different news organizations before leaving to found the forthcoming http://recent.io/.
My suspicion is that neither I nor other journalists I know would use this newsclip.se tool. In a single newsroom I've seen people using Word, TextEdit, Google Docs, Notepad, BBEdit, Gmail, phone-based email clients, phone dictation, and even emacs (me, a few times) to write news articles. While the CMS is newsroom-wide, the writing and editing processes tend to be very personal. Journalists also tend to be individualistic and dislike being forced to use a standardized system without clear benefits.
Where something like newsclip.se might be beneficial would be as a kind of preprocessor/lint for the CMS. It could do a lot of what's being described in the linked article but without replacing the entire journalistic stack.
Making a tool like this has occurred to me, but my idea had a slightly different focus.
Rather than attempt to link evidence to statements, which, if we're being honest, doesn't really bring us closer to the truth since many "sources" are merely other people's words, it was much simpler: identify weasel words, euphemisms, or use of the passive voice. Between these three features, I think that most factual writing could be improved a colossal amount.
I would certainly appreciate an easier way to keep track of claims, citations, evidence, and interplay between a story's moving parts, though. I think that the article mentions a few tools which work toward that goal admirably. Right now, to make a really bulletproof piece, you need to be extremely scrupulous with self-identifying your claims and then providing written explication of evidence or hyperlinked evidence.
Additionally, it'd be really useful to have a tool which kept track of sentence structure and also allowed you to track logical rhetoric by keeping track of "If this, then that" style statements.
This is probably asking too much, though. A final hurdle is that journalists and writers tend to be old-school when it comes to technology, so it might be a hard sell to the older segment of that market.
Some of this (the identification of words, passive voice) exists already, though results vary: https://www.goodreads.com/topic/show/1094904-proofreading-so... (I link to this more because it's a comprehensive list of possible software even though the comparison was done by a vendor.)
Hemmingway does some of what your talking about like identifying adverbs, use of passive voice and rating the reading level needed to understand what has been written: http://www.hemingwayapp.com/
What do you mean for "writing tools as powerful as those used by coders"? In my experience good coders doesn't need very powerful tool. Super-complex IDE are usually a disturbance to good coders rather then an help and even the uber-geeks that use Emacs don't really use in their jobs all the macros.
A lot of good software engineer I know prefer lightweight editor like the good old Vim or SublimeText over IDE like Visual Studio, Xcode or Eclipse. That's because when you write complex code your brain is slower than your finger and the syntax and structure of the language you are using is something under your skin, you don't need to think at it.
So, no, great coder doesn't use powerful editors, if they can chose they use very simple one with the minimum level of features they need to help them think faster (syntax highlighting, autocompletion, indention) and stop. All other features of complex IDE are a disturbance rather than an help.
I don't know what a good journalist need but my guess is that they have the same problem, their brain is slower than their finger and the grammar of the language they are using should be well known. I don't know which tool they use, but in my opinion a simple text editor with a minimal spellchecking system (but not correction) well integrated with the publishing process is more than enough.
From an Innovation point of view I'm usually suspect when someone claim to have an innovative tool for an established industry. If the industry is established the actors have already optimized their tools and if there could be an improvement is very small. What usually happens is that something changes in the industry and make the actual tool obsolete (e.g: a new process or a new technology became available). But if you want to create something really innovative, you have to find this change. Simply bringing something existing in a field in a different field rarely works. And if it works is because you have a really good understanding of both fields.
While it's definitely true in some cases, what you've stated is a generalization: there are some great coders who use SublimeText or some other lightweight editor. Unfortunately, this does not mean using SublimeText makes you a good coder (or even is a sign of one). There is a lot of buy-in to the concept that using a more lightweight IDE makes you a better coder. I see a lot of entry level developers taking on this mentality.
Sometime I cringe when I watch a developer using a lightweight IDE, missing so many obvious syntax errors that they're not going to realize until they go to compile or run their code. So many wasted workflow cycles.
I work on large codebases and I know that I don't have the mental capacity to hold the entire structure (object interfaces, app structures, third party library usage) in my head. It's great that I can offload this task to my IDE and concentrate on the problem I'm solving instead of getting disrupted by having to think about code mechanics.
On a side note, I think it's funny that people like to consider vim a lightweight IDE. All the good vim coders I know of make heavy use of plugins to get the exact same features a full fledged IDE has.
As someone also in the "editor > IDE" camp, I agree that a single monolithic tool is maybe the wrong path to take.
But good coders still need powerful tools, just lots of tools, rather than one complex one. Simple and powerful are orthogonal.
I disagree that established industries have already optimized their tools. That's just blatantly untrue in real life, and for good reason. The tool makers always have the best tools. A journalist can't make a better tool for journalism without learning an entire new career (and tool making requires more experience than average as well, beginners don't make good tools).
Agree on the need of powerful tool. And sometimes you ends up spending time writing the right tool for that specific tasks when it is missing (done more than one time).
On your rebuttal on the tools for established industries, I just said that it's a suspicious claim, not that is false. The point is this: if an industry is a niche, usually who works in that niche has to build their own tools. In larger industries, like journalism, you think that a professional is not asked by toolmakers which feature they need to make their works faster or they don't approach themselves the toolmakers to ask for better tools?
But let's see the problem from a different perspective. Could you list me an innovation that is just the application of a consolidated concept in an area to another consolidated area without a change on the underlying industry or of the technology.
There's a lot of literature on this point. If you are interested, you could search for the Utterback S or the concept of paradigm shift in innovation.
> All other features of complex IDE are a disturbance rather than an help.
The things about IDEs that suck productivity wise is that instead of just giving you all the things you do want (syntax, highlighting, autocompletion, indentation) they provide bloat that doesn't make a task faster to do but easier to. What I mean by easier is putting GUIs on top of command line tools and APIs. A great coder looks at all that GUI cruft and thinks, many what hell is the underlying API or CLI tool I can use instead and get rid of all this middleware crap that is probably full of bugs. Someone who is not comfortable with that instead leans on those GUI tools because they don't really want to understand what is going on under the hood, they need the wizards and the text boxes, drop downs, and such to make it easier on themselves.
With that said, there is a value in having an editor that can actually understand your target language all the down way to the AST level. That's how you can get nice things like reliable refactoring without having to use grep, sed, and friends. Or with detecting syntax errors and such, and even though the VIM plugin system is doing great there can definitely be more innovation in that space (and there clearly has been lately). But let's just leave out all the Spring XML file builders and Tomcat/Jetty container managers and focus on the code.
>"So, no, great coder doesn't use powerful editors, if they can chose they use very simple one with the minimum level of features they need to help them think faster (syntax highlighting, autocompletion, indention) and stop. All other features of complex IDE are a disturbance rather than an help."
Says you, and your anecdotal evidence/experience. Sounds to me that you have more prejudice against "complex" IDEs than you do actual facts. Would countering your points by telling you of the countless bad "coders" I've worked with that use text-editors instead of IDEs convince you to change your mind? (I'm going to stop using your term 'coders' here) Or the absolutely great programmers that I've seen that use powerful IDEs to supplement and augment their abilities?
Do you write in a compiled language that can provide rich intellisense, or something more like Ruby/Python/bash that is a tad lacking in that department?
I use coder because that's the term used in the question. I prefer software engineer usually.
I'm not a coder, or software engineer, for what it matters but I'm doing it now and I understand how it works and I use all sort of language, from C to Python. I have no prejudice against complex editor. In my opinion everyone should be able to choose without problems the editor/IDE that prefers and that's not always true. My anecdotal evidence is that, when free to choose, the better you are at a language the more you prefer an editor.
Nevertheless, I could accept your point of view, but you should explain me something, if there are different point of view you should accept that the advantage of this so called "powerful tool" is less than clear.
Unrelated to the actual point, but I'm curious. You seem to be loathe to use the word "coder". Why is that? How is it different from programmer? Is there some sort of cultural shibboleth I'm unaware of here?
I don't think it's a broader cultural thing, or anything. And personally, I'm fine with the term. It's just there was something about the way the OP was using the word "coder" that didn't quite sit well with me. Reading back at his post, I can't quite put a finger on what it was that actually bothered me about it.
I think there may be a case for using something like GitHub to allow editors to edit etc.. and writers to pull the edits though I'm pretty sure most media outlets have systems that do almost the same thing.
Git and GDB (or your preferred equivalents) are pretty powerful tools relative to what most journalists routinely work with, and pretty much every good coder uses those.
That actually brings up a side topic that I've been trying to follow -- writing text so that it works better with version control. You put each thought or phrase on its own line, and let the markup system compile it into its final form.
My distinct impression is that the limiting factor in quality journalism is gum-shoe reporting, not incredibly powerful CMS.
Buzzfeed has the most advanced CMS, but their reporting pales with respect to the NYT. When they do bother to do proper reporting (McKay Coppins, etc) they get excellent stories - the rest of the time they use filler, because it's cheap to produce.
I actually think technology can improve the quality of journalism. Not necessarily by making great stories better (though the NYT is definitely proving that's possible with some of their interactives) but by making reporting in general cheaper. If a great CMS can make it possible to produce all the economically required filler in half a day, that leaves time to do actual investigative reporting later.
Buzzfeed is definitely pushing this a little bit, though I do think they have a bit too much baggage to get far enough with it.
(PS. If anyone wants to help tackle this problem, we're hiring @ Cafe: http://cafe.com/careers)
I'm sure technology can improve things - for example, the NYT is doing a much better job now of marketing its (excellent) cooking section.
My impression though, is that what makes great journalism is reporting. Almost all the time, when I'm reading a great article, I do so using readability to strip away all the crud and just end up with the text in a decent size and no pictures/hyperlinks/movies.
> My impression though, is that what makes great journalism is reporting.
Yup. At the end of the day, I think in-depth textual reporting does rule. But what tech can do is make that great reporting slightly cheaper (and thus more plentiful/viable).
This is absolutely true. I spend hours laying out print pages. A good CMS could cut that time down to nothing. It's a ton of work up front, but it saves you so much time later.
Edit: I also need a job soon and am applying to write for Cafe!
This is a great idea. So much so that I have "Why didn't I think of that?" syndrome.
Here are a couple of suggestions that I could use if you're looking for feature requests. Most of these things exist in one place or another, but having them integrated into a one-tool workflow would be awesome.
1. Some kind of crowdsourced reputation system for sources (i.e. medical journal sites have high reputation, naturalnews.com has low)
2. Auto cross-referencing between articles based on content.
3. TODO list management
4. License-aware relevant image suggester (please!!!) This alone would be a killer feature for me. Pick out topic words and search selected image sites, then give me thumbnails to choose from.
Yes please! Anybody needs such a thing for blogging anyway!
I wrote this in another comment, but a tool like this already exists: http://www.stevenberlinjohnson.com/movabletype/archives/0002... and has for about a decade. I drop all my blog posts in Devonthink Pro, and I'm often surprised by the connections that emerge.
I've always wondered why writers don't use things like revision control and decent diff tools. I'm not sure the existing tools are well suited to them (yet).
"I've always wondered why writers don't use things like revision control and decent diff tools. I'm not sure the existing tools are well suited to them (yet)."
Certain writers do. It just depends on the domain. For book writers, there are tools like Scrivener. For screenwriters, there are tools like Final Draft. These tools help tremendously with auto-formatting and version control. They also keep track of things like characters and settings, presumably by assigning them to certain classes and IDs "under the hood." (I've never looked under the hood, but I assume they are XML-based.) They are fantastic tools for the writers they're made for. They're also fairly WSIWYG, and do not require writers to know anything about markup languages. That's pretty key, because many writers are not technically proficient. (Modest technical proficiency, at the very least with HTML and some CSS, is probably going to become a requirement in the future...but I digress.)
For journalism, however, Word is the default—and obviously, it has its strengths and its weaknesses. Wonky changelogs and substandard version control are two of the biggest. Some editors at some publications are moving over to Google docs, for the superior versioning and collaboration capabilities. But Google's editor doesn't offer the robustness and feature set of Word, and it's unfamiliar, so a lot of editors and writers are reluctant to adopt it. Personally speaking, I much prefer Google to Word when it comes to working with my editors. I'll gladly take in-doc comments and version control over emailing Word docs back and forth, any day of the week.
Microsoft Word has had revision control for at least 15 years, probably even longer. It's not like the document creation world doesn't have exposure to these ideas.
Microsoft Word has a shedload of features that journalists can't be bothered learn and/or use, and that even the author appears not to know about, judging by his reference to "replacing simple text processors like Microsoft Word".
Some stuff (revision control) is at least reasonably good, though it depends somewhat on what format you are saving text into. Also since most diff tools work on a new line level it isn't as useful as if they split on say periods/exclamation points/question marks, since those are the real dividing points in any form of writing (journalism/prose at least).
As with programmers, I think it's a matter of education and time pressures. If a cowboy coder hasn't been exposed to a strong version control culture, and works mostly alone, they won't have as strong a need for VC (or even know about it). When you're always under deadline pressures, you're focused on getting the product out above all else.
Basically, they don't know there's a better way. They're used to hacking through Word's track changes, and they always have a deadline to meet. That doesn't leave much time for improving your toolset.
Also, many writers have very idiosyncratic workflows. From what I gather their education and workplaces focus more on the craft of writing well, regardless of whether it's with a pencil or a keyboard. Whereas it's hard to contribute to a codebase without being exposed to version control.
Hmm. I think it is akin to the success of Dropbox. Tools like Rsync and binary diffs already existed, but they weren't friendly enough for a lot of developers, dads, non-tech coworkers, etc to use until Drew Houston wrapped them up nicely in Python.
(I vaguely recall him mentioning that on his app even)
This is a very cool prototype, and I especially like the idea of calling it a "journalism IDE" (as opposed to a CMS).
At work (http://cafe.com), I've actually been working on something similar with our CMS (Monsoon). We're trying to use technology to make telling cohesive online narratives a lot easier.
Interestingly, one of the biggest hurdles so far has been in decomposing stories. Media traditionally treats each story as a big blob of text (in most cases, HTML), but we're trying to change that so that each story is actually just an arrangement of smaller tidbits (we call them droplets). Switching to that model helps us to encode a lot more semantic information, and also to reflow stories effectively for context.
We're not yet to the point where we integrate/suggest droplets from other stories automatically, but that's definitely the goal. Maybe we could integrate something like Newsclip.se to encourage that.
So, I think it's a question of who you're marketing to. I'd never call this an IDE to the journalists themselves, but calling it a "journalism IDE" is a great way to recruit developers/technologists—which is actually a big challenge (in my experience, most great developers don't want to touch anything close to content, probably due to latent association with the abomination that is WordPress).
This somewhat plays into Rebecca Parsons' talk [0] at hack.summit() last week. She emphasized that we've done a great job building tools for other technical folks, but we've really dropped the ball on supporting non-technical fields. Her perspective was specifically on the potential of DSLs, but the call to developers to be more relevant is likewise compelling.
It's funny, I was thinking about that while writing my dissertation.
After a year of coding on Visual Studio, during writing I would say something like "As mentioned in previous chapters, the decision was based on..." and then just click on "previous chapters" and press F12, expecting to be taken to the original reference.
Unfortunately I found out Word does not offer such functionality just yet
There's a standard called hNews, created by Jonathan Malek (Associated Press), Stuart Myles (Associated Press), Martin Moore (Media Standards Trust), and Todd B. Martin (Associated Press), which addresses this:
hNews is a microformat for news content. hNews extends hAtom, introducing a number of fields that more completely describe a journalistic work. hNews also introduces another data format, rel-principles, a format that describes the journalistic principles upheld by the journalist or news organization that has published the news item. hNews will be one of several open standards.
hNews is an XHTML microformat -- the tags are entity names and classes added to standard HTML entities.
Key among them are: hnews, hentry, source-org, dateline, geo, item-license, principles.
Other microformats, including hCard, can be used to identify people, companies, and organizations, similar to vCard properties. The elements will be familiar to those who've worked with address, Active Directory, or LDAP data: fn, n, nickname, org, email, tel, adr, and more.
An IDE that tied into these (or, possibly, other standards) could be useful.
You'd likely need some sort of natural-language processing in the copyediting or publishing process to apply this uniformly. Field reporters may work on a wide range of equipment and software (including pen-and-paper or simply voiced-in reports). And expecting reporters to incorporate tags into their copy is likely a stretch.
The developer in me likes this very much. I mean, Taxonomies! Unfortunately many non-journalists tend to romanticize and over-complicate the craft. Journalists care mostly about the story telling, and it's not clear to me how this translates into a better story. I believe this to be a (beautiful and well-intentioned) example of overengineering a domain model that, at it's most basic, involves just titles, deks and content such as text, images and video. To me it does not clearly push journalism toward a more profitable and sustainable future and thus, is a distraction from more challenging problems at hand.
This is vaguely reminiscent of some of the professional tools for scriptwriters and novelists, which do in fact start to resemble IDEs: you can shuffle around characters, outlines, plot fragments ...
"Yet once you know the formula, the seams begin to show. Movies all start to seem the same, and many scenes start to feel forced and arbitrary, like screenplay Mad Libs. Why does Kirk get dressed down for irresponsibility by Admiral Pike early in Star Trek Into Darkness? Because someone had to deliver the theme to the main character. Why does Gina Carano’s sidekick character defect to the villain’s team for no reason whatsoever almost exactly three-quarters of the way through Fast & Furious 6? Because it’s the all-is-lost moment, so everything needs to be in shambles for the heroes. Why does Gerard Butler’s character in Olympus Has Fallen suddenly call his wife after a climactic failed White House assault three-quarters of the way through? Because the second act always ends with a quiet moment of reflection—the dark night of the soul."
Let's be completely honest: there was very little good television on until 1999. The only thing that's changed is the volume of tv being produced(which naturally favors cheap reality shows). But more and more great things are being produced. Films are cheaper to make than ever before.
Most shows and film will be mediocre. But you don't have to watch the dregs - you only have to watch the best of the best! And by the measure of good things being produced, media has never been better.
> But you don't have to watch the dregs - you only have to watch the best of the best
This is my tactic. I have a television, but it's not actually connected to an aerial or a cable box, just my Apple TV. I only watch television shows when it's clear they're excellent and exactly to my tastes, and it's brilliant :)
We had some Github religious adherents who made a lot of noise about this. The net in our case ended up being "You can eliminate complex stuff like Word and just use markdown & Github!" (I almost spit out my coffee laughing)
I don't think that an IDE model makes sense because writing in human language is more complex and nuanced than programming languages, which are more limited in scope. At the end of the day, IDEs are shims to match human desires to the language the machine expects.
I'm not a journalist but I do a lot of writing, and I actually use a program called Devonthink Pro in the manner described here by Steven Berlin Johnson: http://www.stevenberlinjohnson.com/2005/02/devonthink_cont.h... . It's surprisingly close to the tools Lindenberg is describing.
As a former journalist, I think that http://hypothes.is will become another very important tools for reporters... It's an annotation layer for the Web.
reading this I feel like it would totally be possible to use CRM software like Salesforce to manage investigating a story. Leads and the like, it's almost like the workflow is the same.
* Your weekend newspaper would come out on Wednesday.
* There would be new editions many times daily... not with new stories, just corrections of the first edition, which was blatantly inaccurate and partly written in Moldovan.
* Every day, the newspaper would be in a different format which didn't fit the newspaper rack you just bought.
* Every week, the newspaper would get bigger, but contain no more content (just a new font). You would regularly be forced to buy a new newspaper rack.
* Also, once a week the paper boy would break into your house and steal your old papers. He would offer to sell them back for you in the new format, for a higher price.
* Also, the newspaper rack sellers would not let you store newspapers of which they disapproved.
* Rather than telling you about the world, the paper would track your behaviour and tell the world about you.
* Once a week, the front page would be 404 NEWS NOT FOUND.
* Reporters would be paid high six-figure salaries, but would be unable to relate or talk to anyone but other reporters.
* Many journalists would consider themselves brilliant, world-changing geniuses, with plans not just to report on government, but to replace it.
* At the same time, they would have secret deals with those governments to report people who read "subversive" news.
Believe it or not, this topic has come up on Hacker News before. To paraphrase my comment from then (https://news.ycombinator.com/item?id=7324897): while Moldovans speak a language that's very similar to Romanian, in a 2004 census 60% self-identified their primary language as 'Moldovan', as compared to 16.5% who say they speak 'Romanian'.
As linguists say, a language is a dialect with an army and a navy. Whether 'Moldovan' is its own language or simply a regional dialect of 'Romanian' is entirely a question of politics.
So, to reintegrate this datum into the ancestor post...
* The newspaper would advertise job openings for reporters with "5 years of experience writing in Moldovan" as a requirement. Reporters that list experience in "Romanian" are arbitrarily culled from consideration. Despite the requirement, 100% of the newspaper is written in English.
* The newspaper had a correspondent in Moldova about 3 years ago, who also wrote in English, and that section of the new job description was blindly copied from the advertisement for his replacement, a position which was actually cancelled after 4 weeks, for "business reasons". The newspaper loudly complains about a shortage of qualified reporters. They don't mention that the candidate that they interviewed rejected their offer for its ridiculously low base pay.
* The reporter job is eventually outsourced to Moldova. Readers wonder why the local police blotter has so many Andreis, Tanyas, and Nicolais in it, and begin to worry about the Russian troops quartered east of the river.
Sounds like Italy Italian and Switzerland Italian - I can go to Switzerland and understand what they say, but some turns of phrases are weird (imagine calling a sale "Action!", or saying "I mailed myself to Geneve" when you mean taking the bus).
> Believe it or not, this topic has come up on Hacker News before. To paraphrase my comment from then (https://news.ycombinator.com/item?id=7324897): while Moldovans speak a language that's very similar to Romanian, in a 2004 census 60% self-identified their primary language as 'Moldovan', as compared to 16.5% who say they speak 'Romanian'.
You've been refuted in the very comment thread you've linked to, lol.
Romania has existed as 4 smaller states [0] for a very long time, inhabited mainly by a people descendant from the native Dacians[1,2] and the colonising Romans. Moldova, the geographical region, was one of these states.
Romania's steps towards unification:
* in 1601 Michael the Brave manages to unify all Romanian principates, albeit very briefly
* in 1881 Walachia and Moldova unify by electing the same leader
* Transylvania is added to the mix after WWI
* because of our initial German allegiance in WWII, we lose half of Moldova (now the Republic of Moldova) to Russia
The point of my short, not that accurate history lesson is that Romanians and Moldavians are the same friggin' people, that had the same language for many, many centuries. Even if the language in the Republic of Moldova would've diverged significantly since WWII (which it has certainly not), it would still be merely a dialect as opposed to a different language. Romanian and Moldavian are less different today than e.g. German and its Austrian dialect are (hence why Moldavian is called a subdialect [3]).
> As linguists say, a language is a dialect with an army and a navy. Whether 'Moldovan' is its own language or simply a regional dialect of 'Romanian' is entirely a question of politics.
Don't believe most elections/referendums from Moldova. There's an acute Russian influence in the Republic of Moldavia, manifested as both systemic brainwashing of older/poorer people, and control over politics and economy, and that's corroborated by the systemic electoral fraud (also present in Romania today [4,5] – sigh).
(Moreover, rumour has it that Russia has armed forces conveniently stationed not too far from the border with Moldova.)
A lot of wire copy already includes metadata, but it's generally just in a header that accompanies the story.
What I was envisioning was something more like what is being proposed for the semantic web:
<name id="1394">John Smith</name> was elected president of the <organization id="2315">New Castle County Council</organization> on <date value="2014-12-10">Wednesday</date> at the <place lat="39.685881" long="-75.613047">county headquarters</place>.<source id="23" name="Mila Jones" title="New Castle County public relations officer"></source>
I also wanted to use the metadata to help copy editors trim wire stories:
<priority value="1">This amounts to de facto resegregation. <priority="4">(And we all know how we segregation worked out the first time.)</priority> If the school district still values integrated schools, it must act swiftly to correct this effect.</priority>
It turns out, though, even when you create a UI that lets reporters and editors easily plug in this metadata without having to understand XML, they are not apt to fill it in, because they are just so overworked as it is.
Plus, in order for this to work on a larger scale, you'd have to get an incredible amount of buy-in. You'd have to get reporters and editors to agree that it's worth their time. You'd have to build software to support it. You'd have to get all of the different media companies out there to agree on standards.
It's just ... not what the media industry is (or should be) focused on right now. They've got bigger things to think about, like how to find a viable business model.