Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Yanukovych leaks (yanukovychleaks.org)
321 points by mxfh on Feb 27, 2014 | hide | past | favorite | 87 comments


One of the participants here.

It's been a crazy few days. We have documents drying in the former president's sauna, prosecutors waiting for each file to be scanned before confiscating it, and an incredible group of journalists working night and day to save as many documents as possible.

A few articles with more background: http://gijn.org/2014/02/25/yanukovychleaks-org-how-ukraine-j... http://thelede.blogs.nytimes.com/2014/02/25/ukrainian-journa... http://gijn.org/2014/02/27/yanukovychleaks-update-the-projec...


Corrupt governments' greatest nightmare is transparency.

It's amazing that these documents were discarded so carelessly. That alone probably says a lot about the competency of the Yanukovych government at large.

Good luck and Godspeed! Here's hoping that some of the information found here can be used as evidence and will speed the transition to a better government. Stay safe out there.


> It's amazing that these documents were discarded so carelessly.

If there are criminals (and there are) at least let's hope they are incompetent and stupid they can be found and caught easily. Or that those chasing them a fast enough for the criminals to make mistakes on their way out.

The former East German government in its final days tried to shred documents, and it did. It left mountains of single strip shredded documents. Well, it turns out it is possible to painstakingly reconstruct those. There is software that can help as well.

http://www.spiegel.de/international/germany/puzzling-togethe...

They should have burt them, then mixed the ash with water and dumped it in a landfill, but, well glad there were not competent enough or these enacting the change were coming fast enough on their heels.


name one government that is not corrupt

edit: funny i'd like to know what the down vote is about. it's not a snarky response i'm being full serious here right now. i don't know a single one. that's not how governments operate


In general I'm with you on this one, but maybe the Nordic states? Or maybe extremely small states a la Lichtenstein, where the complexity and amounts of money involved in governance are so low that corruption hardly pays off at all.


You should look at DocumentCloud and see if it would be helpful for hosting and managing the documents you're finding.

http://www.documentcloud.org/home

https://github.com/documentcloud

It might be ideal for your needs. The project uses Tesseract‘s OCR to read documents and OpenCalais to pull data from them, including key people, places, and dates.


The project initially used DocumentCloud, but I guess that's built for scenarios which are a little less horrible than single-page JPEGs coming in via FTP.

Tesseract has been working beautifully, we've got the first 1500 documents sent through it. Unfortunately, we don't know the language of each document (trying to get up some crowdsourcing for that), so each gets sent through three times (eng, rus, ukr). Finally, all versions are indexed in ElasticSearch. If anyone has a neater way of doing this (e.g. a good, Python-based language detector), please shout!


I doubt Tesseract will do well with handwritten Ukrainian... and I am positive OpenCalais won't detect anything in that language.


I wish you guys all the best. :)

I've been arguing with a friend of mine the past few days who's convinced all of this is just some kind of CIA operation to get back at Putin for reasons. Or something. Also that people should never be able to protest about their "democratically-elected government".


The very idea that CIA would somehow be involved in a violent and coordinated coup for the purposes of advancing its geopolitical interests.... Somebody get your friend a tinfoil hat amiright or amiright!

http://en.wikipedia.org/wiki/Covert_United_States_foreign_re...


Just like a model of the solar system of epicycles-within-epicycles that accurately predicts the location of Jupiter a few times must be the explanation for all further observations of Jupiter, right?


For the ones who have hard time remembering how to spell it:

http://c.gg/yl


I took the admittedly stupid approach of just clicking the link, while at work, with my office door open. As it turns out, it's on the up-and-up, all good, works as expected.


No danger here :) Just making things easier to use.


Expert troll?


Whats the site powered by? Did you just throw it together or is it a CMS?


Django, MySQL, S3+CloudFront, Tesseract and lots of duct tape...


It looks like the photos are upside-down, when you go to a specific page and click to zoom in.

Didn't know where else to report it.


The detail page has a "Rotate" button, down next to "Next" and "Previous".


Слава Украине! И привет из Москвы ;)


hey there - i'm a reporter w/ @mashable, wondering if you might be free for to do a quick Q&A? sorry to barge in on the thread but this is super interesting to me personally. - @moneyries / moneyries at gmail.


Please let them do their important real work instead of your pseudo-journalism about social media.


Holy cow Sprint, that is a really vicious reply, and totally unwarranted.


Slightly warranted. It seems like there is some time-pressure (prosecutors confiscating files after scanning) that means any time taken away from archiving all this could lead to some things being hidden from public view if the prosecution suddenly decides to confiscate all the files.

It wasn't phrased particularly nicely, but it's got some reasoning behind it I think?


The BBC reported on this last night - saying something along the lines of there being a 72 hour recovery window (not sure if that means from being thrown into the lake or from being surfaced).

Quite a good report actually, they showed the drying and scanning. Apparently some documents were burned before being thrown into the lake also. http://www.bbc.co.uk/news/world-europe-26361455


It is harsh but I stand by it. Take a look at mashable.com and maybe you will understand my aggression.


Mashable reports a wide variety of stuff these days, like this current top story:

http://mashable.com/2014/02/27/russia-protect-ukraine-yanuko...


Of course! Anything that brings visitors who you can serve ads to is good content. Re-wording AP releases or stories from news websites does not take much time.


It's okay. News websites are easy to knock. Truth is we have spent months following this story and have a reporter on the ground in Kyiv. This is a sampling: http://mashable.com/category/ukraine-protests/. More here: https://www.google.com/search?q=ukraine+site%3Amashable.com&....


The article on Al Jazeera's imprisoned reporters was a really good read. I guess I should thank you for sending me there.


You mean the re-worded http://www.aljazeera.com/news/middleeast/2014/02/al-jazeera-... ? You are welcome for me sending you to Al Jazeera.


Welcome to HN, where the comments are bullshit and politeness doesn't matter.


So long as the journalist puts adds information on how to volunteer and help, press can be useful.


There's an email address listed on their profile, might be worth trying that.


Where's you public key?


Mashable writer with a public key? I think you are confusing them with a real news organization, not the site whose headlines currently include: "11 Funky Computer Mice That Really Click" and "25 Pieces of Food Jewelry Cute Enough to Eat"



Top Story:

The Justin Biebers of Minecraft Are Rewriting the Marketing Playbook

Just...


I hope you are deliberately lying about where this is still going on, otherwise why are you jeopardizing this potentially important operation by bragging about where you're conducting it? You are only inviting trouble from the thugs.

Would the downvoter care to come out of their dark corner and explain why I am wrong?


I've seen it on the BBC news. The place is swarming with activists and reporters. Its definitely not being done in secret.


I was unaware of the BBC (or any) major media coverage of this. It still worries me that this is going the route of "everybody look at this" to keep it secure. It does take a certain amount of trust among participants to go the route of "ok guys, we are going to handle these in a secret place until the job's done and we can distribute all at once," but the chosen route paints a very large bullseye on the operation.

I would imagine that it would not be hard for the police to justify a raid to reclaim classified documents, but maybe someone more familiar with the situation than myself could explain why this is less likely than my paranoid mind tells me?


The same activists who effectively just won the revolution are occupying the former president's residence, and have it open to the public. Journalists are everywhere. Said activists have the support of the military and the police and society at large. There's just no danger whatsoever of a raid, they can afford to be open. The thugs already lost.


That's very good news. Thank you for breaking it down for me!


> Said activists have the support of the military and the police and society at large.

Except in other parts of the country, Crimea for example, pro-Russian protesters are taking over government buildings and staging their own protests...

There's even talk of secession in Crimea...

The pro-EU activists may be a majority in some parts of Ukraine, but not in Ukraine at large...


Not the downvoter and don't think it's worth a downvote but don't think the comment said anything not mentioned in (at least) the NY times article.


One other consideration: by posting their comment, they've leaked their ip address, which is a proxy for their location.


I'm not ignorant of the needs and concerns of self-promotion in order to build a popular campaign...but I hope they have a technical advisor who will, at some point, inform them about the technique of OCR and how a large hashtag-watermark can obstruct such a technique.

Also, minor detail, but the images should also be rotated to their proper orientation. Crowdsourcing data collection has to be as frictionless as possible, and this is an easy fix.

Depending on how many actual documents there are (i.e. how many pages are in those 200 folders), it might be worth it to go the route of ProPublica's "Free the Files" project, in which they built a mini-app that let people voluntarily transcribe the important fields in each document:

https://www.propublica.org/series/free-the-files

Their Al Shaw wrote a piece about designing for efficient crowd-sourcing:

http://www.propublica.org/nerds/item/casino-driven-design

They even open-sourced the Rails plugin for it:

http://www.propublica.org/nerds/item/transcribable-free-the-...


> the images should also be rotated to their proper orientation

there's a rotate button when you click through to the detail page. We're tracking where images get rotated to, and setting the orientation according to that. It's still a bit buggy, but we're getting there

> the technique of OCR and how a large hashtag-watermark can obstruct such a technique

We're running OCR over non-watermarked versions. We're hoping to have a search function up later today

Thanks for the links -- we'll look at them, and see what we can use


> there's a rotate button when you click through to the detail page.

When rotating some of the images (for example Img 999) it seems to cut the edges of the image off and clicking to zoom doesn't help.


This is one of those Walls coming down moments for me.

I have banged on about how governments and states are losing their privacy as much as individuals - and whilst this is happening after a political crisis, it is increidble to see a few determined hackers and some simple off the shelf equipment is throwing open the doors to a hidden state. Truly eye-opening, and something we should look for in our nice safe democracies.

PS Plus I know a few Ukrainians and hope that this can get resolved with no further bloodshed.


They could crowdsource finding interesting stuff - eg, the Guardian had an app where members of the public were show random expense reports from politicians, and could flag unusual/odd expenses.


That's definitely the plan. We've been working flat out on this for the past few days, and the immediate priority is getting the documents preserved. [many of them were waterlogged, and have to be separated and dried in the presidential sauna]

But crowdsourcing is certainly going to happen


What if some of the documents contain private information of non-involved individuals? Any steps being taken to protect their details?


Good question -- this is definitely an issue we need to wrestle with.

For now, the vast bulk of the documents going up are business papers involving Yanukovich's companies and the management of his estates. I think those are fair game. But we'll get into murkier areas as time goes on.

For now, if you see something that you think shouldn't be public, drop an email to yanukovychleaks@gmail.com and we'll look into it.


Isn't that a bit too late at that point in time?

Once something has been on the internet...


If nothing else, this is a reminder that any "private" information you reveal to the state may or may not remain private.

Hence the importance of "The right of the people to be secure in their ... papers ..., against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized." That includes bureaucratic compulsion under penalty. Maybe this will persuade Ukrainians to demand real privacy protections (as in: the government can't have private info).


... you did not just quote the US Constitution at a Ukrainian.


Works pretty well for us when followed. Just making a suggestion, since constitutional changes seem in the works for Ukraine.


Could you get some pics of drying the docs in the sauna? That would be amazing!


Here's a collage I found with a sauna picture included. https://twitter.com/gijn/status/439047949042319360

Edit Whoops! Found a bigger sauna picture right after this posted! https://twitter.com/gijn/status/439022212105134080


here are some in a speedboat: https://twitter.com/kgorchinskaya/statuses/43735984982826188...

There's at least one photo from the sauna somewhere online, but I can't remember where


I think this is awesome, I worry about it triggering a Russian response if there is something in there which ties back to the Putin administration.

What I really like is way in which Ukraine is going about this, unlike the Mideast (Egypt comes to mind), where it seems the only things the people who took over cared about was exacting retribution on the former government, rather than having "Make a stable and just government" as their first priority, and "Investigate and punish any crimes that may have occurred" as their second priority.


The first thing this new "government" did before doing anything else was repeal a law that allows regional rights to minority languages (Russian included) and demanded that the International Criminal Court charge the democratically elected Yanukovich with crimes against humanity.

They are only now trying to name an interim government. As far as how stable this government will be that is left to be seen, but if the US was willing to invest some resources in supporting the revolution and several of the candidates being named this week, as the assistant secretary of state Victoria Nuland seems to suggest in this intimate conversation (http://www.bbc.com/news/world-europe-26079957), perhaps they'll be willing to bail out Ukraine's economy?


This should really be served with https.


Why? Any snooper still sees a DNS lookup for this domain followed by a connection to it. This is not such a dynamic site - it's fairly obvious what you will be accessing.


A site like this can have interesting corners not easily found. The knowledge that you know about such corners could itself be used against you. That's a good reason to serve it up as https.


That's a good point. You've convinced me :)


I'll take the images to the priest of the catholic ukranian parish I attend...


This is in Russian (or Ukrainian) ... are the HN readers supposed to read fluently Cyrillic ?


I can fluently read Cyrillic. I learned it as a child in a few minutes for fun. It's fluently reading any actual language written in Cyrillic that gets me :P


It is English, as well. Look at the right-hand side of the screen.


I'm pretty sure he meant the actual documents and not the header text.


It is Ukrainian.


> assuming all of HN only knows english


You know what, I'm French ... but when I come on HN I expect to do my reading in English unless otherwise specified in the title of the thread.


You expect to read freshly dried and scanned Ukrainian government documents in English?


I expect not to have content that is unreadable by probably 90% of HN readers ...


Given the common coverage of esoteric programming languages, I have no idea how you have arrived at that expectation.


This doesn't belong here at Hacker News.


Would you like to instead:

  talk about how to pivot yanukovychleaks.org to a billion dollar startup idea.

  Critique their choice of web framework.

  Talk about how to optimize bandwidth usage.

  Critique other design choices, usage/non-usage of CSS

  Discuss SEO strategies?

  Other?


I wish there were more JavaScript posts. Did you know you can do functional programming with it?! /s


He's just mad they are not using Erlang


http://ycombinator.com/newsguidelines.html

"Please don't submit comments complaining that a submission is inappropriate for the site. If you think something is spam or offtopic, flag it by going to its page and clicking on the "flag" link."


Strongly disagree. This is the epitome of the hacker mentality. Just because it's not being applied toward building a company doesn't mean it's not worthy of discussion.


Why not?

Edit: Think hard about your answer because to me it seems like this is a beautiful example of the power of technology outside of the norm here in the states.


Overthrowing the government and breaking down walls that prevent transparency is surely an epic "hack" in the best sense of the word.

Note to USA: You could learn from this.


Mind explaining why?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: