We believe that users own their data. We’re working on a system that will extract all user data from the old Digg infrastructure. In August we’re launching an archive website for users of the old Digg to find, browse, and share a history of their submissions, diggs, and comments.
If you’d like to be notified when your data will be available, just enter your email address here[1]. Then stay tuned, and let us know if you have any questions.
Meh, I was on Digg from 2005-2007, I didn't delete my account because I thought some day I might want to look through my old stuff.
I went to take a look sometime around 2010, and they had nothing of mine. No comments, no submissions, nothing. Probably there somewhere, but nowhere accessible to me.
I wouldn't hold my breath on getting anything back.
Not to mention, an archive website still doesn't help anyone following links that have been placed on the internet over the last decade.
The spammers and the clowns paid to spam digg for SEO lost all their work. That's quite possibly the best part of the new digg, especially if they can be kept out.
Yeah, I find the "loss" of the useless former data to be a positive, not a negative. If their site was socially "corrupted" by astroturfers and paid Diggers, starting over is the best way to create a new culture.
From a TechCrunch article published yesterday, following an interview with John Borthwick:
"According to Borthwick, it would have cost “hundreds of thousands per month” to keep the site running on its old platform. Even though the site was state-of-the-art just a few years ago, most of the infrastructure would be considered legacy technology by a modern startup."
That's why. SF Weekly, and the author Keith Plocek, obviously did not reach out to Betaworks before running this article.
Would it not have been possible to "cement" those pages into static HTML and throw them up on a CDN? The only thing "digg" about Digg now is the domain name. Why they bought that and then threw out the baby with the bathwater is beyond me.
For what it's worth something like this is still probably possible. I'd be shocked if any sensible person would get access to Digg's dataset only to delete it immediately thereafter. While they've replaced the site with something else to get the ball rolling, I don't get the impression that they're crass and/or ignorant enough to just destroy the whole thing, so we may yet see an archive of the old Digg.
Also, I think the old Digg demonstrated that it didn't really care much about the data anyway.
Considering all the popular reposts on Reddit, they could use the database of already submitted Digg content to populate the daily stream of links when not much else is happening.
Take one guy and have him, at the very least, back port user accounts into the new system. There's absolutely no reason why you cannot write an ETL job once you have your new database structure ready to go. Sure its more work, but we're also talking about an Internet property that at one point was a goldmine.
Oh no, I always begin my morning with digged articles from 2007. Actually, no I don't. So, who cares, apart from SEO guys?
And I love how they say:
What would've been the harm in leaving those archives up? It's not that complicated.
Of course it's not complicated, it's actually fun! What can be better than starting a new project with importing old and unwanted records from a screwed up database?
Mainly, it's always bad form to delete user's data. But also, this stuff is useful to historians. For example, here's a site that just posts interesting things about the Geocities archive (which Yahoo deleted but some archivists mostly mirrored first). http://contemporary-home-computing.org/1tb/archives/2559
IIRC we used nofollow on stories in upcoming. Once they were promoted to popular, the nofollow was removed. This was done mainly to prevent people from trying to gain juice by submitting spam stories.
I feel like Digg should be charging to "turn this off" for blogs, or even on a per article basis. I don't mind sponsored stories making it into the mix, but this is another form of revenue that could be explored with more sites that have a metric ass-ton of traffic.
Links with rel="nofollow" have no link "juice" however Digg.com itself is a very prominent and trafficked website, so any links that are not capped with a nofollow have extraordinary value. (A very basic judge of "link quality" from a particular domain is its "PageRank" value, which Digg.com scores at an 8 of 10.)
Digg has no SEO value. The content they have from the past is rarely relevant - so they have no evergreen content. Unlike sites like Quora or even Yahoo Answers where content written in the past still has value right now.
I realised how much sense this makes for BetaWorks, they have so much social data from bitly and knowledge in the space. They can make a real consumer end to bitly as well as the enterprise tools they're building.
They can make a semi-curated and semi-data driven broadsheet on any topic. It's a beautiful interface for a V1 and I think if they use all of the data from bitly it could be powerful. They can get a grasp of how many people are actually clicking on links, how many social shares and upvotes there are.
Essentially a much better version of Flipboard that is data driven.
There's nothing wrong with making a version of Digg that embraces social elements but Digg 4 was not it. It was trying to turn Digg into a Twitter clone.
Forgive me as I did not follow the original story very closely, but what exactly did Betaworks buy? Are they even the owners of the old data, or did they just buy the brand and code, whereas perhaps LinkedIn now owns the actual data?
Obviously lots of people in the comment here are far from the point made in this article.
Digg v1 not redirecting and keeping old content is removing all of its presence on google. How much was the % of digg's traffic coming form google? I would say more than 50%.
They still have a few day until the next google dance, but they're basically screwed now.
IIRC about 40% of diggs' traffic came from Google. But when they changed their algo ~2 years ago to favor original/source content, our overall traffic dropped by about 30%. The value of that traffic in advertising dollars was significant, but in terms of participation in the community and growth of the site, it was almost nothing. The new site isn't ad based, so I don't think anyone really cares.
They seem to be implying that Digg was too big to fail, from an SEO standpoint.
"But it only takes a couple clicks to realize all of the archives are gone. All of them. All of the Ron Paul idolatry. All of the ASCII facepalms. All of the linkbait. Gone."
That should be the problem, but obviously that content won't be missed.
You cannot come up for air without pissing someone off. Digg's carcass was festering the internet. I'm glad to see it being taken in another direction.
What about my data from the old Digg?
We believe that users own their data. We’re working on a system that will extract all user data from the old Digg infrastructure. In August we’re launching an archive website for users of the old Digg to find, browse, and share a history of their submissions, diggs, and comments.
If you’d like to be notified when your data will be available, just enter your email address here[1]. Then stay tuned, and let us know if you have any questions.
[1]http://digg.com/archive