Hacker News new | past | comments | ask | show | jobs | submit login
Jason Calacanis’ Backup Plan For Replacing Content: Steal It From Wikipedia (blogsblogsblogs.com)
105 points by scofflaw on March 12, 2010 | hide | past | favorite | 70 comments



Calacanis' fantasy all along was to own Wikipedia but fill it full of ads. He used to rant endlessly about how much money they'd make if they monetized Wikipedia. Mahalo is his attempt to make a seedier low-brow version of Wikipedia that's ad-laden. This move should not be a surprise. That's just my opinion though.


Kinda off topic, but I think he's right about wikipedia serving ads. They have so many page views, they could put up a couple of image or text ads on the side and make it look very tasteful--unlike mahalo and the other ad filled pages of the internet. Expanding on that, maybe they could only serve non-profits, giving them ridiculously low click or view rates. How could Jimmy Wales not be interested in that?


It would release a hornet's nest worth of issues. For example advertisers might change articles in which their ads appear in order to influence click-troughs. Then contributors will ask for a cut of the money. Then nobody will donate because they have to look at ads anyway.

Then advertisers will start demanding from their sales people that certain content be changed or taken off "or I will remove my ads."

So it should only be considered as a desperation move if their donations plummet.


To really think about it now, it's truly remarkable that such a powerful entity can exist in such a material world, free of spam, ads, etc. I don't think we cherish this great resource as much as we should.


Making money off volunteers tends to make those volunteers disappear. Restricting themselves to charity may help, but it'll be an endless debate that will hurt them a lot.


People still volunteer to help at the Olympics, but I suspect you're right that adding ads would cause people to leave the community


> People still volunteer to help at the Olympics

Really!?


This stuff happens all the time. I paid $200 to "volunteer" to work at the PGA Senior Open


well, I wouldn't call this stuff volunteering - you're probably interested in some particular (sporting) event and willing to pay for a good access with your work (and maybe some money)

wikipedia editors don't get that kind of privileges relative to any random visitor, I'd argue.


I'd say they could go for pure sponsorships - just a set of logos with a link to each sponsor's homepage, under the heading "Wikipedia Gold Sponsors". Very few options to sponsor a reduced part of Wikipedia (i.e. sensible to allow on language level, not at keyword level). Probably still would make the foundation a healthy sum of money each year - many large corporations would like to be seen as good-hearted contributors to the world's largest source of free curated information.


Even if they adopted an official policy of allowing criticism, there would still be a stigma to including criticism in articles about the sponsors.


Non-profit with extremely low click rates is an excellent idea.


Wales could put three Google text ads on the front page and finance the entire operation.


Really? My experience with Wikipedia is that topic pages are frequently the first result in Google so I don't think I've seen the homepage in at least a year.


Personally, I never see the front page unless I'm on a new browser install and need to right-click > "Create Keyword For This Search" in the search box. But I imagine people who don't use nav-bar keywords go to the front page all the time to search for something.


We really need to stop putting Calacanis discussions on here.


I'm kind of ambivalent on that point. Part of me suspects that bad things happen when good people just stand by and watch.

I also think each of these is another data point in a Google discussion, more than about Jason.


So, let's badger Matt Cutts, Googles mr. anti-spam until he gives an official position on this instead of discussing the same stuff over and over again.


I agree with you, more Matt Cutts stories as opposed to Calacanis ones. He doesn't deserve the press, negative or positive.


I agree this should not be about Mahalo.com or Jason Calacanis..

It should be about how do we persuade Matt Cutts through facts and oratory to enforce the rules.


I don't understand the "bad thing" which is happening here. Are we supposed to be so thoroughly invested in the quality of Google search results that it's a personal affront when someone games the system?


The "bad thing" is the inconsistency with Google letting large sites get away with the shady practices while punishing the little guy.


Yes. Google has massively raised the barriers to entry in Internet search, and profited mightily from doing so. It is a big deal if they are gameable, so that sites like Mahalo can force us to pay an attention tax every time we want to find something.


I am, when they game the system without adding value to the end user (and note that's a general claim, not aimed at Mahalo specifically). And if you let one get away with it, others will follow, until the search-engine system breaks down.

Now that could mean a shift as revolutionary as the shift to search engines originally was. Or maybe, because at the moment almost all gaming is targeted at the Google gorilla, it may mean Google loses some relevance and falls back to the pack in terms of market share. Maybe either of those would be good. But why take the risk?


Which got me thinking, `vote-up sites` have the same view for each user. It would be interesting to see how user preferences could be detected and mixed in (like a personal ranking system), though I'm weary of a self inflicted echo chamber.


My startup does this.

It uses total upvotes like HN, but mixes in your past preferences for similar material and a bit of Bayesian magic.

The idea is to provide a site that is able to scale better than the usual up/down vote sites.


I'd prefer a "blacklist" field in prefs where you could add strings for stories that you don't want to see. Probably wouldn't help HN's caching strategy though...


There is a Greasemonkey script named HN Toolkit that allows you to accomplish this:

http://userscripts.org/scripts/show/25039


Thanks for the tip! Do you know if it works in Chrome?


Why should I care about this? Because right now, I don't, and I'm about ready to start flagging any article that mentions Jason Calacanis.

If you hate someone and what they do, don't give them reams of free press! Nobody would ever have heard of him or Mahalo if it hasn't been for people whining about him/it on their blog...

Also, the GFDL, one of Wikipedia's licenses, says you can "copy and distribute the Document in any medium, either commercially or noncommercially".


"Nobody would ever have heard of him or Mahalo if it hasn't been for people whining about him"

Except maybe people completing a Google search looking for meaningful information, and finding a MFA page with minimal or no content.


I've found plenty of pages on the Internet with wrong or worthless information. Welcome to the Internet.


Fair enough, but if I do a total reversal of my Canadian Socialist morality and play free market advocate, isn't this a problem that the market for search engines can solve? Why can't Bing show off how their search engine ignores mahalo and other wikipedia scraper sites and thus provides a more useful experience?


It's a fair point - the consequences won't necessarily be negative - http://news.ycombinator.com/item?id=1185598. But is it worth the risk?


I'm beginning to suspect that some or all of the Mahalo/Calacanis stories here are submitted by himself, just to get the names known.


I am imagining Calacanis as a zombie rummaging through the web, looking for new content to replace the old content, while being hunted by Matt Cutts who wears those peril-sensitive sunglasses from The Hitchhiker's Guide.

In some ways this is hilarious.


I think it's pathetic.

And Matt Cutts doesn't seem to be doing much hunting, it wouldn't take more than one 'shot' to take mahalo out of circulation.


Which is why I said that he is wearing those peril-sensitive sunglasses: if I remember correctly, they would prevent you from seeing anything that could disturb you.


No, they're so that you can maintain your cool attitude in the face of extreme danger. (Because you can't see it).


Actually at one point they are described as hiding things that might disturb you too (massive massive hitchikers fan :-))


You may be right, I'd have to check, but I thought that was the 'SEP', the somebody-elses's-problem-field.


The SEP hides things which don't fit into your worldview, the glasses hide things which might disturb you, which can fit into your worldview or not.


Ah I might be slightly off: the proper quote says (along with your bit) they hide things that might alarm you.

The radio play uses disturb.


He seems to follow the wikipedia licence.


So does every other wikipedia clone site. But that's besides the point, he claims that mahalo is built with their users' content, clearly that is not the case.


Sure, but using Wikipedia content in manner permitted by its license is not 'stealing' it.


Fair enough, that's a misleading title. 'Lift' instead of 'Steal' or maybe 'copy' would be better.


"Redistribute under the terms of the GFDL."

Let me guess... Linux distributions are "stealing" Firefox? Oh, there isn't some holy war against that, so it's a good thing. Right.


Do Linux distributions take Firefox and laden it with advertisements?


Some do. Ubuntu replaces the default home page with an Ubuntu-specific search box that points to the highest bidder's search engine.

It is perfectly compliant with GPL/GFDL to take the product and fill it with ads. There is nothing wrong with doing that.


They'd certainly be within their rights if they did.


And it is perfectly within everyone else's rights to frown upon it.

The liberal licensing of Free software/content does not mean every use of those freedoms is a good and positive thing. And nobody is being inconsistent by treating it as a bad thing, while still recognizing it's within one's rights to do.


It said right there on the page I looked at that the information was from wikipedia.


As the title of this posting suggests, that's apparently the backup plan because they lack enough original content.

The fact that they attribute it is nice but does not detract from that point.

JC is on the record stating more than once that mahalo.com is built using user supplied content, he seems to have stretched that definition to now include the content of other sites as well.


Actually, no he doesn’t. Just linking back to the Wikipedia page is insufficient. To whit:

> You must include a copy of, or the Uniform Resource Identifier (URI) for, this License with every copy of the Work You Distribute or Publicly Perform. [...] You must keep intact all notices that refer to this License and to the disclaimer of warranties with every copy of the Work You Distribute or Publicly Perform. [...] If You Distribute, or Publicly Perform the Work or any Adaptations or Collections, You must [...] keep intact all copyright notices for the Work and provide, reasonable to the medium or means You are utilizing [...]

Mahalo has a link to the CC-BY license 3.0 (even though the link is oddly titled "CC License 2.0"). It is not acceptable to take content from wikipedia (licensed under CC-BY-SA 3.0) modify it, and release it under a different license with conflicting terms.

This could be cleared up by unambiguously stating on the Mahalo pages that the content was available under the CC-BY-SA license 3.0, and providing a link directly to the terms of the license.


In fact as a wikipedia user, contributor, if I remember right the change form CC2 to CC3 at wikipedia was done to handle these sort of issues with credit not being given correctly..

I wonder does the DCMA takedown rules apply here?


I am disappointed by the degree to which people are piling on here. When you run a user generated content site, it's common to seed it any legal way you can at first. As long as Jason is following the GFDL, I don't see a problem with using Wikipedia content to kickstart his community and get traction.


He's not kickstarting it with Wikipedia. Mahalo is not new.

The two main issues with all of these scraped pages is that a) Jason recently boasted how Wikipedia was, in his words, simply a "free for all" (thus leaving the impression that he feels that his site was therefore better, being moderated), and b) that his pages were all UGC built by "his" users, not bots.


You've made some valid criticisms of Mahalo in the past, but the content and tenor of this article makes it look like this has gotten personal for you and now you're reaching.

He's not kickstarting it with Wikipedia. Mahalo is not new.

This isn't about age, this is about size and activity of user base. Again, I can empathize with anyone trying to bootstrap a UGC business. You practically have to seed it with content.

The two main issues with all of these scraped pages is that a) Jason recently boasted how Wikipedia was, in his words, simply a "free for all" (thus leaving the impression that he feels that his site was therefore better, being moderated), and b) that his pages were all UGC built by "his" users, not bots.

While both of these are certainly reasons why you may personally find his use of Wikipedia's content distasteful, neither of them rise to the level of being "stealing," ban-worthy, or even particularly unethical assuming that he's in compliance with the appropriate license.


I'd be curious to know what the position of Mahalo's investor's are in this debacle? Are they aware of what they're investing in?

(for reference, Mahalo's investors include Mark Moritz, Elon Musk, Mark Cuban and Ted Leonsis)

I'd be willing to setup a scraper site this weekend if they want to give me money too.</sarcasm>


as long as it makes money, who cares?


Well, I care.

Not for myself, but you have teams of guys out there building legitimate web applications that could have a meaningful impact on the web community and instead, douchebags with scraper sites are getting invested in.

For all we know, we lost the next Youtube (or whatever app you want to talk about) due to Calacanis' antics (I don't personally hate the guy, I just don't approve of his methods)

---

Lastly, apologies for the tardy reply. I've got maxvisit set to 30 and minaway to 240 mins, so replying won't ever be close to instantaneous.


Who is this guy and why do I care?


for as much as he rips on Jimmy Wales, I'm sure Jimmy would love to make a spectacle of this


If he did, he'd make a fool of himself. Wikipedia explicitly allows this sort of thing, it's one of the founding principles of the site. And I think it's great, IMO it's one of the reasons the site is so successful.

That someone running a shitty recycled content farm is making money off of it doesn't alter a damn thing.

And that's not to say that Calacanis is adding anything of value to the web or anything like that, he's certainly not, as far as I can tell, but I just can't get too worked up when someone actually makes use of the license freedoms that we all fight so strenuously to spread...

WTF re: Google, though, I can't believe for one second that this is not on their radar?


Sometimes, I wish the internet had permabans.


Seriously, this level of attention doesn't seem to be rational anymore.

The guy is trying to build a business that creates jobs, and hes putting some of his own capital on the line to do it. Thats the most important point here - he's making jobs.

The guy isnt selling blood diamonds or anything, at worst he's gaming a search engine. Sure, he's using some wikipedia content to run some tests. But, haven't the same people on here also run tests on their businesses - maybe incomplete features, bugs, overselling, trying to keep up. Thats all part of building something from nothing.

Give him a break - he's trying to create more jobs, but hes from a position where doesn't have to bother.


Creating jobs by doing a bad thing to a large number of others is not a net social good. There are plenty of other problems that good software engineers should be working on, rather than spamming up the internet for the benefit of a company that will ultimately funnel a decent percentage of its generated wealth to a professional troll.

If it ever gets to the point where Calacanis gets a big exit, business magazine/blog articles will be written about him and how successful he was and detailing some of what he did to generate that wealth, and some future entrepreneurs will see that, look up to him, and say "Hm, that doesn't sound so hard, I can do that too", and then go off and make their own spam sites for fun and profit.

It's really lose, lose, win (for Calacanis) if we let him get away with that, and as disproportionate internet users, we're the ones that will be hurt the most by search engines becoming more spammy.

No break for Mahalo, please. If Google dropped them and all the other MFA sites, that would be better than 5 Christmases.


he will not get the big exit, have you reviewed the long term stats using quantcast of mahalo.com?

Currently he has to revise and update the feature set to create boom bust cycles in visitor upswings..and yet long term they only amount to a few percentage points in growth..


You're making him sound like a saint. He's not trying to make jobs, he's trying to make money. He'll do it any way necessary. He's a businessman through-and-through. Whether you think that's a bad thing or not is up to you but don't pretend like his intentions are philanthropic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: