Uh, that's not a response to Aaron's article. That's Jason acting naive and oblivious to dodge the accusations, and taking on this whole 'aww shucks, can you help me out a bit?' attitude to shift the discussion and get on Aaron's good side. If you had a case against what you were accused of doing, you would've published a well thought-out article, much like some of the articles you've written that have been on subjects you're clearly confident and well-versed in. But you have nothing this time.
Also, stating that such pages only amount to less than 1% of your revenue is in no way a justification of the content theft you're committing. You're cowardly sidestepping the issue. What have you got to say about the actual content theft, rebranding of such content as Mahalo's, and knocking down the original authors of that content by skipping on the credits and outranking them?
"We can't tell whether Jason is misleading us about the proportion of scrape-generated pages on Mahalo without access to any Mahalo page statistics."
Well, when doing a site search in Google for
site:mahalo.com "Links Powered by Google"
there are 553,000 pages indexed in Google which are using scraped search result content (with optimized page titles) to help pull in traffic.
http://www.google.com/search?q=site%3Amahalo.com+Links++Powe...
and keep in mind that is just links from Google...there are also chunks of content from Google blog search, Twitter, and other sources (images, videos, news) on those pages
he is full of ____ if he is trying to get anyone to buy that doing the above is responsible for less than 1% of their traffic when Compete.com shows their search referral traffic as being ~ 60% of their referrals
It is not just a few (thousand) 100% auto-generated (experiment/stub/zebra/spam) pages that have scraped content on them...the above search shows Google estimates over a half million pages in their index contain content from their own search index...total regurgitation of 3rd party content :D
And lets not forget that 1.) he is using people's optimized page titles as content on his pages 2.) search traffic monetizes better via ads than other traffic forms...especially the search traffic that lands on a page for some random longtail keyword made up by arbitrarily combining chunks of 3rd party content mixed together and re-aggregated. 3.) in addition, there is a $0 editorial cost to scraping these millions and millions of content snippets and re-displaying them. 4.) he is making at least 5 figures a day from that content scraping...with 100% certainty.
his 1% remark is just another form of misinformation. nothing new there!
Mahalo is monetized through Adsense(well some affiliate links too). + the fact that the result is usually incomplete, means people are more likely to hit an ad to find a better answer.
So it's most likely a big cash cow for Google.
Which do you think it's better for them to send a user to? Mahalo powered by Adsense or __________ powered by Tribal Fusion?
Oh and just to clarify I'm not saying Google is doing anything evil. They don't adjust their algos to help push Adsense sites higher. But when an adsense site reaches the #1 position, they do appreciate the extra revenue.
Yes, that they don't notice. People are very quick to say companies make decisions based upon money, but they don't pay attention to the fact that the money is usually far smaller than even the short term loss incurred by not doing the right thing.
Our pages are built by our community, so the quality will vary. That page doesn't have a "vertical manager" yet, but it will. Then we would build out the content a little more.
It's not a perfect system, people can't put multiple search boxes in there.
However, that page will never rank well in a search engine (unless by a fluke). In order to rank well you really need to have 500+ original words.
We're in the process of moving all pages to that standard. It's really a self-regulating thing: if our contributors make short pages they never rank and never make money. They get frustrated and we teach them how to make longer pages and some day they may rank.
... it's really not a problem, and the truth is we rank for three things well:
1. video game walkthroughs (typically 2-10,000 words!)
2. how to articles (typically 800 to 5,000 words)
3. question & answer pages (typically 300 to 10,000 words),
Isn't this basic SEO (and i'm not expert): build original content and you might rank. Build short pages, you don't rank.
All pages start short (just like wikipedia stubs do), and over time we make them longer. that's the normal process.
I think vaksel's comment "Mahalo is monetized through Adsense..." at least partly explains it. Other than that, is anyone lodging policy-violation complaints when they find dodgy pages?
Semi-unrelated to the actual content of this particular post:
(I apologize about that, by the way)
Am I the only one who simply can't stand Jason Calacanis?
He's often propped up as some sort of guru/authority/etc. of start-ups and the Web in general, and I just don't see it at all. I've never read any words of his and felt like a smarter or more knowledgeable person afterward; I only ever see rather mundane platitudes.
Perhaps I just haven't read the right pieces of his? If anyone believes this might the case, please consider responding to this post with a link or two. I'm seriously very baffled by his image (and to be honest, I don't think highly of Mahalo as a concept, for many various reasons I won't detail here unless someone is interested in them).
Congratulations! You've hit upon the tragic core of this industry: Namely, that it's full of semimature people who are only well-known because they're well-known. This is such a new field of study that even the biggest figures are usually only halfgood at what they do.
That's cruel. Let me rephrase: This isn't a new field. It's a series of new fields, all developing in tandem. Web design, web coding, web marketing, are all necessarily only a couple of decades old. The tools we're using are younger than that. Two years ago we were still debating how to align a webpage vertically. We've only begun to have real web typography in the same period of time. And programming is similarly new, and we still don't have good web marketing. But those of us who're invested in the web don't like to think about that, so we convince ourselves that we're as advanced and sophisticated as anything else. To do that, we lower the bar.
Celebrities in any field tend to balance skill and marketability. Rarely is somebody both ultrafamous and cutting-edge. So in a world where nobody is really that famous and nobody is really that talented, our famous icons are doubly pathetic. Even our best icons, like Zeldman and Fried, are bright but not really operating on a new plane. (I think there are some exceptions, like Dean Allen, but Dean isn't the size of Calcanis.)
That's why the brightest talents in our sphere come from other mediums. I think the best designers I know are all twee and twenty and annoying as fuck. For all their immaturity, that's the first generation that could grow up immersed in all the aspects of the business. But admitting their superiority is admitting just how fucking clueless we all are, and that wouldn't be as fun for the communty.
Calcanis is a particular jerk, though. I never got why we like him here. Mahalo's been a joke from launch day.
(Sorry if there were spelling errors. I wrote this on an iPod.)
> Am I the only one who simply can't stand Jason Calacanis?
One of the rules of this site is that you ought to write things that you would feel comfortable saying to someone's face. And often it's very practical because all sorts of people turn up to read stuff here. Including Jason, it appears.
That said, I get a bad vibe too: why is everything he is associated with seemingly "Jason's this" or "Jason's that". Generally, a startup is known first and foremost as the company, not as "Alexis Ohanian's Reddit" or "Sergey Brin's Google", but you always find that guy's name in the same phrase as Mahalo. I can't quite place my finger on why this bugs me, but perhaps it's something to do with building something up as a free standing entity versus lending your famous name to something.
Jason's recent fits of sanctimony over other companies' behavior contrast poorly with his posturing when people criticize him in turn. His dismissal of any accusations as "silly", old news, or naive mistakes ring hollow and insincere when, in the next breath, he boasts about his long experience and great team of smart people.
I think it's this hypocrisy that's caused sentiment to turn ugly more than the (depressingly common) behavior of his company.
>One of the rules of this site is that you ought to write things that you would feel comfortable saying to someone's face. And often it's very practical because all sorts of people turn up to read stuff here. Including Jason, it appears.
You know, despite being a total noob here, I still thought I had that one checked off on my list of things to watch for...until of course Jason himself showed up on the board (assuming it's really him, of course). I really did not see that coming. But what can I do now other than apologize and hope he wasn't offended? Sorry, Jason. :) No real defamation intended, just wanted to open a discussion on the validity of the opinions you propagate.
I think it's still quite possible to say negative things to/about people, but it really helps to imagine saying them directly to the person in question.
Right, of course. I didn't mean to imply that I thought it a mistake to be critical, just regretted and apologized for the manner in which I criticized.
I can't speak about Mahalo, but I have to say that I've become a big Jason Calacanis fan because of the podcast: http://thisweekinstartups.com/
He can be annoying at times, but I love the way he tells it like it is. Every episode, entrepreneurs call in and he helps us by dishing out no BS advice. Good advice. The kind you need to hear when you are too close to the problem to understand what to do. He doesn't patronize (well maybe just a little), but instead, he seems genuinely interested in helping other entrepreneurs.
Yes, his self promotion can get tiring, but hey, the guy is a marketing genius that we can all learn from. Check out the podcast.
From what I've seen/read, it seems like his strongest skill is finding the right people and letting them do their job and excel at it.
After all without people building the content/handling things, Weblogs inc wouldn't have been such a hot property that sold to AOL for however much money it ended up selling for.
AOL was mainly buying into the trend of blogging / lower cost content production models when they bought Weblogs, Inc. They didn't buy a huge cash generating company.
Jason was bragging about having about a million a year in revenues (revenues, not profits) with some absurdly large number of writers (like 30). And part of that cash flow was selling text links that flow PageRank to online casinos...over 100 of them carpet-bombed his network of blogs.
http://www.threadwatch.org/node/6312
And (ironically) back then he was also selling links to scraper sites
http://www.threadwatch.org/node/5295
(maybe that is where the Mahalo idea came from)?
gosh Aaron... you are attacking me for mistakes two companies ago! funny.
Actually, we didn't know were selling text ads to people selling page rank back in 2005... just like Tim O'Rielly didn't. We both turned those ads off when we found out.
so, we actually did the RIGHT THING and you're still trying to paint me as some evil spammer.
to this day Joystiq, Hackaday, Engadget, TechCrunch50, This Week in Tech, This Week in Startups, Open Angel Forum, Blippy, Gowalla, GDGT and dozens of other brands/projects I've been involved with/invested in are doing great work.
You're making me into this black hat who is trying to cheat... it's not how I operate. I true to make and support great products and improve them consistently.
A lot of folks I know like the brands I'm involved with and I'm very proud of those brands.
Have I made mistakes over the last 15 years of building brands? of course... but i'm always honest and engage folks who point them out. in fact, i thank them.... like i've done here.
your attacking me for what you think are my weakest points just gets my team focused on fixing them--and for that I thank you!
Honestly, thank you! if there are things we can do better we want to do them. We're in it for the long-haul. We want to make a great brand and we don't want to do it by cutting corners.
I know you were hurt when I said "seo is bullshit" a million years ago, but back then that is what i thought. I didn't know anything about SEO back then and I really don't know much about it now. i was wrong when i said it was bullshit and I have said that it was an off the cuff, uninformed statement many times.
"you are attacking me for mistakes two companies ago! funny."
Well I am still in the same company I was back when you call my industry scum (while you were selling links to shady casinos). As far as I am concerned your opportunistic jumps between companies are all part of the same general trend / strategy. You still have not changed as a person.
"i was wrong when i said it was bullshit and I have said that it was an off the cuff, uninformed statement many times."
Yes but the difference is you stand on a podium at conferences and yell to the press when you slag it off.
When you say it has value you do it on a twitter post or a quick blog post that does not even hit your homepage. You are not quite as loud in this case.
I would say to make it right you should probably need to (at a minimum)...
- Direct link out to the sites you are scraping content from. FIX THE ISSUE.
- Make an in-depth post on your blog highlighting how important SEO is to all web based businesses. Describe how you gamed social media sites and worked media and nepotistic angles to build links for Mahalo. Also describe the bait and switch public relations and attack bait angles you used to build links. Better yet...make this an in-depth case study!
- Make sure a copy of that post lands in the inboxes of your media email list and your Jason email list.
- Issue a press release announcing the above blog post.
If you do all of those then there is a pretty strong chance I will think your apology is sincere. Anything less and I will realize that this is once more another round of posturing.
Well I am still in the same company I was back when you call my industry scum (while you were selling links to shady casinos). As far as I am concerned your opportunistic jumps between companies are all part of the same general trend / strategy. You still have not changed as a person.
I'll probably get voted down for this, but I think you're crossing a line here. I agree with some of your opinions, but these kind of rants against a person in particular and some kind of personal dislike you have for them really hurts your case. It's just petty bullshit and it reveals personal bias that really clouds the issue.
OK, I will get started this week on these issues Aaron. I really have no ill-will toward you or SEOs.
I made a simple, uninformed comment five years ago during a Q&A session when someone asked me about SEO. Back then when we were doing Engadget and Joystiq and HAckADay we didn't do ANY SEO and our opinion of SEO was it was a waste of time.
I've personally learned from great SEOs like you and Michael Grey how to do white hat stuff, and what the best practices are.
Our goal at Mahalo is to produce great content, and yes, rank for that content when we have a GREAT page in search engines.
We don't want to rank for lower quality pages or pages that are being built out. Your anger towards me is misplaced I think... i was joking saying SEO is bullshit and it's not my fault that everyone takes my one-liners and turns them into gospel.
I will make a concerted effort to rehabilitate the damage I've done with my flippant comments. Honestly, I will. Even willing to do a joint press release with you and have you on This Week in Startups to discuss...
... provided you keep giving me all this great free advice! :-p
You are correct, my best skill is finding amazing people and supporting the heck out of them.
However, I would give myself a modest amount of credit for finding verticals a little earlier than average (silicon alley, blogs, human-powered search, etc), as well as marketing, branding and product design.
I never asked to be an expert... I just state my opinion and folks can respond to it. Sometimes i'm right, many times I'm wrong.
I don't claim to be smarter than anyone... I just go to work every day and try to do a little better each day.
I never asked to be an expert... I just state my opinion and folks can respond to it. Sometimes i'm right, many times I'm wrong.
I'm not your biggest fan, but can everybody piling on here read this and take it to heart? The best thing about the Internet is that you can say anything whenever you want and learn as you go. Just because Jason's more famous than you are doesn't mean he expects his words to be taken as canon, so if he apologizes for saying something and backtracks it's probably less slimy than it is genuine people-not-being-perfectly-consistent-always.
I notice people are piling on the downvotes right now, which is a shitty way to handle a debate. Even if you disagree with people, vote based on whether they're contributing to the conversation. When the guy who made Mahalo comes in, he's contributing something rare and unique and if you pound it into the dirt you're making us all look like immature jerks.
of course he is acting polite. he is still stealing people's content.
I am just calling a spade a spade.
I have never met the guy in person...I just think he should hold himself to a higher standard if he is going to trash an entire industry he should educate himself on best practices within it rather than claiming ignorance while building a deceptive business model.
Sounds like there's some bad blood between Jason and Aaron. Maybe I don't know enough about SEO to get the sting of Aaron's arguments, but on the surface Jason has been cordial and Aaron has been vicious. That's not helping your credibility Aaron.
That's one of the problem - most people don't know enough about SEO to know what exactly is going on. They place blind trust in those who are featured in the mainstream media.
While Jason goes out in public and says one thing, based the examples in Aaron's blog post he does just the opposite in practice.
Sites like Mahalo that auto generate content and scrape content are increasingly becoming a problem online as they clutter the search results and hurt the publishers they scrape and steal from. Google is doing worse than turning a blind eye to this as they actually are encouraging these very large "content providers" to create content for filling ad space (see the Wired article on Demand Media).
With SEO being an industry that has a negative perception by many, those involved in the industry feel the need to defend their profession.
It goes beyond SEO being an industry. Many "SEO's" who are actually good at what they do spend a lot less time on SEO as a profession and spend 100x more time on building businesses. Auto-generated content sites like Mahalo, eHow & other Demand Media junk are hurting their legitimate businesses with debatably unethical businesses practices.
Let's use a different industry to illustrate what is happening. Let's say a band named The Beatles records a new album. The local radio station gets a copy of their album and plays their song. The listeners love it so they play it more often, but they don't mention who the band it and on their website, they put up a link to download the song... but without any credits. Their audience grows. They get advertisers to advertise to their audience. They say, "hey, playing good songs gets use more listeners and more listeners get us more advertisers, which gets us more $$. Let's do this more often." So they go do this 500,000 times, and each time never mentioning who the artist is. They grow and prosper while the artists starve. Oh, in the mean time they call the artist scum.
In the above metaphor, the artists are the bloggers whose content Mahalo is using. The radio station ripping off the artist is Mahalo. The Federal Communication Commission is like Google, who is allowing all this to continue because the radio station is giving them a cut from the advertising revenue.
Hope this helps make it a little more clear why what they are doing is wrong, needed to get exposed and needs to get fixed.
This is like Tiger Words claiming he desired privacy for his family deeds (after he had went outside his own family). if you want the public relations when you are hyping your own company and/or trashing other people's livelihoods then you accept the greater responsibility.
you can't just choose to take all the benefit and fall back on that <em>I am no expert</em> crap when you get caught in a lie.
"if he apologizes for saying something and backtracks"
That is the big problem. The apology is insincere.
He is not doing any real backtracking or shifting of strategy...just repeating a fake apology WITHOUT addressing the issues that were brought up.
MOST CRUCIALLY putting nofollow on the links to the sites he is "borrowing" content from.
I'm not going to argue with you, because I haven't been following this, because I could give a rat's ass about Mahalo. I think what I said about downvoting still stands. You don't have to upvote him, but when he's down at -4 a few minutes after he comments, it looks less like we're having a good debate and more like a bunch of us are kneejerk attacking him. It doesn't give Jason motivation to keep talking to us when the site visually displays how little we care about his words.
keep in mind that this is like the second or third thread I have ever participated in here...I click the left arrow up to upvote a few times, but I don't even know how to downvote yet...haven't done a single one yet.
How can you call Maholo a search engine when the majority of the traffic you get comes from other search engines. How is Maholo any different than About.com?
Jason's sweeping "it's less than <1%" comments are getting tired.
We can't tell whether Jason is misleading us about the proportion of scrape-generated pages on Mahalo without access to any Mahalo page statistics.
I'm not for or against Jason on this matter, I'm just saying that we have no data on which to base any conclusions. It's possible he's telling the truth.
It would be interesting for someone to take up the challenge of creating a small web app that finds all Mahalo URLs, heuristically examines them for spamminess and generates some statistics.
Not that such an app would be of any particular long term use, but it might be interesting nonetheless.
"we accidentally took that off when we moved to Mahalo 3.0 i think." Jason is the last person who would accidentally make a move like that. Its all planed.
His business would come crashing down if he admits even to an inch of this. I have to respect that he knows how to run his business and respond to these types of concerns that are raised about his ethics. But that still doesn't excuse the facts.
... the truth is having short content pages indexed works against you--that's why i no indexed them to being with.
Again, I'm not as big of an expert on SEO as Aaron, but I think there is something called "page rank sculpting" in which you push your sites page rank to high-quality pages and no-index the ones that are shorter in terms of original content. We did that because one of our people read about it on a blog.
We're probably holding ourselves back having removed this and we are putting it back on because we didn't realize it was off.
That is why I thanked Aaron.
Is the page rank sculpting thing not a good idea? I thought this was a fairly certain thing: only index the best pages.
A mistake is something you do where you try to make it right (at least if you acknowledge something as a mistake AND MEAN THAT YOU WANT TO MAKE IT RIGHT you should fix it).
Scraping 3rd party content (without permission) and then putting nofollow on the links is bogus.
But you keep avoiding that issue because you realize what you are doing, and your goal is to cash in on others content for as long as you possibly can.
Thats fair. Theres no denying Jason is good at what he does and understands the web business better than most. I feel like this is an opportunity for those on HN to learn the way he rolls and manages his business.
Of the pages indexed in Google that are of this type of spam in nature, Mahalo accounts for less than .1% .. I think this is more Google's issue than Mahalo's.
"Of the pages indexed in Google that are of this type of spam in nature, Mahalo accounts for less than .1%"
Huh? I'm not sure which way I stand on this whole mess, but the fact that he makes up a tiny portion of the "spam" doesn't seem relevant? Is a petty thief less bad because he accounts for less than .00001% of theft in the world?
> Remove the nofollow's from the attribution links to content you've scraped. Anything else is just plain rude.
I think this is a question we all, other [slightly petty] arguments aside, want to hear an answer to. I doubt we will though - care to prove us wrong Jason?
1) "MORE QUALITY PAGES and more QUALITY links = bette SEO from what the top SEOs have told me. "
For this purpose, indexed by Google counts as "quality." Please read the PageRank paper, and you'll quickly see that more pages in the index == more PageRank.
That doesn't always mean putting crap pages into Google is a good idea, but strictly speaking, it will raise the total amount of PageRank your site has. Seriously, read the paper yourself.
2) "but we don't scrape content. our search results do have abstracts but they are smaller than Google's, and in some cases they actually are google's."
Perhaps scrape is the wrong term. But you absolutely do include other people's content and nofollow it, for example all of the images on this page:
NoFollow is meant for links to content which you don't want to "vouch" for. If you've included their content in your page, I think you can vouch for it.
Maybe an honest oversight, maybe not. Either way, you know about it now, so fix it.
you are not scraping directly...you are taking 3rd party abstracts of publisher content and using them against the publishers. this is actually even worse than scraping directly because there is no way to opt out of Mahalo "borrowing" a publisher's content without the publisher also blocking Google from indexing it.
keep in mind all of your content is creative commons licensed...so if you keep up your scrape and nofollow attribution game, you are going to be in for some surprises where you feel the effects of what you are doing onto others.
your choice...either fix the nofollow anti-attribution issue yourself or have the market fix it for you.
I think that the overly "I love you man" tone of his response indicates to me that he's thumbing his nose at the situation and just being a bit of an ass.
not the intention... not thumbing my nose at Aaron. he's great ask Fisking someone and I'm willing to hear him out and do a better job if there is merit to his arguments.
I'm also willing to jump into this mud pit and get my ass kicked by you guys. :-)
Also, stating that such pages only amount to less than 1% of your revenue is in no way a justification of the content theft you're committing. You're cowardly sidestepping the issue. What have you got to say about the actual content theft, rebranding of such content as Mahalo's, and knocking down the original authors of that content by skipping on the credits and outranking them?
You got caught, dude.