More

jfriedly · on Oct 28, 2013

https://github.com/echelon/laser-testbed

You can also find more stuff that he's done at his website, http://brand.io/

jfriedly · on Oct 4, 2013

I read the paper when this first showed up on HN[1]. The most important thing they did was to create a training set with higher granularity in the data than much of anything previously seen. Based on their training set, their algorithm was able to achieve 85% positive/negative accuracy on sentences, but previously state-of-the-art algorithms moved from 80% accuracy up to 83% accuracy when adapted to their training set. While their algorithm appears to be better than anything they tested against, this is fundamentally an incremental improvement, not groundbreaking research. The real win here came from using a better dataset.

[1] http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf

Edit: formatting

vivekn · on Oct 4, 2013

Will be interesting to play around with that dataset.

jfriedly · on Aug 28, 2013

Would you care to send me an email? I'd love to ask someone who worked at one of these companies a few questions.

joelfriedly@gmail.com

Thanks!

jfriedly · on Aug 4, 2013

An example:

First alpha release of Python 3.4 is out (python.org)

https://news.ycombinator.com/item?id=6155423

It probably linked to [1], but all we can tell is that it was something at python.org. The post was by plessthanpt05, an HN user for about two years with 838 karma. All of his/her posts from the past year or so are dead (about 60 of them). They don't appear to be a bot.

Python 3.4.0a1 isn't something that would interest everyone on HN, but it certainly doesn't seem like the kind of thing that should have been killed.

[1] http://python.org/download/releases/3.4.0/

Edit: more details on plessthanpt05

bnr · on Aug 4, 2013

Well, that user also posted over 60 links in the last year, while never receiving a single upvote or comment. I'm not sure why a flesh-and-blood user would still be posting after that.

consz · on Aug 4, 2013

I posted here for a year after getting hellbanned before realizing. It happens.

kefka · on Aug 4, 2013

On another account, I checked regularly for possible hellban status.

Lo and behold, I caught it about 3 days in. Of course, I didn't even use HN for the first of those 2 days.

This day, I spend more of my time on subreddits that share HN's enthusiasm and knowledge without the subterfuge of hellbans.

Phlarp · on Aug 4, 2013

Could you elaborate on which subs these are?

Too · on Aug 4, 2013

Maybe hellbanning adds random upvotes to deceive the poster even further?

ics · on Aug 4, 2013

"Saved links" might be one reason. While I wouldn't advocate using HN as a personal bookmark service it does lessen the burn a bit if nobody else comments/votes on it. The good thing is that if you use saved links for that it makes you think twice about what sort of things you should be submitting.

jfriedly · on Aug 4, 2013

Their posting frequency dropped significantly over the past year, probably as a result of never receiving any upvotes. At the time they were (presumably) hellbanned, they were posting at a rough average of one post per day, and it's much less frequent now. They seem to go for a week or two period in which they'll post several links, then forget about HN for a while.

I won't try to defend all of their posts; most of them are things that I don't even find interesting. But it seems that someone basically used banning as a method of cutting down on uninteresting material.

That may even be a good way of maintaining the signal-to-noise ratio on HN: if we ban the users that post many uninteresting links then the community won't have to see them. But it wasn't a tactic that I was aware of before today.

jfriedly · on Aug 4, 2013

I can't seem to edit this now, but someone else reposted the link to Python 3.4.0a1:

https://news.ycombinator.com/item?id=6155530

Their submission currently has 34 upvotes.

jfriedly · on July 15, 2013

The password can be easily found by googling the SHA1 hash.

DanBC · on July 15, 2013

And if no-one had posted the password it's easy enough to use any of the online "hash crackers" to get the password.

jfriedly · on April 2, 2013

This should be to https://www.nebula.com/nebula-one (without the trailing slash)

jfriedly · on March 30, 2013

According to the statistics site that the OP used, 54 million Americans are single. [1] Ignoring the 62 million Americans that are under fifteen, we find that only 21% of American adults are available. [2] So if your area's ratio is 9:10 and ~80% of the people are taken, you end up with a 7:12 ratio among the singles (ignoring homosexuality).

[1] http://www.statisticbrain.com/online-dating-statistics/

[2] https://www.cia.gov/library/publications/the-world-factbook/...

jfriedly · on Dec 25, 2012

> I tried and the link is down.

lunabluehotel.com seems to have gone offline yesterday, I suspect as a result of a traffic spike following their blog post. From their twitter stream:

> Seems that our website, http://www.lunabluehotel.com is currently down. We have a call in to find out why & solve the... http://fb.me/15notFYoE

(Note the link at the end of the tweet is also broken.) They also posted on the 23rd that views of their wordpress-hosted blog post were spiking:

> We just noticed that views of our Expedia blog post have gone through the roof for the 24 hours or so, all coming... http://fb.me/zyw2aX7R

jfriedly · on Dec 12, 2012

This sounds eerily familiar. Around a decade ago, a data analytics company called Pharmatrak was actually found guilty of breaking federal wiretapping statutes for doing something very similar. [1] In their case, they had built a network tracking HTTP GET requests to pharmaceuticals companies websites with a web bug [2] and attached cookie. But because some of the pharmaceuticals companies were using GETs as the method on HTML forms (remember, this was ten years ago), the users actually ended up making GET requests with personally identifying information in the URL encoded parameters. Since these GET requests were logged by Pharmatrak, and neither party (the users nor the pharmaceuticals companies) had consented to giving away personal information to them, Pharmatrak was found guilty of wiretapping.

Pharmatrak eventually won on appeal though, arguing that they had no intention of collecting personal information, which exonerated them because only intentional eavesdropping is a crime.

The company in the OP's article could make no such arguments though. I suspect that their main difference is that they make no assurances of confidentiality to the websites using their software the way Pharmatrak did. Which 1) is just really creepy, and 2) sets them up for trouble with users in California, because California's wiretapping statutes say that it's a crime unless both parties agree to it. [3]

[1] http://cyberlaw.stanford.edu/packets001737.shtml

[2] http://en.wikipedia.org/wiki/Web_bug

[3] I'm not sure if this applies to police, but it definitely does to private parties: http://www.citmedialaw.org/legal-guide/california-recording-...

Edit: Added third reference.

jfriedly · on Dec 9, 2012

I don't think so. "Statistically significant" is a relative term and when testing an entire population is infeasible (as it often is), we instead sample some fraction that we believe is "statistically significant" on the assumption that it will accurately reflect the whole.

The point of this article is that a sample only accurately reflects the whole in some ways. Variability in particular scales with the square root of the sample size. And since misconceptions about variability have been at the heart of many controversies (male vs. female intelligence, school size, cancer risk, etc.), De Moivre's equation is important; even dangerous in the sense that ignorance of it has led to billions of dollars wasted.