Hacker News new | past | comments | ask | show | jobs | submit login
Perspective API – An API that makes it easier to host better conversations (perspectiveapi.com)
114 points by flinner on Feb 23, 2017 | hide | past | favorite | 53 comments



I took the slider to the left end, and it was a lot of climate change denial. I thought "ugh, is this going to opine on left vs right ideology? That seems Orwellian", and dragged the slider to the right end, where, to my surprise, all the comments were insulting/useless.

It's pleasing to know that it doesn't care about your opinion, just about how eloquently it's expressed. Sounds very useful.


"The discovery of the Jewish virus is one of the greatest revolutions that has taken place in the world. The battle in which we are engaged today is of the same sort as the battle waged, during the last century, by Pasteur and Koch. How many diseases have their origin in the Jewish virus! ... We shall regain our health only be eliminating the Jew." – Hitler

17% toxic.


For fun I tried to change the meaning of the quote but leave most of the words and syntax:

"The discovery of the Polio virus is one of the greatest revolutions that has taken place in the world. The battle in which we are engaged today is of the same sort as the battle waged, during the last century, by Jonas Salk. How many diseases have their origin in the Polio virus! ... We shall regain our health only be eliminating Polio."

5% toxicity. So just talking about Jews makes a quote 4 times more toxic. Interestingly, the word "toxic" itself also makes a quote 4 times more toxic.

But personally I don't think Hitler quotes should be rated as toxic. The point isn't to control what people say, but how they say it.


Depends how you define toxicity - a toxic idea or toxic wording. The Hitler comment is obviously toxic in meaning but is 'eloquently' worded (or, at least as much as it could be given it is literally Nazism.)

I am at least relieved that this doesn't appear to be an Idea police AI nor a particularly liberally biased classifier


If I use my full comment above (not just the part in quotation marks) it says 25% toxic.

I think it is mostly a profanity filter.


For anyone who's used Google Translate, it should be entirely unsurprising that the best "AI" we can cook up appears to be doing little more than rating individual words.

I admire the effort, but the tools seem to be so fundamentally lacking.


Give it some time. ML improves with usage and training data.


While this is true, ML isn't a panacea that will solve all problems given time and training data.


The same quote is only 12% "toxic" if you remove the ... inside, and only 10% if you quote the entire thing (with ").


"Toxic" seems to be the wrong word for this tool. Another commenter used the word "polished". I think we can agree that the sentence you quoted is "polished"?


Hitler's speeches are anything but polished. In my opinion, his speeches display only a rudimentary grasp of rhetoric. I'd disagree that that sentence is polished in any way beyond basic syntactic and grammar rules.


I'm genuinely curious how many outlets would be enthusiastic about this without the ability to slide conversation towards their narrative.


Why would they want to skew posters narratives? If the entire point of comment additions to outlet sites being to increase retention and engagement then ideal/belief 'conflict' is exactly what they would want.


I would hope that they wouldn't, and hopefully I'm very wrong that they wouldn't want intelligent discourse instead of uniform agreement. That's why I'm very curious what interest there'd be in adoption of such a tool.


It seems to really hate profanity. `I love it` is 1% toxic. `I bloody love it.` 38%; `I fking love it` 95%. In many circles, the more profane are more congratulatory.


Even when used positively, hyperbolic words like profanity indicate a level of close-minded passion about one side or another of a discussion. The purpose of Perspective seems to be to foster rational discussion between open-minded individuals in a comment string. I'd be hard-pressed to find someone who moved a conversation in a non-toxic direction by typing "I bloody respect your opinion, but here's why assumption X is wrong."


The implicit assumption you are making is that a comment can contain only one thing. Something like the following would be both perfectly acceptable and moving the conversation in a good direction, IMO:

So fucking awesome.

By the way, how do you do X? Whenever I've tried I always run into problem Y. Also, I'm interested in your thoughts on Z.

The initial congratulatory statement used profanity for effect and to convey emotion, but in a positive way. The following statement(s) are specifically contributing to and flushing out the conversation. 85% toxic according to them (but helpfully it did allow me to correct them, so maybe they'll get better).


It looks more to me like it encourages non specificity, (questioning, uncertainty, or intellectual language)


"The author of the previous comment has a simian countenance which displays a lineage rich in species diversity" - 2% toxic. So insults are OK as long as they are from Watterson. I approve!


Great tool, by the way. I think it might actually work!

I started off with a highly toxic comment (in the window on the tool, not in "real life") and I tried to be just as insulting while lowering the toxicity level.

It was informative that when I did this, the sentence sounded more educated, more polished, but was just as rude.

If this spreads wide, I suspect it will usher in a new era of veiled insults and implied disfavor, but that will be a vast improvement over what we have today.


I think HN is basically already like this. It has its own set of problems, but I agree that it's an improvement.


You're sure passive aggression is a good idea? Personally, I'd prefer the direct insults to someone being passive aggressive.


To me that creates the question why I would read the comments in the first place. Because "filter to the left" contains not one post that would interest me. It also doesn't give me a feeling on relevant discrepancies in opinion.

If I could filter I would love to filter out everybody in the "filter completely to the left" as well as the "completely to the right" spectrum and just have the ones in the middle. The ones on the left are insanely boring and conformist, and the ones on the complete right are really just idiots.


Very interesting, though I'd love more details about the signals they're using.

I hold out a slim hope that this discussion doesn't once again devolve into "but how can we even define 'truth'??" This seems more analogous to spam detectors - perfection is absolutely theoretically impossible, but low-error implementations are incredibly useful in practice.


This seems to be a tone detector, not a truth detector.


I remember (from building a sentiment analysis irc bot[1] back in the day that used the afinn wordlist) that sentiment analysis is effective because robots can do mid-70%-accurate classification, but humans only agree around 80% of the time on 'positive/negative' classification.

So I always wonder, if simple models like the afinn wordlist work at close-to-human levels, how much total value is added by the more robust model. Still very cool!

---

[1] https://github.com/mrluc/cku-irc-bots/blob/master/feelio.cof... - included the following line of code, which is a coffeescript crime but also a cute sentence:

    if i_feel_you then @maybe @occasionally, => say i_feel_you


"Disobedience, in the eyes of any one who has read history, is man's original virtue. It is through disobedience that progress has been made, through disobedience and through rebellion."

Oscar Wilde, being 14% "toxic".

--

"Politeness , n. The most acceptable hypocrisy."

Ambrose Bierce, being 48% "toxic"


Also 14% toxic:

> "If parental love is so distorted that it demands submission and dependence for its self-confirmation, social adjustment turns into a test of obedience and the child’s efforts to comply bring with them the loss of genuine feelings. The human being then becomes the true source of evil."

-- Arno Gruen (though putting it in quotes knocks off a full 3% toxicity)


I tried two phrases:

    59% You're a potato.
and:

    60% You're a potato
Just losing the period makes it more toxic! Wow.

Now what can I do with this?


Putting your comment in quotes seems to makes it considerably less "toxic" than the same sentence without quotes, apparently (unless very high rate in the 90s).

"You're a potato."

35% toxic


just <potato> is 19% toxic


53%: Your a potato.

55%: your a potato

61%: ur a potato

36%: a potato is you


You can have all the technology in the world but eventually what censorship really means is someone else stopping you from speaking your mind, and that person can be benign or malignant.

"host better conversations" is a nice little marketing jingle but what it really means is "host conversations closer to what you consider 'better'". And 'better' is in the eye of the beholder.


I'm getting <10% toxicity ratings with sarcastic, mocking comments. Those ones will be a bit hard to fight off, it seems.


This is how the team built its toxicity model:

What's toxic?

This model was trained by asking people to rate internet comments on a scale from "Very toxic" to "Very healthy" contribution. Toxic is defined as... "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.

I'm not sure if it will ever be possible for algorithms/sentiment analysis to identify sarcasm as it is extremely situational and is often misread or misunderstood even by humans.

Seems to me that sarcasm will keep human moderators employed for some time to come.


it's seems easy to me, all they have to do is look for the /s! (/s)


Lazy youngsters, can't even finish your pseudo-tags! In my day, it was </sarcasm>.

</sarcasm>


Sometimes I can't even tell when I'm being sarcastic


> We are also open sourcing experiments, models, and research data to explore the strengths and weaknesses (e.g. potential unintended biases) of using machine learning as a tool for online discussion.

Does that mean that practically everything to be able to run it on a server without any google API is possible (once those things release)?



I just tried some different terms in the test bed on perspectiveapi.com and so far I noticed that 'Podesta' is a very toxic term, the same as 'Pizzagate' and 'Skippy' for some reason.

Kind of strange. Seems like this is more a tool for shaping the narrative than it is a tool for keeping a discussion sober.


Of the total number of discussions you've had or witnessed regarding pizzagate, what percentage have been sober?


Most of them have been sober, as sober as can be considering most threads I have partaken in contained content from his instagram...


This reminds me of the work Crystal is doing to coach you to write in the style most appropriate for your audience.

https://www.crystalknows.com/


Seems like lemmatizing the text before it's rated by humans (during the training of the model, that is) would get around a healthy portion of the grammatically-induced scoring differences.


Detecting (and warning commenters) about toxicity seems like a really useful idea. I would certainly like to browse many Brexit discussions with the top 40% of toxic comments cut out.


I think people often forget how much toxicity is filtered out of a conversation by talking face-to-face with a real person. I like to reserve conversations on political topics for emails, texts, calls, or real life chats.

Internet anonymity enables a ton of toxicity.


I agree that anonymity increases the potential for toxicity. But you're basically saying that because of that you're only willing to discuss politics in private, which to my mind doesn't really count as engaging in politics in a meaningful way. Given the proportion of public discourse that takes place on the internet these days, it makes perfect sense to develop tools like these to improve its quality.

Opting out to only engage in political discussion in private is in a way a privilege, that many feel they can no longer afford.


In addition to being a basic term match, it also would mark any conversation about controversial subjects as toxic. Try saying something innocuous about suicide or rape.


This thing does not like exclamation points. I'm all for being less toxic and more positive in my language, but sometimes I get really excited about it!!!


Sentiment analysis, basically?


I wonder what the results are on HN comments.


'You have a nice behind' scores a lot differently than 'you have a nice ass'. Hmmmm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: