CaptionBot by Microsoft

jnky · on April 13, 2016

I found it curious that this Bot is really bad at recognizing apes: chimpanzees and gorillas specifically. I fed it a lot of the images from a Google image search for these animals and more often than not it either doesn't recognize anything or considers them bears.

I don't mean to offend, but I'm left wondering if the creators of image recognition services disincentivize their neural nets from recognizing something as an ape, gorilla or chimpanzee so as to avoid the same mistake Google made when it falsely recognized black people as gorillas [1].

[1] http://blogs.wsj.com/digits/2015/07/01/google-mistakenly-tag...

colllectorof · on April 14, 2016

Reading the blog post and the Twitter thread.

Wow. 10 years ago it would have been seen as a comical blunder of a stupid AI. Something to be fixed for sure, but not a Serious Social Issue by any means. Nowadays, it apparently warrants several follow-up articles in the mainstream media, "social" commentary and 3,309 retweets.

From the blog post: “The bias of the Internet reflects the bias of society,” she said.

In some cases - yes, but this one seems more like society deliberately projecting human motivations onto a primitive algorithm and jumping to conclusions about what its errors "really" mean.

-

For people who disagree, here is a scenario your might want to consider. Imagine that you built an image tagging service. Imagine that someone found a glitch in your service that they consider offensive. Imagine them tweeting about it (before or instead of contacting you directly) and getting a similar kind of reaction, complete with extensive social commentary and media coverage. Nice, big crowd of people using your company and your service as a convenient example of things-that-are-wrong-with-our-society. How would you feel in that case?

taneq · on April 14, 2016

The issue here is that race relations in the U.S. are so convoluted and have so much history that it's literally impossible to tell what would be considered "racist" without a broad understanding of the culture and a comprehensive list of past racial slurs and grievances.

Witness the KFC ad (https://www.youtube.com/watch?v=ZaIhf41ctkM) which was broadly labeled as "racist", despite the fact that the 'black people love fried chicken' stereotype is (as far as I know) only a U.S. construction.

jnky · on April 14, 2016

> The issue here is that race relations in the U.S. are so convoluted and have so much history that it's literally impossible to tell what would be considered "racist" without a broad understanding of the culture and a comprehensive list of past racial slurs and grievances.

I understand, but isn't the logical conclusion that one cannot make a piece of technology (like an image caption bot) that is unaware of, for instance, such complex racial relations? And if so, should technological progress really be hampered by people's sense of outrage, even in the absence of malicious intent?

taneq · on April 14, 2016

The logical (to me) conclusion is that people need to unbunch their panties and stop looking for things to be offended about.

I don't think it's reasonable for anyone to expect an AI system, at our current level of technology, to have a complete understanding of every nuance of human pique to the degree where it will never do anything which could be interpreted as offensive by anyone.

Hell, that's a far higher bar than we humans can hope to meet in today's 'outrage culture'.

gengkev · on April 14, 2016

Yes, I think people overreacted a bit. But part of the backlash was because if these algorithms were better trained on a diverse set of faces, they might not have made that mistake. I think that's a fair criticism.

colllectorof · on April 14, 2016

> if these algorithms were better trained on a diverse set of faces, they might not have made that mistake.

Is this assumption based on something specific? I.e. are there good reasons to believe that they weren't trained on a diverse set of people and that such training would prevent this error from happening?

jnky · on April 14, 2016

I couldn't agree more.

I'm all for not discriminating people, but on the other hand I firmly believe that technical progress should not be second to people's feelings.

readams · on April 13, 2016

People in general look very much like gorillas. They are close cousins after all. Telling humans and other apes apart can't be very easy for a poor, overworked neural network.

mc32 · on April 14, 2016

Also why are people offended by gorillas, or apes. Imagine being mislabeled as a bear or a giraffe, would people get as offended? What about bring identified as a parrot?

jnky · on April 14, 2016

I get where you're coming from, but black people being compared to apes is kind of an age old racist trope, so it is understandable that people would be unhappy about it.

Being mislabeled as a bear on the other hand does not generally carry the negative connotations as calling people apes, so I suspects that this is why CapitionBot is erring on this side of the classification.

mc32 · on April 14, 2016

I'm aware people may want to _use_ it as a racist tool --but I fail to understand that transformation. We're all apes[1] the offender and the target/victim. So more than anything, it shows ignorance by the victimizer/bigot.

I suppose it's like calling someone a Neanderthal, but with racial implications. Interestingly, we're learning Neanderthals were not as cognitively lacking as perhaps some supposed.

It's like saying, hey, you are hairy... Right, we're all hairy (bald or not) with very few exceptions. It's a weird thing.

[1] In Indonesian lang. people and apes share the "orang" root. Oran, orangutan, orangnakal, etc.

jnky · on April 14, 2016

I don't think there is as much to understand as you make it out to be. People feel insulted when called apes, because it is frequently used as an insult.

It may well be taxonomically correct to say that humans as well as apes are primates and members of the Hominidae family, but that doesn't make it any less insulting.

Just because something is technically correct doesn't mean it isn't dehumanizing and suitable as an insult. I'm sure nobody would like to be called a sack of meat either.

mc32 · on April 14, 2016

Point taken.

Still, this is a machine that's doing the labelling, and one would have to think someone purposely programmed it to do that, rather than the engineers lacking foresight to see a possible misrecognition.

nols · on April 14, 2016

C'mon, there's a long history of calling black people monkeys or apes as a way to remove their humanity. It's willful ignorance to ignore that history.

outside1234 · on April 13, 2016

Short answer: Yes, especially after Tay.

6stringmerc · on April 13, 2016

I fed it the "Wat" meme and it thinks it's Pope Benedict.

>I am not really confident, but I think it's a man is smiling for the camera and they seem . I am 99% sure that's Pope Benedict XVI

Source Image: http://memesvault.com/wp-content/uploads/Wat-Meme-Old-Lady-0...

Needless to say, my errant habits of trying to break stuff shine through once again.

aaronbrethorst · on April 13, 2016

To be fair, that person does kind of look like Benedict XVI... http://worldmethodistcouncil.org/wp-content/uploads/2013/02/...

brillenfux · on April 13, 2016

Truth to be told it wasn't off too far.

6stringmerc · on April 13, 2016

Hah, yeah but when I got it working again, this time with a screenshot of Jules from Pulp Fiction, it thinks his gun is a camera.

>I am not really confident, but I think it's a man holding a camera.

Source Image: http://www.cinemablend.com/images/news_img/79237/pulp_fictio...

mc32 · on April 13, 2016

Imagine this tech matures and it can be incorporated with bodycams for police, when confronting a subject with objects in their hands it may be able to confidently estimate the probability of being a firearm or not, with better predictability than the police/people.

tajen · on April 13, 2016

...and better proveability. Most people are actually unarmed. http://www.theguardian.com/us-news/2015/jun/01/black-america...

While we're here, let's go the full way and set up a proveable and public way to train a robocop, and I'd trust that more than a human cop. The awkward moment when AIs have more brains than cops (at least under the US system).

toyg · on April 13, 2016

ITT: someone who didn't actually see the original Robocop: https://www.youtube.com/watch?v=A9l9wxGFl4k

mc32 · on April 13, 2016

Likely before we get to robocops the "robocops" will be integrated into people who have proven risky by either previously known behavior in addition to social signals.

So Jane truant with convictions of petty theft or battery gets off with probation if she agrees to embed her own personal "robocop". Yes invasion of privacy, etc. But the alternative for her would be time in the pen, for example. So in this case people can become their own robocops who turn the host in to authorities if certain conditions are met (engages in previously restricted activities).

I think this is more likely than a roving robotic cop which looks out for misdeeds.

dexterdog · on April 13, 2016

You are doing the lord's work. Carry on.

flatline · on April 13, 2016

Took a few tries but worth it - my son from the other morning:

https://imgur.com/a/w1Uai

Edit: They should have named it CationBot.

dingo_bat · on April 16, 2016

Cationbot? Why? Is it positively charged? :P

aw3c2 · on April 13, 2016

Are you sure your son is fine with you giving Microsoft his image and accepting all what you permit them by doing so?

flatline · on April 13, 2016

God only knows. I guess if he shows up at school and people say, "hey, you look just like that Microsoft kid!", I'd feel slightly guilty. Seems unlikely, and I'd have reasonable grounds for a C&D.

wagglycocks · on April 13, 2016

Actually, their ToS says they can do whatever the hell they want with it:

https://www.microsoft.com/en-us/legal/intellectualproperty/c...

coldcode · on April 13, 2016

"I seem to be under the weather right now. Try again later :(" i.e. we killed it.

nkg · on April 13, 2016

That's disapointing. But it worked once and did very well. I sent this picture http://www.sofoot.com/IMG/img-le-regard-perdu-1460478703_580... and it said "I think it is a football player on the field and he looks :(".

He didn't mistake the football player with a rugby player, a cricket player or else. And +1 for the emoji

neogodless · on April 13, 2016

I uploaded a photograph of a bunch of snow piled on top of a round table[0], which looks a lot like a marshmallow to the human eye. But it came back with "I am not really confident, but I think it looks like a polar bear lying in the snow." Not terrible :)

[0] http://imgur.com/nKRK5hc

neogodless · on April 13, 2016

Recorded: http://imgur.com/dtBmoBQ

unclebucknasty · on April 13, 2016

>rugby or cricket

Wonder if it was the ball logo on the player's shirt, the greater popularity of football , or something else that helped it distinguish.

LeifCarrotson · on April 13, 2016

Or if it assumed it was actually an American football player and would have answered the same for all the sports on green fields...

unclebucknasty · on April 13, 2016

I guess Occam's Razor says we should check the EXIF data.

javiayala · on April 13, 2016

I came here to say that along with the message: Classic Microsoft... they need to restart the machine.

slowmovintarget · on April 13, 2016

I got that for a pic of a teddy bear.

I uploaded a photo of the planet Saturn and it guessed that it was dish. It got the shape right.

akerro · on April 13, 2016

Did you post a picture of rain?

afshinmeh · on April 13, 2016

haha yeah

gilnahmias · on April 13, 2016

CaptionBot team here. Thanks for the images and captions! Please keep sharing them and give us feedback.

phodo · on April 13, 2016

First of all congratulations on a) the science (built on the shoulder of giants...) b) the accessibility / interface and service

Wondering if you plan to open up a caption API of any sort? Can definitely use something like this. If you desire the training feedback, then that could be added as well as part of the API. I'd be willing to do that for some images. So if you do add a training feedback API, please make it optional.

gilnahmias · on April 13, 2016

Thanks! We published all the APIs for free. Check them out @ https://www.captionbot.ai/Home/Magic

phodo · on April 13, 2016

Thank you! Don't know how I missed that Cognitive Services link...

6stringmerc · on April 14, 2016

Have you considered an 'abstract' type of version? As in, rather than simply describing the image as the caption, take the information that would be used, fill it in a MAD-LIBS[1] style setup, and see how that turns out? Maybe it's just me but a surrealist CaptionBot could be pretty fun.

[1] https://en.wikipedia.org/wiki/Mad_Libs

YeGoblynQueenne · on April 14, 2016

Thank you very much for putting your work out there for people to have fun with (and criticise!).

CaptionBot seems to have a bit of trouble with simple two-colour outline drawings. In one case I saw it even get the colour wrong ("red and white" for a black and red image).

Is that something you would have expected?

Also- I notice it doesn't do very well with character recognition either. Is that surprising?

jedberg · on April 13, 2016

It would be great if you added a text box under the stars so I can (optionally) tell you why I didn't give you five stars.

For example, I uploaded a picture of my daughter as an infant, and it said, "This is a baby on a bed and he's :D" which I gave four stars to because it said he instead of she. But honestly you really have no way of knowing that. :)

gilnahmias · on April 14, 2016

Thank you jedberg! That's a great idea. We love it and will look into adding it soon.

raus22 · on April 13, 2016

Do you know why it puts a lot of things on surfboards?

It seems from the comments that that is one of the most represented misclassifications.

jagger27 · on April 13, 2016

I'm super impressed by its response to this image:

http://i.imgur.com/tc5rz9s.png

whiskers · on April 13, 2016

I'm less impressed by what it made of this:

http://imgur.com/KAmCF2m

nicklovescode · on April 13, 2016

four stars? brutal

1024core · on April 13, 2016

I worked in Image Processing and Vision for a long time. If you'd asked me 2 years ago that something like this could be possible, I would have laughed you out of the room. But in the last year or so, I've been stunned beyond belief at how well these networks work.

grahamburger · on April 14, 2016

I remember seeing this about two years ago that seemed to do a pretty good job:

https://news.ycombinator.com/item?id=8129499

(Site's gone now but it was a demo on top of Clarifai)

Honest question, is this much better than Clarifai?

on April 13, 2016

[deleted]

1024core · on April 14, 2016

Yes, it is.

_l4jh · on April 13, 2016

Hmm I can't help but think it should have done a little better with this image http://i.imgur.com/yBNJWKf.png

ilaksh · on April 14, 2016

I got the same response not really confident but think it's a cell phone. Mine was a cluster of little buildings with gardens on top. Your is a ketchup bottle. Maybe 'cell phone' is the default response when it doesn't know.

dingo_bat · on April 16, 2016

I've noticed that the bot does best with real photos. It goes wrong with cartoons or objects with a white/black background with no surrounding detail.

madmoose · on April 13, 2016

Feeding noise to a neural network is always fun: https://i.imgur.com/pPdwIGx.png

spo81rty · on April 14, 2016

Lol epic fail

donutdan4114 · on April 13, 2016

Pretty good. Can't wait to see how good this tech gets in the next few years.

https://www.dropbox.com/s/ty34c02y1mngyrc/Screenshot%202016-...

arunitc · on April 13, 2016

I gave this image - https://i.imwx.com/images/maps/truvu/map_specnewsdct-109_lts...

and I got this result "I am not really confident, but I think it's a couple of glass vases with flowers on top of a surfboard."

yk · on April 13, 2016

I gave it this one [1], it claims "I think it's a blurry picture of a boat." which is not entirely bad description but obviously misses the point.

[1] http://vignette3.wikia.nocookie.net/lovecraft/images/9/95/5b...

gagege · on April 13, 2016

Of all things to try to make a brand new AI describe, you give it an indescribable god/priest of R'lyeh???

:)

fweespee_ch · on April 13, 2016

It is important to unit test the boundary conditions.

louis-paul · on April 13, 2016

http://www.cloudsightapi.com/api is crazy accurate, it gives "galleon above sea monster painting".

lotkowskim · on April 13, 2016

Most likely because they use mechanical turk (people) as per this reddit discussion :) https://www.reddit.com/r/MachineLearning/comments/356e76/ask...

jez · on April 13, 2016

I went a different direction; I gave it:

https://upload.wikimedia.org/wikipedia/commons/thumb/9/90/Pe...

And it came back with, "I'm not really confident, but I think it's a person on a surf board in a skate park."

debacle · on April 13, 2016

It suggested that two of the images I uploaded also had something to do with surfing. Maybe it was trained on a lot of surfing images.

emehrkay · on April 13, 2016

It loves things being on top of surf boards

http://imgur.com/WWHQu5i

MS has been providing the net with fun activities for a few months now. The other night the celebrity look alike site blew up on twitter and instagram

nerdy · on April 13, 2016

It's only a matter of time before a repeat of Microsoft's last AI experiment (Tay), when the Internet teaches CaptionBot all of the positions in the Kama Sutra.

brillenfux · on April 13, 2016

For science: it only states "I think this may be inappropriate content so I won't show it" which is a shame :)

6stringmerc · on April 13, 2016

Potential future quote:

"In a development that surprised even us, after an influx of /b/ and Something Awful Goons, the AI decided to shut itself off."

petercooper · on April 13, 2016

I chose goatse and it says "I am not really confident, but I think it's a man holding a cat."

arprocter · on April 13, 2016

It looks like it refuses to comment on a certain toothbrush-mustached dictator

Edit: recognizes Stalin though http://i.imgur.com/9W6wqUd.png

Thaxll · on April 13, 2016

Made me chuckle:

http://imgur.com/GSpanVe

dexterdog · on April 13, 2016

You think there's not a glass of water in that picture?

larrik · on April 13, 2016

It says "any image" but I think they really mean "any photograph", based on the samples as well as the stuff I uploaded to it.

larrik · on April 13, 2016

I had to share this one, I sent a screenshot of the Yahoo homepage from a while ago (yeah, I had that hanging around...) with the main image being of Donald Trump.

The caption guess was "I am not really confident, but I think it's a television screen and he seems ."

ultramancool · on April 13, 2016

Did you know that television screens are not only alive, but male?

jchampem · on April 13, 2016

This one is really Funny https://pbs.twimg.com/media/Cf8LJk6WcAAqZ0X.jpg (image can be found on windows default install!)

Beowolve · on April 13, 2016

I'm dying. This is hilarious.

spo81rty · on April 14, 2016

This is a ton of fun. Cat on a counter... Lol http://m.imgur.com/2tYgmmL

mdpopescu · on April 14, 2016

That image is just.... wrong :) Poor AI is going to have some nightmares.

arprocter · on April 13, 2016

Fun stuff

http://i.imgur.com/kS6sgNT.png

Edit: I was expecting it to think an eel was a snake, but... http://i.imgur.com/EmpRNkA.png

mapleoin · on April 13, 2016

This is no fun to talk about without permalinks to uploaded images/results.

ulkesh · on April 13, 2016

I tried a Magic Eye photo. It didn't see the sailboat at all.

CodeCube · on April 13, 2016

It's a schooner!

sumoboy · on April 14, 2016

I tried the same thing and it saw a monkey.

chriskanan · on April 13, 2016

My lab is trying to do something similar for answering questions about images. We have a significantly better system than the current system that's online, but we haven't had a chance to update it yet: http://askimage.org

It is far from perfect, but is near state-of-the-art. I'm guessing it won't hold up to HN.

ataylor32 · on April 13, 2016

I like it. https://i.imgur.com/5HPdbSa.png

gilnahmias · on April 13, 2016

Awesome picture, we love it! -CaptionBot team

Spivak · on April 13, 2016

This is amazing. This is exactly what I needed to get through a long on-call shift.

https://m.imgur.com/N72gtoC

swalsh · on April 13, 2016

It is almost as smart as a child. I uploaded a picture of my Notre-Dame vacation photo, and the caption was "A person standing in front of a church"... which is close to my sons "mommy standing in front of that church we went to"

soundwave106 · on April 13, 2016

Yep, on the stuff it recognizes. It recognized a picture I tested taken from the Grand Tetons as "a lake with a mountain in the background", which was quite correct, but also kind of generic.

On the other hand, it described a picture of Grand Prismatic Springs in Yellowstone as "a train with smoke coming out of the water." Which also is kind of like the crazy things kids sometimes say when they see something new.

http://imgur.com/JBlpJvQ

lewiscollard · on April 13, 2016

http://imgur.com/JwidrUy

I get the same "about as smart as a very young child" vibe from CaptionBot; the intersection of pleasantly silly and rather impressive.

(Photo: Rover P6 on the start line with a Nissan R35 GTR, guessed as "I think it's a police car parked in front of a truck.")

verelo · on April 13, 2016

It's amazing how wrong this gets some things, and then again its amazing how right it gets other.

The last one in this set really surprised me:

http://imgur.com/a/gLTl4

Savageman · on April 13, 2016

Ohh I got a good one: "I am not really confident, but I think it's a close up of a plane with a blue umbrella."

http://imgur.com/FYucrda

shawkinaw · on April 13, 2016

I got a pretty good one too: "I am not really confident, but I think it's a close up of a man with a cow."

http://imgur.com/DmQHP7W

justsaysmthng · on April 13, 2016

It was spot on for 30% of the images, but wildly inaccurate on the rest.

In fact, I assume this is a crowd sourced training for the tech..

Kind of disappointing, but at the same time I understand that this task is not trivial at all.

andreyk · on April 13, 2016

Links to a couple of the initial super impressive research papers on generating captions for images from 2014 and 2015: http://googleresearch.blogspot.com/2014/11/a-picture-is-wort... http://cs.stanford.edu/people/karpathy/deepimagesent/

As far as I know this was the first research to do the super cool thing to combine multiple neural nets trained on different data in super cool ways:

"Now, what if we replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images? Normally, the CNN’s last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But if we remove that final layer, we can instead feed the CNN’s rich encoding of the image into a RNN designed to produce phrases. We can then train the whole system directly on images and their captions, so it maximizes the likelihood that descriptions it produces best match the training descriptions for each image."

AND

"Our alignment model is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding"

Cornelia_Kara · on April 13, 2016

Here's a whitepaper on the tech behind the CaptionBot.ai demo: http://research.microsoft.com/pubs/264408/ImageCaptionInWild...

This demo uses the Vision API and the Emotion API from here: https://www.microsoft.com/cognitive-services/

viach · on April 13, 2016

It can't recognise B. Gates photo. Ok.

gagege · on April 13, 2016

Shows how far MS has come since the 90s!

bikamonki · on April 13, 2016

Android users: do you get a lack of memory/resources error when you try to take a pic instead of selecting from gallery? It is a silly bug where the camera activity kills the browser activity that called it.

Google: we cannot move forward with 'Progressive Webapps' if you guys don't fix these silly bugs.

Take-picture-do-something is a common feature of webapps like here Mr CaptionBot!!

Normal_gaussian · on April 13, 2016

Nope.

I have a bunch of tabs open which included a video (~30minutes)

Using a Nexus 5, Android 6

YeGoblynQueenne · on April 14, 2016

Yeah, the tech is not yet ready for prime time.

1) Close-up of a roman coin

- I think it's a banana peel

2) Inverse black-on-white outline drawing of a wolf howling at the moon (logo of comic series Elfquest)

- I am not really confident, but I think it's a close up of two giraffes near a tree.

3) Red-on-black drawing of eight arrows with a circle in the middle (chaos symbol).

- I am not really confident, but I think it's a red and white sign.

4) Red-on-black drawing of a hammer-and-sickle (communism symbol).

- I am not really confident, but I think it's a picture of some sort.

5) Ltd Cmdr Data laughing, one hand on his chest, the other extending outside the picture.

- I am not really confident, but I think it's a man holding a wii controller and he seems :D

6) Germaine Greer biting the head off a barbie, while shaking another off its ponytail

- I am not really confident, but I think it's a woman eating a doughnut and they seem :D D:

7) Image of a tiny lilac octopus on a black background

- I am not really confident, but I think it's a close up of a doughnut.

8) Red-on-black drawing of an "A" in a circle (anarchy symbol)

- I am not really confident, but I think it's a lamppost

9) Black-and-white picture of actress Liv Ulmann

- I am not really confident, but I think it's a man with a stuffed animal.

10) Portrait of countess Elisabeth Bathory

- I am not really confident, but I think it's a woman wearing a hat and she seems :|

For the record, number (10) is spot on (though with low confidence, so may be just random).

emanueld · on April 13, 2016

At least it got the tree part right.

http://m.imgur.com/sIm97r9

ThinkBeat · on April 13, 2016

Hmm.

A couple of days ago I think there was a post about Google doing a lot of development and research around creating systems that understand / categorize / comment / recognize images.

One thing I took away from reading about it is that Google has billions of images to train it with from all their different ventures.

Does Microsoft have access to anywhere near the same numbers of pictures?

exhilaration · on April 13, 2016

Microsoft owns Corbis, doesn't it? http://www.corbisimages.com

ThinkBeat · on April 13, 2016

It is my understanding that it was owned by Bill Gates not Microsoft.

Bill Gates sold it earlier this year to a Chinese Company http://www.ft.com/intl/cms/s/0/d6fbcb88-c126-11e5-846f-79b0e... (paywall),

The Chinese company has structured a deal with Getty to take over licensing outside of China.

However, having access to world class photography is great, but the images that (probably) will be the most interesting for Microsoft to recognize will be selfies, and other "crowd" created amateur photography and possibly memes.

I personally would think it to be cool to see if the bot could traverse the Getty collection and see if it could recognize the photographer of an image it had not seen before. Why yes, this is Leibovitz.

chris-at · on April 13, 2016

> I am not really confident, but I think it's a close up of a cat.

Hello kitty: http://i.livescience.com/images/i/000/024/750/i02/tarantula-...

laurentoget · on April 13, 2016

Not surprisingly, if its training material was scraped from the web, it is biased toward cats, as my experiment also indicated:

http://imgur.com/z4zQLzz

debacle · on April 13, 2016

An interesting project, but it fared pretty poorly on all of the images I gave it - the suggestions were wildly outlandish.

oh_sigh · on April 14, 2016

I wish services like this would be released without any kind of moral filter on the subjects it classifies.

I uploaded a picture of Michaelangelos David to the service to see what captionbot would say about it, and I got back a message "I think this may be inappropriate content so I won't show it."

plank · on April 13, 2016

Tried it with three different pictures, one from clker.com (http://www.clker.com/cliparts/0/3/f/0/1194984730712928848mag...) mistaken for a lamp-post, and two from unsplash (https://unsplash.com/photos/2Ts5HnA67k8 and https://unsplash.com/photos/iIg4F2IWbTM). In the latter two it tells me that it cannot recognise anything. So for me, it isn't there yet....

woodfordb · on April 13, 2016

It classified my picture of a dog as inappropriate content and wont display it. Dang it.

ccozan · on April 13, 2016

It feels like there are two sides of this: either recognition is amazing, either is really really far.

It seems that after it generates the caption, this needs to be fed to some semantic pipe, so that a plane sitting on a book would not make sense, and try further.

After all, it really depends on the training data. If the picture of a train ticket was never seen by the NN, how could it answer correctly? How ever, it should try to reduce the answer to some more meaningfull info, for example instead of two giraffes near a tree, ideally would have said, it's a text and would attempt OCR.

apocalyptic0n3 · on April 13, 2016

I gave it a photo of a Cylon [0] and it said "I am not really confident, but I think it's a close up of a motorcycle." Close but not really there; Google's reverse image search has a better detection in this case. As an aside, it'd have been really cool if it said it was a picture of a toaster.

[0] http://www.xperiax10.net/wp-content/gallery/cinema_x10/cylon...

bgalbraith · on April 13, 2016

CaptionBot doesn't really know what to make of Winged Doom: http://imgur.com/86uwKfa

dexterdog · on April 13, 2016

That's what I would have said and my AI is reported to be pretty good.

Devthrowaway80 · on April 13, 2016

Pretty impressive - gave it a few profile photos and it did suprisingly well, correctly identifying "A couple walking on a beach at sunset," "a man looking out a window", etc.

It struggled with wildlife photos - a pack of arctic wolves was "a sheep standing in the snow", and penguins swimming was "a bird flying over a body of water" (close but no cigar).

jdmichal · on April 13, 2016

Flying over, flying through... Minor detail.

icefox · on April 14, 2016

Uploaded a cropped version of Mars in a photo that shows its atmosphere from http://spaceref.com/onorbit/mars-methane-and-mysteries.html

And was told: "I am not really confident, but I think it's a toilet that is in the dark."

vdnkh · on April 13, 2016

I tried a bunch of different images and I got 'two giraffes near a tree' a bunch of times. They were drawn images though.

lordvissu · on April 13, 2016

It's not at all working. Every time, the same thing pops up - "I am under the weather now. Try again later. :("

semerda · on April 13, 2016

Definitely not a picture of 2 giraffes near a tree: https://www.dropbox.com/s/ki9p59txh8mk143/Photo%20Apr%2013%2... It's just a Caltrain ticket ¯\_(ツ)_/¯

zimpenfish · on April 14, 2016

That's weird - $workchum fed it a screenshot of Perl 6 and it thought it was two giraffes.

cobalt · on April 13, 2016

obviously you are mistaken, i can see the eyes!

skykooler · on April 13, 2016

It doesn't seem to know about rockets. http://imgur.com/mqRuLVq.png

(I tried the spacex landing pictures too - it correctly identified "a boat in a large body of water" but ignored the ten-story rocket above said boat.)

TeMPOraL · on April 13, 2016

I posted a photo of Just Read The Instructions, got: "I am not really confident, but I think it's a yellow motorcycle.".

-.-

zacharynewton · on April 13, 2016

Silly Microsoft, should have at least had some caching layer instead of analyzing every image. RIP CaptionBot.

jdkanani · on April 13, 2016

Microsoft has great tech team - no doubt, but seems it lacks in product and market strategies.

krambo · on April 13, 2016

I'd like to see photo battles between microsoft and google, as a live game show.

mcheshier · on April 13, 2016

My results ranged from impressive to awful. It recognized Pete Carroll with 96% accuracy from a meme picture where he struts and chews gum. Then it thought a picture of the super bowl field before the game was boats on a table.

joshu · on April 13, 2016

"I never felt at home here. This is an awful place to be dropped down halfway”

tomschlick · on April 13, 2016

Gave it a picture of an AR-15 on a shooting bench and it thought it was a bicycle.

bingeboy · on April 13, 2016

Service appears to be down or "under the weather" whatever that means.

lotso · on April 13, 2016

Ha, got eerily accurate results. Some funny ones as well, but interesting tech.

smellf · on April 13, 2016

Show me.

breischl · on April 13, 2016

Pretty impressed that it got this one, given how the faucet breaks up the outline.

http://puu.sh/ohauF/435af67ac1.jpg

StephenConnell · on April 14, 2016

My photos did not do too well. My Coral looks like a cake, my lizard looks like a bird, my boy fishing looks like a man next to a river, and a waterfall looks like a close up of Rock.

gsbell · on April 13, 2016

Hypnotoad is not a "person on a surf board in a skate park." http://imgur.com/2Cf5LKW

mfoy_ · on April 13, 2016

I got that when I uploaded a picture of a pine cone...

jlubawy · on April 14, 2016

I uploaded the sad Michael Jordan meme face and it responded "I think it's Michael Jordan wearing a suit and tie and he seems :(", sounds about right...

zarify · on April 14, 2016

So I looked for a random photo on my phone and fed it a picture of a spot my leg that I'm keeping an eye on. Close-up of a cat apparently. Damn these hairy legs.

vic20forever · on April 13, 2016

Hmmm... I'm not seeing it. https://i.imgur.com/OFPArbf.png

fernly · on April 14, 2016

Moar surfboard. Hmmm.

jedberg · on April 13, 2016

I gave it a picture of a Captcha, and it said it was some giraffes against a fence. :) So at least we know they haven't broken Captcha yet!

indatawetrust · on April 14, 2016

http://i.hizliresim.com/o35z6m.png

Koopa · on April 13, 2016

This one made me laugh http://imgur.com/u0E5eu5

gsbell · on April 13, 2016

Close up of a Bicycle... http://imgur.com/BZd088p

asib · on April 14, 2016

This made me laugh: https://imgur.com/PhbyAyK

daxfohl · on April 13, 2016

Surprisingly it does pretty poorly on the images include in Windows XP's "Sample Pictures" folder.

monknomo · on April 13, 2016

I gave it a statue of Joan of Arc and it thinks it is a motorcycle mirror with a neutral expression...

monk_e_boy · on April 13, 2016

Wow! I gave it a photo of a kitesurfer and it got it (man flying a kite in a body of water). Amazing!!

andrewclunn · on April 13, 2016

Uploaded dick pick. Caption said it was a micro penis :-(

cabirum · on April 13, 2016

Feed it Deep Dream generated images.

jcoffland · on April 13, 2016

It did not work well for me. I tried to give it an easy one. A picture of a salt and pepper shaker. Here's what it said:

> I am not really confident, but I think it's a cake made to look like a phone.

Nice try m$.