Frinkiac: Simpsons quote search engine

reaperhulk · on Feb 5, 2016

One of the authors here. I blogged a bit the other day about how we built this: https://langui.sh/2016/02/02/frinkiac-the-simpsons-screensho...

simcop2387 · on Feb 5, 2016

Where did the subtitle data come from? It's got a mispelling of Horatio McCallister in it:

https://frinkiac.com/?p=search&q=captain+mcallister https://www.google.com/search?q=horatio+mccallister

Also when are you going to get seasons 16+? I really need to be able to find Jeff Albertson! (Comic book guy)

reaperhulk · on Feb 5, 2016

Interesting, thanks for the report there. Coincidentally, this misspelling is also present in all the subtitles that Simpsons World uses!

We chose the season 15 cutoff pretty much arbitrarily. We're not necessarily opposed to later seasons, but we'd like to have some better season/episode filtering in place before expanding more.

simcop2387 · on Feb 5, 2016

Make sense with the seasons. Certainly helps when you've got issues like season 11 being off like it is. There's other reports of the search functionality being literal with punctuation and other things to work out. It'd be really neat with more polish and possibly adding in more tv shows. I imagine the data needs are pretty substantial, but I wonder if there might be a good way to deal with that by having the server generate the jpgs on the fly from the better compressed video files. That might really be a big win if BPG or other formats actually take off and begin replacing jpg.

freshyill · on Feb 6, 2016

Awesome work. I saw when you announced it on r/thesimpsons. What do you think of Josh Weinstein getting such a kick out of this on Twitter?

Also please give the text a slight drop shadow, if possible.

Ps. This makes me so happy. Thank you for making it.

acomjean · on Feb 5, 2016

Cool.

"We also parse subtitle files and correlate each subtitle line's timecode with the timecode of the screenshot. Finally, the frinkiac binary can upload the data set to frinkiac-server. "

Could you elaborate on this parsing of the subtitle files. I've seen the "open source" star wars gifs file with the dialog and time codes[1], but I'm not sure how they pulled the text from the close-captioning? (edit: someone else something similar...sorry).

Also aside from the two character index search index you describe how are you searching the quotes with postgres? Are you using postgres's full text search[2] or something else?

Thanks, I love the simpsons and this is really cromulent[3] and cool.

[1]https://github.com/LindseyB/starwars-dot-gif/blob/master/sub... [2] http://www.postgresql.org/docs/current/static/textsearch.htm... [3]https://frinkiac.com/?p=caption&q=cromulent&e=S07E16&t=10420....

mikewhy · on Feb 5, 2016

> Could you elaborate on this parsing of the subtitle files.

Not the creator, but subtitles are easy to find, and super easy to parse.

    1
    00:02:17,440 --> 00:02:20,375
    Senator, we're making
    our final approach into Coruscant.

    2
    00:02:20,476 --> 00:02:22,501
    Very good, Lieutenant.

wodenokoto · on Feb 6, 2016

.srt files are quite straight forward, but some of the other formats can get quite annoying in my experience.

irq-1 · on Feb 5, 2016

This is great work.

https://frinkiac.com/?p=meme&q=drinky&e=S04E21&t=769718&m=+A...

tptacek · on Feb 5, 2016

This is fucking awesome.

gegtik · on Feb 5, 2016

Consider lucene indexing if you want free fuzzy searching

frik · on Feb 6, 2016

or Sphinx search, similar to Lucene but coded in C++: https://en.wikipedia.org/wiki/Sphinx_(search_engine)

st3v3r · on Feb 5, 2016

Are there plans for an API? I think it'd be fun to try and get a Pebble Time app working with this.

yolesaber · on Feb 5, 2016

There is one - open up the page and look in the web inspector.

mumrah · on Feb 5, 2016

Was this done using the DVDs? I'm curious about any potential licensing issues with the screen caps and subtitles. Did you have to get permission/sign something - or does this fall under fair use?

matthewmcg · on Feb 5, 2016

https://frinkiac.com/meme/S06E11/1070785.jpg?lines=+Yes%2C+y....

giaour · on Feb 6, 2016

https://frinkiac.com/meme/S06E21/903635.jpg?lines=You+better...

cplease · on Feb 6, 2016

No way this is licensed (no copyright notice, even; not even a mention of Fox), and no way it is fair use. It has frame-by-frame, full resolution images and full transcripts of every episode up for browsing. This is textbook mass copyright infringement. Short of offering unlicensed video downloads for a fee, it could hardly be more clear-cut.

Yeah, it's cool, I get it, but you can't just steal and redistribute content en masse for your cool project. Well, he did, but I expect he'll be hearing from Fox's lawyers soon.

jrochkind1 · on Feb 6, 2016

It is arguably fair use in the U.S. I don't think there is enough case law to be sure. It's hard to predict how it would go in litigation. I think you're right that the defendants wouldn't have a particularly strong case, but they wouldn't have the weakest.

The courts have generally judged significant "transformation" of the source material to be powerful in determining fair use. I think that would be in their benefit. Also it could be argued that this has very little effect on the market for the original copyrighted material, which would be in their favor. Of course, the copyright holder would see and argue it differently if they choose to sue. And the "the amount and substantiality of the portion taken" would not look good for the defendants -- but even though some common belief focuses on this factor almost exclusively -- thinking as long as you copy only 10 pages or whatever you're good, and if you don't you're definitely not -- that's not how it works, it's just one factor, and one that the courts in the past couple decades have somewhat de-emphasized.

But I don't think we can say "no way it is fair use", or "it could hardly be more clear cut." It could go either way. Fair use in the U.S. for novel things, not already well established as fair use or not, almost always looks like this.

matthewmcg · on Feb 6, 2016

Counterpoint: copying every single page of every book and making it searchable can be fair use. It just takes only 10 years of litigation and appeals to determine that. See Authors Guild v. Google. https://www.eff.org/document/ruling-appeals-court

Also, the "TV Eyes" system that recorded television newscasts and made the searchable was fair use, though certain features were found to be infringing. See https://www.eff.org/deeplinks/2015/08/dangerous-decision-fai....

Point is, the law is hardly clear cut and never is with new technologies. Without someone willing to take a risk and develop a potentially infringing technology we would never have had VCRs, MP3 players, YouTube.... I applaud the creators for making an incredibly useful resource and I hope if they do face legal threats they get a zealous pro-bono defense from someone like the EFF or Larry Lessig.

dopeboy · on Feb 5, 2016

This is impressive. It found everything I tried. If the author is reading, showing GIFs or a small video clip instead of a static image would be preferable.

My favorite Simpsons quote: https://frinkiac.com/?p=caption&q=up+and+atom&e=S07E02&t=673....

Coach: Up and atom!

Rainier Wolfcastle: Up and at them.

Coach: Up and atom!

Rainier: Up and at them!

Coach: [annoyed] Up and atom!

Rainier: [louder] Up and at them!

Coach: Better.

acomjean · on Feb 5, 2016

Simpsons gifs as a service... al la:

https://twitter.com/StarWarsDotGif

Its open source and maybe could be repurposed?

https://github.com/LindseyB/starwars-dot-gif

jrochkind1 · on Feb 6, 2016

Ooh, that's an awesome idea, and should be quite do-able technically, animated GIFs.

asd · on Feb 5, 2016

I love this. It found everything I threw at it. I hope the Fox lawyers don't take it down.

https://frinkiac.com/?p=search&q=THERE%27S+A+STUFFED+PEPPER+...

md224 · on Feb 5, 2016

I searched for "moon pie" and didn't find what I was looking for. :(

Gorbzel · on Feb 5, 2016

Yeah, I was saying Boo-urns, and it couldn't find it.

Also, yeah, this is coming down as soon as the lawyers get ahold of it.

rconti · on Feb 5, 2016

They may or may not (be allowed to) have a sense of humor about it. Our 24 Hours of LeMons car's publicity was sent to Matt Groening by a friend, and he passed it around the office. Apparently he asked their publicity folks if they could invite us up to show off the car about the same time that legal asked about sending us a cease and desist.

In the case of the car, it's probably fair use and the only issue was likely that we have non-Fox-approved sponsorship on it, but they probably decided their advertisers wouldn't complain about it because it's not exactly big bucks changing hands here.

So yeah, we got to meet Matt Groening and David X Cohen and Al Jean and a lot of the writers. It was definitely a cool experience.

http://www.thehomercar.com

mbrubeck · on Feb 5, 2016

The spelling in the captions can be hard to predict, and it's not good at fuzzy matching:

https://frinkiac.com/?p=search&q=I+was+saying+buu-urns

asd · on Feb 5, 2016

Oddly, it works with quotes.

https://frinkiac.com/?p=search&q=%22moon+pie%22

mbrubeck · on Feb 5, 2016

Or even with just one quote:

https://frinkiac.com/?p=search&q=%22moon

It looks like it uses really naive word breaking, so it considers the punctuation part of the word.

jrochkind1 · on Feb 6, 2016

They could definitely use some better text indexing/relevancy ranking implementations. I had mixed success. I'd recommend lucene or something based on lucene (Solr, ElasticSearch).

6stringmerc · on Feb 5, 2016

Just in time for the Grammys!

https://frinkiac.com/?p=caption&q=grammy&e=S10E14&t=1008006&....

Once the AV Club finds this I think a black hole will open and consume us all. The website is quite cool though!

Analemma_ · on Feb 5, 2016

If I could use this to get subtitled gifs of the scene in question, not just screenshots, it would go from amazing to godlike. On the roadmap for v2, hopefully?

tptacek · on Feb 5, 2016

CLICK THE IMAGES

https://frinkiac.com/?p=caption&q=lousy+smarch+weather&e=S07....

shawabawa3 · on Feb 5, 2016

For me I don't get gifs, just a list of stills from the same scene

LinkDJ · on Feb 5, 2016

I think he's making a joke about clicking the "next image" fast enough that it appears to be animated.

navbaker · on Feb 5, 2016

I tell people every day that you don't win friends with salad. Glad I finally have the images to go with it!

redwards510 · on Feb 5, 2016

I know that feel[1][2] bro.

[1] http://www.redbubble.com/people/babushack/works/12826784-you...

[2] http://www.redbubble.com/people/newdamage/works/9371721-i-wo...

ColinCochrane · on Feb 5, 2016

Great work! Tried out some of my favourites and it worked like a charm.

https://frinkiac.com/?p=search&q=that%27s+a+paddlin

https://frinkiac.com/?p=search&q=thrillho

https://frinkiac.com/?p=search&q=now+where%27s+me+toothpick

acomjean · on Feb 6, 2016

All my favorites are there:

"we tried nothing and we're all out of ideas" https://frinkiac.com/?p=caption&q=we+tried+nothing+&e=S08E08...

'see you suckers' https://frinkiac.com/?p=caption&q=suckers&e=S14E18&t=688354&...

'the buddy system... foolproof' https://frinkiac.com/?p=caption&q=buddy+system&e=S14E03&t=94....

'I'll never be the darling...' https://frinkiac.com/?p=caption&q=stoke+their+beards&e=S06E0...

'mountain of sugar' https://frinkiac.com/?p=caption&q=mountain+of+sugar&e=S06E02...

'childrens dance recital' https://frinkiac.com/?p=caption&q=parents+expect+a+children%....

'wookie' https://frinkiac.com/?p=caption&q=wooki&e=S06E02&t=1301984&m... 'idiots island' https://frinkiac.com/?p=caption&q=any+sign+of+inteligence&e=....

'George Harrison' https://frinkiac.com/?p=caption&q=george+har&e=S05E01&t=9790...

'Beer' https://frinkiac.com/?p=caption&q=killing+you+with+beer&e=S0....

'alcohol' https://frinkiac.com/?p=caption&q=+the+cause+of&e=S08E18&t=1...

derman232 · on Feb 6, 2016

'help yourself to some more stock' https://frinkiac.com/?p=caption&q=help+yourself+to+more+stoc...

acomjean · on Feb 6, 2016

"bubbles can burst?" https://frinkiac.com/?p=caption&q=golden+age+&e=S13E18&t=107....

j45 · on Feb 5, 2016

This is great and long over due.

For those of use who grew up having conversations in simpsons dialog, this will help provide those in my wife who don't have such habits develop them :)

thepies · on Feb 5, 2016

small point - the encaptionator should put a 1/2px black stroke around the white text so it is visible against any background colour

edit - after reading the FAQ I see you are working on this

I withdraw my question https://frinkiac.com/?p=caption&q=withdraw&e=S08E14&t=688870....

OhHeyItsE · on Feb 5, 2016

This is the reason the internet exists.

nefitty · on Feb 5, 2016

You call this a tax return!?

https://frinkiac.com/meme/S07E17/702635.jpg?lines=+You+call+...

ringofgyges · on Feb 5, 2016

Some great screencaps compiled in this article:

https://www.inverse.com/article/11007-frinkiac-is-the-visual...

Kluny · on Feb 5, 2016

I can't believe how fast it is.

martythemaniak · on Feb 5, 2016

Who can write a Simpsons quote search engine?

https://frinkiac.com/?p=search&q=the+garbage+man+can

dalke · on Feb 6, 2016

Any chance of OCR? I searched for "Pharm Team", which was the name of the company at https://frinkiac.com/?p=caption&q=major+league+baseball&e=S1... though the name was never said.

vlunkr · on Feb 5, 2016

This is great! My only complaint is that it comes up with lots of near duplicates. The images look they are different frames, but the quotes they reference are the same

joe_coin · on Feb 5, 2016

I think that's a feature. You get to choose which frame you prefer.

volaski · on Feb 5, 2016

This is amazing. I hope there's an api for this

kentbrew · on Feb 5, 2016

Break out your Chrome inspector and follow along in our exciting home version:

https://frinkiac.com/api/search?q=hoyvin

Results look like this:

[{"Id":1745953,"Episode":"S13E08","Timestamp":797213,"Filename":""}]

Concatenate the episode and timestamp to get the image:

https://frinkiac.com/img/S13E08/797213/medium.jpg

Caption here:

https://frinkiac.com/api/caption?e=S13E08&t=797213

Look for the Subtitles array:

"Subtitles":[{"Id":138914,"Episode":"S13E08","StartTimestamp":794266,"EndTimestamp":796533,"Content":" ( gavel pounding ) So, Professor,"},{"Id":138915,"Episode":"S13E08","StartTimestamp":796533,"EndTimestamp":799834,"Content":"tell us about Operation Hoyvin-Mayvin."}]

nkrisc · on Feb 5, 2016

You'd have to be stupider than a monkey to not like this. Are you stupider than a monkey?

https://frinkiac.com/?p=caption&q=how+big+of+a+monkey&e=S12E...

seppo0010 · on Feb 6, 2016

I made this Chrome extension to generate animated GIFs from frinkiac https://chrome.google.com/webstore/detail/frinkiac-gif/dlaba...

ChrisArchitect · on Feb 5, 2016

on the legal/lawyer talk tip - there have been a few notable other simpsons screencap repositories (like Lardlad) that have remained online for years. Wondering if there's some leeway or can't chase after a single frame (rather than video with picture and sound, which they are notoriously strict on youtube about etc)

bootload · on Feb 5, 2016

This is a great tool. Any copyright issues? I tried it, "but it disappeared into 'fat air'."

daok · on Feb 5, 2016

Every time you type a character in the search box, it adds a browser history. That is not great...

tptacek · on Feb 5, 2016

WORTH IT.

https://frinkiac.com/?p=meme&q=gamblor&e=S05E10&t=1151749&m=...

jameshart · on Feb 5, 2016

Hmm... https://frinkiac.com/?p=meme&q=gamblor&e=S05E10&t=1151749&m=...

jrochkind1 · on Feb 6, 2016

They should be using replaceState, it looks like they are using pushState instead maybe.

sirsean · on Feb 5, 2016

You're right! But that behavior should be a little better now.

iztyhi · on Feb 5, 2016

No Milpool! :(

More seriously:

1) Awesome!!!

2) It would be great if the search results page listed the quotes in addition to showing the images.

tptacek · on Feb 5, 2016

It totally does have milpool:

https://frinkiac.com/?p=meme&q=i+think+i+left+my+glasses+in+...

It's just that nobody ever says "Milpool" in the dialogue.

iztyhi · on Feb 5, 2016

Excellent!

ChrisArchitect · on Feb 5, 2016

curious about how it works/was developed https://news.ycombinator.com/item?id=11036894

anindyabd · on Feb 6, 2016

First thing I searched for: "Kids, you tried your best, but failed miserably. The lesson is, never try." Got the exact episode. This is great :)

doodpants · on Feb 5, 2016

I was hoping to find the quote in which Grandpa Simson mentions Estes Kefauver, but searching for "Kefauver" yields no results. :-(

squeaky-clean · on Feb 5, 2016

It looks like there's no episodes indexed past season 15, and this quote is from season 20.

https://frinkiac.com/?p=episode&e=S20E14 Should be the episode.

fungos · on Feb 5, 2016

Authors: Can you describe the backend infrastructure?

I'm just a bit curious here about the costs of running a toy service like this.

mjklin · on Feb 6, 2016

"And that is why The Lord of the Rings can never be filmed!"

Stumped ya Frinky. It didn't have to go down like this.

happyopossum · on Feb 5, 2016

This is an amazing feat of human ingenuity.

silveira · on Feb 5, 2016

Awesome. I could find an episode about "tiger-repellent rock" just by searching for "tiger".

morsch · on Feb 5, 2016

Much cooler than expected. So I assume this is fairly trivial to adapt to any other set of subtitled videos?

huangc10 · on Feb 5, 2016

"Hi Supernintedo Chalmers" LOL...this is freaking awesome. GIFs would be an improvement :)

noobie · on Feb 6, 2016

Though you may be rat-like in appearance, you are truly king among men for sharing this!

ChrisArchitect · on Feb 5, 2016

also, why didn't this get picked up in the duplicate post algo HN? For the blog writeup @reaperhulk you should have put 'Show HN' in your original post to get more traction or something

tehbeard · on Feb 5, 2016

I'm getting a nothing found error? Is this a mobile bug?

GurnB · on Feb 5, 2016

Getting the same results from my laptop at the moment also. Everything is returning 'Nothing Found' Error. It was working earlier today. (Can you tell it is Friday?)

sdh · on Feb 5, 2016

needs a random button

peruvian · on Feb 5, 2016

How is this so fast?

pbhowmic · on Feb 5, 2016

Brilliant. just what I need to needle the wifey

rglover · on Feb 6, 2016