Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: PodText.ai – Search anything said on a podcast, highlight text to play (podtext.ai)
219 points by anonbuilder on Feb 9, 2023 | hide | past | favorite | 84 comments
Hi HN, wanted to share a project that I’ve been working on recently.

PodText allows users to find anything said on a podcast. You can also listen and share clips to a specific part of the podcast audio, simply by highlighting the text of that part. Currently there are just over 25k podcast episodes and I’m adding a lot more in the coming weeks (yes my GPU bill is painful).

In order to monetize it, I’m building a sponsorship database to help sponsors find podcasts and vice versa. This will be sold in the form of a $99/month “PodText Business” subscription. I bet I could charge a lot more to large sponsors but I’ll tweak that as I talk to potential customers.

Right now the UI is very bare bones (doesn’t even have pagination) but I’ll polish it once the data pipeline is working well. Please let me know if you run into any bugs or have any questions about the site or business model.

PS: I'm a regular on HN using my real name but can't post under that account since my employer will fire me if they found out about this project :-)




This is a really interesting project with a lot of potential.

If I were a sponsor looking for a podcast I would want my search process to look something like this:

- Search for a term relevant to my line of business

- See a list of podcasts ordered by % of utterances which contain my key phrase throughout their last N episodes

- Annotation of how many listeners each podcast had in last N episodes


They could and probably should offer semantic search, which would be far more powerful than searching exact match keywords.

If you could identify podcasts that often talk about a domain more broadly, you'll have a higher hit rate and overall a better audience fit.


I’ve been creating a semantic search using embeddings tonight against my own podcast transcripts. I’d be happy to have my own content surfacing mechanism like this!


Appreciate the feedback! I'll keep these use cases in mind as I build out PodText Business


Building out the search a little more to support exact matches would also be super useful in this flow. For example, I've been on several podcasts talking about Notebook.ai, but searching for the name also matches "notebook", which results in an unusable signal-to-noise ratio (seeing every podcast that says the word "notebook"). Likewise, it'd be great to quote-search exact matches for "Andrew Brown", instead of seeing all podcasts that mention "Andrew" or "brown".


Happy to exchange notes with you on our learnings building https://podsearch.page


heads up.

the "How it Works" link is broken on home page of podsearch.page


I thought about making something like this, but one important part - which seems to be missing here - is speaker diarization (identify who says what.)

In a world of increasing automated content generation, the "who" might become just as important as the "what" of information.


Amazing. I was literally just building the same thing. I have a little over 50k podcast episodes with karaoke style word by word transcription.

On to the next project I guess.


There are dozens of us! This looks nice and the UI is better than I would have done. Good job, OP. I guess I can stop running my GPU every night.

I might still try to release a nice bundled-up docker container that can STT a podcast RSS into a text RSS. Some podcasts are enjoyable to listen to, but some I just want to skim the text.


Tbh this is like the single most common idea every tech guy gets when they’re frustrated with the problem. Many like me move on when we realize how niche and potentially small a TAM it could be (of course any idea can generalize in creative ways). So no harm as long as it was fun. Also many options can co-exist!


I don't like that attitude ;)


This is amazing. How do you search for two terms at once? e.g. "aboriginal" and "origin". Doesn't seem possible to require both terms are present.


Thanks! The search function is built with Algolia, I'm sure they support boolean ops like "AND" but I'll need to dig into their API. I think if you search both terms, transcripts containing both should be ranked higher.


I’m doing a similar personal product. Highly recommend switching to Typesense before your Algolia trial is up. I’ve heard good things about Meilisearch but Typesense has been rock solid for me.


I second that. Typesense is fantastic. I used it for a job board with 3 million and it did great.


Haven't heard about Typesense - thanks for the pointer! Btw if you want to trade notes on our projects, feel free to email me: team@podtext.ai


You might want to try semantic search instead of fiddling with keywords. Disclaimer: I'm building a plug-and-play semantic search API at https://kailualabs.com


I've been thinking about how to improve search, would love to try a demo of your product!


Surprised to see how many other small projects doing this same thing there are. Kinda seems like a solved problem with ListenNotes. Not affiliated, but I use the service a lot and they have a lot more features than just transcripts including publicly accessible APIs (some of which could probably be utilized by some of the projects posted here)


I'd echo this. ListenNotes API is great and I thought podcast search was already solved for devs.

I've always enjoyed this question on their FAQ that gives some tips for potential competitors - https://www.listennotes.com/api/faq/#faq2

> There are at least 3,035,027 podcasts and 156,316,374 episodes on the Internet...


In addition, it's free for users and the API's free plan is sufficient for most personal uses. Hard to see how any of these newcomers could compete unless they had funding behind them


Please take it as encouragement that this is a real business and not discouragement that it's been done but there are similar products in the market. We use podscribe at my place. It's more about better targeting than discovery because we have an ad sales team but suffice to say there is an established market for this kind of thing.


Do you get permission from podcast creators?

Because these transcripts are probably derivative works.


It hasn't been litigated yet, but the closest analog is closed captioning. You can legally create closed captions for someone elses work, and you then own the copyright.


I think the closest analog is audiobooks. You can't create audiobooks without permission from the original creator of the book. Your derivative work has it's own copyright, but it only applies to the work you've done, i.e. the audio. You can't steal someone's original work just by adding something to it. You get copyright on the addition only.


Nice idea but it didn't work for me... you need to look into why "poppadoms or bread" isn't turning up any results.

It's one of the lines in every episode of the Off Menu Podcasts, one of the biggest podcasts in the uk.


This is great, I was working on something similar in the last few days, but since it is hard to cover every podcast, I stopped to think of a way to niche down. I feel your pain with GPU and scalability to transcript podcasts.

I was thinking of adding something like this for the UI https://github.com/johan-akerman/SpotifyTranscripts in case you find it useful.

Good luck! It is a really nice project.


Thanks! I'll check out their UI, mine definitely needs a lot of work :)


Really interesting service; well done! Looking forward to seeing how it will evolve.

Quick feedback: it seems you're using fuzzy search, which may or may not be what the user expects. For example, searching for "FPGA" also matches "figa", "fuga", "Fagan", "FPNA", "PGA" (tour), and a bunch of other irrelevant terms. Using quotes to indicate I'm looking for an exact match didn't help.


What software do you use to transcribe the speech? Whisper?


Yes, using Whisper running on banana.dev :)


Do I understand correctly banana pricing is that it costs $1.87 per hour, so the hour of audio with large model costs you about $1? Thats probably a bit too expensive compared to cloud providers.


Whisper can be run on CPUs and produce high-quality transcripts. Doing it on CPU is like a 10th the price and more horizontally scalable.

This is using CPUs: https://modal-labs--whisper-pod-transcriber-fastapi-app.moda....

Transcribes a 1hr podcast in 1min for ~3-5cents.


> 1 hour of audio processing with Whisper on Banana costs <30 cents.

From their page on Whisper: https://www.banana.dev/deploy-whisper


Thats probably medium model


This is a game-changer for topic research. Thank you! I also agree that this will be a valuable resource for potential business sponsors and worth the money. One note, I like that it shows you the exact location of the text with the term you are searching for. Is there a way to see the entire transcript or listen to the entire show on your site?


I was also exploring this idea a while back, congratulations on taking it past the finish line. Highly recommend you set up CloudFront for serving S3 static resources - right now you are linking directly to S3 which is much more expensive for content distribution, especially since you're serving large audio files.


Would be nice to have a slider to adjust the timestamp period of time. Timestamps every 3 seconds is distracting to read 4 words per line. Being able to adjust to 10/30/60 second timestamps would be nice to give you a proper paragraph to read (i guess you may not be able to indent cleanly but you get my point)


This should be possible, since Whisper forks can support exporting results with timestamps per word. Indexing entire sentences / clauses is a more complex problem though.


Seems like Chapo Trap House and The Adam Friedland Show are inexplicably unrepresented not only here but on other similar webapps linked elsewhere in this thread... Strange, considering that those are two incredibly popular podcasts, at least by the metric of Patreon revenue.


I couldn't find either of Jim Cornette podcasts, tiger belly, flagrant, brilliant idiots, and Matt and Shane's secret podcast which I think is number 1 on Patreon.

I would assume they started with the podcasts they currently subscribe to, then asked their friends, then started going down one of the charts that rank by downloads.

I never heard of the ones you mentioned, and you may not have heard of the ones I mentioned. There are so many podcasts that there are always going to be missing ones. Not to mention, if the goal is to sell ads, then high Patreon earners would not really be your target.


Stop baiting partisan bickering


People don’t like advertising with Nazis so it wouldn’t make sense to pay to transcribe that.


Cool idea and I wish you the best. But talking about monetizing might be too early at this point. After a couple of tests it seems like it is completely unusable:

- It can't search non-english words - It will change your the spelling of any word to show completely irrelevant results


If you want to grow your audience user-wise, allowing people to add alerts on specific terms would be pretty cool.

There's a set of people I always enjoy listening to whenever they're on a podcast, but it's hard to keep track of everything out there.


Looks very cool! Besides the search, which works great and feels like a powerful tool, I like how snappy the word highlighting is in the transcripts. Can you elaborate a bit on how that's done? Thank you, this is really great work!


If anyone is interested, there's a podcast app called Snipd that integrates transcription. They also use AI to create chapters and snippets. It then creates a feed of snippets, like Instagram reels. Pretty neat.

(Not my app; just a happy user.)


Here's my crack at a podcast transcription website: https://podscription.app

I made this while unemployed and the skills I learned from making it helped me land my new job!


Any chance you'd be open to sharing source? Also, curious if you used any tuning on Whisper to get transcription more accurate?



It's not particularly pretty but the source can be found here: github.com/zachbellay/podscription


I thought for Whisper (https://github.com/openai/whisper) you actually do not need GPU's and can use a CPU?


You can use CPU, it's but using GPU in my experience has been about a 20x speedup.


Super interesting - I did a search on a common business name like "Dropbox" and its clear some ads show up. Wonder if there is any way to parse these out so they don't show in the results?


That's a great point, I'm parsing out ads to determine sponsorships anyway so filtering these from search should be straightforward. Thanks for the feedback :)


This is neat, but would it be possible to search for a multi-word phrase? If I search for a sequence of words I just get results that match one or more of the words but not the phrase itself.


Very neat! Is it possible to browse by topic instead of by podcast? You mentioned 25k podcast episodes but right now I can only browse ~100 of them or so unless I come up with some keywords.


Nice work! It would be great if after clicking on the text within the podcast that matches my term, I was brought to that section of the transcript rather than to the beginning.


That's a great piece of feedback! I'll get this fixed soon, super useful suggestion


Super cool. This could help make podcasts much more social. Do you generate the text or use provided transcripts? Are there any legal issues associated with that?


wow, I had the same idea as my next home project. I put off working on it for too long. And this looks exactly as what I was hoping to achieve. Guess I have to find another idea. As mentioned in the comments there are so many great services in this space.

I had this problem of vaguely remembering something on the podcast that I listen to, and I wanted to find it by keywords. Using podtext.ai I was able to do it. So I consider it solved


How are you guys affording the transcription bills?

25,000 episodes at 1 hour long, could cost 30,000 USD at a conservative market rate. Are you guys self funded?


Haha I built this for fun a few months ago to full text search my own favorite podcasts. What are you guys using to do full text search / indexing?


Something like this would be interesting to me, many podcasts I listen to are not here or on some of the other links shared by others. Sometimes I can jump around and find roughly the right space for the segment to share but often I can't. It doesn't feel helpful to share a 2 hour podcast for a minute of discussion.


Currently I'm using Algolia but others have pointed out some alternatives I'll have to check out. Would love to hear any feedback/ideas you have from your project!


Awesome, I just tweeted at your proj re: gpu


May be like zoom linking of different speakers !! Is there an open source of this ?? Google transcribes all its podcasts automatically!!


The transcription price would be big for 25,000 episodes at 1 hour long. That's like 30,000 USD.

How are you guys funding this?


How can I search for a full name or phrase? Something like "Steve Jobs"? Now I would get Steve and Jobs results.


I did the same at https://voilib.com ;)



also adding https://steno.ai to the list


Wow, how'd you manage to get interviews split into different speakers?


(not the creator, but I've built something similar for personal use)

This is a great library for determining which speaker is speaking during each time in an audio file (this is called speaker diarization); I imagine they used it or something like it. Works really well out of the box!

https://github.com/pyannote/pyannote-audio


thanks!


Would be nice to have some sort of help--at the moment Im confused what its all about


From Noiser's latest episode of "Short History Of", at 0:14, the phrase "he's a law student" is transcribed as "his aloer student".

Apparently aloer is a real word, but not one I've ever heard, and not one that has anything to do with Indian history (the subject of said podcast episode).


Hello, this looks fantastic, I was looking for something like this a while ago.


Have you tweaked Whisper in any specific ways to improve accuracy?


Very cool, looking forward to testing the platform!


How do you discover new podcasts to index?


feel free to post any corrections

  podtext.ai           --this post
  www.listennotes.com  --focused a lot more than transcripts, but probably the most complete of all here.
  steno.ai             --most professional seeming and possibly second most complete
  podscription.app     --good search, very limited inventory
  podscript.ai         --pretty but a little buggy, doesn't show timestamps or transcripts in search results
  podsearch.page       --only generates for specific podcasts. can't search across podcasts


Hate to be that guy, but is this legal? I would assume to do something like this you would need to have permission from each of the copyright holders.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: