Project Naptha

me_again · on July 11, 2022

What I would really like is a little bit like this but not quite the same: full text search over everything I have ever seen on the computer. It would read and index the emails, web pages, word docs, etc as I open them, then later when I think "I know I saw a doc about cache oblivious algorithms", I can search for it without being distracted by 100K documents I haven't seen. Or I can find that email I read, without finding the same phrase in a bunch of junk mail I never opened.

Does anything remotely similar exist?

bmn__ · on July 11, 2022

The pieces exist, you can string them together with a Perl one-liner. You are interested in the set intersection of the following two topics:

Full indexing: <https://lesbonscomptes.com/recoll>, <https://userbase.kde.org/Akonadi>, <https://addons.mozilla.org/firefox/addon/falcon_extension> (If you're not content with a piece, then research substitutes on <https://alternativeto.net>.)

Recent: `.local/share/recently-used.xbel`

This does not help with the email part because email programs do not register opened messages in recently used. Work-around: install a DBus or AT-SPI hook and write your own database of recently opened messages.

Happy hacking!

shishironline · on July 11, 2022

Thank you for sharing this

isaacimagine · on July 11, 2022

I've seen this been called a 'personal search engine' before. One person who is well known for their personal search setup is thesephist[0]. My friend is also working on an extension that uses NLP to semantically index your browsing history so that any text on the internet can be turned into a hyperlink to something else you've read[1].

[0]: https://thesephist.com/posts/monocle/

[1]: (WIP) http://espial.uzpg.me

suby · on July 11, 2022

I've read comments from people (don't remember the forum, perhaps HN) where people have said that they did this. No idea if there's a public project for this that you can use, but people have definitely done it. I agree that it'd be useful to have, though you probably need a good way to filter out irrelevant stuff.

AB1908 · on July 11, 2022

Try looking at karilicoss' promnesia and it's background for similar ideas and tools.

ryanfox · on July 11, 2022

I’ve been working on exactly that! [0]

My info is in my hn profile, if you (or anyone reading) would like to chat about it.

[0] https://apse.io

DocTomoe · on July 11, 2022

For Windows, Google used to have something like that. Because it's Google, it has since been discontinued [1].

Mac's finder is close to what you have described, and works reasonably well for me.

On Unix, this sounds like something a grep one-liner (maybe with some document depacking/packing pipe for Office documents) would do.

[1] https://en.wikipedia.org/wiki/Google_Desktop

applgo443 · on July 11, 2022

I considered doing this - take screenshots of your screen constantly, OCR them and index them. It's fairly simple. However, there are some problems

- OCR constantly running in the background is power consuming - What granularity do you take your screenshots? Imagine each screenshot is 500 Kb and you take one each second. This'd result in 40 gigs of data per day. How are we gonna store it? How many days data do you want to keep?

billwashere · on July 11, 2022

That's Apse – A Personal Search Engine https://news.ycombinator.com/item?id=27965979

nly · on July 11, 2022

Privacy?

capableweb · on July 11, 2022

Since parent is taking power consumption and disk storage into consideration, it's fair to assume they are considering a local approach, meaning privacy is as good/bad as any other local data you have on disk today.

whilenot-dev · on July 11, 2022

There was once a thread here on hacker news about missing features of operating systems, as such a thing could only be achieved on OS level... can't find it anymore, unfortunately. It was mentioned that passwords etc. could be a nightmare. A feature like that would be a big dream of mine: Some kind of an individualized semantic archiving processing and a vector search engine to search through it.

bitL · on July 11, 2022

Open-text question answering. Just make your own; index all paragraphs of all documents using TF-IDF as you access them, then when trying to search for something, use this index to get a set of candidate paragraphs and run them through BERT-QA trained on SQuAD v2. You can extend this to the content of images - first run image captioning using CNN and transformers, then index the resulting paragraphs the same way (in both cases, include a link to the original in the metadata). You might need to write some browser plugin/system driver to do it automatically as you access documents/images.

dirkc · on July 11, 2022

There used to be a project that kind if did this: https://beagle-project.org/. It's long since defunct and I'm always surprised that nothing emerged to fill the gap?

EDIT: I did a bit of Wikipedia rabbit holing only to discover that tracker [1] is currently running on my computer and indexing my files

[1]: https://en.wikipedia.org/wiki/Tracker_(search_software)

solardev · on July 11, 2022

I think windows and Mac both do this by default, no? Just disable the web search and your local full text is what you're left with.

sneak · on July 11, 2022

It doesn't index text in local image files, and it doesn't index over the full text of all the webpages and epubs I've read.

The5thElephant · on July 11, 2022

I believe they meant they want something that searches only content the user has personally directly accessed, not ALL local content.

ricardobeat · on July 11, 2022

Spotlight on Mac will show you recently opened, or frequently opened, files first.

invalidusernam3 · on July 11, 2022

Ordering by "Date Last Opened" or "Date Modified" does a fairly good job in some cases

theK · on July 11, 2022

This does introduce sorting contention though, what to sort for first? relevance or date accessed? Ideally you would want to introduce date accessed as an aspect of relevance itself.

fnord123 · on July 11, 2022

I thought we all disabled access time (mounting with noatime) to avoid trashing SSDs so quickly.

rhn_mk1 · on July 11, 2022

For the web part, there's a tool called Recoll, and a browser plugin Recoll-we.

joshu · on July 11, 2022

this has been built before. the problem is that it also needs attention for ranking

2Gkashmiri · on July 11, 2022

you know,,,, there was a april fools day annoucement on torrentfreak years ago, maybe a decade, it was describing this behaviour. that was nice

conorcleary · on July 11, 2022

Submitters need to get back to more descriptive titles on HN. If this post is the first and only exposure to this project for a user, "Project Naptha" alone doesn't give me confidence to roll the dice and click on an unknown link unless there are existing comments I can investigate. Thus, chicken and egg, scroll by.

cyberbanjo · on July 11, 2022

Browser extension OCR: From the homepage "Project Naptha automatically applies state-of-the-art computer vision algorithms on every image you see while browsing the web. The result is a seamless and intuitive experience, where you can highlight as well as copy and paste and even edit and translate the text formerly trapped within an image. "

lmm · on July 11, 2022

They probably put a descriptive title and then had a mod edit it to use the non-descriptive page title instead, as per HN's usual policy and practice.

em-bee · on July 11, 2022

i don't think that happened. in my understanding that rule applies to article titles, not to site titles.

in my opinion, when posting a site, then a description, ideally taken from the site, should be included.

this is kind of a variation of a "show HN", like "look what i found"

1123581321 · on July 11, 2022

I never heard of it and I clicked because the name intrigued me. You might benefit from a link preview extension to reduce the friction to explore links before comments.

jessmartin · on July 11, 2022

Adding the "subtitle" from the site itself would provide at least a bit of helpful context: "highlight, copy, and translate text from any image."

latchkey · on July 11, 2022

True, that said, it made it to the front page... and while not everything on the front page is super interesting, it does give it a bit weight on the dice roll fun.

keyle · on July 11, 2022

Interesting but I thought my mac already does this? Maybe just M1 and Monterey.

I was trying to debug some image the other day, upon inspection I got confused by the shadow dom doing weird stuff, but only in safari; to then realise that it was macOS converting the text in the image to text in shadow dom! ... Good/bad experience report I suppose.

ref. https://support.apple.com/en-au/guide/preview/prvw625a5b2c/m...

irae · on July 11, 2022

Macs and iPhones have been doing selectable text for a while now. Especially if you use Safari instead of Chrome. All images have selectable text since Monterey and iOS 15. It existed to a lesser extent in previous years.

Some of the features Naptha aims to do are not part of it though, like translating, removing the text from the image, and some other right click behavior. On the other hand, Apple is using this without any connection, which Napha says it is availble but with degraded quality and slower speed. Apple probably uses their optimized ML silicon, so I would imagine is is more battery efficient for using on the go.

mzs · on July 11, 2022

seems in English only

eastendguy · on July 11, 2022

The Mac built-in tool does not work on Youtube videos, which is one of my use cases for the Copyfish OCR tool. This, and the integrated translation.

modeless · on July 11, 2022

Android does this too, though you have to go into the app switcher for it to work. Super useful.

oittaa · on July 11, 2022

It would be great if Google allowed that everywhere. Does anyone know if it's on the roadmap?

modeless · on July 11, 2022

I think the extra step of going into the app switcher is not really an issue. It works for every app, and I actually like that I know whether I'm using the OCR or traditional copy/paste because OCR isn't perfect. When I copy from the app switcher I know I need to double check the results after pasting. If it just did OCR automatically all the time I'd probably be surprised after copying some text and realizing later that there was an OCR mistake in the result because I didn't realize that OCR had been used.

brian_herman · on July 11, 2022

Yes I think you can do it in preview too.

kbouck · on July 11, 2022

... and on iphone/ipad

https://support.apple.com/en-us/HT212630

jordemort · on July 11, 2022

This needs a year annotation - I almost emailed to express interest in a Firefox version until I noticed the references to Chrome 36 and Google+

eastendguy · on July 11, 2022

Copyfish is a good alternative to Project Naptha and works in Firefox:

https://ocr.space/copyfish

Ajedi32 · on July 11, 2022

Yeah, I wonder if there's a more modern version of this somewhere. On-device OCR is probably good enough now that a fully-offline-by-default version of this might make sense, and that's something I might actually use.

wanderlust2021 · on July 11, 2022

Copyfish uses offline OCr.

crorella · on July 11, 2022

hehe, I was thinking the same, then I read your comment.

I wish they give support to Firefox in the future, getting text (and even modifying it!) is something I need to do often.

input_sh · on July 11, 2022

Opening via Firefox shows me this:

> Depending on the number of sign-ups, a Firefox version may be released in a few weeks. If you're interested in Naptha for other browsers, email me.

So they're at the very least considering it.

EDIT: Never mind, it says so since 2014: https://web.archive.org/web/20140425003753/https://projectna...

shubhamjain · on July 11, 2022

This makes me feel old. I saw it for the first time in 2014 on HN, just when it was announced. I felt kind of envious of Kevin Kwok, author of this project. I had just graduated and he was still an undergrad. He had already shipped so many complex projects, including a full-fledged Flash Animator[1] for the web.

It's pretty surprising to see that his site hasn't been updated since 2015 and not many projects have been shipped since then.

[1]: https://antimatter15.com/project/ajax-animator/

zxexz · on July 11, 2022

Ahh, I did an Algolia search just now and found the 2019 post[0] but thought that felt a bit recent for what I remember. I can't find that 2014 post, however - do you have a link?

[0] https://news.ycombinator.com/item?id=20919147

shubhamjain · on July 11, 2022

Here: https://news.ycombinator.com/item?id=7629396

gfd · on July 11, 2022

i remember the last time i saw his name was on https://news.ycombinator.com/item?id=14894653 (which i guess lost against tensorflow.js?) in around 2017 or so.

zxexz · on July 11, 2022

I thought this seemed familiar. Discussed on HN 3 years ago [0]

I remember using this for a while on a separate Chrome profile. It was quite useful, albeit quite CPU intensive.

[0] https://news.ycombinator.com/item?id=20919147

draugadrotten · on July 11, 2022

This extension is a privacy nightmare. "By default, when you begin selecting text, it sends a secure HTTPS request containing the URL of the specific image... The server responds with a list of existing translations and OCR languages that have been done."

That is some pretty sensitive data to keep around. There seems to be some rudimentary thinking around privacy: "no user tokens, no website information, no cookies or analytics" Yet keeping an index of all the image requests from any IP would not pass muster by any GDPR lawyer I have met.

http://my-support-group/advice-for-disease.jpg http://my-political-group/campaign-ideas.jpg http://my-therapy-group/suicide-prevention.jpg

https://ec.europa.eu/info/law/law-topic/data-protection/refo...

nprateem · on July 11, 2022

There are probably some hacking angles too, e.g. I wonder if the API will helpfully tell me the contents of https://mybank.com/user/latest-statement.jpg or whatever

emmelaich · on July 11, 2022

FWIW, there is a Chrome app for cloud vision, OCR.

https://chrome.google.com/webstore/detail/cloud-vision/nblmo...

By a Google employee I understand but not official Google product of course.

metadat · on July 11, 2022

Does Naptha still work? I recently reviewed all installed chrome extensions and it seemed broken, so I removed it.

Nition · on July 11, 2022

A related trick for text that you can't usually select: Hold Alt. For instance try using the mouse to select text in http://www.google.com with and without Alt.

tim-- · on July 11, 2022

I had this extension on my Chrome browser a few years ago, and was dumbfounded when I thought that Chrome had added the ability to not only OCR PNG files - but also replace the text in them!

Completely forgot that I installed this extension years earlier.

Crazy extension!

mcintyre1994 · on July 11, 2022

I love this idea, Apple recently added it for saved photos on iOS and I think in preview too? It doesn't seem to be working for me though. I highlighted some text, hit ctrl + c and got this in my clipboard:

<[ TEXT RECOGNITION IN PROGRESS / MORE INFO: http://projectnaptha.com/process/ (IDX:a:0-a:1-a:2-b:7&a:0&a:0&168&817:XDI) / ELAPSED 26.11SEC / DATE Mon, 11 Jul 2022 08:08:50 GMT / TEXT RECOGNITION IN PROGRESS ]>

The right-click translate doesn't seem to work either, it just selects a whole paragraph.

_bkyr · on July 11, 2022

It works on all images in Safari on macOS as well. It's actually been helpful but was very odd when all of a sudden the functionality appeared out of the blue.

irae · on July 11, 2022

Even though I watch WWDC every year and the iPhone release event, I was surprised with how it works, and how fast it is. Basically hovering my mouse over any image (ouside of Google Chrome/Brave/Firefox) changes the cursor to selectable after half a second or so.

Took me a while to get used to. But it has ben actually useful a number of times. My most common use-case (as a front-end developer) is when people send me screenshots of bugs, and I can select, copy, paste texts from the screenshots to find the source code files =)

wahnfrieden · on July 11, 2022

it has a full dev sdk as well

_tom_ · on July 11, 2022

Nice! This works slightly better than apple's version, which makes it very hard to select the main image, once it detects text in the image. Naphtha seems to handle this correctly!

ralfd · on July 11, 2022

This is a cool project, but this is a bit embarrassing:

> I started building a text recognizer algorithm specifically designed for Impact font, and it was actually working pretty well, but I kind of misplaced the code somewhere. So, until I find it or replace it, you'll have to use Tesseract configured with the "Internet Meme" language.

jzer0cool · on July 12, 2022

With a chrome extension, are there any trust, security, privacy issues enabling an extension with an unfamiliar organization as well as leak what sensitive images are being transferred.

I wonder also why such "convert" options are not made available or are they in the works?

jtth · on July 11, 2022

I find things like TextSniper to be more useful than this functionality, which has been in iOS and MacOS for a little bit. I use it more on hypothetically selectable elements that for whatever DOM-related reason aren't selectable than I use it on text in images.

LegitShady · on July 11, 2022

the front page still contains references to google+

rexreed · on July 11, 2022

This project is almost a decade old! I wonder why it resurfaced? Coincidentally or perhaps not the book in the post "How to do nothing with nobody all by yourself" also is a top HN post right now. Maybe related?

poulpy123 · on July 11, 2022

what I would like to find is a free a cheap way to OCR my handwritten notes. I know that handwritten is much more difficult but it would be possible to use supervised learning mon my specific handwriting.

CoastalCoder · on July 11, 2022

I want something similar for digitization of photos from whiteboarding sessions.

I think it would be an awesome way to capture notes from design sessions while still allowing the fluidity of a real whiteboard.

pbhjpbhj · on July 11, 2022

OneNote's recognition of my handwriting--once described by a tutor as 'dogs dribble'--is nothing short of miraculous.

OneNote only does recognition for search and by text-block, you can't select as in the OP; would love that feature.

poulpy123 · on July 12, 2022

thanks I will have a look, I think my job has a license for it

holler · on July 11, 2022

Really cool! I went to try the translate feature but each time I left-click to open the nav menu (macOs) it deselects the text? I know it says right-click but for me it's left-click to open the menu.

a-dub · on July 11, 2022

last time i saw this the ocr component of it was tesseract (originally hp's c++ ocr engine, later acquired and open sourced by google, even later rewritten to use neural networks) compiled into webassembly.

avmich · on July 11, 2022

> Unfortunately, your browser is not yet supported

Latest version of Firefox? Seriously?

nit987 · on July 11, 2022

Hey, loved it. How much time it took to develop this?

adastra22 · on July 11, 2022

> Unfortunately, your browser is not yet supported

Safari on macOS.

terramex · on July 11, 2022

Safari has its own, built-in implementation of that called Live Text since last year. Just try highlighting any text in the image and voila - it works.

https://support.apple.com/en-au/guide/preview/prvw625a5b2c/m...

chrisseaton · on July 11, 2022

Ironic, because Safari already does this natively.

WesternWind · on July 11, 2022

Same with Firefox on MacOS here.

irae · on July 11, 2022

Very interesting that Firefox would have this level of integration with the OS. Firefox of old was criticized specifically for being completelly not native to macOS. Times change. I might give Firefox a shot in the recent future.

adastra22 · on July 11, 2022

I don’t think he is saying it is working?

WesternWind · on July 14, 2022

Sorry yes, it doesn't work. It only works on Chrome afaik.

moneywoes · on July 11, 2022

What's the catch?

solardev · on July 11, 2022

Today it's replacing the text in PNGs.

Tomorrow it's replacing us. Duh duh duh...