Perplexica: Open-source Perplexity alternative

yosito · 2024-05-24T06:46:17 1716533177

It would be awesome if this could also search my Obsidian notes at the same time, and if it worked seamlessly on all of my devices.

nine_k · 2024-05-24T13:22:36 1716556956

In this regard,maybe JetBrains should dust off Omea [1] and add an appropriate LLM to it.

[1]: https://www.jetbrains.com/omea/

tony-vlcek · 2024-05-24T07:31:54 1716535914

Logseq user here with an upvote.

anewhnaccount2 · 2024-05-24T07:38:20 1716536300

I guess it didn't eat your notes yet: https://discuss.logseq.com/t/data-loss-happened-twice-i-cant...

jddj · 2024-05-24T10:36:33 1716546993

I've had Google photos eat ~12 months worth of pictures, I don't really trust anyone to keep my data safe.

Logseq has been fine for me over several years, but it also makes it extremely easy to auto-commit to git.

jhund · 2024-05-24T12:51:24 1716555084

Logseq's git auto-commit is a great insurance policy and should make recovery a breeze.

eichin · 2024-05-24T15:21:14 1716564074

Ah, those appear to be entirely multi-device-sync problems. (I use logseq with git-autocommit for storage and backup - but since the multi-node sync stuff wasn't available for self hosting anyway, I've never tried it, and thus dodged the problem entirely. Obviously for a lot of people multi-device use is the entire point, but for some of us, logseq is "just an editor"...)

spiralk · 2024-05-24T20:28:46 1716582526

I actually do something like this with my logseq notes. Since all of the files are .md in a directory, one can load them all in to a vectordb and use it for RAG. Logseq has an API too but using the .md files is easy.

reckless · 2024-05-24T06:16:34 1716531394

Looks similar to what I've been using for a few weeks https://github.com/miurla/morphic

number6 · 2024-05-24T07:24:44 1716535484

Is it worth the install or is it just a gimmick?

hackerlight · 2024-05-24T07:44:48 1716536688

No need.to install, go to www.morphic.sh

insane_dreamer · 2024-05-24T20:59:59 1716584399

just tried it; sweet!

octobus2021 · 2024-05-24T15:03:02 1716562982

I absolutely love this and will try as many as possible very soon. I think "intelligent search" (asking LLM questions to search on the Web by communicating, preferably by voice) is one of the few solid use cases for LLM. I hate the idea of having this happen in the cloud with someone having my data, so doing this locally with my local LLM would be ideal.

nilkn · 2024-05-24T18:42:18 1716576138

Even after the release of GPT4o, Perplexity Pro with Claude 3 Opus is by far my most used LLM application. For me, the writing quality of Claude 3 combined with a wider variety of information sources makes it far surpass raw ChatGPT for most non-creative/non-interactive tasks.

bufo · 2024-05-24T21:38:16 1716586696

I recommend Phind.com, it’s been much better and faster for me than Perplexity Pro. I typically use their custom 70B model but you can also use GPT4 o or Turbo, or Claude 3 Opus.

abdullahkhalids · 2024-05-24T18:35:58 1716575758

What would be even better, if it could also search my local repository of ebooks and pdfs. Most of the stuff I do, needs serious answers from books or papers I have already selected. Random webpages on the web don't cut it.

Citing the book section/page/paragraph would be magic.

sc077y · 2024-05-29T10:16:51 1716977811

This is 100 percent doable. Building something like this at scale might be a pain but locally it's fairly easy.

applied_heat · 2024-05-25T16:27:52 1716654472

I used to use qiqqa for local full text search of pdf library, I think the world has moved on to mendeley, and paperless(-ng) I believe performs the same function

jahewson · 2024-05-24T16:45:40 1716569140

The web search itself till happening on the cloud though? And instead of searching one provider it now searches multiple… not sure how much better this is really.

berz01 · 2024-05-24T16:04:42 1716566682

[flagged]

dang · 2024-05-24T18:00:22 1716573622

Please don't be snarky or post in the flamewar style to HN. We're trying for something else here: https://news.ycombinator.com/newsguidelines.html.

idle_zealot · 2024-05-24T16:46:44 1716569204

You're being downvoated, and I think the reason is this: there is a perceived difference in intentionally contributing something like a post on HN and having your personal searches be collected by Google.

octobus2021 · 2024-05-24T17:48:38 1716572918

I would replace "perceived" with "significant" but yeah, pretty much.

allanrbo · 2024-05-24T05:44:35 1716529475

First I’m hearing of the meta search engine SearxNG too. Neat. Feel like we’ve come full circle, going back to meta search engines again.

sorokod · 2024-05-24T06:20:42 1716531642

Same here, here is a list of public instances[1]. Docs link [2].

[1] https://searx.space/

[2] https://docs.searxng.org/

rashadphil · 2024-05-24T14:45:33 1716561933

Here is another open-source alternative: https://github.com/rashadphz/farfalle (Disclaimer: I made it)

asadalt · 2024-05-24T15:30:47 1716564647

which one is better

sc077y · 2024-05-29T10:11:34 1716977494

Very interesting. I'm building a RAG chatbot and I haven't done the inline citations yet, I honestly thought it was a lot more complicated then just telling the llms to cite with a number and then have numbers next the sources. I did something to that extent as kind of a joke and it worked but the llm didn't always listen. I thought either post processing (checking cosine distance between sentences and retrieved chunks) or function calling would be the way to go.

behnamoh · 2024-05-24T16:43:31 1716569011

It was about time someone made an alternative to Perplexity.

spdustin · 2024-05-26T01:22:26 1716686546

I want to like it. But not supporting deployments and closing tickets submitted by people who are trying to get it running in their homelab turned me right off. The configuration shouldn't be that fragile.

Cilvic · 2024-05-24T09:16:38 1716542198

This is cool. My biggest question was "does it work?" then I had another look at the repo and saw the "Repocloud" one click deployment. And it's quite well done. Apart from signin up for the repocloud account (3$ free credit) and waiting for the deployment (5mins) ... I'm now waiting for my first answer which doesn't seem to come through and there are not a many ways to trouble shoot as far as I can see... I've asked on discord

KRAKRISMOTT · 2024-05-24T06:09:11 1716530951

Can you add support for Serp API? I prefer to pay for a managed proxy farm instead of using SearxNG which requires too much babysitting.

throwanem · 2024-05-24T21:24:58 1716585898

Oh interesting, with the collapse of Google result quality lately I've been thinking about trying out SearxNG in my homelab. If you want to expand on the headaches you've run into, I'd be interested to hear!

fudged71 · 2024-05-24T16:05:16 1716566716

Are there any benchmarks to compare these online research agents? There’s so many to choose from now but it’s hard to compare them

sanjayk0508 · 2024-05-24T09:29:38 1716542978

There's been many other good alternative of perplexica before

michelsedgh · 2024-05-24T09:57:42 1716544662

Care to share which ones?

2024-05-24T10:01:10 1716544870

[dead]

bravura · 2024-05-24T13:38:11 1716557891

Why doesn't this site have any way to contact the maintainer?

Even their TOS makes it seem like they aren't an actual company (the counterparty is "RepoCloud.io")

jon309 · 2024-05-24T22:30:12 1716589812

Super cool! I would love if we could make this serverless and easily deployable with CDK or Terraform. Maybe I’ll take that up as a side project, who knows!

jakozaur · 2024-05-24T10:27:09 1716546429

Sorry to say, but this looks like a trademark violation. Though the project may be cool, it immediately put me off:

https://www.trademarkia.com/perplexityai-98400215

I'm not a lawyer, but trademarks are well protected. You can provide similar services and confuse customers by using almost identical names. Don't do Gooogle search engine, Macrosoft OS, etc.

If they will get traction, Perplexity could force them to rebrand.

Terretta · 2024-05-24T10:50:01 1716547801

Perplexity is an information theory term, not a brand:

Perplexity of a probability model -- A model of an unknown probability distribution p, may be proposed based on a training sample that was drawn from p. Given a proposed probability model q, one may evaluate q by asking how well it predicts a separate test sample x1, x2, ..., xN also drawn from p.

https://en.wikipedia.org/wiki/Perplexity

llamaimperative · 2024-05-24T11:27:07 1716550027

Not how the law works. I’m not certain Perplexity has trademarked their name but the question of whether it’s an information theory term or not wouldn’t prevent them from doing so, nor would it prevent them from defending that trademark.

Engineer-y people trying to interpret law has to be one of the most reliably silly things on HN.

michael-ax · 2024-05-24T11:29:38 1716550178

Have you ever tried to trademark a random noun?

marcinzm · 2024-05-24T11:33:28 1716550408

No but lots of other people have: https://tmsearch.uspto.gov/search/search-results

Feel free to release a computer named Apple to prove me wrong.

michael-ax · 2024-05-24T13:27:36 1716557256

Alright, read up on domains, then try arguing that 'perplexity' as company and noun are in different spaces! I grant you that if they were, the company could trademark that noun. But it seems clear that Perplexity named itself after the noun and by so doing gave up the option of trademarking its company name.

mdp2021 · 2024-05-24T11:33:38 1716550418

> Engineer-y people trying to interpret law

It must be out of how perplexing apparent hiatus between legitimacy and positive law can be.

marcinzm · 2024-05-24T11:25:25 1716549925

That doesn't mean in any way that it can't be a legal trademark.

calny · 2024-05-24T13:31:01 1716557461

I’m an IP lawyer & AI dev: my first reaction was, “hmm there are trademark issues here.” From a US perspective: “Perplexity” certainly CAN be a trademark, and the company has applied for one—to my knowledge it’s still pending. If the term was merely “descriptive” of the service provided, like “American Airlines”, then the company would need to show that the term has acquired distinctiveness: ie, that purchasers associate the term with that specific company. But perplexity is probably more than merely descriptive here.

Assuming that they have a valid trademark, the issue becomes whether there is a likelihood of confusion between Perplexity and Perplexica. That is a fact-specific, multifactor test, which I’ll spare you. But there could be arguments both ways IMO

EDIT: trademark issues aside, cool project!

jay-barronville · 2024-05-25T00:02:06 1716595326

HN is so incredible. The topic can be just about anything and there’s someone here with just the right expertise and/or set of skills to share their two pennies. The current topic is AI and IP law and here comes someone who’s an IP lawyer and AI engineer. I truly love this place.

michael-ax · 2024-05-24T11:28:22 1716550102

Which is why Trademarks are a non-issue here. My bet is that the Devs understood that.

tcsenpai · 2024-05-24T13:16:06 1716556566

I was waiting for this moment since months. Sir, you are the GOAT

webprofusion · 2024-05-24T07:49:43 1716536983

When making an alternative to something, don't reference the name of the thing you're copying if that thing has (or can afford) a legal team to protect their brand. If your product can reasonably be confused with the original (it can) they will eat your soul.

iforgotpassword · 2024-05-24T08:45:55 1716540355

Huh? So reactos shouldn't say they build an alternative to Windows? As long as you build it yourself and don't steal any resources or secrets, there is no problem mentioning that it's an alternative or replacement for another product. What's much more dangerous is picking a name for your own product that resembles the original.

twobitshifter · 2024-05-24T11:17:08 1716549428

More that they should not have called it Windowz

rcxdude · 2024-05-24T17:22:08 1716571328

You can reference the competitor, but you don't want there to be any risk that a moron in a hurry might confuse your product with theirs, else you're in for a trademark violation.

TeMPOraL · 2024-05-24T08:52:25 1716540745

Wonder how Gitlab survived next to Github then. To first approximation, the names are the same, and so are the products...

digi59404 · 2024-05-24T12:02:05 1716552125

Git is a registered trademark of neither GitLab or GitHub. Both GitLab and GitHub have negotiated the usage of the Git trademark. Provided they follow the rules set out for them, they can continue to use it.

As an employee of one of them I personally bought the git.new domain. I paid a good chunk for it and was going to build a new project template builder on it. I got.. talked too by legal about this. Because as an employee it actually violated one of those rules.

So that’s the how, and why I know.

jakozaur · 2024-05-24T10:32:23 1716546743

The dispute happens only if one party owns the trademark and sends a Cease & Desist letter. Different companies have different approaches to aggression here.

Second, it has to prove that it confuses customers (e.g. if you pick ten end users and do tests if they find that confusing). Maybe a sophisticated tech audience is better at finding differences than the general public.

mihaic · 2024-05-24T09:34:44 1716543284

Both of these are built on top of git, an open source project, so Gitlab is not a riff on Github. Perplexica on the other hand seems like a direct reference to Perplexity, not on the concept of being perplexed by something.

wrasee · 2024-05-24T10:34:19 1716546859

Yet the way git is used is still similar. Both lead with ‘git’ in their name, both append a pithy three letter suffix to ‘git’ that both describe some kind of space where people meet to do stuff. Surely that’s more than just coincidence.

TeMPOraL · 2024-05-24T10:11:41 1716545501

Isn't "Perplexity" itself a direct reference to a machine learning term that, among other things, is very relevant to large language models, on top of which Perplexity is built?

IanCal · 2024-05-24T11:36:04 1716550564

That's a far more tenuous link than "gitlab hosts git repos".

qxfys · 2024-05-24T08:00:30 1716537630

michelsedgh · 2024-05-24T08:37:01 1716539821

Actually I loved it. I dont think they have any grounds to sue. Its different and close enough. Also they wouldn’t sue a project on github, if they do they show their faces its worse for them. Also many forks will happen and they have to sue many. Worst case you change the name of the repo. Thats the power of open source ;)

tartrate · 2024-05-24T09:44:21 1716543861

Isn't Yuzu a good counter example?

theturtletalks · 2024-05-24T11:23:12 1716549792

Yuzu’s downfall was not the repo, it was their Discord. They were sharing DRM cracking keys on there and getting paid $30K/month on Patreon. It’s the same reason most emulators require you to bring your own BIOS.

freehorse · 2024-05-24T11:17:22 1716549442

It does not sound relevant to me, because that was a case of "video game piracy". It was not about the name per se.

pants2 · 2024-05-24T06:27:03 1716532023

I made my own version of this for personal use some time ago, it's a fun project! I use Kagi for the search backend and Colly/ScrapingFish (which has plans starting at $2) for getting the content. Both work really well!

mdp2021 · 2024-05-24T11:28:45 1716550125

An article would be nice...

tremarley · 2024-05-24T07:19:50 1716535190

Release it please

chakintosh · 2024-05-24T10:42:00 1716547320

I've been using Perplexity for months now on the Free tire (with the 5 Pro searches/4 hours) and its been plenty for me and I use it has completely replaced google for me. So I'm not sure where Perplexica fits in my use case, especially that I'll have to install and maintain it and use lesser models than Perplexity.

hosh · 2024-05-24T14:34:30 1716561270

Some people want to self-host this technology. AI is very powerful, and not everyone wants that to be controlled by large corporations or institutions.

dcreater · 2024-05-24T14:01:24 1716559284

Anyone used it yet? Was posted here a while back. I'm interested to hear whether it works and how good it is rather than many "this looks great" comments. Perplexity.ai itself has been pretty poor for me after I got past the honeymoon phase

throwaway_ab · 2024-05-24T08:26:41 1716539201

[flagged]

zone411 · 2024-05-24T08:41:32 1716540092

Not only that, but it opens the project up to having to deal with a trademark cease and desist letter and then having to rebrand. Preplexity would be obligated to send one in order to protect its trademark if they become aware of this. How are seemingly decent software developers so unaware of anything besides coding?

jasonvorhe · 2024-05-24T08:32:46 1716539566

So that people can easier guess what it's about.

It's not like some multinational is saving on advertising by stealing a small company's existing brand recognition. Most users of the "original" will most likely never hear of it nor care enough to setup some docker stuff and do their searches locally.

michelsedgh · 2024-05-24T08:32:40 1716539560

Thank you so much for posting this and ofc the creators. My brother and I were in a debate and this just proved my point. Feels real good to see it. Cant wait to try it ;)

hackernewds · 2024-05-24T05:41:07 1716529267

How is this related to Perplexity?

viraptor · 2024-05-24T06:21:21 1716531681

Does the first paragraph of the page answer the question?

asadm · 2024-05-24T06:13:36 1716531216

It's open source version of Perplexity.ai

rvz · 2024-05-24T11:34:08 1716550448

Both Perplexica and Perplexity are bad names for a search engine.

Very perplexed as to who was the smart person that chose this dreadful name for the company.

Yes, it has another definition in context to information theory; which my point is, I used the first definition like a normal person would, which is commonly associated with...

'...a state of confusion or a complicated and difficult situation or thing.' - Cambridge English Dictionary [0]

None of them can ever become a verb that makes sense like 'google it'.

[0] https://dictionary.cambridge.org/dictionary/english/perplexi...

rors · 2024-05-24T14:47:28 1716562048

Perplexity is term from information theory. It's one measure of the quality of an LM. I.e. how perplexed is my model? To an experienced researcher it's a unit of measurement like metres or kg. https://en.wikipedia.org/wiki/Perplexity

I agree that it doesn't transfer out of that specialised domain.

Aloisius · 2024-05-24T16:13:58 1716567238

Eh. Still a weird name given one generally wants to reduce perplexity.

Might as well call it Uncertainty.

robertlagrant · 2024-05-24T13:25:43 1716557143

I Encartered the concept of verbs and I agree.

mdp2021 · 2024-05-24T15:52:33 1716565953

'Perplexity' is "through the complexity".

mdp2021 · 2024-05-26T14:39:49 1716734389

'Perplexity' is "through the complexity",

which is much better than "proplexity" (see e.g. "profanity") - where you never entered the complexity at all.