Hacker News new | past | comments | ask | show | jobs | submit login
Teclis – Non-commercial web search (teclis.com)
239 points by samcrawford on March 23, 2022 | hide | past | favorite | 79 comments



Lots of great new search engines popping up that search the rest of the web that Google tends to ignore.

Other ones worth checking out include:

- https://search.marginalia.nu/ (A non-commercial search engine)

- https://wiby.me/ (Tends to have those really weird and cool indie sites)

- https://searchmysite.net/ (An index of personal websites)

- https://indieweb-search.jamesg.blog/ (Search IndieWeb websites)

- https://millionshort.com/ (Ignore the first million results from Google)


I think this is among the more complete lists:

https://seirdy.one/2021/03/10/search-engines-with-own-indexe...


I was thinking of making a small site showcasing indie search engines or small search engine projects ig


Wish someone would meta search these search engines


There's fairly aggressive bot activity when you run a search engine, which is why many of us are behind cloudflare or similar bot-mitigation services.

Best guess is the bots are attempting to manipulate Google or Bing's suggestion algorithms this way (since a lot of small search engines are basically just forwarding results from bigger engines). Unfortunately they can be very aggressive to the point of almost bringing down your service. I saw up to 30k searches per hour before I cried uncle and got behind cloudflare.

I do offer an API, and I suspect some others in the indie search space might as well (or might at least be willing to set one up). That way I can rate limit on a per-consumer basis without affecting everyone.


Searx and SearxNG support a few of these, but instances get blocked and unblocked left and right.

eTools is a metasearch engine that uses commercial APIs for its search providers, so it does not get blocked. However, each search is a bit expensive for eTools, so users might get blocked by eTools instead.



+1 for andisearch also. Totally different approach. Have barely seen it talked about on here. The results are dope, however. I very much appreciate marginalia as well. I applaud the programmers making new and interesting tools in this space.


Credit to them for trying some new things on the UI front, but it looks like the organic results are from Bing (like most other alt/privacy search engines). It would be interesting to learn more about how/if they plan to build their own index, or set themselves apart.

IMO, Kagi and Brave search are the two best alternative general search engines right now.

Runnaroo was pretty good as well ;-)


I think Mojeek is a better alternative "general" engine than Brave because Brave's ranking algorithm was optimized against Google SERPs (back when it was called Cliqz), making many SERPs too similar to Google's.

Kagi uses a mix of Teclis and other engines (claims to use Bing and Google) but the ability to adjust ranking yourself is its wild card. Neeva is similar, combining its own index with some ability to influence ranking.

But personally, I'm trying to reduce my use of "one engine to rule them all" and instead use specific engines for specific tasks they're good at.


Mojeek is excellent, and because they use 100% their own index they have a much higher hill to climb.

When I say "Best"for a general search engine, my definition is that it would fulfill the needs of myself and my non-technical family members. Kagi and Brave Search both do that while being different enough to not be just another Bing clone. I use Mojeek often, think it is great, and having their own index is a tremendous asset, but it doesn't quite meet that full definition yet.


An additional thing of interest. The text of the results such as heading and summary on andisearch is not the same as the other search engines. I think it displays more of each web page's content from its index.


Interesting. With search 'java lambda function equivalent' I see results different to Bing, Google and DuckDuckGo. Greater difference still for "elon musk latest news". My guess, they use Bing where they lack their own index.


Curious as to how these engines accept new sites into their ranks. A big problem most of us have had is spammy results outranking everything else by building large, fake networks of sites that boost each other's rankings via interlinking. Many of the higher end networks are undetectable, as they have legitimate content and never link to more than 1 other internal site (among a mix of external sites, some affiliated with still other networks).


I use Personalized PageRank giving disproportionate voting power to a bunch of people whose entire identity is their passionate dislike for the commercial web. To get a really high rank in my search engine, you need to convince a those people to link to your site.


One approach is to have human testers. When a low quality site gets high rank, you investigate in detail how that happened and downrank the linking sites.

It shouldn't be that hard to find the bad network if you're systematically investigating all the time. Google has people testing search results often.

The problem is that this is fairly expensive. But quite possibly not the largest cost a search engine would have.



Wiby has been a great experience. I love being able to contribute sites to it too, to.


http://yup.is for business news


Curious if anyone has recommendations for great enterprise search?


Algolia. It is used on here on the search bar on the end of the page.


It is insanely expensive though. I think all of your indexes are stored in RAM.


you.com is worth checking out


Hey all - creator here. It looks like next page of results does not work currently because wrong query param (should be "q" instead of "topics"). Easy enough to manually change if you need it.

As a few of you noticed, narrow searches do not work very well because this is not a general web search engine and has a tiny index compared to Google. Use Teclis to discover more about a broader topic you are interested in and to discover writing from 'clean' websites on the web.

Looking forward to feedback to improve!


> As a few of you noticed, narrow searches do not work very well because this is not a general web search engine and has a tiny index compared to Google. Use Teclis to discover more about a broader topic you are interested in and to discover writing from 'clean' websites on the web.

Are you getting better results with vector search?

I've been looking at this problem with my search engine as well. I've recently side-loaded all of stackoverflow and stackexchange, and searching in that part of the index is still not great at finding narrow results like you can on bigger search engines, when that reasonably speaking should be possible.

I think, beyond the fact that my index is DIY and fairly crude, algorithms like BM25 are designed to identify topical keywords, and they do that rather well, but narrow searches go far beyond merely the topic and often involve words that aren't important to the document but are important to some particular context within it.

I may have some ideas to get around this, but they're fairly half baked. Experiments are needed.


Not OP but I am working on a search engine with vector ranking. Why do you say that vector search would help with narrow queries? In my experience, semantic search helps broaden the query to search for adjacent ideas without exact term marches.

Hybrid approaches that use vector search for broad matches and rerank using BM25 could be what you’re looking for. See https://blog.vespa.ai/efficient-open-domain-question-answeri...


> "Hybrid approaches that use vector search for broad matches and rerank using BM25"

Hybrid approaches, e.g. Learning To Rank, normally do it the other way around, given the main benefit of hybrid is to mitigate the cost (time) of vector search, i.e. use a non-vector search (e.g. BM25) to get a broadly relevant set of results first (and quickly), and then the much more computationally expensive vector search to rerank the smaller results set. There are various approaches to try to make vector search more viable across large corpuses, e.g. Locality Sensitive Hashing and Approximate Nearest Neighbour Search, but if you've implemented one of those than I'm not sure there'd be any benefit in retaining a hybrid approach.


> Why do you say that vector search would help with narrow queries?

I was just asking whether he'd seen better results. I haven't experimented very much with it on my search engine. It's as crude as they get, and in part I want to see how far I can push old fashioned 1970s search algorithms :P


Vector search is good for broad searches. Narrow searching is a problem of crawling, not ranking IMO. Teclis crawls a very particular and small portion of the web, which is the main reason it can not find results for more specific searches.


Thanks for making Kagi! I hope you and your team can figure out a way to make a flat monthly fee feasible so I can continue using the site!


I was a little surprised to see Fandom.com results come up in one of my test searches, given that they are notorious for being very far from "clean" (I counted 25+ uBO blocked when checking the page in Vivaldi, which is far above the threshold of 5 mentioned on your page). Might be worth looking at in more detail.

Also, Marginalia Search link on front page is broken.


Teclis is the name of a High elf wizard in Warhammer (a miniature fantasy strategy game). Is that where the name comes from?


Yes, although I was more of a Wood elf player in WHFB.


Really great work! It is exciting that people are working on alternative indexes of content, especially ones that prioritize content written by individuals for smaller audiences. The uBlock heuristic is an interesting way to capture that.

Matches well with our thesis we wrote about here: https://re-search.xyz/writing/mapping-the-new-world-towards-...

Disclaimer: We’re a research group that is also working on a new kind of search engine. Our approach is a little different though. We think that information is now scattered across different semi-open silos, so the future of search will not look like a search bar and ten blue links to web pages.


I'm also very interested in new paradigms to explore the internet. I've built a sort of explorable graph of adjacent websites based on my search engine database, was on Show HN a while back:

https://search.marginalia.nu/explore/news.ycombinator.com

If you click 'similar' under any site, you get a list of its neighbors.

I think it would be neat to extend the metaphor not to just websites, but ... I dunno, something more general, links, topics, what have you. Like a browsable web of connected things. Maybe like with a bookmarking or annotation system. I think it could be super neat. Still a bit of a hand-wavy idea, but I want to build it, or someone else to build it.

I do think the search box is a bit limiting.


Yep, graph structure of topics enables users to wander around topic space. Good for less directed, more exploratory searching.

Another graph that is useful is the graph of people -> topic clusters. See https://twitterverse.net/ . Such a graph can help rank content from people deeply invested in a particular topic, and its hard to fake because they would presumably have to trick all their peers about their expertise

We'll have our own Show HN soon but it's great to see similar ideas bouncing around. Would love to connect over email to learn more about your thoughts.


Yeah, I think there's real room to build something very cool and useful in the space of exploration and discovery.

Do shoot me an email, my address is in my profile.


> The way detection works is we count the number of uBO blocked requests on the page, and if too many (threshold is set to 5), we kick it out, leaving only "clean" pages in the index.

I'm genuinely surprised there were any pages left to crawl.

Unfortunately this also kicks out genuinely useful blogs and other pages that are otherwise helpful but happen to be using a platform or framework that makes a few block-worthy requests.

I can't figure out if all of Wikipedia is in the removed set or just ranked too low to show up in results. On the browser, the site seems clean.


> I can't figure out if all of Wikipedia is in the removed set

Turns out it was filtered out by mistake, back now!


This is such a fantastic search engine. Obviously not perfect, but the search results are information rather than blogspam/ads/etc. Breath of fresh air.

Funnily enough I somehow ranked #1 for "ADHD" but I don't know what's particularly special about my landing page. Does your crawler look prioritize/crawl HN by any chance?


Not in particular, there is triple ranking system in place plus some heuristics and it decided to rank you #1. Nice to be there for a change, huh? :)


I don't feel worthy to be honest


This is a really cool concept. Reminds me of the old(er) days when the web was a bit quieter and there wasn't an entire apparatus designed to steal your attention and focus.

This seems like a really good way to do research as well: people offering information without the expectation of getting paid for it.


This is the first time I've tried Teclis and it was a very positive experience. Always happy to see anything new in this space (search engineering?) and this seems particularly aligned with my interests.

My queries didn't get (obviously) mangled behind the scenes! Thank you for treating me with respect. Having said that, Teclis doesn't seem to treat alternate spellings of '-ise' words (e.g. normalise/normalize) as equivalent -- this is one case of auto-correction that I do appreciate in other search engines.

I just noticed the semantic search mode tip. I haven't tried it yet, but I like that it's not the default way to interpret my query.

I found it easy to find "technical" results and even (relevant) websites that I've never seen or heard of within the first ~10 hits. I wonder about the link between "non-commercial" as Teclis defines it and authentic, non-abusive, or otherwise desirable search results.

Also good:

- I didn't need to turn on javascript.

- clear info on the front page (the info itself and the fact that it's right there)

- results are actual normal links

- result snippet is normal selectable text (not a giant link)


Yay! Finally a good alternative search index :)

Plus I'm impressed that kagi.com teclis.com and the Orion browser is all the same guy ^^

EDIT: And "Kagi was created in 2018 and is running on tight budget, bootstrapped by the founder's funds from the previous exit. "


Bookmarked! I found https://c3js.org/ which is exactly what I want for my personal project. I also found https://github.com/javascriptdata/danfojs which looks interesting as well. All within 2 searches.


Love the idea. The first few things I searched had very few results, and when I got into more 'mainstream' topics, I was surprised to still see Quora et al in the results (I get a "7" flag on my uBlock icon when I visit I Quora page so I'm not quite sure how that ties in the with '5' threshold mentioned on the homepage).


The number can vary greatly based on browser, other extensions and location accessed.


Does this also exclude wikipedia? One of the first queries I usually try on search is literally "test", and I usually expect a wikipedia article for testing (either as an assessment or a scientific test or a programatic test) on page 1 or 2, but here there was none.


The results seem crappy to me, teclis found what I wanted zero percent of the time.

Hopefully they can keep iterating and improving this; a new entrant to Search is always welcome!

Because we desperately need something better and more useful than Goggle. It'll take a paradigm shift, for sure.


What were your failed searches?


My real name

Projects on GitHub (if it found anything, it was shitty, unmaintained forks)

Current events like the war in ukraine

Wikipedia articles

Terms found on websites I host or frequent which do not serve any form of advertisement (not indexed apparently, the hits were completely irrelevant with zero matching terms)


I just want the old Google search back. It's very frustrating when I submit a search term within quotes and I get back a bunch of pages that don't contain that term.


Beyond search results, a significant amount of googles value is in its “apps”, or whatever they call the functional snippets like calculator, translator etc. built into the engine.

I wonder if it would make sense to have cross platform plug-ins, so that all of these interesting nascent search engine efforts could automatically benefit from new plug-ins and an ecosystem could start to develop.

It’s great to have an alternative but obviously it’s such a huge effort the efficiency of development will be important.


SAAAS - Search Augmentation as a Service?



It is not meant to search a specific term like angular documentation, but to explore a topic like angular - and for that I see very interesting results.

https://teclis.com works just fine, site was submitted using its http link (maybe mods can change it).


I hate to say it but for "best laptop" I'd rather get a typical SEO affiliate result than a 4 year old article talking about why they think MacBooks from 10 years ago were the best laptop.

The fact is "best laptop" is what's called a commerical intent query where people are looking to make a purchase. They want recent results and recent products, not informational articles


Yeah, "best laptop" seems a strange search phrase to list as the first example for this engine. "best laptop 2022" does return a couple of results, both of them at least somewhat useful (Consumer Reports and PCWorld).

> The fact is "best laptop" is what's called a commerical intent query

However, there's a place for a search engine that doesn't see it as one - there've been quite a few times when I was trying to research a topic, when the search engine assumed it was a commercial intent query and made it almost impossible to get the historical view I needed from the search.


Someone needs to create a search engine specifically for products and services. One thing I end up doing is searching places like reddit.

"cool shirts for summer" and then search places like Reddit, fashion forums, etc. basically all areas where UGC is relatively authentic. And then toggle it for "paid for blogs", like strategist, wirecutter, rtings, etc.


First page results are interesting, paging to the second page gives me a:

"A query would help :-)"

One thing I noticed playing with Teclis is that it gives useful results for 'A vs B' queries. I don't know a single other search engine that still delivers remotely useful results for this type of query.


I know it's not meant for narrow results, but I still chuckled a bit when searching my own nick surfaced an entire serp with nothing but unrelated blog posts about being dyslexic. Wonder if it's just because "drusepth" sounds similar enough to "dyslexic"?


Really like the design of this. Best search results layout I've seen since the old days of google.


* Fun Challenge Find a query that has only one result in Teclis! Then read that page.

I think found one but failed to read that (which?) page.

http://teclis.com/search?q=sysadmin+horror+stories


Try http://teclis.com/search?q=%28sysadmin+horror+stories%29 for more results (semantic search)


Thanks its worked


It wasn't hard. My second query "minecraft speedrun" had 1 result.


That's another way of asking same question, I have chosen the one I would like to hear:)


I don't think anyone's mentioned it yet but I like the name!


Fun bug, searching for the letter e crashes the site

http://teclis.com/search?q=e


Or any other single character query.


It badly fails the "which countries are using ivermectin" test I use to evaluate search engines. Yandex does the best.


Seems like it drops keywords in some cases and returns results that only match part of the query. Please don't do this.


Doesn’t work when I try to search for 14.2 MacBook Pro reviews . Gave me some links to 2018 MacBooks . Might be the ublock filter.


Review sites are littered with advertising likely preventing any results from being indexed. It also doesn't help that most reviews are now in the form of video.



Does anyone know of a decent search engine that searches and shows results of only vintage type sites - you know, the ones built with the old school HTML tables or frontpage kind of stuff? Often, these are the most valuable in content, with less promotional bullshit like a random popup asking to signup for a newsletter or perhaps some dubious GDPR notice with all personal data collection toggles set to "off".


Try mojeek.com (disclaimer, I work there)

The index is 5bn+ pages and entirely independent as opposed to the Bing/Google offshoots.

With Google's algo tending to favour "mobile optimised" sites, I suspect a lot of older sites get buried.


where does the name come from?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: