> Lots of startups are launching new “vector databases”—which are effectively databases that are custom built to answer nearest-neighbour queries against vectors as quickly as possible.
> I’m not convinced you need an entirely new database for this: I’m more excited about adding custom indexes to existing databases. For example, SQLite has sqlite-vss and PostgreSQL has pgvector.
Do we still feel specialized vector databases are an overkill?
We have AWS promoting amazon OpenSearch as the default vector database for a RAG knowledge base and that service is not cheap.
Also I would like to understand a bit more about how to pre-process and chunk the data properly in a way that optimizes the vector embeddings, storage and retrieval ... any good guides on the same i cna refer to? Thanks!
1. It depends on how much embeddings we are talking about. Few millions, probably yes, 100s millions/Billions range? You likely need something custom.
2. Vectors are only one way to search for things. If your corpus contains stuff that don't carry semantic weight (think about part numbers) and you want to find the chunk that contains that information you'll likely need something that uses tf-idf.
3. Regarding chunk size, it really depends on your data and the queries your users will do. The denser the content the smaller the chunk size.
4. Preprocessing - again, it depends. If it's PDFs with just texts, try to remove footers / headers from the extracted text. Of it contains tables, look at something like table former to extract a good html representation. Clean up other artifacts from the text (like dashes for like breaking, square brackets with reference numbers for scientific papers, ... ).
I had the same idea, but now I a Postgres database that has very high latency for simple queries because the CPU is busy building large HNSW indexes.
My impression is that it might be best to do vector index construction separately from the rest of the data, for performance reasons. It seems vector indexes are several orders of magnitude more compute intensive than most other database operations.
Using a built-in vector extension is convenient if you want to integrate vector similarity (“semantic search”) with faceted search. Some vector stores (e.g., qdrant) support vector attributes that can be matched against, though.
As mentioned by another comment, an advantage of using a separate vector store (on different hardware) is that (re-)building vector indices can cause high CPU load and therefore latency for regular queries to go up.
RAGs are the ControlNet of image diffusion. They exist for many reasons, some of those are that context windows are small, instruct-style frontier models haven’t been adequately trained on search tasks, and reason #1: people say they need RAGs so an industry sprouts up to give it to them.
Do we need RAGs? I guess for now yes, but in the near future no: 2/3 reasons will be solved by improvements to frontier models that are eminently doable and probably underway already. So let’s answer the question for controlnets instead to illuminate why just because someone asks for something, doesn’t mean it makes any sense.
If you’re Marc Andreesen and you call Mike Ovitz, your conversation about AI art generation is going to go like this: “Hollywood people tell me that they don’t want the AI to make creative decisions, they want AI for VFX or to make short TikTok videos or” something something, “the most important people say they want tools that do not obsolete them.” This trickles down to the lowly art director, who may have an art illustration background but who is already sending stuff overseas to be done in something that resembles functionally a dehumanized art generator. Everybody up and down this value chain has no math or English Lit background so to them, the simplest, most visual UX that doesn’t threaten their livelihood is what they want: Sketch To Image.
Does Sketch to image make sense? No. I mean it makes sense for people who cannot be fucked to do the absolutely minimal amount of lift to write prompts, which is many art professionals who, for the worse, have adopted “I don’t write” as an identity, not merely some technical skill specialization. But once you overcome this incredibly small obstacle of writing 25 words to Ideogram instead of 3 words to Stable Diffusion, it’s obvious: nobody needs to draw something and then have a computer finish it. Of course it’s technologically and scientifically tractable to have all the benefits of controlnets like, well control and consistency, but with ordinary text. But people who buy software want something that is finished, they are not waiting around for R&D projects. They want some other penniless creative to make a viral video using Ideogram or they want their investor’s daughter’s heir boyfriend who is their boss to shove it down their throats.
This is all meant to illustrate that you should not be asking people who don’t know anything what technology they want. They absolutely positively will say “faster horses.” RAGs are faster horses!
I’m baffled by your remarks about sketch-to-image. Maybe I’m missing something in what you mean, but sketch-to-image, to various degrees of completion (that is, it’s more than just sketch-to-image), lets you guide things in a way that prompts simply don’t. It’s rather useful if you’re trying to compose something specific, and plenty useful in other scenarios too.
Does sketch-to-image make sense? Um… yes? Absolutely? I genuinely don’t understand how you could even deny it, let alone act as though that’s the obvious answer?
why does this site accept Github login ONLY? why not a simple email sign-up or other options?
I am rather wary about sharing my Github login/profile with third-party sites -- just keeping a separation between the profile that has links to many other automations and deployments
Someone else may not have a github profile for many reasons
Appears to be poor system design, but also clever discover and exploit. The amount involved here is meager (less than USD 3000). Hopefully the vulnerability is now patched otherwise this news just published a manual for millions of opportunities.
About finding fascinating hand scribbled notes from books of the past ...
I discovered this in a scanned collection of old books maintained by our government ministry of culture ...
there are comments on some pages (see pages 8, 11, 17 for example) including the cover. in some places the scribbler who seems be an authority on the subject himself expresses some rebuttals/commentary on the text's shortcomings
PS: this was the first time I stumbled across, and discovered that "Gentoo" was another name used by the British for the "Telugu" language of Southern India. I am a native speaker of Telugu and I never knew this fact.
often the estimates need to be competitive or bottom most for you to win the contract
no one acknowledges the unknown and risks at the beginning of the project
writing down more than 4-5 risks along with the bid amount is taken as a sign of a team that will not be easy to work with and fight over every bullet point whether something is in scope or a change request
and often the one who estimates and wins the project may not be on the actual project team with delivery accountability
I like that it supports background playback. I "installed" it as an app by clicking on "add to Home Screen" on my Chrome on Android.
The touch target to select a dot seems finicky on mobile. Had to tap multiple times to select a station on the map.
Some names/descriptions are too wide and result in the play button rendering offscreen.
The time displayed is device local TZ. Can we optio ally display the radio station city's TZ?
Tangential Idea:
Can I pick 3-4 favorites to cycle through automatically every N minutes.
How about using AI to detect ads and or blabbering by RJ hosts between music tracks ..and use it to skip to another station. Of course this should only happen in music mode and not skip useful content like talk shows and news.
Privacy and security risks of the future loom big.
reply