Hacker News new | past | comments | ask | show | jobs | submit | olup's comments login

I am a father of two, and I could not have penned that any better.

It's Counter Strike of course


Common Sense


Famous french radio program about lives and experience, like the moth meets Bourdieu. For this episode, they wrote and made the voices all in ai, relating to the Paris ai submit. The episode is used to trigger réflexions about gen ai.


On the prompt side, it's very simple, and can probably be done in a variety of ways. How we did it is to prepare a prompt with multiple "user" messages. The first one gives the instruction

you are given a reference and three candidates, which one of the candidates do you think is a match to the reference? Only output its identifier or a code when none is found

Not exactly that but something along those lines.

Then one "user" message per car (reference + candidates) with image + text indicating the type (reference or candidate) and an identifier (can be as simple as the index for the candidates).


Poster here. We would have loved that, and it was one of our first proposal - a QR code or some kind of marker. However, the client is understandably very controlling on the aesthetics of their wall as a central element of their scenography. We would have pushed for it again in the last resort, but would probably have lost the contract.


This is completely offtopic, but I would bet it was a government-funded museum. A reasonable institution would have worked with you to find an acceptable compromise, something much easier to implement with a small sacrifice of aesthetics.

Anyway, great work, and thank you for taking the time to share it!


Really? I would much less expect a government museum to be particular about aesthetics. Privately run museums/collections/exhibitions on the other hand tend to have very finicky owners -- after all, they're putting up their own money to achieve their vision, and so of course they tend to not want to compromise on how it might look.


First time for me posting this kind of story - I thought it would make an interesting case on solving a hard computer vision problem with a crafty product engineer team.


Just a small feedback… I have switched to the reader mode because the font used is very challenging to read for me.


Also, having a blog post about image detection, and not showing a single picture in the whole post was quite frustrating.


Especially given the detailed description surely the author could just generate a similar image


Just thinking that. Spend a few minutes trying to have chatgpt generate some images with Dall-E 3. Flux would probably be better to get all the specific details but ya


Thanks for sharing. Interesting approach. As other commenters mentioned, article could do well with some hypothetical images. Maybe on a follow-up blog post? Also since you mentioning your Company's name you missing an opportunity for marketing by not providing a link.


Can you tell me what font is this?


One of these:

> ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace


The single-character-width 'fi' ligature is quite jarring in a mono-spaced font.


I just loved the wooden spirograph thing I got my 5yo daughter for Christmas (she does too, what fun). But then I thought making it an app to start exploring how those shapes work with her. And because it's 2024 I just asked an AI (here bolt.new) to build it, and refine by prompting. Thought someone else might enjoy it.


I use supermaven and cline with my own API key, a setup superior to cursor imo. Tried to go back to gh copilot yesterday but couldn't bear it for a full workday, and reverted to my previous arrangement.


FYI supermaven has joined Cursor

https://www.cursor.com/blog/supermaven


This looks really interesting, cursor has been way better than copilot for me but supermaven looks great. I went down the rabbit hole with: https://www.youtube.com/watch?v=zLQuBSuzu2w&t=661s

Your setup sounds interesting. What sort of API key do you use?


We have an openai account for the company, so I mainly use gpt4o or 4o mini with supermaven and cline. I think Claude 3.5 works even better.


I am interested, but why should I use this one over jina ai reader (which is also free) or firecrawl, or the ten other puppeteer + readability + turndown pipeline (or even a AWS lambda doing the same) ? This is not sarcastic I am genuinely looking for something fresh in the field.


do you need to embed it directly in pinecone ?

If yes then DataFuel is the right choice. Adding this feature as we speak.

Please let me know :)


Interesting but we process documents before embedding them, and have specific requirements for the embedder.

Having developed a couple of page to markdown myself, I think the bigger challenge is to make sense of so many pages that rely on spacial organisation of information that only makes sense to human, or even presence of images. One way to do it is to render the page as an image and extract data with a vision llm. But you do need heuristic on when to do classic extraction and when to use vision, plus get rid of cookie banner and overlays. This is more complex and costly, but have real business value, for the one that can pull it off.


what would be your specific requirement?

Right now adding chunk size, model for embedding, what else?

Image is a great challenge with OCR can be solve as you mentioned


We, as many players, have custom pipelines on embedding. We don't split docs based on chunk size but do semantic chunking and chunk augmentation. We embed everything with two embeddings services to always have a fallback if one provider is not available.

If I were in your shoes I would not think embedding and inserting in a vector store would be my responsibility, especially since there are so many different stores on the market.


They say the models are under 3b parameters. If only for voice generation it sounds pretty good, no ?


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: