More

olup · 2025-05-06T10:34:18 1746527658

I am a father of two, and I could not have penned that any better.

olup · 2025-02-19T06:49:13 1739947753

It's Counter Strike of course

vaylian · 2025-02-19T14:43:41 1739976221

Common Sense

olup · 2025-02-07T21:34:50 1738964090

Famous french radio program about lives and experience, like the moth meets Bourdieu. For this episode, they wrote and made the voices all in ai, relating to the Paris ai submit. The episode is used to trigger réflexions about gen ai.

olup · 2025-01-14T09:12:00 1736845920

On the prompt side, it's very simple, and can probably be done in a variety of ways. How we did it is to prepare a prompt with multiple "user" messages. The first one gives the instruction

you are given a reference and three candidates, which one of the candidates do you think is a match to the reference? Only output its identifier or a code when none is found

Not exactly that but something along those lines.

Then one "user" message per car (reference + candidates) with image + text indicating the type (reference or candidate) and an identifier (can be as simple as the index for the candidates).

olup · 2025-01-14T08:31:11 1736843471

Poster here. We would have loved that, and it was one of our first proposal - a QR code or some kind of marker. However, the client is understandably very controlling on the aesthetics of their wall as a central element of their scenography. We would have pushed for it again in the last resort, but would probably have lost the contract.

psandor · 2025-01-14T16:38:32 1736872712

This is completely offtopic, but I would bet it was a government-funded museum. A reasonable institution would have worked with you to find an acceptable compromise, something much easier to implement with a small sacrifice of aesthetics.

Anyway, great work, and thank you for taking the time to share it!

achierius · 2025-01-14T17:04:19 1736874259

Really? I would much less expect a government museum to be particular about aesthetics. Privately run museums/collections/exhibitions on the other hand tend to have very finicky owners -- after all, they're putting up their own money to achieve their vision, and so of course they tend to not want to compromise on how it might look.

olup · 2025-01-10T21:02:32 1736542952

First time for me posting this kind of story - I thought it would make an interesting case on solving a hard computer vision problem with a crafty product engineer team.

caioariede · 2025-01-13T21:50:24 1736805024

Just a small feedback… I have switched to the reader mode because the font used is very challenging to read for me.

littlestymaar · 2025-01-13T22:31:44 1736807504

Also, having a blog post about image detection, and not showing a single picture in the whole post was quite frustrating.

Oarch · 2025-01-13T22:38:26 1736807906

Especially given the detailed description surely the author could just generate a similar image

bl4ckneon · 2025-01-13T23:32:45 1736811165

Just thinking that. Spend a few minutes trying to have chatgpt generate some images with Dall-E 3. Flux would probably be better to get all the specific details but ya

yannis · 2025-01-14T02:38:17 1736822297

Thanks for sharing. Interesting approach. As other commenters mentioned, article could do well with some hypothetical images. Maybe on a follow-up blog post? Also since you mentioning your Company's name you missing an opportunity for marketing by not providing a link.

idkman_oops · 2025-01-14T10:28:33 1736850513

Can you tell me what font is this?

martin_a · 2025-01-14T11:57:13 1736855833

One of these:

> ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace

nmeofthestate · 2025-01-14T17:44:54 1736876694

The single-character-width 'fi' ligature is quite jarring in a mono-spaced font.

olup · 2024-12-27T21:06:04 1735333564

I just loved the wooden spirograph thing I got my 5yo daughter for Christmas (she does too, what fun). But then I thought making it an app to start exploring how those shapes work with her. And because it's 2024 I just asked an AI (here bolt.new) to build it, and refine by prompting. Thought someone else might enjoy it.

olup · 2024-12-18T19:42:24 1734550944

I use supermaven and cline with my own API key, a setup superior to cursor imo. Tried to go back to gh copilot yesterday but couldn't bear it for a full workday, and reverted to my previous arrangement.

kgilpin · 2024-12-19T03:24:38 1734578678

FYI supermaven has joined Cursor

https://www.cursor.com/blog/supermaven

jasondjk · 2024-12-18T20:31:02 1734553862

This looks really interesting, cursor has been way better than copilot for me but supermaven looks great. I went down the rabbit hole with: https://www.youtube.com/watch?v=zLQuBSuzu2w&t=661s

Your setup sounds interesting. What sort of API key do you use?

olup · 2024-12-18T21:19:55 1734556795

We have an openai account for the company, so I mainly use gpt4o or 4o mini with supermaven and cline. I think Claude 3.5 works even better.

olup · 2024-12-13T05:34:30 1734068070

I am interested, but why should I use this one over jina ai reader (which is also free) or firecrawl, or the ten other puppeteer + readability + turndown pipeline (or even a AWS lambda doing the same) ? This is not sarcastic I am genuinely looking for something fresh in the field.

sachou · 2024-12-13T05:39:20 1734068360

do you need to embed it directly in pinecone ?

If yes then DataFuel is the right choice. Adding this feature as we speak.

Please let me know :)

olup · 2024-12-13T06:03:35 1734069815

Interesting but we process documents before embedding them, and have specific requirements for the embedder.

Having developed a couple of page to markdown myself, I think the bigger challenge is to make sense of so many pages that rely on spacial organisation of information that only makes sense to human, or even presence of images. One way to do it is to render the page as an image and extract data with a vision llm. But you do need heuristic on when to do classic extraction and when to use vision, plus get rid of cookie banner and overlays. This is more complex and costly, but have real business value, for the one that can pull it off.

sachou · 2024-12-13T06:07:22 1734070042

what would be your specific requirement?

Right now adding chunk size, model for embedding, what else?

Image is a great challenge with OCR can be solve as you mentioned

olup · 2024-12-13T06:12:01 1734070321

We, as many players, have custom pipelines on embedding. We don't split docs based on chunk size but do semantic chunking and chunk augmentation. We embed everything with two embeddings services to always have a fallback if one provider is not available.

If I were in your shoes I would not think embedding and inserting in a vector store would be my responsibility, especially since there are so many different stores on the market.

olup · 2024-11-26T20:10:29 1732651829

They say the models are under 3b parameters. If only for voice generation it sounds pretty good, no ?