Hacker Newsnew | past | comments | ask | show | jobs | submit | pbronez's commentslogin

"it is still a LLM and not a "pure" OCR"

When does a character model become a language model?

If you're looking at block text with no connections between letter forms, each character mostly stands on its own. Except capital letters are much more likely at the beginning of a word or sentence than elsewhere, so you probably get a performance boost if you incorporate that.

Now we're considering two-character chunks. Cursive script connects the letterforms, and the connection changes based on both the source and target. We can definitely get a performance boost from looking at those.

Hmm you know these two-letter groupings aren't random. "ng" is much more likely if we just saw an "i". Maybe we need to take that into account.

Hmm actually whole words are related to each other! I can make a pretty good guess at what word that four-letter-wide smudge is if I can figure out the word before and after...

and now it's an LLM.


Microsoft's Power Platform should be a big advantage. If you already have your data in Outlook/SharePoint, the PowerPlatform makes it easy to access. Unfortunately I've encountered several roadblocks deploying CoPilot Studio & Power Platform for my enterprise. Note: I'm using GCC, so everything is worse than normal.

1) Incomplete integration. Often I just want to write a prompt to create structured data from unstructured data. e.g. read an email and create a structured contact record. There's a block for this in Power Platform, but I can't access it. Studio can do this pretty well, but...

2) CoPilot Studio sucks at determinism. You really need to create higher level tools in Power Automate and call them from Studio. Because of (1) this makes it hard to compose complex systems.

3) Permissions. We haven't been able to figure out a secure way for people to share Copilot Studio agents. This means you need to log into studio and use the debug chat instead of turning the agent on in the main Copilot interface.

4) IDE. Copilot Studio bogs down real fast. The UI gets super laggy, creating a terrible DX. There should be a way to write agents in VScode, push the definitions to source control, and deploy to Copilot, but it isn't obvious.

5) Dumb By Default. The Power Platform has hooks into Outlook and Active Directory. Copilot has access to the latest OpenAI models. CoPIlot Studio has an MCP server for Calendar. Out of the box I should be able to tell CoPilot "schedule a 30min meeting with Joe and Larry next week." Nope. Maybe if I struggle through CoPilot Studio to create an agent? Still no. WTF Microsoft.

I guess I'll stop there. I really wanted to like Copilot studio, but it just didn't deliver. Maybe I'll circle back in a couple months, but for now I'm exploring other platforms.

PS don't even get me started on how we were so excited to retire our home-grown chat front end for the Azure OpenAI Service in favor of Copilot, only to have our users complain that Copilot was a downgrade.

PPS also don't talk to me about how CoPilot is now integrated into Windows and SIGNS YOU INTO THE FREE COMMERCIAL SERVICE BY DEFAULT. Do you know how hard it is to get people to use the official corporate AI tools instead of shadow AI? Do you know how important it is to keep our proprietary data out of AI training sets? Apparently not.


Interesting project, very dense post. I like the idea of a genuine personal search engine. You’d think that Windows and MacOS would do this well, but they really don’t.

Project GitHub is here https://github.com/eagledot/hachi


I have also been surprised that personal search engines are not a solved problem. “We” have actually known how to do decent search for a long time, including across images and the entire freaking internet for over two decades, but it’s not simple or commonplace to get a good semantic search interface for your own files, local or remote.

Chrome currently offers a semantic search across your browser history, but it’s buried. The major photo services allow for search across your photos. Windows and Mac have indexed keyword search across files, but the interface feels primitive.

I increasingly want a private search index across my browsing history, my photos, my notes/files, my voice recordings, GitHub projects, etc.

I thought a paid personalizable search engine like Kagi would be a good place to get/build a personalized internet search index on my browser history, but they don’t really offer the tools for that scale.

There are some enterprise search engines trying to solve this for orgs, so maybe I should be looking there?

I’m glad to see projects like Hachi, and am curious what others are doing or reaching for.


“Windows and Mac have indexed keyword search across files, but the interface feels primitive.”

The functionality is further obscured when (at least on windows) the local files results are intermingled with results from afar, which I guess are Bing.


For me it just doesn't work at all. I don't know why but every windows instance I've used since Win7 has not been able to find files even with the exact filename supplied. I don't disable the indexer. I can see it using CPU and disk resources but it just doesn't find anything relevant when I search. When I instead use Search Everything on Windows it works perfectly.

No money to be made in making your life easier in that way, therefore no KPI is generated for it's implementation.

plus that would also mean less incentives to upload personal data to their servers...

I don't know about macOS, but I've found Spotlight awesome since switching to an iPhone last year. The only issue I have is that some apps that I would really like to search don't index their data with it.

Reminds me of Danswer, actually. That’s an LLM-powered personal search engine. Looks like they’re making an enterprise play now.

https://danswer-website.vercel.app


Yeah, same. I get that people worry about Kagi staying profitable and alive. But honestly the T-shirt thing and the hub align with my understanding of their brand: Hard Way. Kagi takes the hard road that nobody else will. They never cheap out. They keep quality really, really high even when they could drive better margins by lowering their standards to follow the herd.

Now, if the company dies because of genuine financial mismanagement, I will be pissed. I rely on Kagi's search and AI offerings. For now, those core offerings keep getting better even with these side quests. For example, their Ki research assistant just left beta:

https://blog.kagi.com/kagi-assistants


Yup, this is a fantastic project and probably the most mature attempt at a global knowledge graph for contemporary news.


I wonder if it’s possible to implement anti-cheat as a USB stick. Your GabeCube or gaming PC would stay open by default, but you could buy an anti-cheat accessory that plugs into a free USB port. Connecting that device grants access to match making with other people who have the device.

There are several products that rely on a USB device like this for DRM solutions. It’s probably much easier to unlock static assets than validate running code, but I don’t have insight on the true complexity.


>I wonder if it’s possible to implement anti-cheat as a USB stick. Your GabeCube or gaming PC would stay open by default, but you could buy an anti-cheat accessory that plugs into a free USB port. Connecting that device grants access to match making with other people who have the device.

What does the USB stick actually do? The hard part of implementing the anti-cheat (ie. either invasive scanning or attestation with hardware root of trust) is entirely unaddressed, so your description is as helpful as "would it be possible to implement a quantum computer as a USB stick?"


If Homebrew auto-signed third-party code, that puts them on the line for the security of that code. The whole point of MacOS developer certificates is to increase the trustworthiness of the software you run on your machine. The trust comes from the formal relationship between Apple and the software developer, which includes a traceable financial transaction. If signed software proves to be malicious, attribution is trivial.

If the homebrew team signed everything, they would immediately become a target for bad actors. The bad actors would flood homebrew with malicious binaries, which homebrew would auto-sign, users would download & run, and the bad actors would laugh all the way to the bank.


Yeah, makes sense Homebrew doesn't sign everything with their own certs. I was suggesting that Homebrew could run codesign locally with the user's local certificate as part of the install process.

> The bad actors would flood homebrew with malicious binaries, which homebrew would auto-sign, users would download & run, and the bad actors would laugh all the way to the bank.

Every software distributor has this problem, code-signed or not. This is either already happening to Homebrew (and not using code signing) or there's some other reason that it isn't happening.


> The whole point of MacOS developer certificates is to increase the trustworthiness of the software you run on your machine.

Tim Cook is laughing all the way to the bank on that one


Optional anti-cheat could be really interesting. Make it a matchmaking option; let the players decide who they want to play with. This effectively makes "PC without Anti-cheat" a new platform in cross-platform match making.

I can imagine a whole scene popping up where everyone cheats to the max, creating whole new game modes.


This already existed in CS:GO, it was called Hack vs Hack. Private servers could choose whether to run anticheat or not. You'd see some with names like HvH and join to find people spinning in circles and comparing which aimbot was the most dominant.


> I can imagine a whole scene popping up where everyone cheats to the max, creating whole new game modes.

That would be very interesting. I also bet that people would start developing bots that play the game better than a human could and eventually it would essentially turn into digital BattleBots.


That’s a great observation. I’m hitting the same thing… yesterday’s hacks are today’s gospel.

My solution is decision documents. I write down the business problem, background on how we got here, my recommended solution, alternative solutions with discussion about their relative strengths and weaknesses, and finally and executive summary that states the whole affirmative recommendation in half a page.

Then I send that doc to the business owners to review and critique. I meet with them and chase down ground truth. Yes it works like this NOW but what SHOULD it be?

We iterate until everyone is excited about the revision, then we implement.


There are two observations I've seen in practice with decision documents: the first is that people want to consume the bare minimum before getting started, so such docs have to be very carefully written to surface the most important decision(s) early, or otherwise call them out for quick access. This often gets lost as word count grows and becomes a metric.

The second is that excitement typically falls with each iteration, even while everyone agrees that each is better than the previous. Excitement follows more strongly from newness than rightness.


Eventually you'll run into a decision that was made for one set of reasons but succeeded for completely different reasons. A decision document can't help there; it can only tell you why the decision was made.

That is the nature of evolutionary processes and it's the reason people (and animals; you can find plenty of work on e.g. "superstition in chickens") are reluctant to change working systems.


The audio is weirdly messed up


Yes, sorry about that. We had tech issues, and did the best we could with the audio that was captured.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: