Hacker Newsnew | past | comments | ask | show | jobs | submit | whiddershins's favoriteslogin

I did this and never looked back.

It’s called a “WebView app” and you can get a really good experience on all platforms using them. Just:

- don’t make any crazy decisions on your fundamental UI components, like breadcrumbs, select dropdowns, etc

- add a few platform-specific specialisations to those same components, to make them feel a bit more familiar, such as button styling, or using a self-simplifying back-stack on Android

- test to make sure your webview matches the native browser’s behaviour where it matters. For example, sliding up the view when the keyboard is opened on mobile, navigating back & forth with edge-swipes on iOS, etc

I also went the extra step and got service workers working, for a basic offline experience, and added a native auto network diagnostic tool that runs on app startup and checks “Can reach local network” “Can reach internet (1.1.1.1)” “Can resolve our app’s domain” etc etc, so users can share where it failed to get quicker support. But this is an app for small-to-medium businesses, not consumer-facing, and the HTML5 part is served from the server and cached. I haven’t thought much about what you might need to do additionally for a consumer app, or a local-first app.


In SQLite, transactions by default start in “deferred” mode. This means they do not take a write lock until they attempt to perform a write.

You get SQLITE_BUSY when transaction #1 starts in read mode, transaction #2 starts in write mode, and then transaction #1 attempts to upgrade from read to write mode while transaction #2 still holds the write lock.

The fix is to set a busy_timeout and to begin any transaction that does a write (any write, even if it is not the first operation in the transaction) in “immediate” mode rather than “deferred” mode.

https://zeroclarkthirty.com/2024-10-19-sqlite-database-is-lo...


https://ai.google.dev/gemma/docs/gemma-3n#parameters

> Gemma 3n models are listed with parameter counts, such as E2B and E4B, that are lower than the total number of parameters contained in the models. The E prefix indicates these models can operate with a reduced set of Effective parameters. This reduced parameter operation can be achieved using the flexible parameter technology built into Gemma 3n models to help them run efficiently on lower resource devices.

> The parameters in Gemma 3n models are divided into 4 main groups: text, visual, audio, and per-layer embedding (PLE) parameters. With standard execution of the E2B model, over 5 billion parameters are loaded when executing the model. However, using parameter skipping and PLE caching techniques, this model can be operated with an effective memory load of just under 2 billion (1.91B) parameters, as illustrated in Figure 1.


> Since my PKMS is hosted online to manage notes across devices, I have multiple layers of security to ensure my notes are kept private. {Screenshot of a login form}

The biggest life hack I can recommend for a self hoster is to set up a VPN on your local network and then just never expose your services on the public internet unless you're specifically trying to serve people outside your own household.

Before I did this I was constantly worried about the security implications of each app I thought about installing or creating. Now it's not even worth setting up auth on a lot of simple services I build because if someone is able to hit their endpoints I'm already in deep trouble for many other reasons.


These days, I usually paste my entire (or some) repo into gemini and then APPLY changes back into my code using this handy script i wrote: https://github.com/asadm/vibemode

I have tried aider/copilot/continue/etc. But they lack in one way or the other.


I use them as follows:

o1-pro: anything important involving accuracy or reasoning. Does the best at accomplishing things correctly in one go even with lots of context.

deepseek R1: anything where I want high quality non-academic prose or poetry. Hands down the best model for these. Also very solid for fast and interesting analytical takes. I love bouncing ideas around with R1 and Grok-3 bc of their fast responses and reasoning. I think R1 is the most creative yet also the best at mimicking prose styles and tone. I've speculated that Grok-3 is R1 with mods and think it's reasonably likely.

4o: image generation, occasionally something else but never for code or analysis. Can't wait till it can generate accurate technical diagrams from text.

o3-mini-high and grok-3: code or analysis that I don't want to wait for o1-pro to complete.

claude 3.7: occasionally for code if the other models are making lots of errors. Sometimes models will anchor to outdated information in spite of being informed of newer information.

gemini models: occasionally I test to see if they are competitive, so far not really, though I sense they are good at certain things. Excited to try 2.5 Deep Research more, as it seems promising.

Perplexity: discontinued subscription once the search functionality in other models improved.

I'm really looking forward to o3-pro. Let's hope it's available soon as there are some things I'm working on that are on hold waiting for it.


For any given thing or category of thing, a tiny minority of the human population will be enthusiasts of that thing, but those enthusiasts will have an outsize effect in determining everyone else's taste for that thing. For example, very few people have any real interest in driving a car at 200 MPH, but Ferraris, Lamborghinis and Porsches are widely understood as desirable cars, because the people who are into cars like those marques.

If you're designing a consumer-oriented web service like Netflix or Spotify or Instagram, you will probably add in some user analytics service, and use the insights from that analysis to inform future development. However, that analysis will aggregate its results over all your users, and won't pick out the enthusiasts, who will shape discourse and public opinion about your service. Consequently, your results will be dominated by people who don't really have an opinion, and just take whatever they're given.

Think about web browsers. The first popular browser was Netscape Navigator; then, Internet Explorer came onto the scene. Mozilla Firefox clawed back a fair chunk of market share, and then Google Chrome came along and ate everyone's lunch. In all of these changes, most of the userbase didn't really care what browser they were using: the change was driven by enthusiasts recommending the latest and greatest to their less-technically-inclined friends and family.

So if you develop your product by following your analytics, you'll inevitably converge on something that just shoves content into the faces of an indiscriminating userbase, because that's what the median user of any given service wants. (This isn't to say that most people are tasteless blobs; I think everyone is a connoisseur of something, it's just that for any given individual, that something probably isn't your product.) But who knows - maybe that really is the most profitable way to run a tech business.


The Vercel AI SDK abstracts against all LLMs, including locally running ones. It even handles file attachments well, which is something people are using more and more.

https://sdk.vercel.ai/docs/introduction

It uses zod for types and validation, I've loved using it to make my apps swap between models easily.


People are sticking up for LLMs here and that's cool.

I wonder, what if you did the opposite? Take a project of moderate complexity and convert it from code back to natural language using your favorite LLM. Does it provide you with a reasonable description of the behavior and requirements encoded in the source code without losing enough detail to recreate the program? Do you find the resulting natural language description is easier to reason about?

I think there's a reason most of the vibe-coded applications we see people demonstrate are rather simple. There is a level of complexity and precision that is hard to manage. Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping.


If anyone here hasn't read Borges, I'd strongly recommend him. Pretty much everything he wrote was short, <20 pages, and so it's really easy to sit down and read one of his stories over a lunch break. The common recommendation would be to try out Tlön, Uqbar, Orbis Tertius and see if you like it. If so, it's part of Labyrinths, which is (in my opinion) his best collection of short stories. The best edition in English is probably Penguin's Collected Fictions.

Regarding the content of this interview:

>If you compiled an enormous dataset of everything Borges read, and combined it with an exquisitely sensitive record of every sensory experience he ever had, could you create a Borges LLM?

This is my Kantian way of thinking about epistemology, but I don't think that LLMs can create synthetic a priori knowledge. Such knowledge would be necessary to create Borges out of a world without Borges.

In this interview, Simon's view feels much more like the way Hume viewed people as mechanical "bundles of sensations" rather than possessing a transcendent "self". This led to his philosophical skepticism, which was (and still is I guess) a philosophical dead end for a lot of people. I think such epistemological skepticism is accurate when applied to machines, at least until some way of creating synthetic a priori knowledge is established (Kant did so with categories for humans, what would the LLM version of this be?)



Or as the developer you can play some silent audio in the background via an `<audio>` element: https://github.com/donbrae/onscreen-piano-keyboard/blob/main.... This will ensure the Web Audio API produces sound even with the ‘silent’ switch active.

If anyone wants to delve into machine learning, one of the superb resources I have found is, Stanfords "Probability for computer scientists"(https://www.youtube.com/watch?v=2MuDZIAzBMY&list=PLoROMvodv4...).

It delves into theoretical underpinnings of probability theory and ML, IMO better than any other course I have seen. (Yeah, Andrew Ng is legendary, but his course demands some mathematical familarity with linear algebra topics)

And of course, for deep learning, 3b1b is great for getting some visual introduction (https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQ...).


> Am I right in assuming that this works only with local text files

One of the screen shots shows a .xlsx in the “Temporary Resources” area.

Also: I haven’t checked, but for a “Local-first” app, I would expect it to leverage Spotlight text importers from the OS, and run something like

  mdimport -t -d3 *file*
on files it can’t natively process.

some other alternatives (a little more mature / feature rich):

anythingllm https://github.com/Mintplex-Labs/anything-llm

openwebui https://github.com/open-webui/open-webui

lmstudio https://lmstudio.ai/


Until you hit scale, the database you're already using is fine. If that's Postgres, look up SELECT FOR UPDATE SKIP LOCKED. The major convenience here - aside from operational simplicity - is transactional task enqueueing.

For hosted, SQS or Google Cloud Tasks. Google's approach is push-based (as opposed to pull-based) and is far and above easier to use than any other queueing system.


Always telling this whenever the topic of Kalman Filters come up:

If you're learning the Kalman Filter in isolation, you're kind of learning it backwards and missing out on huge "aha" moments that the surrounding theory can unlock.

To truly understand the Kalman Filter, you need to study Least Squares (aka linear regression), then recursive Least Squares, then the Information Filter (which is a different formulation of the KF). Then you'll realize the KF is just recursive Least Squares reformulated in a way to prioritize efficiency in the update step.

This PDF gives a concise overview:

[1] http://ais.informatik.uni-freiburg.de/teaching/ws13/mapping/...


I tried hoarder and I didn't like the way listed view works. I prefer the simplicity of the view provided by Linkding. I find hoarder new auto tagging with ollama something I want to use because I am lazy.

For references there are many options in selfhosted bookmarking apps market. These beside Hoarder are the most known software.

Linkwarden (https://github.com/linkwarden/linkwarden)

Shaarli (https://github.com/shaarli/Shaarli)

LinkAce (https://www.linkace.org/)

Linkding (https://github.com/sissbruecker/linkding)

Wallabag (https://wallabag.org/)

Shiori (https://github.com/go-shiori/shiori)


I run a label that has direct deals with certain major DSPs. We do over a billion streams a year.

The entire “wellness” music category is programming driven. Much of my energy is spent building and maintaining relationships with the programmers, even with our direct deals. We take a reduced payout on the master side in return for preferential treatment on playlist positions.

I have an active roster of extremely talented producers. It’s a volume play. I’ve made tracks that I’m quite proud of in 90 minutes that have done 20+ million streams.

It’s a wild system but we’ve made it work. Not really a critique or an endorsement - just making a living making music.

Edit: fun fact, Sleep Sounds is generally the #1 streamed playlist on the entire Apple Music platform.


One of the classics and must-reads in music technology.

I read it over and over again when I was building: https://glicol.org/

One of the motivations for building Glicol is to quickly let more people understand sound synthesis and music programming in the browser.

also recommand:

Designing Audio Effect Plugins in C++ by Will Pirkle

Audio Effects Theory, Implementation and Application By Joshua Reiss, Andrew McPherson

And all the books by JULIUS O. SMITH III https://ccrma.stanford.edu/~jos/filters/Book_Series_Overview...


This seems highly relevant: https://arxiv.org/abs/2406.01506

> In this paper, we study the two foundational questions in this area. First, how are categorical concepts, such as {'mammal', 'bird', 'reptile', 'fish'}, represented? Second, how are hierarchical relations between concepts encoded? For example, how is the fact that 'dog' is a kind of 'mammal' encoded? We show how to extend the linear representation hypothesis to answer these questions. We find a remarkably simple structure: simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal in a sense we make precise, and (in consequence) complex concepts are represented as polytopes constructed from direct sums of simplices, reflecting the hierarchical structure.

Basically, LLM's already partially encode information as semantic graphs internally.

With this it is less surprising that augmenting them with external knowledge graphs has a lower ROI.


Great resource! For those interested in learning the fundamentals of audio programming, I highly recommend starting with Rust.

the cpal library in Rust is excellent for developing cross-platform desktop applications. I'm currently maintaining this library:

https://github.com/chaosprint/asak

It's a cross-platform audio recording/playback CLI tool with TUI. The source code is very simple to read. PRs are welcomed and I really hope Linux users can help to test and review new PRs :)

When developing Glicol(https://glicol.org), I documented my experience of "fighting" with real-time audio in the browser in this paper:

https://webaudioconf.com/_data/papers/pdf/2021/2021_8.pdf

Throughout the process, Paul Adenot's work was immensely helpful. I highly recommend his blog:

https://blog.paul.cx/post/profiling-firefox-real-time-media-...

I am currently writing a wasm audio module system, and hope to publish it here soon.


On mac/iOS, you get this using the AVAudioEngine API if you set voiceProcessingEnabled to true on the input node. It corrects for audio being played from all applications on the device.

I had the same problem until I realized that bookmarks are more or less pointless, especially without context or additional information.

Most of the time I used bookmarks to either "read it later" or to "look it up again". In both cases I often desperately tried to find a specific bookmark in a collection of unsorted crap. After organizing them into topics / folders / subfolders, it worked a bit better, but I still had the same problem: How to find the one piece of information I'm looking for in a whole bunch of unsorted content? And what if that has changed or become unavailable somehow (except via archive.org).

These days I use a knowledge collection (flatnotes, but Notion, Obsidian or Logseq would work just fine). If a bookmark is only short content (let's say an interesting git solution or a quick fix for something), I immedeately "extract" this piece of information into my knowledge base in a well organized markdown with headlines including a reference to the URL. If it is a "read later" thing, I invest at least 2 Minutes to write a short summary (2 sentences), what the article is about and what I expect it to help my with.

This has proven to work MUCH better to organize "bookmarks" and informational content. Bookmarks are just for backup now.


RSS still leads to a lot of wasted time. Being in charge of your own feed does not mean it will remain lean and sweet.

The ideal of browsing your RSS feed in the morning cup of coffee in hand just like your father read the morning paper and then got on with his day is mostly just a fantasy.

What happens in reality is that you more or less quickly ammass a big list of blogs and sources that gets updated by the hour and you end up either ignoring most of the entries or you check on them multiple times a day and you waste your time sieving through a stream of countless new content


I've been through this thought-process many times.

1. Google isn't working well any more.

2. Therefore bring humans back into the system of flagging good and bad pages.

3. But the internet is too big - so we have to distribute the workload.

4. Oh, a distributed trust-based system at scale... it's going to be game-able by people with a financial incentive.

5. Forget it.

---

Edit: it's probably worth adding that whoever can solve the underlying problem of trust on the internet -- as in, you're definitely a human, and supported by this system I will award you a level of trust -- could be the next Google. :)


So let's imagine this is how it went down:

1. The person impresses everyone during the interview process with their skills and potential.

2. Initially, the person doesn’t meet the expected productivity levels. They give various plausible reasons and commit to improving.

3. Time passes but there is no significant improvement in their output. Management invests time and resources in identifying possible support and interventions to help enhance their performance.

4. It’s eventually discovered that the person is only dedicating 2-3 hours per day to their role with your company, instead of the agreed 8 hours, because they are simultaneously pulling the same scam with two other companies.

In this situation, I don't think it would be rational for a manager to try and work out a suitable working relationship. The person has already shown they are dishonest and cannot be trusted. This not about envy or anger, but about using past behaviour to predict future behaviour.


I’m curious to try it out. There seem to be many options to upload a document and ask stuff about it.

But, the holy grail is an LLM that can successfully work on a large corpus of documents and data like slack history, huge wiki installations and answer useful questions with proper references.

I tried a few, but they don’t really hit the mark. We need the usability of a simple search engine UI with private data sources.


I'd at least think that quadrature samples would be preferred as they offer instantaneous phase information. I dont think there is anything to be gained by forcing the model to derive this information from the time series data when the computation is so straightforward. Instead of a 48kHz stream of samples you feed it a 24KHz stream of I&Q samples; nothing to it.

I would draw an analogy here between NeRF and Gaussian Splatting -- like ok its great that we can get there with a NN but theres no reason to do that after you have figured out how to optimally compute the result you were after.

I also believe that granular synthesis is a deep well to draw from in this area of research.


Are you a musician? Have you ever used DAW like Cubase or Pro Tools? If not, have you ever tried the FOSS (GPLv3) Audacity audio editor [1]. Waves and Waveforms are colloquial terminology, so the terms are familiar to anyone in the industry as well as your average hobbyist.

Additionally, PCM [2] is at the heart of many of these tools, and is what is converted between digital and analog for real-world use cases.

This is literally how the ear works [3], so before arguing that this is the "worst possible representation of signal state," try listening to the sounds around you and think about how it is that you can perceive them.

[1] https://manual.audacityteam.org/man/audacity_waveform.html [2] https://en.wikipedia.org/wiki/Pulse-code_modulation [3] https://www.nidcd.nih.gov/health/how-do-we-hear


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: