Hacker Newsnew | past | comments | ask | show | jobs | submit | renjimen's commentslogin

There's a large difference between reading a book and reading social media posts or news articles: books require hours of concentration to consume, have long-spanning arcs of plot or other structure, and require significant use of our imagination (particularly fiction). You just don't get those three things from any other form of written media.

Signed in just to a agree with this. Over the years, I have seen many people bring up some trite form of online reading in an attempt to raise it to the level of what we typically think of when referring to book reading. It's a stance I really can't take seriously, both as a concept and due to personal experience. "The C Programming Language" did not give me vivid dreams or boost my writing skills, general creativity, and vocabulary when I first read it. A mid-level sci-fi book from the 80s, when I haven't done any for-fun reading in a while, does. There is clearly a cognitive difference between the very act of consuming each of these. In comparison, social media posts or news articles are far closer on the broad spectrum of reading to the the like of road signs and subscriber agreements. The mechanical act of reading is simply not equivalent over the broad range of content that can be read, and so its effects on the mind are also not equivalent. Related? Sure, but only by the fundamental understanding of symbols and language. Apologies for the rant.

One of their primary motivations was understanding changes in literacy, so it makes sense they wouldn't include audiobooks (or podcasts).

They do include audiobooks

Ah, I misunderstood the parent then :)

Seems unintuitive to include audiobooks if they're interested in literacy though (literally their first motivation).


Outer Wilds is amazing if you like puzzles and/or sci fi.


Noted. Is on my definite list now :-)


Seconded. And play Half Life 1 and 2. They’re arguably the best single player FPS games of all time.


I watched couple of Half life videos/movies and I found the story line engaging. I haven't tried playing the games though. I guess next step will be that, in case they support controllers as I don't use mouse.


But there’s an opportunity cost that needs to be factored in when waiting for a stronger signal.


One solution is to gradually move instances to you most likely solution.

But continue a percentage of A/B/n testing as well.

This allows for a balancing of speed vs. certainty


do you use any tool for this, or simply crunk up slightly the dial each day


There are multi armed bandit algorithms for this. I don’t know the names of the public tools.

This is especially useful for something where the value of the choice is front loaded, like headlines.


We've used this Python package to do this: https://github.com/bayesianbandits/bayesianbandits


There is but you can decide that up front. There’s tools that will show you how long it’ll take to get statistical significance. You can then decide if you want to wait that long or have a softer p-value.


Even if you have to be honest with yourself about how much you care about being right, there’s still a place for balancing priorities. Two things can be true at once.

Sometimes someone just has to make imperfect decisions based on incomplete information, or make arbitrary judgment calls. And that’s totally fine… But it shouldn’t be confused with data-driven decisions.

The two kinds of decisions need to happen. They can both happen honestly.


You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.


An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)

DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.


The imbalance between content and consumers on the internet is huge and just getting larger with AI. My advice having just started creative writing: Don't publish on the internet. Share with family, friends and colleagues if you want. Heck, even share with an LLM. But, if even a small part of why you create is for internet points with random strangers, then you're not going to get as much meaning out of it and you'll end up disappointed (even when you do get some internet points).


The speed this can build makes me think software is soon to become a lot more fluid than our traditional iterative approach. Apps could ship minimal and build whatever else they need to at the user’s behest.


The challenge for LLMs over the next year is to get them to operate on large data sets/code bases with millions/billions of tokens through some kind of distributed hierarchical framework, with each LLM operating on a local set of 20k or whatever subset of tokens.


any reading?


I’m just a user, trying out the models first hand on a large project, learning as I go.


Xarray is great. It marries the best of Pandas with Numpy.

Indexing like `da.sel(x=some_x).isel(t=-1).mean(["y", "z"])` makes code so easy to write and understand.

Broadcasting is never ambiguous because dimension names are respected.

It's very good for geospatial data, allowing you to work in multiple CRSs with the same underlying data.

We also use it a lot for Bayesian modeling via Arviz [1], since it makes the extra dimensions you get from sampling your posterior easy to handle.

Finally, you can wrap many arrays into datasets, with common coordinates shared across the arrays. This allows you to select `ds.isel(t=-1)` across every array that has a time dimension.

[1] https://www.arviz.org/en/latest/


As a data scientist, my main gripe with all these AI-centric IDEs is that they don’t provide data centric tools for exploring complex data structures inherent to data science. AI cannot tell me about my data, only my code.

I’ll be sticking with VSCode until:

- Notebooks are first class objects. I develop Python packages but notebooks are essential for scratch work in data centric workflows

- I can explore at least 2D data structures interactively (including parquet). The Data Wrangler in VSCode is great for this


I saw recently a framework that was interacting directly with notebook. But I forgot what was it. Everyday there is a new thing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: