I have done something similar. I cheated to use BigQuery dataset (which somehow keeps getting updated) and export the data to parquet, download it and query it using duckdb.
On software side, building an OS (distribution) from scratch provides a step above bare metal programming[0].
Provides familiarity with different types of things a kernel does via programs/scripts that make use of kernel.
Actually writing binary code for kernel bit can be done under qem[1][2]. aka don't need to buy actual hardware, can use 'software probes' to view what's going on, etc.
Don't have to worry about 'crashing'/trashing box running on (just crash the qem software & loosing just what was done in qem session, if didn't save as 'export/save to external location outside of qem session')
"Reading OpenBSD source code daily (blog.tintagel.pl)" from [hn: 3] automated way to review code.
Use a `notes/TODO.md` file to main a checklist of objectives between chats. You can have claude update it.
Commit to version control often, for code you supervised that _does_ look good. Squash later.
This glitch often begins to happen around the time you'd be seeing "Start a new chat for better results - New chat" on the bottom right.
If you don't supervise, you will get snagged, and if you miss it and continue, it'll continue writing code under the assumption the deletion was fine: potentially losing the very coverage you'd hope to have gained.
If it does happen, try to scroll up to the chat before it happened and "Restore checkpoint"
But malicious code can break the system like this:
wasmtime run --dir .::/ python.wasm -c 'open("python.wasm", "wb").write(b"blah")'
And now it fails with an error if you try to run it because we over-wrote python.wasm. Even if I move python.wasm out of the current directory I'd still be able to break things by breaking those other lib files.
Although... I guess I could use unix filesystem permissions to make those read-only? That could work.
phi3:instruct does a good job with simple summarization. I haven’t tested it for grammar checking (I am focusing on automation and tool use), but it would likely work.
I want to make a Discord bot that impersonates all my friends and continues to refine the model as the conversations continue. Basically this [1] post, but with a more modern model and, ideally, reinforcement learning. Seems like this would fit the bill.... Is there anything else that would make this easier?
Karpathy has an excellent zero-to-hero series on the topic in which he explains the very core of the neural networks, LLMs and the related concepts. With no background on the topic, I was able to get an idea what's all this about and even become dangerous: https://karpathy.ai/zero-to-hero.html
There's something enlightening in hands-on learning without using metaphors. He even opens the code of production grade tools to show you how exactly the concepts he explained and build together are actually implemented IRL.
This is a style of teaching that clicks with me. I don't learn well with metaphors and high abstractions and find it magical to remove the magic of amazing things and bring it down to easy to reason pieces which can create a complex structure with composition so you can just disregard the complexity as a separate thing of the core.
If you're serious about learning a Lisp but motivated more by, as you say, "having more fun" rather than, say, landing a six-figure job writing it professionally... then may I recommend Janet[0] for your consideration. Janet is an embeddably-small, yet surprisingly batteries-included Lisp implemented in pure C. In terms of syntax and core library it borrows more directly from Clojure than from Scheme, but all the modern Lisps have their bits of influence. I've found both the language and the tiny little community that exists around it delightful.
As an example of the latter, somebody smart wrote a real actual book[1] about Janet recently that was on the HN front page for a day or so when he first released it. It's a gentle introduction not just to Janet but to Lisp in general, and assumes only general proficiency with JavaScript to get you up to speed. I recommend it.
StackEdit[0] pretty much perfected what I needed out of a markdown editor - I just need somewhere to write my tickets/docs that wasn't Github so that I could format it properly while writing. I still use it from time to time
1. Use a good checkpoint. Vanilla stable diffusion is relatively bad. There are plenty of good ones on civitai. Here's mine: https://civitai.com/models/94176
2. Use a good negative prompt with good textual inversions. (e.g. "ng_deepnegative_v1_75t", "verybadimagenegative_v1.3", etc.; you can download those from civitai too) Even if you have a good checkpoint this is essential to get good results.
3. Use a better sampling method instead of the default one. (e.g. I like to use "DPM++ SDE Karras")
There are more tricks to get even better output (e.g. controlnet is amazing), but these are the basics.
I'm one of the builders of an open source project (https://buildwithfern.com/docs) to improve API codegen. We built Fern as an alternative to OpenAPI, but of course we're fully compatible with it.
We rewrote the code generators from scratch in the language that they generate code in (e.g., the python generator is written in python). We shied away from templating - it's easier but the generated code feels less human.
This paper is my favorite introduction to compilers, it's short and hands-on, goes from compiling a primitive program that does nothing but returns a single integer to a full-blown implementation of a real programming language in 24 small steps: http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf
Another solution that is surprisingly powerful is Logitech Media Server [0], which despite its name is open source and cross-platform. The server can run on any unix-ish machine, and clients can be any number of Raspberry Pis or ESP32s, or custom boxes, or any computer. Multiroom syncing is great. Works for local library and streaming. There is even a modern web interface available [1]. I looked into many of the other solutions in this thread, and LMS suited my needs best.
I use navidrome[0], its a music streaming server you can selfhost and then use a player that supports the subsonic api for playback. I use the strawberry[1] music player on my desktop and substreamer[2] on android. Navidrome can also scrobble your music to last.fm if you tell it to. The actual music files are mounted with rclone and --vfs-cache-mode full to a directory.
I also use Jellyfin along with an Android app call Synfonium. It has heaps of customisation and works with many media providers including Emby, Plex, Subsonic etc.
I use this in one project but its bundle size is very large for what it does. I think Preact is usually a nicer option (and no, it doesn't require build tools and NPM if your IDE of choice is notepad) with a React compatible API.
It's 3KB. Even less if you opt to not have any React compatibility at all.
It also links to https://try.ruby-lang.org/ which has a series of quick tutorials to go through learning Ruby in the browser, without having to install anything on your computer.
I don't think "poorly documented" is a fair description of those resources.
For JS in general: anything from https://2ality.com - articles, books, walk-throughs, etc. Axel is thorough and systemically reconstructs complex topics from core building blocks.
For Cloudflare Workers specifically: first, learn regular Web Workers, then read everything from kentonv who's the tech lead for CF Workers, and a user here. He's written many (disjointed) pieces on Workers both here on HN and on the CF community board. Additionally, Cloudflare Developers Discord (https://discord.com/invite/cloudflaredev) has a very active channel for Workers with people discussing implementation details and edge cases.
I don’t know why hyperskill.org is not getting any love on HN, zero videos, byte sized lessons to the point - followed by questions and assignment. Also mini projects at every 10 or so lessons which eventually makes up final project and their learning map is really awesome which maps topics to know before proceeding , and integration with JetBrains IDE is nice so you can submit projects or questions directly.