It makes me deeply happy to hear success stories like this for a project that's moving in the correctly opposite direction to that of the rest of the world.
Engildification. Of which there should be more!
My soul was also satisfied by the Sleeping At Night post which, along with the recent "Lie Still in Bed" article, makes for very simple options to attempt to fix sleep (discipline) issues.
It's a function of scale: the larger the team/company behind the product, the greater its enshittification factor/potential.
The author recently went full time on their Marginalia search engine, AFAIK it's a team size of 1, so it's the farthest away from any enshittification risk. Au contraire, like you say: it's at these sizes where you make jewels. Where creativity, ingenuity and vision shines.
This comment is sponsored by the "quit your desk job and go work for yourself" gang.
This is something I tried doing 10-15 years ago, and while talent, and drive, and a good message is all absolutely necessary, there's also some kinda catalyst to get the idea in front of an audience I was never able to crack. So yeah, small teams bring a greater possibility of a good product, but there's a little bit of lottery ticket to the exercise, too.
Capital structure may be more important than size.
A bootstraped company can resist enshitification indefinitely, but if it gets any investment, resisting it becomes harder and harder up to the point where a publicly traded one can't resist it at all.
The only issue is motivation. With a team size of 1, people don't realize how much they depend on having just one other person there. Some people can do it. I think a lot are in for a surprise.
I would wonder if there wouldn’t be inherent survivorship bias that would only make it seem like most smaller projects are as you call them “jewels”. The small bad ones don’t make it to you, but the big ones are too big to fail that sort of thing. Could extend that to big good ones being less surprising and normal because “it just works”.
Elon is running a company, not working a desk job for someone else. The point is following YOUR vision, either by yourself on your project, or by running your OWN company. The goal is intellectual and creative freedom.
Or don't, I guess the world needs its mindless drones, that discourage everybody else from taking the leap and following their dreams.
Fortunately, he's running multiple companies and he can't focus on all of them, so SpaceX can work fine, unlike the company formerly known as Twitter which he is right now focused on running and ruining it.
My impression is (i) Musk has a great team doing the work at SpaceX and (ii) those people have a clear, compelling mission they believe in. SpaceX may not need a lot of his attention.
I joke, but a lot of people just want to show up and do the bare minimal until quitting time. Disturbingly, many seem employed in customer facing positions.
To be fair, if they’re paying minimum wage, the employer has the same attitude towards the employees.
Or you're working in companies that basically stop you from being productive.
Managers questioning the developer's decisions, focus on the sales topics instead of fixing the existing product and colleagues taking shortcuts and exponentially increasing the technical debt is something I basically have to work with every day.
I now have worked on multiple projects on the side, both alone and with others who know their stuff and I've never had any of those issues there.
That's why I'm currently in the process of starting a company with a few of those "good guys" to get out of this madness.
I believe we all suffer, in technology and society at large, from excess resources.
Too much processing power, too much memory and storage. Too many data centers, bandwidth. We lost all sense of traction and we're running wild trying to fill up all the extra space and burn up all the extra resources. Software is currently way more complicated than it needs to be, and it makes it overall excessively insecure as well.
I've said this before, and it's becoming official this fall. I'm getting out of the wheel. I wish I had the competence of Marginalia's author to brave this storm. But my vessel is making water fast and I need to make landfall. Next month I embark on a new stint in academia hoping to change direction for good.
Good luck! I went back to university research after building websites for 5 years and then being a lawyer for 10 years. It was 100% the right choice for me. I hope it is for you, too.
Best of luck to you. I moved in the opposite direction: the enshitification of higher ed hurt my heart too much.
There might be an argument that education has never been the primary responsibility of the academy; that it's institutional responsibility is on "learning" not "education." If your goals are primarily research oriented there's still room to do that, especially in STEM.
It's shitty everywhere. My mistake was aiming primarily at a field, as when I entered IT. I'm not aiming to enter or stay in academia. I'm seeking knowledge and networking and I'll keep my mind open to opportunities.
I can totally relate to the screen post. Mine is even 24" (after having messed around with 27" and 32"). I think it's something not talked about enough (ie manufacturers mostly producing crap monitors when it cones to small diameters, with the exception of LG/Apple and Dell AFAICT), and deserves an extra post.
Do you just use a single screen? I've noticed a trend that people seem to be moving to one giant screen (and now apparently one small one), instead of dual monitors. I don't think it should be forgotten what an enormous productivity boost multiple monitors are, whatever the resolution.
> I don't think it should be forgotten what an enormous productivity boost multiple monitors are, whatever the resolution.
Or tiling window managers with virtual workspaces. When most applications I use are two keystrokes away, the need for more than one screen seriously lessens. Besides, looking at a screen on the side hurts my neck.
I'm using a 24" (4K, I wanted the DPI). I don't even use the laptop screen (either clamshell mode or I just turn it off). It's enough for having two windows side by side (docs + code). Anything else is a context switch, so I don't mind Alt|Command+tab or using virtual desktop (when I need to preserve the context).
>"I've noticed a trend that people seem to be moving to one giant screen"
I've had 3 monitors at one point on my main dev box of which 2 were 32" 4K. It became tiring very fast. I went to a single one. First it was 40" 4K but it proved to be too much neck bending to look at. So I've finally settled to a single 32" 4K and am happy (I use it at 100% scaling). Smaller monitors - no thanks, not my cup of tea. I know it is possible and I used to program on tiny ones in my young days but fuck it. Do not want to go back.
I tried multiple screens but realized I always tend to use just one of them and the others can be replaced by virtual desktops.
In terms of size I use 28" and that works well for me, 14" on the laptop is fine too but feels a bit limited now and then. In the long term I'll probably go for something inbetween with a size ratio closer to 4:3.
MBP + 24", where the monitor is from Apple/LG and a pretty good match to the MBP's screen in terms of color, dot pitch, and glare. I've tried but a giant screen doesn't cut it for me; I'm just messing it up with hundreds of windows, have to move the mouse and the head a lot, etc.
Yeah, I'm not arguing for humungous screens. It just seems like window managers aren't up to scratch for that much real-estate, and the traditional "two full screen windows side-by-side" setup enabled by dual monitors is simpler. I am just skeptical that a single (especially 1080p as in the article) screen could possibly be the most productive environment for a programmer - that goes against what everyone experienced when multiple monitors started to become the norm, albeit with worse resolutions back then I guess.
They do, and PowerToys also has some additional functionality in FancyZones, but it's still not perfect. It's hard to express "I want exactly this window alongside exactly this other window" purely via the keyboard, it's hard to cycle between multiple layouts, it's hard to launch a new window and have it appear exactly where you want etc. I'm not very interested in VR personally, but I am hopeful in the long term that eye tracking, gestures, and voice control make interacting with computers spatially a lot easier, just with the already-quite-good 2D paradigm we've already got.
I've never really seen it as multi-tasking, just that a lot of knowledge work involves inputs and outputs (edit web page/ previous web page, edit code/view test results, read requirements/write spec etc). Those can often be separate windows and being able to focus on either at will works well for me.
I guess that’s where the ergonomics kick in and break it for me. I gotta mention that I wear glasses, so everything not in the center of my vision is basically blurred. Which means lots of head turning with dual or wide screens.
I found something with an effective resolution of about 2560x1440 to be a sweet spot, where I can fit windows needed for a single task in one screen. Provided those windows don’t waste too much space with unnecessary stuff of course.
Gotcha. I must admit on Linux especially where I just have Ctrl+1, Ctrl+2, Ctrl+3 mapped to virtual desktops, I'm not unhappy on a single screen. I'm just aware that 20 years ago when it started to become more common, it was one of the few completely black and white productivity gains you could get as a programmer (alongside solid state disks). Seems odd if that doesn't hold in general today.
I quickly realized I absolutely hate setups with multiple monitors, when I was given one at work. I quickly gave back my second monitor, because it just made it worse for me, and additionally wasted space on my desk. If you want to improve my circumstances at work, give me a bigger monitor, not more of them.
I've never used multiple monitors. I've also never talked about that fact, because why would I? Only people who use multiple monitors have reason to talk about them, so discussion is biased in favor of seeming like people use them.
(I've never even tried multiple monitors. I could immediately think of several reasons why they would not be an improvement for me, so made a quick decision in 1998 and moved on.)
It's interesting because traditionally "gold plating" can have connotations of overdoing something, making something slow and expensive that could have been simpler.
That would be how outsiders see those who don't shirk the extra work that goes into building their own infrastructure to escape the merchants of convenience.
When I stay up late and go on a walk after morning coffee I just walk like a zombie then fall asleep for the rest of the day. I think it is rationalization, maybe something else changed for the better and then you'd both want to go for a walk and also sleep better.
I'm wondering if humans are mostly incapable of producing great things without (artifical) restrictions.
In this case, marginalia is (ridiculously) efficient because Victor (the creator) is intentionally restricting what hardware it runs on and how much ram it has.
If he just caved in and added another 32GiB it would work for a while, but the inefficient design would persist and the problem would just show it's head later and then there would be more complexity around that design and it might not be as easy to fix then.
If the original thesis is correct, then I think it explains why most software is so bad (bloated, slow, buggy) nowadays. It's because very few individual pieces of software nowadays are hitting any limits (in isolation). So each individual piece is terribly inefficient but with the latest M2 Pro and GiB connection you can just keep ahead of the curve where it becomes a problem.
Anyways, turned into a rant; but the conclusion might be to limit yourself, and you (and e everyone else) will be better off long term.
For most applications it simply does not make any sense to spend this much time on relatively small optimizations. If you can choose to either buy 32GiB of RAM for your server for less than $50 or spend probably over 40 hours of developer time at at least $20 / hour, it is quite obvious which one makes more sense from a business perspective. Not to mention that the website was offline for an entire week - that alone would've killed most businesses!
A lot of tech people really like doing such deep dives and would happily spend years micro-optimizing even the most trivial code, but endless "yak shaving" isn't going to pay any bills. When the code runs on a trivial number of machines, it probably just isn't worth it. Not to mention that such optimizations often end up in code which is more difficult to maintain.
In my opinion, a lot of "software bloat" we see these days for apps running on user machines comes from a mismatch between the developer machine and the user machine. The developer is often equipped with a high-end workstation as they simply need those resources to do their job, but they end up using the same machine to do basic testing. On the other hand, the user is running it on a five-year-old machine which was at best mid-range when they bought it.
You can't really sell "we can save 150MB of memory" to your manager, but you can sell "saving 150MB of memory will make our app's performance go from terrible to borderline for 10% of users".
What if runtime performance and developer performance aren’t inversely proportional?
It might just be to a certain degree, we’re not actually getting any business efficiency from creating bloated and slow software?
A lot of things, especially in business IT, are built on top of outdated and misleading assumptions and are leaning on patterns and norms touted as best practices.
We sometimes get trapped in this belief that any form of performance improvement somehow costs us something. What if it’s baggage that we didn’t need in the first place?
I say this having worked in the software business for coming on 15 years: I don't think an organization where business is calling the shots would be capable of building an Internet search engine.
The entire project a fractal of this type of business-inscrutable engineering, and in any organization where engineers aren't calling the shots, that engineering isn't going to get done, and the project is going to be hideously slow and expensive as a result.
In a parallel universe where I had gone the classic startup route and started out using the biggest off-the-shelf pieces I could find, gluing them together with python (instead of building a bespoke index); then thrown VC money at hardware at when it started struggling; even more when it struggled again; I'd be absolutely dumbfounded when it yet again hit a brick wall and my hardware cost is tens of thousands every month (as opposed to $100/mo now).
Since I've instead built the solution from scratch, I've also built a deep understanding of both the domain and the solution, and when I'm faced with scaling problems, they're solvable in software rather than hardware. I can just change how the software works until it does. It's a slower route, but it's also able to take you places where conventional-wisdom-driven-development does not.
> We sometimes get trapped in this belief that any form of performance improvement somehow costs us something
But it does always cost you something. Developer time isn't free, after all.
If we only cared about performance, we would be handwriting SIMD intrinsics for baremetal applications. But we don't, because it is easily worth a 20% performance penalty to write code in a modern programming language. We're willing to trade 10% performance for a well-tested framework and library ecosystem which greatly reduces development time. Nobody cares how efficient your application is when it never ships.
Even "bloated" and "slow" do not always mean the same thing. Just look at something like database indexes: they take up space (bloated), but make your application faster. Often that's a worthwhile tradeoff, but creating indexes for literally everything isn't a good idea either. It is all about finding the right balance.
I do agree that a lot of user-facing applications have gone way too far, though. Even completely trivial Android apps are 150MB+ binaries these days, and the widespread use of memory-hogging Electron tools is a bit worrying. When your app runs on millions of devices, you should care about resource usage!
> We're willing to trade 10% performance for a well-tested framework and library ecosystem which greatly reduces development time. Nobody cares how efficient your application is when it never ships.
I think this is the sticking point. People assert without any real evidence that whatever framework greatly reduces development time, and if that were the tradeoff, it might make sense. e.g. Rails and Laravel bill themselves this way.
Meanwhile, I've found that a more barebones framework in Scala is more productive to develop with, and also gets at least 100x the performance (e.g. a request rate in the 10s of thousands/second is easy to do on laptop hardware), which also makes it operationally easier since now you don't need a giant cluster.
> But it does always cost you something. Developer time isn't free, after all.
According to the rest of the comment we seem to largely agree. But this statement is what I want to challenge a bit.
Basically it comes down to this: We're doing plenty of unnecessary stuff all the time, especially in business IT (web applications, CRUD, CRM, CMS, reporting etc.) software.
Your Electron example points to the right direction, but there are also a lot of practices regarding code organization (patterns, paradigms), being too liberal with pulling down dependencies, framework bloat etc.
Simply getting rid of things that we don't need and thinking about performance in very rough terms gets us _both_ better performance and minimal code. I would wager that this isn't a tradeoff as in you describe. We might, especially in the mid-long term, actually gain developer time.
This often means thinking about DB access patterns, indexing and so on as you mentioned. Meaning we think in SQL and lean on the capabilities and heuristics of our DB. What does the DB actually need to do when I do these queries? How can I model the data in a way that gives me just enough flexibility with reasonable performance? Which parts of the system needs to know what? How does the data need to flow through it?
All that stuff we sometimes put on top (ORMs, OO patterns etc.) can get in the way of that. Does it really make us more productive to put these abstractions on top? Do we gain anything from doing these OO incantations?
The article in question is a really good example of removing complexity and drastically increasing performance at the same time.
I have a good example as well.
Our internal image optimization module that we use to generate differently sized images and formats in order to accommodate browser capabilities and screen sizes, was getting noticeably slow, especially for e-commerce related sites and brochures that typically feature a gallery per product/service.
Long story short: It got 50-60x faster simply by removing a convenience layer, writing SQL directly, processing the images in grouped batches and so on. AKA all just low hanging fruit. The end result is also simpler. It took work/time but it didn't need to be that slow in the first place.
And we are our own users too! We have to maintain, extend, test our code. Faster code gives us faster iteration cycles. Fast feedback loops lead to much higher productivity and flow.
In my opinion, a lot of "software bloat" we see these days for apps running on user machines comes from a mismatch between the developer machine and the user machine. The developer is often equipped with a high-end workstation as they simply need those resources to do their job, but they end up using the same machine to do basic testing.
Incidentally, I think the reason they need those specs is the same: the people building the dev tools all have top end hardware, and what’s fast enough for them is good enough to ship. I don’t think the people building the dev tools at meta, or apple, or google are seriously considering the use case of a developer working on an old dual core 8 gb machine, but that’s the reality in large parts of the world.
On the other hand, development tools simply require a lot of resources: a debug build is always going to be more demanding than a production build, and a full recompilation of something like Firefox is never going to be fast.
I reckon a lot of developers are more than happy to sacrifice 8GB of RAM for near-instant advanced autocomplete and typechecking, but we should keep in mind that it should not become a requirement to do basic development.
Oh, I agree that this is the cause, but no thanks.
I will keep using the heavyweight development tools that run through all of my code discovering problems, analyzing its quality, changing it into more performant code, deducting all kinds of new code to complete my high-level view, and doing all kinds of the modern benefits that use my computer to replace my work.
I can get behind testing the resulting executable on a low-power machine. But not on using it for everything.
So i guess that is the point GP is making. From practical perspective one would just spend 50$ on RAM and forget about it. But you miss opportunity to make something great in terms of algorithm improvements for example. Even if it costs you more.
SO here artificial constraint is that "you cant have more RAM" and so you need to find other more creative solutions.
But OP could have more RAM, and the end user doesn't care about how clever the algorithm is. They only care that the service is offline for a week while the developer is having fun implementing their new toy algorithm.
Working on improvements like this makes a lot of sense in academia, when you are running thousands of servers, or when you need so much memory that you can't buy a big enough server. It's a nice backlog item for when there is literally nothing else to do.
I have definitely fallen into this rabbit hole myself. Solving difficult problems with clever solutions is a lot of fun! But my manager - rightfully - chastised me for it because from a business PoV it was essentially a waste of time and money. Fine for a hobby project, not so much when you are trying to run a business.
> Working on improvements like this makes a lot of sense in academia, when you are running thousands of servers, or when you need so much memory that you can't buy a big enough server.
It also makes sense for all desktop and mobile apps, per gnyman’s point on not hitting individual limits. Also for anything that other people will be running, like libraries. Because your thing might perform marvelously in isolation on a beefy dev machine and still run like ass on every user’s somewhat underpowered computer with ten other apps competing for RAM (hello Slack).
See also: “craptop duty”[1]; how the 512K Mac reduced the competitive advantage of MS’s ruthlessly-optimized office apps but the MultiFinder, né Switcher, restored it[2].
Thanks for the links, interesting story about switcher.
And yes, hearty recommendation on the craptop.
I was discussing with a colleague a while back when he said he had to get one as he was unable to reproduce a problem on his M2 even with the 6x cpu throttling you can do in Chrome. On the old laptop he dug out of storage he managed to reproduce it instantly.
I need to check with him how easily he could fix it, one argument I have to why some optimisation is always good is that it's easier to fix things as you go along vs having to go back and try to make things better. I mean if there is one thing making it slow then it's easy, but I have a feeling it's often a mix of things and at that point you might not be able to replace the slow parts as everything else depends on them.
(I might have actually linked the wrong Switcher story. The one I remembered explained why it existence was strategic for Microsoft, but control over it was not: multitasking made memory optimization matter on a larger machine.)
I intentionally underestimated the cost to provide a lower bound.
In reality the listed changes once you consider all team members are likely going to run closer to 80-120 hours at $150 / hour - but that only confirms the final conclusion.
Yeah this aligns with my view. Limitations breed ingenuity, and that isn't limited to demo scene outputs. You're going to run into scaling problems sooner or later, and they're a lot easier to deal with early than late. If your software runs well on a raspberry pi[1], it's going to be absurdly performant on a real server.
It's actually how we used to build software. It's why we could have an entire operating system perform well on a machine like a Pentium 1 with most of what you'd expect today, etc. while at the same time we have web pages that struggle to scroll smoothly on a smartphone with literally a thousand times more resources across all axes. The Word 95 team were constantly faced with limits and performance tradeoffs, and it very clearly worked or did not.
If I had just gone and added more RAM (or whatever), I would still have been stuck with an inferior design, and soon enough I would need to buy even more RAM. The crazy part about this change is that it isn't just reducing the resource utilization, it's actually making the system more capable, and faster because free RAM means more disk caching.
I think it’s why old computers felt good and also why old games were so good.
Maybe it have something to do with the complexity of the systems we deal with.
When you have a restricted amount of some resource (RAM, physical space, food, materials, time, money …) you have to plan how you will use it. You are forced to be smart.
When you have a virtually infinite resource, you can make whatever you feel making but you don’t have to really care about the final state, you just start and you’ll see when it will work.
I’m not exactly a true gamer, but I’ve always been amazed by the fact humans were capable to store so much emotion, adventures and time to enjoy in the good old cartridges with some kb/mb of rom. I mean, Ocarina Of Time rom is just the size of the last 8 photos I took with my iPhone.
The guy who made Virtualdub (virtualdub.org) has a blog who said essentially that. His video program is supersmall because it doesn’t use 4 packaged libraries; he programmed everything to hardware / OS interfaces directly.
American Airlines ran SABRE, a sizeable airline ticketing and reservation system, in the mid-1970s on two system/360 mainframes that could only process a few tens of millions of instructions per second.
A raspberry pi 2 can do over 4 billion Dhrystone instructions per second, and a pi 4 over 10 billion per second.
Of course by modern standards mid-1970s SABRE was pretty barebones for an airline's main system, but it's at least theoretically possible to run simplified systems for over a 100 airlines simultaneously on a single pi 2...
So yes modern programs are very far from optimized. 1000x or 10 000x improvements are possible, less for math heavy stuff.
I think microsoft has a huge problem with this. Even 3000$ laptops from 5 years ago struggle with running a teams call, some office instances and a browser with 30 tabs at the same time without slowing down to unacceptable levels.
They test stuff individually and running one thing alone is fine, but that's not what people do.
I'd imagine that artificial limits in the form of run time on well-defined hardware that are only raised after an explicit decision could be the solution to this.
But then again I only write business software where the performance aspect comes down to "don't do stupid shit with the database and don't worry about the rest because the client won't pay for those worries", so I might be on the wrong track entirely.
- relying on a high-end monitor to provide sufficient color contrast
- loading an unreasonable amount of resources only to provide laggy animations
...I'm wondering if the responsible designer(s) only have 32-in Retina displays and the latest Macbooks to work with. Because on any other combination of devices, the website looks and feels awful.
And I know this because I was formerly guilty of it!
“ I'm wondering if the responsible designer(s) only have 32-in Retina displays and the latest Macbooks to work with. Because on any other combination of devices, the website looks and feels awful.”
I think often it’s that they aren’t users themselves. They make it “pretty” but not functional.
It's extremely rare that UX designers are also users of the products they develop. Maybe they use Figma or the order entry of Amazon even if they don't work for Figma or Amazon, but who of them is going to use again the order entry form for Random Customer N after they started working on the registration form of Random Customer N+1?
My GF is a designer and she always says that the problem is that nobody test anything. I've been helping with a project for her client and pretty much everything went on the fly.
I was able to spot obvious flaws in the design, she agreed but said that she had no time and such is life.
There are so many tools to see how a website looks like on multiple screens / devices. Even full on emulation. I can see a designer making this oversight but a UX designer doing that kinda makes the UX part of the title irrelevant.
> I'm wondering if humans are mostly incapable of producing great things without (artifical) restrictions.
On the one hand, I agree with this. I think an awful lot of great art and great work comes from the enforced genius of operating within constraints, and there's a profound feeling that comes from recognizing that kind of brilliance.
I'll also say, though, that there's also something about seeing the results of absolutely turning every nob to 11 - about seeing the absolute unfettered apex of what particularly talented people can actually do with no constraints whatsoever. It's a very different experience, and I deeply respect the genius of making art under constraint, but sometimes you've just gotta put a dude on the moon, you know?
> I'm wondering if humans are mostly incapable of producing great things without (artifical) restrictions.
Here is an opinion to support your hypothesis from a couple of different domains: poetry and music.
While some people prefer free verse and avante-garde music, what stays most in my mind, and what seems to endure longest overall, are poetry with regular rhyme and meter and music that follows standard patterns of melody, rhythm, and harmony. Having to force their creativity into those sometimes rigid frameworks seems to enable many artists to produce better works.
Maybe a counterexample to it, not sure if it can be applied:
I write software in ABAP, which is a weird and ridiculously complicated language that has inline SQL and type checking against the database and has never had a major version that breaks old stuff, so code from 30 years ago will (and does) still run.
I used to have fun working around the quirks of it and finding solutions that work within the limitations, but now I'm just frustrated by having to solve problems that haven't been around for the past 15 years in the rest of world and looking at terrible code that can't be made any nicer because of those limitations or because customers don't give a shit as long as it works most of the time.
There's definitely some artist who feel that way. Jack White comes to mind. He'll deliberately use restrictions(like writing music for only two instruments) and even physically obstruct things at live performances. See this (very good) interview with Conan: https://m.youtube.com/watch?t=890&v=AJgY9FtDLbs&feature=yout...
And as a developer or a team, you're bound by how long development takes, not by the required resources.
You won't be asked by a business stakeholder "oh, and how much RAM does it take?" or "why is it $2,000 a month instead of $1,000?". These questions tend to come much later when profit needs to be ironed out.
And later, when performance becomes important, it is often much harder to improve than early on. Especially with legacy db schemas with a lot of existing customer data.
Interesting observation and aligns with my experience of really enjoying small focused tools and apps. This website is a good example.
Further, it feels like there's a corollary here to companies, where financially constrained companies who are smaller and more focused provide better customer experience than cash-flush competitors.
> I'm wondering if humans are mostly incapable of producing great things without (artifical) restrictions.
I think the real issue is that there isn't a program language that produces a compiler error if the given code can exceed a maximum specified latency.
Even working on a program with soft-realtime scheduling, I've had to constantly push back against patches that introduce some obscure convenience without having measured worst case latency.
The problem is so bad I doubt most people realize it's there. I don't know what the answer is, but I have the feeling there's an intersection with timing attacks on software/hardware. Some kind of tooling that makes both worst case times and variance as visible as the computed CSS in devTools would probably help. Added to some kind of static analysis, perhaps devs to hack their way to decently responsive interfaces and services.
It would probably help if the whole "developer spec" thing would go away. I never understood why people think they need 32GB of RAM and top of the line CPU to write code. If you're compiling a lot (especially C++ I guess) then you need a build server. I wonder how much better things would be if "developer spec" actually meant something close to median or representative spec.
I completely agree. Every creative I've ever trusted has the same philosophy: freedom through constraints. I've found in my life, too, I can focus more closely on elegant solutions when they become (perhaps artificially) necessary, not merely aesthetically pleasing. I'm actually having a similar experience of insane efficiency improvements in a personal project, much smaller in scope, that came down to using bit operations and as-branchless-as-possible methods for an Arduino Nano.
Without going about efficiency and priorities. I think it's easy enough to claim that a great way to spark creativity or great solutions is putting constraints.
It's about specialization around the usage of few elements to achieve a goal vs a paradox of choice or going through common and known patterns.
Jams can be great for this and people realize they can work so much more efficiently and focus on the core of their idea.
I very often use my highspeed 3g option in the network tab when developing web UIs to give myself some serious constraints instead of assuming everyone is using a developer workstation.
Oh thank you. I have been doing a hobby project on search engines, and I kept searching of variations of "Magnolia" for some reason. ""Marginalia"" at least for me is hard to remember. Currently, I am trying to figure my way around Searx.
Does Marginalia support "time filters" for search like past day, past week etc? According the special keywords the only search params accepted is based on years.
year>2005 (beta) The document was ostensibly published in or after 2005
year=2005 (beta) The document was ostensibly published in 2005
year<2005 (beta) The document was ostensibly published in or before 2005
The search index isn't updated more than once every month, so no such filters. The year-filter is pretty rough too. It's very hard to accurately date most webpages.
> In brief, every time an SSD updates a single byte anywhere on disk, it needs to erase and re-write that entire page.
Is that actually true for SSDs? For raw flash it’s not, provided you are overwriting “empty” all-ones values or otherwise only changing 1s to 0s. Writing is orders of magnitude slower than reading, but still a couple orders of magnitude faster than erasing (resetting back to “empty”), and only erases count against your wear budget. It sounds like an own goal for an SSD controller to not take advantage of that, although if the actual guts of it are log-structured then I could imagine it not being able to.
>> In brief, every time an SSD updates a single byte anywhere on disk, it needs to erase and re-write that entire page.
> Is that actually true for SSDs? For raw flash it’s not, provided you are overwriting “empty” all-ones values or otherwise only changing 1s to 0s.
Maybe it depends. I wrote the driver for more than one popular flash chips (don't remember which ones now, but that employer had a policy of never using components that were not mainstream and available from multiple suppliers) and all the chips I dealt with did read and write exclusively via fixed-size pages.
Since SSDs are collection of chips, I'd expect each chip on the SSD to only support fixed-size paged IO.
In this scenario I was basically re-writing the entire hard-drive completely in a completely random order, which is the worst case scenario for an SSD.
Normally the controller will use a whole bunch of tricks (e.g. overprovisioning, buffering and reordering of writes) to avoid this type of worst case pattern, but that only goes so far.
1. Writable Unit: The smallest unit you can write to in an SSD is a page.
2. Erasable Unit: The smallest unit you can erase in an SSD is a block, which consists of multiple pages.
So if a write operation impacts only 1 byte within a page, the SSD cannot erase just that byte. However, it does not need to erase the entire block either.
The SSD can perform a "read-modify-write" type of operation:
- Read the full page containing the byte that needs to change into the SSD's cache buffer.
- Modify just the byte that needs updating in the page cache.
- Erase a new empty block.
- Write the modified page from cache to the new block.
- Update the FTL mapping tables to point to the updated page in the new block.
So, a page does need to be rewritten even if just 1 byte changes. Whole-block erasure is avoided until many pages within it need to be modified.
Not precisely. The logical view of a page living at some address of flash is not the reality. Pages get moved around the physical device as writes happen. The drive itself maintains a map of what addresses are used for what purpose, their health and so on. It’s a sparse storage scheme.
There’s even maintenance ops and garbage collection that happens occasionally or on command (like a TRIM).
In reality a “write” to a non-full drive is:
1. Figure out which page the data goes to.
2. Figure out if there’s data there or not. Read / modify / write if needed.
3. Figure out where to write the data.
4. Write the data. It might not go back where it started. In fact it probably won’t because of wear leveling.
You’re right that the controller does a far more complex set of steps for performance. That’s why an empty / new drive performs better for a while (page cache aside) then literally slows down compared to a “full” drive that’s old, with no spare pages.
Source: I was chief engineer for a cache-coherent memory mapped flash accelerator. We let a user map the drive very very efficiently in user space Linux, but eventually caved to the “easier” programming model of just being another hard drive after a while.
Just a shout out to my boss at Mojeek who presumably has a very similar path to this (the post resonates a lot with past conversations). Mojeek started back in 2004 and for the most part has been a single developer who built the bones of it, and in that, pretty much all of the IR and infrastructure.
Limitations of finance and hardware, making decisions about 32 vs 64 bit ids, sharding, speed of updating all sound very familiar.
Reminds me of Google way back when and their 'Google dance' that updated results once a month, nowadays it's a daily flux. It's all an evolution, and great to see Marginalia offering another view point into the web beyond big tech.
Lots of people treat optimization as some deep-black-magic thing[1], but most of the time, it's actually easier than fixing a typical bug; all you have to do is treat excessive resource usage identical to how you would treat a bug.
I'm going to make an assertion: most bugs that you can easily reproduce don't require wizardry to fix. If you can poke at a bug, then you can usually categorize it. Even the rare bugs that reveal a design flaw tend to do so readily once you can reproduce it.
Software that nobody has taken a critical eye to performance on is like software with 100s of easily reproducible bugs that nobody has ever debugged. You can chip away at them for quite a while until you run into anything that is hard.
1: I think this attitude is a bit of a hold-out from when people would do things like set their branch targets so that the drum head would reach the target at the same time the CPU wanted the instruction, and when resources were so constrained that everything was hand-written assembly with global memory-locations having different semantics depending on the stage the program was in. In that case, really smart people had already taken a critical eye to performance, so you need to find things they haven't found yet. This is rarely true of modern code.
I agree in general, but I think bugs are a lot easier to track down with divide and conquer strategies. If you're able to reproduce the bug by sending request X to service Y, gradually shrink the test case down until you've found the culprit.
Optimization is often an architectural problem. Sure there are cases where you're copying a thing where you could recycle a buffer, but you run out of those fairly quickly, and a profiler will tell you what you need to know.
A lot of the big performance wins are in changing the entire data logistics, possibly eliminating significant portions of the flow until the code does what it needs to in as few steps as possible.
This is a good point; bugs are less often an architectural problem, and that fixing any architectural problems (bugs or otherwise) are more difficult than fixing localized problems.
I'd just like to put an opposite vote on the scale and say that SQLite is such an excellent default choice for this and you were right to do it this way. I can't really see why people argue for databases which have the "relationality" feature deliberately removed.
Maybe I'm old school but I would just stuff that kind of data in a struct. It should be real, real small as a packed binary tree. just mmap the file and be there. If you're never writing to it, or only write to it via filesystem full-file overwrites, you can lay it out most conveniently in memory. We did a similar thing with the sparse matrix data for the Netflix prize back in the day - in a database it was 6gb, in C structs it was ~600mb.
Of course I do respect the encapsulation of the existing SQL query basis but there is sometimes a time for something even more compact.
Yeah tempting for sure. I tried to resist it on the basis that I already have a bunch of similar solutions in the system, and while they're fast and effective, they're also always a bit of a headache to maintain and debug; and this isn't something that is extremely space constrained.
I also like how they decided to mix in sqlite alongside the existing MariaDB database because it gets the job done, and "a foolish consistency is the hobgoblin of little minds".
"I wish I knew what happened, or how to replicate it. It’s involved more upfront design than I normally do, by necessity. I like feeling my way forward in general, but there are problems where the approach just doesn’t work"
Yes, immediate (or soon enough) gratification feels good... To me, and maybe is because I am an old fart, this is the difference between programming and engineering.
I took a start script from 90 seconds to 30 seconds yesterday, by finding a poorly named timeout value. Now I'm working on a graceful fallback from itimer to alarm instead of outdated c directives.
“So it’s a search engine. It’s perhaps not the greatest at finding what you already knew was there. Instead it is designed to help you find some things you didn’t even know you were looking for.”[0]
If all you want is the same results as Google I say you should just use Google. I'm not trying to compete with them, I give the results you won't find on the big G.
Google still kills for useful applications. I use marginalia when I have either A) something deeply obscure I want to research or B) when I just want to read some entertaining long-form text posts about something in a category that interests me.
Of course it is. There are many Russian sources in English on official government sites, twitter, whatever. And they represent a point of view of at least tens of millions of people, but probably more, invisible to the western audience.
I enjoyed reading this but I also fundamentally don't get it at a basic level like... why re-implement stuff that has already been done by entire teams? There are so many bigger and productionised search and retrieval systems. Why invest the human capital in doing it all again yourself? I just don't get it.
How does one learn new things if not first by understanding them, and then looking to evolve them?
Sure, we can shell out to libraries and to other people's work, but at some point, you will have to understand the thing that you've abstracted away if you want to evolve it.
Not really. The original usage of it was completely positive and it has evolved into something more ironic but not necessarily negative. I think the person you were replying to used it absolutely the way most people use the phrase.
It can be more or less translated to "I want to see the outcome of what they're doing" usually used after the person who is "cooking" is criticized for their methods.
Most of what exists doesn't work for my application. It either assumes an unbounded resource budget, or makes different priorities that don't scale by e.g. permitting arbitrary realtime updates.
I'm building stuff myself because it's the only way I'm aware of to run a search engine capable of indexing quarter of a billion documents on a PC.
I want to do my own version of something like this to have a personally curated search function. The "it's mine" factor is enticing, if it does something unexpected, then I know all the dependent, interacting parts so I can trace the problem and fix it.
But I'm a privacy and self-hosting nut, which is probably just another way of saying the same thing.
(I will probably never actually do it, but that doesn't stop it being on the list).
It's kind of funny really, many good projects are actually maintained by a small group of people. You genuinely cannot throw more people at a project and make it faster and better. A lot of cool infra and protocol concepts stuff that we use day to day is made by a handful of people at best
Independant efforts are as important, if not more important, than larger corporate efforts. Because independants have different priorities, and are often not as tied to satisfying curiosity instead of some bottom-line.
Read any articles about how amazing Google's search is lately? Me neither.
Funnily enough this repetitions encapsulates what is wrong with this reasoning. If you don't put time into learning the fundamentals, telling yourself there's not need to re-invent the wheel, you end up repeating old mistakes. Turns out computer systems are not as transparently obvious as a wheel...
It makes me deeply happy to hear success stories like this for a project that's moving in the correctly opposite direction to that of the rest of the world.
Engildification. Of which there should be more!
My soul was also satisfied by the Sleeping At Night post which, along with the recent "Lie Still in Bed" article, makes for very simple options to attempt to fix sleep (discipline) issues.