Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SQLite is likely used more than all other database engines combined (sqlite.org)
131 points by ysabri on June 15, 2024 | hide | past | favorite | 78 comments


In the past, I've used it as a file format for an online application. Users often wanted to download bits and bobs of data from the app, not to view it or edit it on their desktop, but more as backups or to share a particular configuration Previously we'd used a json+zip pair, but the lack of enforced schema became a problem. Switching over to a gzipped sqlite db with a "custom" file extension worked incredibly well


In terms of installations, sure.

In terms of total data, probably not.

In terms of important data, 100% not. Important data needs guarantees, and its privacy is second to its longevity.

Server side and centralized beats client side decentralized for all but sensitive consumer data.


You may be shocked to learn how much of the world's "important data" lives in Excel files.


A lot of world’s important data is on paper.

As far as important data served from a relational database is concerned, MySQL most likely beats the hell out of everything in terms of popularity. The bulk of SQLite’s usage has to be various runtime caches, file data, and so on.


Postgresql superseded mysql ever since the maria fork/oracle acquisition.


The topic is not quality but popularity.


Yes. In terms of popularity (installs/time) psql superseded mysql. Although of course at the tipover point msql still may have beaten psql in terms of total installations, i'm pretty sure even by that metric psql already won.


Source? I’ve only been using PostgreSQL, but I believe most legaxy systems are not going to rush to update, and world consists of legacy systems and systems that will become legacy tomorrow.


I'm the source

we have been living in a psql era for longer than mysql, so it has had enough time to catchup.

If you feel otherwise it's probably that memory bias effect where distant events seem closer and we underestimate the passage of time,

There's an xkcd for that

https://xkcd.com/891/

"The death of mysql(2010) is closer to its birth(1995) than to the present (2025)" will become a true statement next year.


So it's essentially your gut feeling?

It's obviously going to be hard to determine a number for a statistic like "most popular", and there are always very different interpretations of the available numbers, ending up with vastly different rankings.

I.e. this one keeps Oracle DB at place 1, mysql at second: https://db-engines.com/en/ranking

It links to their method at the top of the statistics. (Oracle has basically a DB monopoly if you believe that ranking, as these are both owned by them)

It's dataset is probably flawed, but I think we can say

"Rumors of mysqls death have been greatly exaggerated"


There are stats that claim MySQL is used more, but I did not link to them because I don’t know if they are trustworthy. You, meanwhile, just made a fact out of thin air.


It is unfortunate how HN discussions has fallen. Not only are most ( if not all ) of DB discussions completely ignoring MySQL, it is as if Oracle and MS SQL Server isn't a thing.


Fun fact: my (french) company audited the french navy and found out they share important nuclear submarine maintenance data as excel files through email.


Former US military officer here. This does not surprise me at all. I wasn’t in subs but we absolutely did this for maintenance of other multimillion dollar assets.


Microsoft is in so many government offices around the world it's not even funny.


I’m certain MS is in all government offices almost globally, except that one municipality in Germany that keeps attempting a Linux transition every other year.


Not shocked, not a problem.

Easier to copy/replicate, packaged with a GUI, provider with a commercial relationship, programmatic support ( formulas, vba, COM, raw data access)

Would be closer to sqlite than server side though, but it's capacity for replication and sharing makes it more adecuate for sqlite.

Sqlite's lack of GUI gives data a programmer bias, important data uses excel because unless you are in a technocratic environment.


There's quite a few guis available, eg here's mine: https://www.timestored.com/qstudio/database/sqlite You raise an interesting idea. What if sqlite database s has also been .exe executable to give at least a cli. That would have been interesting, similar to redhead.


I'm not sure I buy the "important data" one: my Lightroom catalog, to me, constitutes incredibly important data. Losing it and just having the base files would destroy over a decade of work.

It also happens to be a SQLlite3 database. And the same is true for a slew of other applications that (quite rightly) use SQLite databases as their file format.

You might be thinking of things like financial transactions, or medical records, but that's not the only kind of important data there is.


The number of societal-wide problems that occur if your Lightroom database gets corrupted is zero.

The amount of hell that would be unleashed if the financial systems layers upon layers of database transactions got broken is impossible to comprehend.

So if you mean “important” as “necessary for society to function”, then no, your browser bookmark files, contact list, or the other two dozen things your laptop and phone use SQLite for are not important.


This is a false equivalence, as one user’s particular usage of SQLite is not comparable to all financial instructions’ databases.

Better to compare to the prospect of all smartphones being irreversibly corrupted at once.


Literally all smartphones dying at once is preferable because they are all backed up into heavy duty databases. You can restore every phone from the cloud in this thought experiment.

You can not restore a data center from a pile of phones.


The goalposts keep moving here. You can just take any arbitrary definition of important and use it to exclude SQLite deployments.

It's all semantics, and also irrelevant to the original article anyway, since nowhere does it argue that SQLite holds the most data (important or otherwise).


I didn’t comment earlier so I can’t move a goalpost I never set down. I agree with the root comment that there may be more instances of SQLite, the most important data is not in SQLite.

For what it’s worth, I have used any number of databases over the years and SQLite is very good for a number of things.

None of those things are the core infrastructure that stores your emails, money, and other must have, shared, high availability data.

There are different tools for different jobs, that’s fine.


How do you physically store it? An external disk?

I venture a guess that for each sqlite db holding important work data, you have dozens or hundreds of traditional db datasets, that are critical to your work. And yes, I mean your bank, but also messaging apps, online services.

If you backup your hard drive in the cloud at all, you are already depending on trad dbs for the very use case supposedly highlighting your dependence on sqlite.


+1

So is contacts list, sms messages, Whatsapp data, browser data etc etc. whese are separate installations of sqlite, hundreds per phone. And they ARE important, at least to me.


Interesting.

My contacts are stored in my sim chip or google(server), also they are shared with many apps (server), whatsapp msgs are backed up to google drive (server), and itself stored on server side whatsapp servers.

Broswer data is sqlite yes, but it is also flushed every couple of months, and is by design ephemeral.


How would you go about measuring that, or determining what's important?

Huge SQL for servers is just a different use case.

Someone's temporary browser data or settings for my android alarm app could be just as important and sensitive as an average record on those big servers.

Fortunatelly, SQL fits the role of a small transactional engine very well to support those small use cases for a large number of applications. It deserves the good reputation.


We don't need to measure everything, we may talk about variables that are hard or subjective to measure.


Right. In terms of installations, I bet textfiles still win. Doesn't mean much, does it?


Gross theoretical error. Text files are a (ambiguous and undefined) file format, while sqlite is an application.


> In terms of total data, probably not.

> In terms of important data, 100% not.

Choosing not to focus upon quantity or quality is likely the reason why SQLite is so popular. It is a lot easier to manage and develop for SQLite than it is to do the same for large databases that require high availability and reliability under heavy loads. There are many more applications for small databases that contain data that is important to the end user even though that data may not be important to anyone else.


You can also use a plain file system, it allows data to be exported more easily and through un unbiased app like a file explorer.

I'd say if the data is important and client side, you'd probably want it to be accesible in a protocol format rather than a sqlite file format.

It's cool that sqlite is open source and u can technically recover the data, but at the end of the day excel data is more accesible, which defeats the ethos of open sourceness


Would you classify a nurse call system as requiring important data retention and access? Mapping backbone devices with passive and interactive end user devices with means to communicate whom should provide assistance?

Works great there too with μC/OS-II. Competition was using Windows XP and Microsoft Access.

Only time ever using Oracle has been with internal business analytics, not very important.


No I would not, the application is important and critical, but the data? What data are you holding? Historical auditable data regarding calls? I have some experience with hospitals, and deep data like this would never be used for diagnostics or even epidemiology tracing.


> Server side and centralized beats client side decentralized

SQLite works on the server too. Important data can live in a SQLite database.


I remember reading or seeing this post about how SQLite can really scale and supports a really high amount of transactions per second. I don't know if it supports distrubuted loads that well, which might be the reason why it's use didn't pick up server side.


SQLite solves a lot of problems well enough that you can focus on other things.

Have an mvp you’re not sure you’ll ever have more that 2 users - yep.

Storing a little data for an application on the disk and don’t want to write your own schema.

Want to teach someone how databases work without setting up a sever - sure.


A few others:

- Want to distribute data to users that don't want to manage a server? A lot of people don't want to manage a server and don't need the best possible performance.

- Want to take data with you on a thumb drive and work with it offline? It's extremely convenient to be able to use SQLite for an app that has to work offline.

- Does the app mostly just read from the database and fit in memory? It's undervalued to just put the entire database into memory so you don't hit the disk and don't introduce network latency. For example, the following website does all enrichment with in-memory SQLite databases: https://shdn.io/analyze?target=ycombinator.com

At Shodan, we distribute versions of our datasets as SQLite and they're a popular way to consume the data without having to manage infrastructure.


Astonishing no commenter mentions that SQLite unleashes the power (and pain) of SQL for what were in ancient times dark binary blobs or obscure text formats. I.e. guaranteed consistency, easy versioning with better up/down compatibility, excellent complex retrievals, ...


For niche uses of database I am sure this is true. But if you are not in that niche, then Postgresql is usually a much better choice. For example Firefox uses SQLite, because it wants to store lots of application specific data. It is good for that.

Personally, I find the tooling, documentation, familiarity and quality of Postgresql makes it my choice even in situations where SQLite might work.


If I'm not mistaken, Apple's CoreStorage is based on SQLite, so pretty much every Apple device for the last years (or decade, probably) is constantly using tens (or more) SQLite databases, per OS. That includes Macs, Phones, Watches, AppleTV etc. That has got to be more than 3 billion devices (since there are at least 2 billion of the phones in active use).

SQLite is also quite often a CI/CD default when building and testing software where you might want to do some tests without starting an entire RDBMS server.

But I suppose it's different if we think in terms of networked databases, or multiuser.


It's also heavily used on Android, so that's some 3bn devices (based on active users numbers from earlier this year). Plus it's in Firefox, Chrome, and so on.


It's probably on all Electron apps as well, and considering some of the parts of Android and Chromium are used on Smart TVs, head units in cars and game consoles, that's probably at least another 1bn.

Suffice to say SQLite is probably at least 6bn active usages in size.


I think the headline statistic negates the claim that these are "niche" uses of database.


At some point, such local usages were sufficiently non-niche that the temp file extension was changed from "sqlite" to "etilqs" to stem the flood of angry emails about local application databases taking up too much disk space.

(cf. Daniel Stenberg getting angry emails because everybody and their dog embeds libcurl, sometimes multiple times in the same app)


Here's the commit from 18 years ago explaining that change: https://github.com/sqlite/sqlite/commit/fd288f3549a1ab9a309a...

    ** 2006-10-31:  The default prefix used to be "sqlite_".  But then
    ** Mcafee started using SQLite in their anti-virus product and it
    ** started putting files with the "sqlite" name in the c:/temp folder.
    ** This annoyed many windows users.  Those users would then do a 
    ** Google search for "sqlite", find the telephone numbers of the
    ** developers and call to wake them up at night and complain.
    ** For this reason, the default name prefix is changed to be "sqlite" 
    ** spelled backwards.  So the temp files are still identified, but
    ** anybody smart enough to figure out the code is also likely smart
    ** enough to know that calling the developer will not help get rid
    ** of the file.


The most frequent use of it is local storage of application data. Postgres is overkill for that.


The article is about number of installations, instances across all types of devices


I'd guess most programs ever written are single-process programs with no need for the benefits postgres provides over sqlite. Postgres doesn't even have analogous functionality for eg an in-memory sql instance!


It most definitely does and I’m serving several read-only production databases from memory. It is insanely fast if you do it right.

https://www.postgresql.org/docs/current/pgprewarm.html


I wonder what happened to bdb. I guess maybe authors lost interest/sold to Oracle and the license changed development just didn't keep up with the times. Bdb used to be the sqlite before sqlite existed and ended up used by various embedded applications or even graphical interfaces e.g. to index file thumbnails and such. I'm not sure but I think MacOS at one point used it this way.


I haven't even thought of Berkeley DB in years. Used to use it quite a bit but most notably perhaps with apache mod_auth_db to handle authentication and subscriber information on a fairly high traffic site (yes, this was long long ago ;) ).


I’m curious, is the prevalence of SQLite similar to how PHP is widely used because it’s behind many WordPress sites?


Couple of major companies like Adobe, Apple, Google, Mozilla use it in their products. That would add up quick.

https://www.sqlite.org/famous.html


Chrome's use of SQLite for history, cookies, and local storage would probably put it in first place on its own. Its extensive use in iOS and Android just guarantees the "win".


I'm trying to understand this sentence: it seems to suggest that PHP is only popular because of Wordpress, which just... makes no sense? Wordpress uses PHP because PHP existed, and was good enough to build Wordpress on. And it's certainly not just clinging onto life, hanging around just because Wordpress is built on top of it, today?

So it's hard to understand which similarity you're asking about, but SQLite is popular because it does one thing, and does that one thing really well, and that thing also happens to be a perfect fit for complex file formats.


I think Sqlite's accessibility and ease of use definitely help. However, the fact you can distribute an entire db with a single file is something that shouldn't be overlooked, nor sqlite's legendary reliability and insane test coverage. Finally the fact you can embed it in memory makes for an amazing testing story.


Kind of, at least in regards to mobile.

When developing iOS and Android apps, the "default" frameworks for storing data (Core Data for iOS, Room for Android) are wrappers around sqlite.

I would guess >50% of apps installed on your phone uses sqlite.


It's also the default database used when creating new projects using many web frameworks, like Rails or (IIRC) Django.


I love SQLite. It’s pluggable interfaces and concise C code make it enjoyable to use (and sometimes to debug - as long as you don’t have to go through the single file version!).

Unfortunately, I wish there were multiple implementations of it, and that the file format was documented and stable.


The sqlite file format is documented and probably has the most insane backwards compatibility out there. The developers pledged to keep it backwards compatible until 2050. In fact Library of Congress has recognised it as one of the few data formats for long term archival.

https://sqlite.org/fileformat2.html https://sqlite.org/lts.html https://sqlite.org/locrsf.html


Most likely, indeed.


There are far more ants in the world than human beings. So what?


Once that is said, does OSS move forward? Not an inch, good luck getting any change upstream


OSS does not mean anyone can contribute. It means anyone can view source


It's still OSS, you're allowed to fork and maintain Sqlite if you don't like the upstream merge requirements.



The test suite is (partially) proprietary. Good luck maintaining a fork for a DB without tests.

https://www.sqlite.org/th3.html see bottom


You can buy a license, and maintain your public fork while also letting everyone know you validated your fork with the same TH3 as the origin version. You just won't be able to show people your TH3 since I doubt you can buy a license that allows that.


i find it ironic one of the most gatekeepy, closed, walled garden companies in the history of the planet, Bloomberg, is a sponsor of one of the most free most open piece of software ever.


Bloomberg sells pacakaged public information, they rely on commonwealth and transparent regulation disclosures, they are agents of such transparency.

They gatekeep their product sure, but you need to do that if you want to sell it.


Sqlite is not open for contributions.

Amazying piece of code whatever you look at it.

I'd also probably win test to code ratio contest.


Another irony is that according to Wikipedia [1] Bloomberg maintains a fork of BDB.

[1] https://en.wikipedia.org/wiki/Berkeley_DB


The majority of these SQLite use cases that the article is celebrating are... as the internal data store in an app where it is locking up user data in a proprietary, closed, gate kept wall garden.

This irony has layers. Like an onion.


Do you use music player? Most, including open source are using SQLite inside. There are open source tiling maps formats like MBTiles, based on SQLite. Just dig around, you can find plenty of examples. It's other way around since database format is well known with tooling available


I’m not remotely claiming ‘I never use software that has SQLite inside it’. I use loads of programs that do. And that’s precisely the kind of walled garden I’m talking about.

Keeping your playlists in a specific music player app’s SQLite database forces you to use that app to search or delete or share those playlists. Heck, some apps use SQLite blob storage to store images or even music files.

Open software should use open file formats and store data in the file system where other programs can access it.


> Open software should use open file formats and store data in the file system where other programs can access it

That's what SQLite is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: