Hacker News new | past | comments | ask | show | jobs | submit login
The Datasette Ecosystem (datasette.io)
186 points by Tomte on April 27, 2022 | hide | past | favorite | 22 comments



Fun detail: I originally built this page of the documentation to help support my application for a JSK journalism fellowship at Stanford, see https://simonwillison.net/2019/Sep/10/jsk-fellowship/

The page used to have a lot more details on it, but a while ago I split it off into two sections on the Datasette website.

https://datasette.io/plugins lists 86 plugins - there are actually a few more that I haven't added to the site yet though, which I found recently by searching the Google BigQuery PyPI download statistics for packages starting "datasette-".

https://datasette.io/tools lists 38 tools - mostly command line utilities I've written for importing data from different sources into a SQLite database file so you can explore them using Datasette.


I worked in journalism years ago (JQuery was just the newest, coolest thing along with Ruby on Rails). I remember Django being touted as the best thing for Journalism, and people were looking at how the NYTimes did it. We ultimately built our own JavaScript library instead (that didn't end well) and went with PHP.


I was expecting it to be a revival of the old Commodore 64 storage medium.


While I'm aware of the storage device, I was thinking more of the chiptune band that did an amazing soundtrack for Space Rubbish:

https://datassette.bandcamp.com/


Yeah, I was wondering how many users of this tool are aware of the origin of the name...


Every time I see the name, I have to double-take.


Your username definitely checks out


Sort of a form of extreme retro sadomasochism, as though loading a 40kB game didn't take long enough on a 1541.


Preferably with proper TAR compatibility. I want my SSH keys on tape dammit! :D


I find the reuse of the term "datasette" quite jarring. You'll pollute my retro searches! :-)

The same thing happened with the "BBC Micro:bit" which was roughly in the same field, unlike this one.


One of the reasons I picked the name (aside from having grown up with a C64) is that I assumed it would be unique enough today that I could use it to get alerts for mentions.

Doesn't work as well as I hoped: My F5Bot subscription on Reddit catches a lot more C64 retro chat than it does mentions of my software! Quite a bit of it in German.

Turns out there are way more people out there using a tape drive that was released in the 80s than I had expected.


I saw this post this morning and thought it was a cool name but didn't open it. Then I just watched the newest 8-Bit Guy video and saw it again, came back here to see if this was about that and found these comments :)


For columnar analytics queries SQLite should probably be replaced with DuckDB.


DuckDB is one of my favourite projects of late.

Some is a little rough around the edges but it has extremely quickly replaced much more complicated setups for me. It's fast and very simple and pairs beautifully with having a bunch of data just in parquet files.

https://duckdb.org/


When do we need columnar queries? I never really found a need. Yes I used AWS Redshift, but used it like a Postgres db.


Every group by boils down to be columnar. When doing analysis you normally do a lot of grouping/slicing/dicing. True, traditional DBs can do this and more modern one will support alternative storage engines and index modes to support that kind of queries.


I was downvoted for my question, but was an honest question. This was what I needed to know. Thanks a lot.


Lower query times provide for quicker iterations.


@simonw : Some feedback on the main home page:

it would be good to have the 2/3 step process to get up and running right on the home page near the top. E.g.:

    $ pip install dataset
    $ dataset data.db -o
(Maybe add a toggle/tab in a widget for the Docker version)


Interesting - I hadn't thought about putting that right on the homepage. The GitHub README has that: https://github.com/simonw/datasette

I worry that my target audience for the homepage won't necessarily have access to a working Python and pip, and so won't be able to just run that command without additional guidance (see https://docs.datasette.io/en/stable/installation.html )

That's why I emphasize trying out a hosted demo on the homepage instead.


It reminds me of Woob ( Web outside of browser ) https://woob.tech/, formely known as weboob.


Datasette is good shit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: