Hacker News new | past | comments | ask | show | jobs | submit login
The interesting ideas in Datasette (simonwillison.net)
135 points by simonw on Oct 4, 2018 | hide | past | favorite | 11 comments



Great write-up. It's the classic innovations dilema (squared) all over again - the simple things e.g. SQLite, CSV, etc. work far better than modern "state-of-the-art" offerings. Also doing my fair share e.g. trying to improve CSV - CSV v1.1 anyone? -> https://csv11.github.io and collecting text to SQLite dbs for football -> https://github.com/openfootball (aka football.db); beer -> https://github.com/openbeer (aka beer.db) and the world -> https://github.com/openmundi (aka world.db) All datasets / SQLite dbs in the public domain (no license, no copyright, no rights reserved), for example, and ready for use with Datasette. Thanks for the great (open) data publishing tool.


If you're excited about GraphQL you may enjoy this one: SQL as a query language for a JSON API: https://simonwillison.net/2018/Oct/4/datasette-ideas/#SQL_as...


Worth it just for the thoughts on keyset pagination alone:

https://simonwillison.net/2018/Oct/4/datasette-ideas/#Keyset...

https://use-the-index-luke.com/sql/partial-results/fetch-nex...

I've implemented this myself (poorly) for mutable data as I slowly began to lose my mind. For example a feed of users' posts (ordered uniquely for each user based on their personal preference) that can be individually created and deleted, and you don't have enough memory to cache all of the query results to be returned to each user immutably. As far as I can tell, this is an open problem.


Not sure if I still agree with what I wrote here, but related article from many years back:

http://cra.mr/2011/03/08/building-cursors-for-the-disqus-api

There's also an implementation in Sentry, that is probably too complex, awful code, and hard to read, but is open source if anyones curious:

https://github.com/getsentry/sentry/blob/master/src/sentry/u...


Interesting how there is also another post on the front page on the thing I associate with Datasette:

https://news.ycombinator.com/item?id=18129349


This is an amusing coincidence that confused me quite a bit as I hadn't heard of either before...


I have been consistently impressed with your work. Keep it up!

This has been very, very nice for throwing up a quick explorer for some research data I use regularly.

You mention things about sqlite's JSON support; do you have plans for exposing more of this?


I've been playing a bit with JSON in one of my more experimental plugins: https://github.com/simonw/datasette-json-html

I also have a feature where you can tell the JSON API "I know this field is JSON - don't give it to me as an escaped string, give me the JSON raw". Here's an example: https://latest.datasette.io/fixtures-dd88475/select.json?_sh...


I do think that if sql were to come out now, it would be like the hottest thing ever. It’s funny how that works.


One thing I'm interested in is a data format that's like CSV, so its streamable, has a binary compressed version, ascii version for viewing on the cli, and also fixed width per row so it can be memory mapped and have random access performed.


How about Flatbuffers?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: