Hacker News new | past | comments | ask | show | jobs | submit login
Dataset: databases for lazy people (github.com/pudo)
111 points by shagunsodhani on Feb 7, 2016 | hide | past | favorite | 38 comments



In the 80's there was a product called 'dataease'. For once a product actually lived up to its name, fairly complex databases could be created using a very simple user interface. So simple that plenty of non-programmers used it to create functional applications that were hard to distinguish from custom built stuff (usually using DBIII or FoxPro or something along those lines).

I've never seen anything come close to the simplicity with which that package managed to connect the various tables to each other and to generate forms with sensible rule based validation (types, ranges, sets).

Databases for 'lazy people' should have a good look at what has already been built to prevent re-inventing the wheel badly, JSON files are easy for programmers but the world is a lot bigger than that. (Or maybe they intended to write 'databases for lazy programmers', then it makes more sense.)

Edit: they're (surprise!) still around after 30 years:

http://dataeaseuk.com/products/dataease/dataease-dos


Even the full-blown xBase languages (dBase, FoxPro, Clipper etc.) is so much easier to quickly manipulate data than SQL. The concept of a persistent data-store is built into the language, or rather, the language revolves around data access as opposed to modern languages that are primarily about abstraction.

We've come so far from the simple old days where a TodoMVC would have been just `use todo; browse`.


The company I work for currently still uses it to run reports for the main MRP system. You're bringing so much terror to my Sunday morning.


Hehe, sorry about that. Yes, the product has its limitations, and definitely isn't a fit-for-all-purposes but it does have a very good user interface and that is what I was commenting on. In a way the fact that it is still in use says something about it too, if there were an easy way to replace it functionality wise I'm sure it would have been done.

The problem with tools like these is that they allow people to be far more dangerous than they would otherwise be, but if properly used and with a good eye towards their limitations they can be game changers.

Another product like that is FileMaker pro. I've seen people do absolutely amazing things with it and I've seen people build the stuff of nightmares with it.


I'm talking OG DOS Dataease here. Extract from Oracle, gets manipulated by Dataease, and outputs some HTML files through windows batch files. It's a horrible process that we've virtualized on VMWare so we could have NT running it. This is what happens when management doesn't want to pay for an upgrade. I don't want to talk about it!


Maybe you could convince management that maintaining such a hodgepodge solution is actually more expensive than paying for an upgrade? (I assume you don't work for free :) ).

> I don't want to talk about it!

So don't!


I thought https://fieldbook.com/ was pretty slick. I would love to have a similar UI as an open-source interface to sqlite.


Great find. Another one for the bonus list: https://news.ycombinator.com/item?id=11051676.


You wouldn't have a video of that anywhere, would you? Curious as to what it looks like, and searches on YouTube seem to just turn up a newer graphical version.


No, I'm sorry. I haven't seen that particular version in actual use since I left the bank I worked at in the 80's and 'video' back then would have meant shooting a VHS tape or something to that effect...

Incidentally, those text mode interfaces were wicked fast given the hardware they were running on.


Ah well, figured it was a long shot. Thanks for the stories in any case!


be sure to check out Oracle APEX (https://apex.oracle.com/en/)


Being mostly a database guy I often feel dumb when I read discussions comparing a popular Python framework over some cutting edge js "unicorn powered" new framework...

So it suddenly make me quite reassured that some others are terrorized by SQL which look to me like a joyful playground.

I guess that should be everyone mantra in CS. Don't underestimate yourself, don't overestimate others, just accept that you can't master everything anymore.


Yes! "Don't underestimate yourself, don't overestimate others, just accept that you can't master everything anymore." Master what inspires you seems to be working for me right now.

A major life lesson for me. Last week a friend told me the hardest part about being my friend was the sadness he felt watching me sell myself short. Now that I'm getting my bearings again heading into the job market I'm really appreciating his observation.

I guess I should have been paying more attention to Stuart Smalley on SNL.


SQL is powerful and awesome. You can process almost any information such as you don't need to do any extra steps after obtaining results. That is, of course, if your database is well designed. That part is hard to automate.


I took a coursera over a year ago on the R programming language. One of the last projects was getting data from SQL into R. Doing this required installing mysql. Almost no one was able to do it, and the instructor gave up and declared it was optional. I don't remember the technical details, but it was a big complex beast that was very difficult to figure out and didn't work right. It left a bunch of crap on my computer too, starting up some kind of server on startup.

Not to say my issues couldn't be solved. Just that it's not beginner friendly and to explain why people might be terrified of it.


"It left a bunch of crap on my computer too, starting up some kind of server on startup."

That "crap somekind of server" would be MySQL server. Were you trying to install MySQL client instead.

I've never had a problem. `apt-get install mysq-server`and I'm done.


> Because managing databases in Python should be as simple as reading and writing JSON files.

Not the best sales pitch. :D


I could imagine this being very handy for a lot of ETL stuff and one-off data migrations.

But really I just love the logo:

https://dataset.readthedocs.org/en/latest/


It is. And I use it for a lot of the tasks you mentioned as well a low friction way to start a new project where I'm not exactly sure where the data model is headed.


Dataset is awesome. I've been using it consistently for the past several years. It is lovely in ETL scripts, especially considering you can use it just to send regular SQL when needed.


Yep, I've used this too in simple flask programs. Pretty simple, and it's great when you don't really need to do anything beyond simple data storage and retrieval. It's upsert functionality is pretty handy—particularly pre-DB support. Not that any of this couldn't be done by hand, but why when you don't have to?


Any thoughts on the performance?


It's an abstraction layer on top of SQLAlchemy. There might be some performance hits, but none that have made me question whether I should be using it. The convenience of having a more pythonic way to work with data for ETL scripts is worth it, to me.


Pandas read_sql does so much more for me I'm okay losing the abstractions. http://pandas.pydata.org/pandas-docs/stable/generated/pandas...


I like this. Even though I tend to use statically-typed programming languages so I'd probably never use this, if I'm bashing out a script in something dynamic I often wish I could store data in a way that was as quick and hackable as the language im using, but more powerful than flat files.

Sqlite has always been a bit dynamically typed, it only makes sense to finish the job and lazy-create tables and columns as needed.

I can't see anything in the docs about how the upsert command works, though. Too many upsert commands use the primary key for upserting, which fails with autogenerated keys.


Hm. For reading stuff, this looks simply like a normal orm-like wrapper around a sql library (edit: SQLAlchemy it is).

For writing on the other hand, this has the wonderful side-effect of doing schema changes on the fly, like adding some new columns to a table because one new record has that field.

I know it's not meant to be used that way, but inevitably this will be used against large existing databases (used inside some company for example), and then those side-effects will show up and make a lot of people very happy I guess.


Some questions I can't figure from reading the docs: indexes? Are they inferred by my selects or do I have to create them explicitly? Or do they just not exist? Ditto keys, although if you want FK constraints in a loose ad-hoc system like this it's probably the wrong tool for the job.

I imagine the nested transactions are implementation defined, since some dbs do "rollback rolls back current trans" and some do "rollback rolls back entire stack.



> In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files.

But SQL is the best DSL for databases Period.


Just noticed that this other project popped up a few hours after with a slightly different approach on the same question. https://news.ycombinator.com/item?id=11053525



This is a great find... seems to be made to copy this. Another reason to use peewee I suppose. Could create quick yaml file support similar to dataset.


How does this compare to TinyDB[1]?

[1] https://github.com/msiemens/tinydb


Not quite. TinyDB is itself a new data base while dataset seems like a wrapper for popular relational databases.


Doesn't python have an active record binding?


It similar to pydal and I prefer pydal.


there's already mongodb.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: