Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Writing a database is a good exercise I could recommend to everybody.

Writing a general purpose database is very difficult. On the other hand, if you know what kind of load you can expect and when you have an idea of performance and consistency requirements it may become much simpler.

I have rolled out my own transactional database for a commercial project. This was a decade ago, the device had at most 1MB of memory budget for compiled code, stack, heap, and database storage. The database was to be stored on a non-wear-leveled flash requiring precise control of which bytes and how frequently were erased. We required ability to store small objects (tens to few hundreds of bytes at most) of varying size and be able to record changes to multiple objects at the same time, transactionally (think modifying two accounts). The requirement was that the database stayed consistent no matter what happened, regardless of any application error or power cycles. These devices were operated by hundreds of thousands of customers and power cycled liberally whenever anything didn't work.

The database run in constant memory. It allowed the application to register hooks to calculate deltas between two versions of records and to apply delta to base version to produce result version. The database didn't care how the data was encoded, this was application detail.

The database was stored as transaction log. During operation, all writes were directed to the end of the log. When the log reached half of the available memory, all live records would be coalesced and written to the other half of the space and the original space would be erased.

The read algorithmic complexity was horrible with some algorithms being cubic(!) complexity. This was absolutely ok, since there could never be enough records that would make it a problem. Having extremely simple algorithms allowed for the entire code to be very simple and compact. Everything fit about 200 lines or so lines of code.

I really enjoyed the project.



How long did it take to implement, and how large was the team?


Not the GP, but I'll share my story, as well.

In early 2001, for the project I was working on then, we needed something like a SQLite-lite. (SQLite existed, but was less than a year old and was still way bigger than our needs.) One of the other engineers and I paired to build something that worked well enough to run with over the course of a weekend, and then polished and improved it along the way, until the project was abandoned a few months later, because of the dot-com bomb.


I did this alone. The database itself was only about 200 lines of code but took about three weeks of effort which included writing the initial implementation, refactoring it, writing performance test, optimizing the implementation for the intended load and integrating in the application.

The application was much larger, about 40k LOC and took about two and a half years to complete, certify and deploy to production.


A database in 200 lines of code is incredible. I'd love to see that, even better if it was annotated.


As I said, there is nothing spectacular about this.

This notion that a database must be a difficult and complex thing and best left for experts is just turning people off from exploring and learning.

When it doesn't need to be general purpose, has only one thread accessing it, stores at most 1MB of data, has only KV access (so basically persistent hashmap) and doesn't have to be blazing fast, the resulting code may be very simple.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: