Hacker News new | past | comments | ask | show | jobs | submit login

I'm going out on a ledge here with this idea/question as I'm not sure if it makes total sense...I've been researching different methods for database interoperability and have been poking at Apache Avro. Would it be viable to tie the DB into Apache Avro, host a JSON schema with clear indication as to how the file is stored and add in schema for whatever other database you want to connect to? "Avro relies on schemas. When Avro data is read, the schema used when writing it is always present. This permits each datum to be written with no per-value overheads, making serialization both fast and small. This also facilitates use with dynamic, scripting languages, since data, together with its schema, is fully self-describing. When Avro data is stored in a file, its schema is stored with it, so that files may be processed later by any program. If the program reading the data expects a different schema this can be easily resolved, since both schemas are present. When Avro is used in RPC, the client and server exchange schemas in the connection handshake. (This can be optimized so that, for most calls, no schemas are actually transmitted.) Since both client and server both have the other's full schema, correspondence between same named fields, missing fields, extra fields, etc. can all be easily resolved. Avro schemas are defined with JSON . This facilitates implementation in languages that already have JSON libraries" I know of a few other ways to achieve this result (choosing which database the user wants to use while maintaining interoperability) but would love feedback on better approach to understanding this process.



I don't think avro helps in any way.

Right now, the common solution is to have some query-builder library which is able to build up a SQL string for a given database based on higher level descriptions of what you want. See ActiveRecord, SQLAlchemy, etc.

This lets the query-builder library hide differences between different sql languages as part of its internals.

These query-builder libraries typically come as part of an ORM, which means you define your data as an object in your language (e.g. a ruby class).

From what it looks like, avro would only be useful for replacing the definition of an object (the programming language class), but that's not a sticking point right now, and defining classes in your chosen language is easier than pulling in a different schema language... and replacing that portion still doesn't help with the true problem, which is that a SQL string for MySQL may not work on sqlite.

I don't see how avro could help in regards to the different wire format different SQL databases have, nor could it help build queries as far as I can tell (e.g. knowing that 'ON DUPLICATE KEY ...' in mysql is kinda like 'ON CONFLICT ...' in sqlite). Even if it could store these differences, the query-builder library already stores them quite efficiently in their language, and such libraries exist for almost every language.

I do not see how Avro helps anything in this area. If you could describe in more specific detail where you think Avro might be relevant, I'd be happy to consider it from that angle. Your comment above is a bunch of words which talk about avro, but not about how it might apply to this specific problem. (Also, paragraph breaks help readability)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: