Using Haskell at SQream Technologies

thinkpad20 · on Sept 30, 2013

Looks really interesting. I wrote an SQL parser and compiler frontend for a SQLite-style database last summer. I started off in C using Lex/Yacc, but for the last piece of the project I used Haskell. This step was to take SRA ("sugared relational algebra"), which is essentially a transition step between SQL and simple relational algebra, and desugar it into relational algebra. The code is up at https://github.com/thinkpad20/sql if anyone is interested, with the code for that portion being contained in the haskell folder. I might translate the whole thing into Haskell (maybe using Attoparsec) at some point.

I was curious that you used a different version of relational algebra than the one I had been taught. I've usually seen RA described in terms of six fundamental operators: Project (pi), Select (sigma), Rename (rho), Cross Product, Union and Difference. What gave rise to the model you chose?

Anyway, it's great to see that more people are using Haskell. I would love to be able to work in it some day.

jakewheat · on Sept 30, 2013

That is really fascinating, I will have a good look at that, thanks! I have a SQL parser in Haskell on github here (using Parsec): https://github.com/JakeWheat/hssqlppp

The model in the article is roughly based on the version in Date and Darwen's work which is almost the same as what you mention: project, rename, extend, join, union and not matching. The summarize by is kind of syntax sugar in this system. I left out some of the operators which weren't used anywhere in the code in the article. Industrial grade conversion from SQL to relational algebra is a real challenge!

thinkpad20 · on Sept 30, 2013

Wow, your parser is a behemoth! My parser ended up being around 500 lines of Yacc. I know I didn't fully implement SQL, but I thought I had gotten pretty close. Are you doing any extra fancy stuff inside of the parser? Clearly the documentation is pretty extensive. I like the lhs work; I might start doing some of my projects in LHS.

SQL to RA is definitely quite a task, but it was pretty fun (up until the end, when it just started getting annoying...). Then again, I think compilers and programming languages are a blast anyway. Haskell makes it even more so :)

When I wrote this code, I wasn't really comfortable using monads, so all of the threading is explicit. If I were to revisit it, I would probably use a state monad, which might also make it easier to make it tail-recursive (as it is, my desugarer is not written in a tail-recursive manner).

jakewheat · on Oct 1, 2013

The parser also does typechecking, and attempts to cover a good subset of DML, DDL and procedural SQL as well. Maybe it could be smaller!

How would you handle correlated subqueries when converting from SQL to RA?

Ixiaus · on Sept 30, 2013

I think it is worth taking the risk to use a relatively unusual and unproven technology.

should be:

I think it is worth taking the risk to use a relatively unusual technology.

jakewheat · on Sept 30, 2013

I think you are right. It should have said relatively unproven technology for SQL DBMS implementation.

pnathan · on Sept 30, 2013

Why is using Haskell a risk? It's not a proprietary system where the company might shut down.

egl2001 · on Sept 30, 2013

1. Speed and space can be difficult to predict without considerable experience. 2. The libraries are not as extensive and tested compared to conventional choices such as Java or C++.

evincarofautumn · on Sept 30, 2013

Using strict data structures is usually the simplest way to regain predictability:

    data QueryExpr
      = RelVar !Text
      | Restrict !ScalarExpr !QueryExpr
      | Select !(HashMap Text ScalarExpr) !QueryExpr
      | SummarizeBy !(Vector Text) !(HashMap Text ScalarExpr) !QueryExpr

Excessive strictness can cost you performance if you end up evaluating things you didn’t have to, but ASTs should probably be strict. Using Criterion for benchmarking helps find the sweet spot.

I have found Haskell’s library ecosystem quite good—any time I’ve needed a library, there has been a suitable one. In the rare case that changes are necessary, maintainers tend to be responsive and helpful. We probably just work on different kinds of software.

jakewheat · on Sept 30, 2013

Many people see it as a big risk (such as investors and customers) which makes it self-fulfilling.

carterschonwald · on Sept 30, 2013

otoh, theres a few mostly haskell shops, and that number is growing! Its really exciting to see the growing haskell adoption at a number of companies doing interesting things

pnathan · on Sept 30, 2013

Can you describe better why they see it as such? The key risk I see in Haskell is the lack of enterprise-y libraries.

jakewheat · on Sept 30, 2013

I guess a lot of it boils down to them never having heard of Haskell before.

raphinou · on Sept 30, 2013

It would be interesting to know if the devs there are long time haskell developers, or if new hires could easily be trained in haskell.

jakewheat · on Sept 30, 2013

We have one other Haskell programmer, who previously worked with some other functional programming languages (mainly some dialect of ML I think) and was working with Erlang in his previous job, and he picked it up quite easily. I think he is unusual though, I found it much more difficult to learn (but extremely rewarding).

raphinou · on Sept 30, 2013

What about learning/discovering the haskell language and its ecosystem? Did you have a lot of refactorings, replacements of libraries wrongly chosen, etc? Were these changes facilitated by haskell? Very curious to know because haskell is high on my list as the language to use in the near future.

thinkpad20 · on Sept 30, 2013

In my experience, the more you learn about Haskell, the more you can pick out spots in your code where you could have made something more concise, more elegant, more efficient, etc. The excellent thing about pure functions and Haskell's strong typing, though, is that refactoring is a breeze -- changing a function will never break another function as long as the inputs and outputs to that function remain the same.

jakewheat · on Sept 30, 2013

The was and continue to be a lot of refactorings, but they were/are pretty much all related to the complexity and sheer number of features in SQL. Haskell really gets out of the way, and helps make the refactorings much more easy.