Hacker News new | past | comments | ask | show | jobs | submit login

K/KDB is a pile of shite. It's loved by quants and managers in financial firms. But it's hated by programmers who actually have to write code in it and deal with shitty code written by quants. K is full of arbitrary restrictions (two many global/local variables - what is it, 1970s?) and crazy quirks (function closing brace has to indented, otherwise you'll get a weird error). All in the name of performance, of course. The error reporting is insanely cryptic. It doesn't tell you what caused an error or even where it happened. It just tells you something like "'rank" - and that's it. Once your code grows beyond 10 lines, you can kiss your sanity goodbye.

And let's talk about performance. K is inherently single threaded. If you have a powerful 96-core machine, it's of no use to you because your K code will be using only a single core. I've seen numpy easily beating K when numpy uses MKL library with automatic parallelization.

There is a reason K/KDB is not very popular outside of a certain domain. It's not because it's proprietary and it costs a fortune. It's just bad.




Having worked a bit with KDB, my experience is totally different. The investment bank I worked at had a large and thriving KDB community of developers. In one particular project, we were able to replace a system consisting of 10's of thousands of fairly good but heavily abstracted java code reduce it to a few hundred lines of Q. The java application at took 7 hours to run its worst case query and the Q code that ran the same query in less than a second. The Q code was solved the business case better, it was more readable mainly because there were no unnecessary abstractions. There was still a OOP middle tier, but it was mainly pass through. Calling any technology a "pile of shite" should be done with great care. Most solutions solve a use case and KDB/K solves its use case exceptionally well. To be honest, understanding what the 'Rank' error means is not that hard. For the project I referenced earlier I personally found it much easier to debug Q than trying to find out what an an AbstractCalculationFactorybuilder was doing. There are very few poor languages, just poor developers.


This comment reads like someone who worked with the language very briefly. The gripes are all things beginners might say and aren't real issues.

This isn't a critique that's really worth giving much thought to. "A pile of shite"

My guess is that they were a grad with FD for a short period and had a bad experience.


Anything will be better than Java, especially enterprise Java ;)

Please compare it to a modern functional or array/data-parallel PL or combinations of them, i.e ML-family, Julia, Futhark, Clojure, Elixir, etc.


the only thing this anecdote tells us is that the legacy java code was a "pile of shite"...


> K is full of arbitrary restrictions

The restrictions, lack of lexical scoping, and weird error messages are indeed frustrating, especially when starting out.

> And let's talk about performance. K is inherently single threaded. If you have a powerful 96-core machine, it's of no use to you because your K code will be using only a single core. I've seen numpy easily beating K when numpy uses MKL library with automatic parallelization.

I'm not sure where you got that from, but it's not true. You can use peach to quite great affect. For example, to do a matrix multiply that'll utilize all slave threads (which you can set at runtime):

mmu[;x] peach y

For more info, check out https://code.kx.com/q/basics/peach/. And while the parallelization isn't automatic, it's pretty damn easy to just s/each/peach/g (you should profile first, though, of course). Moreover, you can just run multiple interpreters and communicate through IPC. It works really well and is pretty easy to do. Give me some numerical python code and I guarantee k/q can do it at least as fast. If your pipeline is something like do some number crunching and save a bunch of rows to MySQL, kdb+/q is always going to be faster, hands down. The performance benefits of having the db and language in one process cannot be underestimated.

> There is a reason K/KDB is not very popular outside of a certain domain. It's not because it's proprietary and it costs a fortune. It's just bad.

I strongly disagree. If it was up to me, I would throw out all the crap numerical Python code I write everyday and switch to kdb+/q. kdb+ is a pleasure to work with and the query language is so much better than SQL it's not even funny.

The only real problem with kdb+/q is that there's not many libraries for it, in particular, neural net and ML libs would be on my wishlist. Though, I'm sure if you're a paying customer First Derivative will whip up some super high quality shit for you.

Also, you're argument doesn't really make sense: if k just sucks, why is it used at all? Because quants and their managers are stupid?


> it's hated by programmers who actually have to write code in it

False. Many people really enjoy programming in it (myself included).

> It doesn't tell you what caused an error or even where it happened. It just tells you something like "'rank" - and that's it.

Since version 3.5 (~2 years ago?) it tells you exactly where the error occurred (prints out the line with a carat showing where in the line the error occurred). And with a little practice you easily learn what the common errors like 'rank & 'type mean. I actually find the error messages pretty helpful.

> If you have a powerful 96-core machine, it's of no use to you because your K code will be using only a single core.

Technically you can run q multi-threaded, but the typical architecture used in kdb+ systems will be using multiple processes, each doing one thing (data ingest, in-memory intraday data storage, data persistence, historical querying, gateway). For example it's generally fairly trivial to add a bunch of extra processes for historical queries so many users can query in parallel.


I work with KDB daily. This comment is misleading and uninformed. I'll agree the language is cryptic but that's what makes it fast. In regards to single threaded, that's just false. There are many options from slave threads to parallel processing you can use to speed up processing if your data sets are large (billions +) or you need to query historical data that is in the size of petabytes. Also once you learn the syntax and error handling it's not bad, just like every other language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: