Hacker News new | past | comments | ask | show | jobs | submit login

From the announcement “As of now, we have mined 1,580 PySpark tests from the Spark codebase, among which 838 (53.0%) are successful on Sail. We have also mined 2,230 Spark SQL statements or expressions, among which 1,396 (62.6%) can be parsed by Sail”

Kinda early to call this a drop in replacement with those numbers no?

But, with enough parity this project could be a dream for anybody dealing with spark’s dreadful performance. Kudos to the team




The next paragraph explains that: "When looking at the test coverage numbers alone, Sail’s capability may seem limited. But we have found that there is a long tail of failed tests due to formatting discrepancies, edge cases, and less-used SQL functions, which we will continue tackling in future releases."

I am with you that it is still very very early. I'll personally keep an eye on the project.


I'll keep an eye on it too, but for a query engine formatting compliance and edge cases tend to be almost all of the work. It's easy to implement SELECT x FROM y WHERE z.


Yeah but the website literally says “zero code changes”. It’s the long tail that’s dangerous since most people don’t understand it as well as a the core functions




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: