I find the example benchmark to be rather ridiculous. Recursion is slow in R because R is an interpreted language, and does not implement tail-call optimization.
If I wanted to implement the Fibonacci numbers in R I would do it like this:
Which takes about 1/10,000 the time of the recursive version (about .4 milliseconds on my machine).
So the conclusion I draw from the article is not that Julia is faster than R, but that you should know what your languages strengths are, and write code accordingly.
I'd love to see some examples of real statistical work were Julia's syntax is as easy as R, but the performance is superior.
Agreed, one should know the strengths of your language, and avoid the pitfalls. Every language has its share of both. But the effort, I suppose, is to increase the surface area of the strengths.
So one of the claims of Julia is that you write code in the most natural manner, and depend on the compiler to make it fast. So, for example, the claim is that adding the optional type parameters do not necessarily make your code faster... type inference is good enough in most cases (there are obvious counterexamples, eg with global variables, but its true in the majority of cases). Also, unlike for eg. Matlab or R, vectorising your code does not necessarily produce performance gains, arrays are good enough. One of the side effects is that the standard library is implemented in the language itself, and thus extending built-in types is a breeze.
Therefore, I think its a very interesting effort in itself. Whether it is good enough to displace any other language is a completely separate issue. As said below, the base of useful libraries in languages such as R is phenomenal. But that should not, I think, preclude admiring a very interesting new language, for the possibilities that it hints at, if only as a highly engaging mental activity.
Personally, and subjectively, I think the combination of the above, and multiple dispatch, causes the language to have a highly pleasing sense of elegance.
1 based array indices are indeed a very opinionated decision. But I must say it isn't a very difficult thing to switch one's mental models around. Its very easy to get used to.
Not to be pedantic, but Fortran 1-based array indices are from Fortran 66 and earlier. Fortran 77 and on include the ability to index from any arbitrary integer, including negative ones.
You can declare a 21 element array indexed from -10 like so:
real x(-10:10)
So hopefully anyone familiar with Fortran isn't hung up on 1-based indices.
Though it's designed as a scientific language, their goals of having C-like speed and yet Ruby-like beauty make it the perfect target for such a project.
The idea of a dynamic Ruby-like language with C-like performance reminds me of Avi Bryant's talk "Bad Hackers Copy, Great Hackers Steal" : http://vimeo.com/4763707
He explains that dynamic languages are not inherently slow. It's their current implementation that's slow. Java is not fast thanks to static typing, it is fast thanks to the Hotspot VM, which builds upon the StrongTalk VM implementation.
Quote taken around the 13" mark (some trolling included ;) ): "Ruby has the dynamic part, and Java has the fast part. And this has led to this assumption that people have that the reason that Java is fast and Ruby is slow is because Java is static and Ruby is dynamic. And that's just not true, that's just a myth. Nothing about Java being fast has anything to do with it being static. It's simply that the papers that the Java people read were the ones that told you how to make it fast, and the papers that Matz read were the ones that told you how to make it usable."
What do you have in mind? A microframework, like Flask or Sinatra? I like that idea. I'd have to real dive into the internal of something like Werkzeug before trying to help out. But the Julia language itself looks fairly straightforward.
I just finished up building a microblogging framework using Flask, and I'd love to do the same with a lean MVC framework in Julia, particularly if you achieve anything close to C-like performance.
Exactly what I was thinking.. something small and light, but elegant (very much like Julia itself!). I have little experience in this area and have only built the most lightweight frameworks for myself, but if we start a git repo I'll be the first to dive in and start poking around. I think it would be a really cool project.
Please let me know if you put up a git repo. I can't promise that I'll be able to make any significant contributions at this point in time (working on big project for the next few weeks), but I'd love to try to help when I do get some free time.
By the way, have you seen brubeck: http://brubeck.io/ It would be interesting to build this framework on top of Mongrel 2. Not saying we should necessarily try. But it could be fast, fast, fast.
Have you considered building it on top of go? Or is it more of a "Julia seems cool I wonder if it's good for this" type idea? Just wondering, because I'm thinking of doing the same thing once I get familiar with the golang environment.
I'm interested, but no idea how much help I'll be :-) Feel free to ping me and would love to try and help, as I've been wanting some context to learn and play with Julia since finding out about it, and being able to do web work in it would be great. Be nice to learn something about web frameworks design and whatnot at the same time!
Count me as interested. And be warned, i have less than an hour of experience with julia though. please post the repo link once you flesh out the basics.
A lot of times, I really don't think there is a need for a whole new language. I use R and python for scientific computing all of the time. There are packages which have been well validated and a community of folks who support the software. Making a new language may be interesting, but if it doesn't really bring anything new to the table, it's just asking for a micro-niche user base (just think of the million and one web frameworks out there).
Julia looks interesting, but I don't see anything that makes me jump up and say, "I want to use that!" Everything I have seen so far can already be done with R, Python, or a combination of the two. If there are a few minor drawbacks to any process in R or Python, I would rather live with it than learn yet another language. Speed is not an issue -- if it is then I am likely coding incorrectly. If I have coded correctly, then it is worth taking the time to create a C library (and since everything else is in perspective, this is just "monkey coding" and never takes as long as we think).
I wish all of the folks who are bright enough to create their own language would voice their wants and needs to the existing coding communities to see if their needs can't be met rather than making something brand new. This kind of community interaction is crucial for a language to mature. Wouldn't you rather have a few very mature languages rather than a million young ones (each of which has its own pros and cons)?
Languages like matlab, R and Python (via numpy), let you do efficient computations as long as you can let them happen in the core, that is written in C and Fortran. This is very often possible, but in some cases you find that it is not, and then it will either be very slow or you need to write a C module for it. None of these languages is designed to make it easy to make a compiler/runtime that is from the ground up efficient. They are instead a rather inefficient interpreter with a really good library. This is where Julia differ. It is designed exactly to be efficient from the ground up.ground up.
Possibly sexist question: Does giving a female name to a programming language increase the rate of its adoption? AFAIK, this is the only such language (perl was nearly named gloria, though!). Does anyone know the etymology of the name (julialang.org seems to be down, so I couldn't check there)?
There are Julia sets but that's for the lastname of the mathematician Gaston Julia.
I don't see a strong pattern. There are others named after female first names that aren't particularly popular (most, really), though the set of popular languages is small enough that it's not clear there's enough data to draw any conclusions.
For a more powerful example (lifted straight from the Julia docs), we can compare Julia and R code for computing the 25th Fibonacci number recursively.
How is it a powerful example? How is it meaningful when comparing programming languages (and implementations)? Maybe it is just a 4/1 joke? Or is it really meaningful for people who otherwise use R?
I don't think anyone has trouble with the arithmetical inequality. The issue is that the recursive Fibonacci number algorithm is notoriously unrealistic (it's exponential in its argument, growing as φ^n). Because of this, the claim that the example is more "powerful" is a little dubious.
As for replacing R, the whole problem is that no-one is really a fan of the syntax but there is an staggering number of packages ready to use. Bioconductor (tools for analysis of biological data: http://www.bioconductor.org/) alone is over 500 packages whose functionality is most likely not implemented for any other language/platform.
> "We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab."
I've thought having it both ways was an insurmountable barrier, and one had to sacrifice the ease-of-use of familiarity to the power of homoiconicity. I guess a resolution is to cheat, and have both sets of constructs (redundantly).
Disclaimer: I don't know anything about Julia, except from reading a couple simple programs just now, and seeing that it uses syntax that looks somewhat like Algol.
I don't think there's any inherent contradiction between the two, depending on what you mean by "the power of homoiconicity" (1). Homoiconicity doesn't mean you need to use lists: you could make a language like C but with native structures to hold all of the syntax. It's just that "cons cell and symbol" happen to be a really simple set of tools from which you can build everything you need.
I suspect that languages that try to have both do it by sacrificing simplicity. They have syntax that looks like C, and then make it homoiconic by building essentially a DOM for all of that syntax. Perhaps the idea is that writing and reading "x = y + z" is deemed sufficiently common that they're OK with making metaprogramming more complex if they get to keep Algol syntax.
(1) By "depends on what you mean", I meant: is simplicity requisite for power? I would say "yes", but I can imagine there's someone who would say "no" and consider Algol-plus-DOM (is that what Julia is?) to be as powerful as simpler languages.
A DOM would give the benefits of homoiconicity, though the quote is We want a language that’s homoiconic. As an example, recent versions of Java include the compiler in Java, and therefore the classes for the AST (though undocumented). That's a DOM; you can use it to implement macros. (There's also ANTLR and BCEL to do AST twiddling - just consider them part of the standard classes of a Java variant).
But I think to be homo- (same) iconic (symbols), the language has to be those symbols. By this strict definition, it has to be a lisp. Maybe there's a looser definition possible, between a DOM and Lisp. I think a parallel lisp-syntax would do it (i.e. you can write everything using a lispy syntax; but there's also a friendlier syntax.
hmmmm, you could do this for any language, provided a AST representation (i.e. lispy) syntactically unambiguous with the rest. That subset of the language would then be homoiconic. e.g. add a first-class AST syntax to Java. Is that the kind of DOM you were thinking of?
Good point. I guess I'd agree that it "has to be those symbols", but not that it has to be a Lisp. I don't know Java's new AST classes, but if I could say something like:
public class HelloWorldApp {
public static void main(String args[]) {
System.out.println("my class name is: " + HelloWorldApp.Ast.Name);
System.out.println("i have " + HelloWorldApp.Ast.Methods.Length + " method(s)");
}
}
then I would consider that "homoiconic": the things I'm defining are objects in the language itself. It doesn't sound like Java does that. My guess is that the AST is only available for text you pass to the compiler at runtime.
That's a problem, in Java: the compilation model is so constrained that even if you could write this, I don't think you could use it to make macros, since there's no way to control "read-time" versus "compile-time" versus "run-time" as you can in Lisp. So perhaps homoiconicity isn't enough: you need that, plus something like EVAL-WHEN.
That's your DOM idea; I'd argue it's not homoiconic, because it's too indirect. But I wouldn't argue very strenuously as I don't know the actual definition of homoiconic. :-)
You're right that the AST is available only for passed text; I hadn't thought of that as a limitation, but you're right. I was thinking of macros, that you define and manipulate within a program: the limitation doesn't affect this use case. Doing it this way, you also get control of {read,compile,run}-time.
I think it's fair to say there are degrees of homoiconicity; you can get some of the benefits by having just a subset of the features. I conject that "a language with full homoiconicity" is identical with "a lisp" - but it doesn't really matter, if we're interested in benefits.
I'm admittedly a huge fan, but it's hard to find any downsides, even for such a young language. A couple of things you might notice if working in the REPL:
- There's no quick way to del/clear the current environment. Namespaces are planned but not yet implemented.
- If a function is redefined, there is no automatic recompilation of dependent functions. It's easy though to list or recall existing functions.
Give it a go. It's a lot of fun. Some of the core team have been invited to present at Lang.NEXT.
Depending on the extent of your plotting needs, Julia's grpahics/plotting capabilities will probably not be quite the same as something like gnuplot or ggplot2.
Currently Julia handles parallelism on its own: you use the parallel primitives or higher-level functions provided by the language, not an mpi, and run over multiple processes using either "julia -p <N>" (on a single machine) or "Julia -hostfile <host list>" (on a cluster). The Julia interpreter does the networking itself in the background: currently I'm pretty sure this tunnels over ssh.
I rather like Julia's parallel syntax, but I'd really prefer that the inter-host communication by done via MPI on the back end for performance. Something to hack at if I ever get time...
I love it when new projects come out of the gate with a `/debian/` directory. Great way to reduce barriers to adoption. I'm actually surprised that it is not in experimental yet...
If I wanted to implement the Fibonacci numbers in R I would do it like this:
x<-numeric(25); x[1:2] <- 1; for(i in 3:25) x[i] <- x[i-1] + x[i-2]; x[25]
Which takes about 1/10,000 the time of the recursive version (about .4 milliseconds on my machine).
So the conclusion I draw from the article is not that Julia is faster than R, but that you should know what your languages strengths are, and write code accordingly.
I'd love to see some examples of real statistical work were Julia's syntax is as easy as R, but the performance is superior.