I had to learn Fortran about 3 years ago, when I got into particle physics.
I had the same reaction: Fortran? Still used? Omg! But after three years, I learned to like it. In its modern form, it is simple, fully vectorized, and fast. I like its syntax more than Python (explicit end is better than significant whitespace). And did I mention it is fast?
So — you have a modern vector math-oriented language comparable to MATLAB or Numpy, but much much faster. And with excellent, state of the art libraries. What’s not to like?
> And with excellent, state of the art libraries. What’s not to like?
Here's my gripe with Fortran: as a computer scientists that works with combustion chemists, astrophysicists, and nuclear physicists, it's disheartening to see the number of domain specific optimizations that these scientists hand code over and over and over ad nauseam in hundreds of code bases all around the world simply because they think their Fortran compilers are good enough. They don't realize that it would be much more productive if their communities standardized around individual DSL compilers capable of doing higher level transformations specific to their individual domains so they don't have to do complicated transformations by hand (and often mess them up). It's fine if those DSL compilers generate Fortran out the back end so they can interoperate with existing optimized Fortran libraries, but holy hell people, we are living in the 21st century, can we please stop programming in primitive data types (without any unit type checking to boot!) and calling MPI send/recv directly. Scientists have better things to be doing with their time than managing bits directly; raise the level of abstraction by building community specific DSLs, standardize, and let's get on with our lives.
Meanwhile, you can take the code from a paper of 20 years ago and it will run flawlessly, without going through dependency hell and battling compiler versions. Code reuse is not always a good thing; self-contained, zero-dependency code is often extremely important. I’ll stay with Fortran for my physics needs, thank you.
It's true that it could happen with any language, but it seems to be especially prevalent and widespread with Fortran. Something about the culture of the language encourages people to dig their heels in and assert that it is good enough.
I have to use Fortan as part of my day job and mostly don't like it. What I hate most of all are the lack of any decent string data type and built in C++-STL-like data structures library. It would be great to have hash maps, etc. The poor built in library compared to Python and Scipy is extremely frustrating. It's very hard to get anything done which is not just array manipulation.
The Fortran I worked with in my minicomputer days was for the most part 100% standard, the exception was its string handling.. which was excellent. So I wrote all kinds of text translating tools (and lots of other tools) using that Fortran compiler. That was a long time ago..
i agree with you. it really reads like Python and seems more explicit from the little i have seen. i wonder how you went about learning it? can you recommend a book? thx.
Yep. That's the community I had in mind. Particle physicists love Fortran :-)
However, I'm not so sure about the speed argument anymore. And in any case for most applications (which are rarely big simulations) it's not worth the sacrifice of readability and modern tools IMO.
Even at CERN people are re-writing some tools in python. (probably wrapping some C++ code, but still ... it's not Fortran).
A youtuber that I follow recently finished his PhD and used FORTRAN extensively. He said that he started with python and it was incredibly slow so he rewrote everything in FORTRAN.
It may be that his use case (something with cloud or other weather modeling function) was just particularly well suited
Any compiled language will be drastically faster than an interpretive scripting language like Python (although numpy/scipy is generally C or Fortran libraries compiled to be called by Python). Having said that I used Fortran for my PhD to do radiation simulations because I was implementing a new methodology into existing code.
In my opinion Fortran is easier and safer to use for scientists with little background in coding because it was built to do math/simulations really well AND it's readable almost like psuedocode (compared with C or C++). Also, modern Fortran is object-oriented which makes building reusable tools and large packages pretty easy.
Having said all of this I think a lot of the physics/nuclear/aerospace/finance community is probably transitioning into more modern languages and starting to employ people with actual CS backgrounds to build with more updated coding practices.
Did they write python correctly though? If you vectorize your algos well enough to exclusively use numpy scipy and never use python loops then python is really really fast (not because python is fast but because numpy/scipy is fast)
Exactly. That's why I say that the speed argument is a bit obsolete by now. We can have the best of the two worlds with python! Elegance/simplicity and speed.
The focus on the speed of execution also downplays two other important aspects of python: speed of writing code and possibility to visualize results easily. As a theorist these two aspects outweigh any argument in favor of Fortran. The only serious competition to python for me is mathematica.
The truth is, any language other than Fortran/C is slow for any serious high-performance-computing task (which is not the same as everyday back-of-envelop calculations and plotting).
CERN is an exception; they are famously a C++ shop. I really really don’t like C++; nearly anything else is preferable to me. Fortran is simple and math-oriented; C++ is horribly complicated and system programming oriented.
There is a subset of C++ that is really pleasant to work with, performant, and simple-ish. Unfortunately, the community around the standards/development processes just seem to want to keep adding more, more, and more to an already top-heavy language. Every idea in language design just must be added in.
When you add in the problems from legacy C compatibility, you have a language that no one person can ever hope to grasp in its entirety. I mean, it's so complex now it pushed Scott Meyers into semi-retirement.
I wish they would have just dumped C legacy stuff and started over some time in the 90s. Actually, the language D is precisely what I wish C++ had become. I just wish D was more widely adopted.
For me modern Fortran is the way to go in high performance computing applications. It's intuitive easy and easy to use, has a large support base with highly optimized compilers. Physics has been backwards in the past with sticking with Fortran 77 for too long. But modern Fortran is way better than anything else in Physics/HPC for many of the reasons given in the presentation.
I urge colleagues and students to not fall for Fortran prejudices, based on crusty old Fortran 77 code. Python is nice and dandy, but it's not for HPC. When you suggest students to stop use Fortran/C and use Python, they might have used the wrong tool in first place, and this has nothing to do with the language.
They use the Fortran 90 version I suppose. I should've said that I'm a theorist though and HPC is not really an issue for us (And if I do want something to be fast I'd use C.)
Heh. I was talking to an older relative of mine recently who explained how excited he was when FORTRAN IV came out. Finally there was a programming language for 'normal' people which didn't require that you know about computers or programming, but just let you type in the what you wanted to solve and it would just do it for you.
Keep in mind that the Python data science community is built around Fortran - NumPy and SciPy have BLAS/LAPACK at their core and you still need a Fortran compiler to build them fully from source. I agree that Python and Jupyter are a much nicer interface for students / end users, but if you want to hack on the numerical algorithms themselves, it still seems like Fortran is the best language for it.
Is Fortran really the best language or it's just too hard to rewrite the code in C / Rust / Julia? I believe it's just easier to wrap it, as it contains tricky numerical edge cases that are too complex to understand for an average software programmer.
Thirty years of edge case fixes are harder to re-write than anyone expects. Even harder is trusting the result. Did you get all the edge cases in the re-write? Will my code be inaccurate because of one you missed?
I should clarify - I mean that it's the best language because there's a large body of existing code and a community around it, not because Fortran is inherently better. I think you could rewrite it in Rust or Julia or something if you had a bunch of engineering effort and also enough organizing effort to make a good community around it and convince the non-NumPy/SciPy users to move to your new thing too.
To pre-empt any wrong interpretations or conclusions: The Rust version heavily relies on optimization annotations and using direct calls to SSE functions. The Fortran version uses no special code and relies purely on the compiler optimizing code and algorithms. The Rust version relies on hand tuned pieces and is several times longer. Just a matter of work for someone to write an equivalent version in Fortran or C that will be equally fast or faster.
If you want to get a real-world impression look at the other Rust implementations that roughly correspond to the Fortran code. They are almost 2 times as slow. This gives you some real-world insight on how much performance you can achieve using Fortran instead of Rust and spending the same time writing code.
That is the big point about Fortran that keeps getting overlooked. Sure heavily optimized C/Rust writting by an expert in writing fast numeric code will absolutely hold its own against the equivalent Fortran. However naively written C/Rust written by non-programmers in the clearest most obvious way possible will almost always be much slower than the equivalent Fortran code.
The C version was ported to be the Rust one, it seems, and uses those intrinsics too. And, eventually this code will get to be a bit higher level while having the same output; those libraries are still a bit experimental though.
> The C version was ported to be the Rust one, it seems, and uses those intrinsics too.
No. Although the Rust program was initially presented to me as a "port of fastest C SIMD variant" the programmer made additional optimizations not found in the C program:
- Moving the loop from outside into "bodies_advance(..)" (SSE pipelining(?))
- Bundle intermediate variables/arrays as struct NBodySim (caching)
- Fit array-sizes within struct NBodySim to the number of bodies (caching)
Matmul is a very common operation, you get problems when you have more exotic computations.
As an example where it got harder for me to find code is numerical CDF function approximation for bivariate / trivariate normal distributions. This is of course just my example, but I'm sure there are a lot of similar operations that are really hard to rewrite because the math is so complex.
That stuff is actually the easy stuff. If you write standard Julia code for those kinds of algorithms you'll hit C or Fortran speeds your first time if you know what you're doing. Lower levels kernels like BLAS have a lot of cache optimizations with how they do things like blocking, so it took awhile for things like StaticArrays to be used in a way to recreate what's going on there. There is still some work needed on mutable stack allocated buffers in order to handle to optimize more of the cache handling but it is quite close now!
If it's so easy can you show me an example code for it?
The speed doesn't matter, it's the algorithm that's important. Please make sure that you handle all the numerical instabilities, and show me the Julia code for binomial and trinomial normal CDF computation that is not using integration, but a fast, correct and precise approximation.
Did you not look at StatFuns.jl? Bivariate is here: https://github.com/JuliaStats/StatsFuns.jl/blob/e21bc26b1773... . For trivariate you'd just do the same kind of translation process if you have a Fortran code which has an appropriate license of course. If you have an example of a Fortran file for trivariate, we can open an issue on StatFuns.jl and get an undergrad via Google Summer of Code to translate it over, or implement from scratch from a paper's description.
The reason why this kind of code is easy to translate is because there is a direct mapping of language features (difficult to translate code is code that uses unique language structures). The only real difficulty of mapping (non-object oriented non-distributed) Fortran into a higher level language is keeping the speed.
In my opinion, the biggest inconvenience with Modern Fortran is that the toolchain is in no way modern. There is no language-level support of a linter, a package manager, a document generator (I use Doxygen but it is not Fortran specific), etc. As a language geared toward numerical computing, I feel that it should have a big standard library that covers advanced math, linear algebra, statistics, ode solvers (like Julia). Yet if you want to get something done, you may still need to reinvent the wheels on top of LAPACK. These things hurt productivity more than the syntactical features.
There are several tool chains but they are either internal on institutional level or commercial. The main numerical library is NAG. Still comersial but cheap compared to the price of most cluster systems that the code is developed for. Heck, FORTRAN is probably the only language where people pay significant sums for the compilers (e.g. pgi, ifort).
Back in my first job we brought NAG libraries as it was far more cost-effective to buy it in rather than having expensive Engineers reinvent the wheel.
Update some of my coding style is still based on from looking at that NAG code :-)
Although I would prefer to not ever see Fortran code (as I don't know how to call into it easily, C implementations are easier to embed in any language), some numerical methods with simple code but heavy math are just not reimplemented too often.
I believe the real value in Fortran is in these libraries that handle numerical edge cases as well. It would be much easier to move them to a newer language if they would contain unit tests, but they usually don't, so it's just not worth the risk sometimes.
The Fortran language defines a pretty clean FFI to bind to other languages. It’s not really any harder than binding to C code, which is why it’s called BIND(C) and ISO_C_BINDING. That was introduced in F03. The only period when it was murky how to interoperate was in the 90s and early 00s when compiler vendors had their own array descriptor structures that didn’t agree with each other. I worked on a language interop tool back then and spent way more time than I’d have liked reverse engineering those undocumented data structures for all of the compilers at the time.
That is surely an implementation specific complaint; calling FORTRAN code from other languages was always very easy on VMS (same with Pascal, BASIC, whatever else)
Just read a comment about how python and c being slow here. And why FORTRAN help complier to generate quick code. May be it is a DSL. May be ... a language survive for a reason.
First computer lecture the guy said why fortran does not die. It should. That is 1979.
I'm surprised for 2 reasons: 1/ CS folks are still talking about Fortran. 2/ The word Modern and Fortran in the same sentence.
Up to a few years ago, I found often myself in discussions with colleagues on the benefits of Fortran vs. less horrible languages such as C. And I always thought that we are so backwards in academia (physics) compared to CS people who have buried Fortran a long time ago. There are even colleagues who use Fortran 77.
Fortunately, things are changing fast now with more and more people using Python. I learned it myself about a month ago and urged my students to stop using Fortran and/or C. I see 0 reasons to torture a new student with Fortran. I am using Jupyter notebooks with my students now and things are so much better: easier/faster debugging, appealing code and interface, higher level features, visualization tools, etc.
> Up to a few years ago, I found often myself in discussions with colleagues on the benefits of Fortran vs. less horrible languages such as C.
In the context of numerics, C is in no way a "less horrible language" than Fortran. The necessity to use pointers in C for multidimensional arrays, and the large amount of "undefined behavior" in the C standard makes it almost impossible to write bug-free number-crunching code. In particular when the person writing the code is a scientist/engineer, not a professional programmer or computer scientist, they are guaranteed to shoot themselves in the foot with C or C++.
Yea...I can write reasonably performant Fortran, but have no clue how to do matrices in C without resorting to the numerical methods book that doesn't allow you to use it's algorithms and has the dumbest license ever (is it called numerical recipes in C)...the Amazon comments are illuminating.
Why is it a surprise that CS people still talk about something that is, at the moment, alive and well? Fortran remains one of the best ways to get native vectorized code for fast math operations. Yes there are higher abstractions like python, but if you want to write code in the same domain as Fortran, you are using numpy, scipy, pandas, etc., and those are all literally wrappers over low level C and Fortran code. Abstractions don't remove the need for low level, "out-dated" coding models, they just make it easier for most people to build on them.
Keep in mind too that a lot of long scientific simulations are more complicated than vectorized loops or linear algebra. Monte Carlo algorithms were developed to simulate neutrons in the Manhattan project and programmed with the help of Von Neumman.
What's wrong with Fortran? You're argument seems to be "it's old." What are the main features you find to be lacking? What language would you propose should be used for the use cases of Fortran, and why?
Age is a pretty good argument actually for computer languages IMO. More abstraction is better, and this usually comes with time and progress. Otherwise we would still be using Assembly, which incidentally should be faster than Fortran since most people here seem to focus on speed of execution. Python is more abstract, ie higher level and this makes it more suitable for scientific tasks than Fortran for the vast majority of problems. As I said earlier not all scientists are doing heavy simulations on a daily basis. And for scientists the goal is their science not the coding.
This is not a post with CS folks talking about Fortran--that is not TACC's role.
Despite the enormous amount of snark directed Fortran's way, practitioners in HPC still use it, and Python cannot seriously be discussed in the same context.
Yes but how many of the people who these slides were designed for are people doing cosmos simulations? That's the relevant quantity. It's especially funny because the author's area of research is in cosmos simulations. :-)
A lot? I would guess the slides are meant for advanced undergrads or first year grad students in a physics program. ee how the math-y parts of the slides are pretty sophisticated (e.g., bessel functions) even though the programming part is pretty simple?
Yea, but people still do all kinds of other numerical work in Fortran like Power Systems and it is much easier to read than the new C++ replacements I've seen.
Check out GFortran, it's 100% free as part of the GNU project. I believe the Intel Fortran compiler is also free (as in money, maybe not as in freedom). The speed argument does matter too--how many high frequency trading/fintech/big data/banking/engineering companies are built, at some point in the stack, on high-speed libraries written in Fortran? I don't know any numbers, but any company that uses numpy is transitively also using Fortran in their stack.
I'm not sure that after a month you know enough to really evaluate a language (any language, not just Python). You know the problems and headaches that went away. But you haven't found all the new ones yet...
I had the same reaction: Fortran? Still used? Omg! But after three years, I learned to like it. In its modern form, it is simple, fully vectorized, and fast. I like its syntax more than Python (explicit end is better than significant whitespace). And did I mention it is fast?
So — you have a modern vector math-oriented language comparable to MATLAB or Numpy, but much much faster. And with excellent, state of the art libraries. What’s not to like?