意外に公正で草

どうせ julia 上げ記事かと思ってみてませんでしたが、覗いてみると意外に Fortran も上げられてました。お金が絡むとシビアです。

要旨

During the last months, I have been working intensively on quantitative intraday strategies. As a side result, I have tested workflows for similar tasks in Python, C, Fortran and Julia. Here are my findings.

I work on projects related to instruments trading (i.e. I design and simulate derivatives market algorithmic/quantitative strategies).
I have not used Machine Learning or AI techniques in these strategies, just plain/vanilla statistics and simulation.
Handled data sets are large but not huge, normally my simulations cover 30 million records data sets per asset/instrument, every data is used several times, and I do parametric and Monte Carlo analysis. This implies a large number of iterations.
I am not an expert programmer and I am not interested in becoming one, I just want to focus on the market logic and the strategies that exploit profitable edges.

I need a language that can deal easily and without efforts with large data sets.
I need speed.
I do not need that much speed to require multi-core or parallel processing.
I do not need —at this time— Machine Learning or AI libraries.

言語

The beginnings: Python and R
C/C++
Python + NumPy + Pandas
Python + C
Fortran
Julia, the newcomer

一部引用（太字は私の強調）

Fortran is now a niche language (currently around position 30 in the list of most used programming languages in the world) but it still has an active community in specialised fields such as high-energy physics, astrophysics and molecular dynamics. Together with C, it is also the main programming language for supercomputing. Standards such as OpenMP and MPI which define ways to distribute computing among cores and distributed computers are always implemented first for C and Fortran. 　

　

While modern Fortran has nothing to do with what most people think (upper case code, goto instructions and punched cards), its syntax shows some age. It allows a good code structure and it is quite easy to learn (even to master). My experience while porting the last simulation to Fortran was that coding was easy and that specifically, the algorithmic parts written in C were much easier to code in Fortran. The structured code was also superior to Python, although this is a personal opinion and I know many would disagree with this statement.
　　

There were also some problems: a harder integration with graphical tools and the fact that variables need to be declared in advance (something I do not like) but in my opinion, the main issue was that debugging was hard because Fortran at the end ends up calling a C compiler. So my experience is that debugging Fortran was a bit harder than debugging the Python+C solution.
　

On the positive side, Fortran also had some unique solutions to deal with array and structured data, which includes custom index (array[2000:2020] is a valid index range, something that no other language can achieve), vector operations and simple ways to initialise variables and structures.
　

Fortran or C is the way to go if you need speed and plan to do analysis requiring multi CPU and/or multi-core —HFT industry uses C++ intensively — , and it is not casual that both AMD and Intel keep their compiler divisions selling both C++ and Fortran compilers. But if you do not need that much speed, it might be better to trade off some performance for a more friendly environment easier to debug and the possibility to use DataFrames.
　

Fortran performance is astonishing. Fortran was able to read a whole year of prices from a CSV file, convert them to integers to avoid loss of precision, round them down to contract ticks and store them into memory in around 1 second. This means that a strategy using 20 years of 1-minute prices would be loaded into memory and ready to be used in 20 seconds. It simply outperforms by far any other language I tried.
　

Its native usage of arrays also made calculations very fast too. In many operations Fortran outperforms C. Believe it or not, 60 years later, it is still the number one number cruncher and its structure is well suited for simulation problems because it was designed with that kind of problems in mind.