in the implementation of the tail_xxx functions does allocate a new array.
I wrote a test program to use foldl to calculate the sum of an array, and it seems to have n^2 time complexity. Also a test program summing an array of length 10000 of 64bit integers uses 395816 kbytes of memory, which is close to 8 * 10000^2 / 2 / 1024 = 390625.
Thank you jerf and sampo for your very helpful comments. This is something that I had on my mind during the implementation but focused primarily on expressiveness.
While the array slicing in Fortran is O(1), the implementation of tail does make a copy. Though a good compiler should be able to replace tail(x) with x(2:), the Fortran standard does not guarantee it to do so. The first step forward that I see would be to replace tail(x) with x(2:) in the internal implementations of fold[lr].
In any case, the time complexity for these functions is definitely something I plan to document soon.
> replace tail(x) with x(2:) in the internal implementations of fold[lr].
Tried this for foldl with gfortran. Totally helps. Summing 10 000 int64's, memory consumption drops from 400M to 4M. Also time complexity becomes linear.
I wrote a test program to use foldl to calculate the sum of an array, and it seems to have n^2 time complexity. Also a test program summing an array of length 10000 of 64bit integers uses 395816 kbytes of memory, which is close to 8 * 10000^2 / 2 / 1024 = 390625.