Recovering ancient FORTRAN FFT Code
Why?
I stumbled upon a Twitter Post by Rudiger Mohr stating that he uncovered a scan of some FORTRAN code used for FFT and that he needed help transcribing the listing attached to the document. He ended with
Warning: You might learn some Fortran
Don't tempt me with a good time I thought and got to work.
The document he found was a 1967 technical report titled "Three Fortran Programs that perform the Cooley-Tukey Fourier Transform" upon basically all of the FFT code we use today is based on. Specifically the FOUR1
subroutine.
However there are three implementations in the document. And the comments documenting the FOUR1
program mention something along the lines of "btw this other implementation is faster".
PROGRAMS FOUR2 AND FOURT ARE AVAILABLE THAT RUN TWICE AS FAST
From the preamble of the
FOUR1
subroutine
The FOUR2
subroutine from the same document which is nowhere else on the internet to be found. It being faster than the carbon copy of all FFT code definitely seems quite intriguing.
How hard can it be?
The task seems simple: Copy the code from the annex of the paper into a file, build it and it works. That's the theory.
But, the print of the listing is very bad in a few places and the scan doesn't make it better.
This part is easy:
IF(M-NP1)160,140,140
J=J+M
This is a bit harder but managable:
CONTINUE
IPAR=3-IPAR
And this is basically impossible:
! ?!
The Code
The programs are written in Fortran II and intended to be punched onto puchcards. Therefore some restrictions that seem very strage for the comtemporary programmer apply.
The code is in fixed format, that means one line of code can b e at most 72 charachters in length and the first six columns have special meaning. If there is anything (here a C
) in the first column the row is considered a comment.
The columns 2-5 are reserved for labels. Those have to be numeric and unique per program but can be in any order. Being restricted to four characters means that there are at most 9999 labels in a program. But if you manage to program yourself into that corner, you have definetely different problems. Chances are that there is no computer (remember, you are in 1967) in the world with enough memory for your program anyway.
Since some code lines may need more than the space available in one of the fixed rows you can continue on the next row. Anything in the 6th column means that this row is a continuation of the previous one.
There is no distinction between upper case and lower case, the program listings are in all-upper-case.
Space is precious in Fortran programs so there is no indentation or unneccessarhy whitespace whatsoever. In fact, whitespace has no meaning in a Fortran program and you can completely omit it. It would make for an absolutely unreadable code though.
The document is from 1967 so it employs some programming techniques that are considered ancient even in the Fortran 77 standard.
A simple if
statement has a strage format and this way of writing if's can only be found in very old Fortran programs.
The format is
IF (number1)number2,number3,number4
and is basically a three-way goto: If number1
is less than 0
, goto the statement with the label number2
. If it is equal to 0
goto number3
and if it is greater than 0
goto number4
.
The full transcription for both the programs FOUR2
and FOURT
can be found in the GitHub repository.
FOUR2.f
The FOUR2
subroutine is actually quite short (200 lines with comments, 135 code) with most of the code readable in the listing. Still, in the readable parts determining the difference between +
and *
requires some sort of divination.
The best thing to do in the really hard portions of the listing is guess based on context and patterns and try.
In the beginning of the PDF there is an "errata" clarifying some of the really hard to read parts. It helps a lot, but there are still some places left where our best strategy is to guess.
FOURT.f
The subroutine FOURT
is a bit longer than FOUR2
(557 lines with comments, 394 code) and thus there are more places for transcription errors to occur.
I've located (through the art of googling) another Fortran 77 subprogram also called FOURT.f
archived through some ESO FTP server as part of the STSDAS IRAF package. The package that the Fortran code comes from was assembled in 1993 or 1994. 25 years after the release of the 1967 document!
The code in the ESO archive is definetely based on a copy of the code in the 1967 document, but is has some differences. Differences that also are interesting to study, but not in the scope of this post. It would be interesting to know where I. Brusco (he created the package where this file is used) got his copy of the FFT routine from.
The two versions of FOURT
are similar enough to really aid in the transcription of the listing.
Chasing errors
The Compiler
Since reading of the listing is not as straight forward as anticipated, there inevitably are some errors in the initial transcription. Some tiimes there are guesses on what might be written based on the surrounding code or on patterns found elsewhere. And then there are typos made during the transcription. The way the listing is printed also lends itself to skipping lines when reading, which also doesn't help.
Finished the first transcription and the first barrier to pass is the compiler. It has some checks and it will yell at you if you for example reference a label that doesn't exist.
Crashing
Second hurdle: Running the code.
The code given in the paper is a SUBROUTINE
, so it has to be called. So, I write up a quick PROGRAM
that calls the FFT code. But what to expect, how do we know that this is indeed doing a fast fourier transform and not some nonsensical mathematical manipulations? We could try to understand the FFT by heart and generate some test values. But that would require effort.
Instead we do a forward transform followed by an inverse transform. One would expect (if the transform is injective, which the FFT should be) that after a forword followed by an inverse transform the original input data is reproduced.
! NOTE THAT IF A FORWARD
! TRANSFORM IS FOLLOWED BY AN INVERSE TRANSFORM, THE ORIGINAL DATA
! WILL REAPPEAR MULTIPLIED BY NN(1)*NN(2)*...
Okay, all set for a test run, let's go!
./test-four2
Program received signal SIGSEGV: Segmentation fault
Well, not so fast then. Let's go hunting for memory invalid accesses. Probably some wrong transcriptions, calculating the index into the data array wrong or getting some loop parameters wrong.
That's a good excuse to get some old gdb
knowledge back. To get nice backtraces and information about where the heck we crashed exactly, we need to add the -g
flag to the gfortran
call.
Running correctly
After a lot of back and forth with wrong indices, pluses, minues and asterisks the code actually runs. It runs without crashing. Success! But is our goal to just run without crashing? No, we want to do the fast fourier transform! Do we actually?
Not at first, but there definetely were some correct numbers in the output of the first iteration of the code that actually did run.
I used a sample data set from the RosettaCode project and added some pretty printing to the console for checking to the test program.
There were a lot of correct numbers in the output, but they seemed to be shuffled a bit. The imaginary part appears one data point too early and has it's sign flipped. Strange. Nothing a good ol' hard read of the listing and some diff
'ing with the code from the ESO archive can't fix.
And it got fixed!
The sample data from Rosetta now gets correctly transformed and after transforming the data back, everything is multiplied by 8
as expected. The 57-year-old code lives once again!
The output of the FOUR2
program is a lot worse with all the imaginary parts missing completely and the real parts also not correct.
What's still to do?
The FOUR2
program needs to be running correctly, as it now does something but certainly not the right thing.
More rigorous testing of the FOURT
program would be nice. Multiple dimensions, large datasets, that kind of stuff. It would also be nice to cover the four different cases that the input data gets sorted into (see line 164).
Checking the claim that the code in FOUR2
and FOURT
is twice as fast as the FOUR1
variant would be nice, as well as comparisons to current implementations.
Doing the very unfair comparison with Python transforming the 8-element complex vector from above ten million times takes about 8
times as long using numpy
.
Code | Time |
---|---|
FOURT.f |
1.665 s |
numpy | 13.969 s |
Created using hyperfine
with two warmup runs
Conclusion
This was a very fun expedition into some very old code that is relevant up to this date. I learnt some Fortran for which I was looking for an excuse for quite some time now. All I need now is an 1401 to let this run on era-appropriate hardware.
If you find any errors and want to help, feel free to open a PR on Rudiger's GitHub or shoot me an E-Mail.