4

Problem Background

Recently, I joined a government agency as a software engineer/scientist/analyst. Previously, worked in software industry - gained 3 years of software engineering experience at previous job (to add to about 7 years in computational science/scientific computing). My current job is to come up with a strategy for modernizing a legacy scientific program.

The scientific program to modernize is a large legacy computational system that basically does mathematical optimization. Development started in the 1990s and has not kept up with best practices, unfortunately. It was/is written by scientists and analysts.

The main component of the system is a Fortran-based (various versions starting from 90, some newer versions incorporated, compiling with 2018 compiler) program that does the optimization. The program consists of 400K lines of Fortran code, 20K lines of shell scripts, and 60K lines of external math solver code. There is no test suite, hence the legacy label. The program can be thought of as a dozen modules that describe a particular physical component's behavior in the optimization. The general flow of the Fortran program is described in a main routine, where these dozen modules are called sequentially. The main routine does some other data orchestration and I/O as well. There is some interface to commercial products and optimization solvers, probably through a home grown Fortran wrapper. One of the biggest issues IMO is the use of global variables - both main and the modules have access to these globals, so change to the state can be made from anywhere (see my specific question).

There is a lot of home grown code for sub-systems or utilities that manage the main Fortran program, written mainly as shell scripts. These sub-systems include:

  • a queuing system that manages the executions of the main Fortran program on internal prem Windows servers,
  • post-processor that converts the Fortran UNF files to CSV and Excel format,
  • custom visualization package written in Visual Basic that plots the results of the Fortran program,
  • version control utilities as wrappers around RCS VCS,
  • compiler utility that wraps the Fortran compilation.

Those are the main sub-systems or utilities necessary to work with the Fortran program and its input/output, but there are loads of other Fortran programs and shell scripts that do longer-term things like server space management and license management.

My immediate team is responsible for the Fortran code execution and integration with other modules (so not all 400K lines of Fortran is in our scope, just maybe 10-20%, the rest is with other groups responsible for the dozen modules, which introduces some organizational pains since we have no control over their code). My team consists of me and another software developer, both mid-level software developers converted from scientific computing. A junior software developer with a traditional background in software and CS is joining shortly. Our senior software developer (one of the original developers of the entire system) is retiring in 1 month, and we are in the process of trying to find a replacement.

Problem

My question is: What are the components and sequence of the modernization plan/strategy that I should consider? The modernization is basically the process of moving from legacy to a more modern process, both technically (e.g., architecture, frameworks) and organizationally (e.g., agile process management for development).

Proposed strategy

Currently, at a high level, my plan is to:

  1. assess extent of home grown code for systems that are not part of the main Fortran program;
  2. replace each of these home grown solutions with best practice open source solution, so we maintain as little code as possible;
    • current order is modern VCS (Git/Gitlab), then queuing system, then viz package, but order will be determined by how much code there is per sub-system.
  3. with the remainder of the code - hopefully just the main Fortran program and not some vital sub-system that we cannot find an open source solution for - capture current behavior with characterization tests;
  4. refactor (update Fortran, port all functionality that doesn't do number crunching from Fortran to Python, etc.), make sure tests pass, repeat;
  5. "futurize" code by updating architecture to enable cloud compute (to avoid vendor lock in), using Docker for containerization.

Research

I've looked at some great discussion of similar topics:

But notice that some of these questions and answers are almost 10 years old, so I wonder if there are better approaches available. Also, I am dealing with a procedural scientific computing environment, rather than a heavy OOP business app, so perhaps the principles mentioned in the above Stackexchange links don't carry over as well. I am also not a senior software engineer, so not sure if I am even using the right terms in search and question formulation. There is the complication of scripts and utilities in the system that makes this effort not just about porting or refactoring Fortran, that makes this situation and problem unique.

Thanks!

ximiki
  • 325
  • 2
  • 8
  • 2
    It sounds like you already have a pretty good handle on this with your proposed strategy. Did you have any *specific* questions about that strategy? – Robert Harvey Feb 10 '20 at 23:15
  • 1
    There's one thing you haven't mentioned in your question that I think you need: a *software architecture.* At a minimum, that should include a class diagram, an API specification and a UI framework. – Robert Harvey Feb 10 '20 at 23:18
  • by the time you finish refactoring fortran spagettii will be in vogue again. – Ewan Feb 10 '20 at 23:52
  • Here's a thought for you to consider. The 6DOF simulation group at TI DSEG Forest Lane had to develop a simulation of a vehicle sight. Traditionally, these were written in FORTRAN. As an experiment, they did it in PASCAL. They said, afterwards, that they lost a few percentage points in performance, but they more than got it back in maintainability, to the point that they would never again do a simulation in FORTRAN. – John R. Strohm Feb 10 '20 at 23:52
  • 2
    *"But notice that some of these questions and answers are almost 10 years old, so I wonder if there are better approaches available."* - no, the answers in this posts are pretty timeless and still valid. Voting to close as a duplicate of the first one, since it contains an extensive strategy for handling these kind of legacy systems (no OOP required for this). – Doc Brown Feb 11 '20 at 06:59
  • @DocBrown, there are two other differences that I state besides the age: first, the procedural vs OOP programming style, and second, the existence of the "baggage" of utilities and shell scripts for orchestration (so the problem scope isn't just the codebase, it is all the other stuff to help the code do its job). I don't see how the complexity of this question is **directly and completely** answered in the **one** question that you propose answers it. I believe this is unfair in that future answers can't be posted to this complex problem. – ximiki Feb 13 '20 at 16:59
  • 1
    @ximiki: I see almost nothing in that linked answer which cannot be applied to non-OOP, scientific software where the code base also contains utitlities like shell scripts. But in case someone wants to write an answer which brings in something really new, they can always ask me or a diamond for reopening (don't ask gnat, that will be a waste of time). – Doc Brown Feb 13 '20 at 18:19
  • @DocBrown, understood, thanks. That's the clarification I was looking for (that those approaches in the other OOP-focused answer are general enough and applicable to non-OOP as well). – ximiki Feb 13 '20 at 20:08

1 Answers1

13

OK in all seriousness, this code has worked for 30 years. It will work for 30 more.

You could spend your life 'modernising' it and only add bugs.

Start walling bits off and componentising so you can write the new bits the way you want.

Concentrate on measurable improvements in performance, fixing bugs and new features. Gradually improve the code as you work on it rather than attempting to refactor for its own sake.

Ewan
  • 70,664
  • 5
  • 76
  • 161
  • 4
    Fully agree. Yes, this code is messy because it was written by scientists, not programmers, However, *because* it was written by scientists, not programmers, *and* is messy, it is *full of implicit hidden domain knowledge that you will never be able to grasp* because you are not a scientist. It is almost guaranteed that you will introduce bugs when you touch this code *and not even notice it* because you lack the domain knowledge. If this code had been designed using Software Engineering Best Practices, the domain knowledge would be explicit, but alas, it wasn't, so it isn't. – Jörg W Mittag Feb 11 '20 at 02:48
  • I would start by adding automated tests, so that modules can be measured as they are replaced. – Robert Baron Feb 12 '20 at 18:24
  • The refactor has the purpose of improving maintainability, since a larger pool of programmers could understand it. Unfortunately this isn't a readily measurable improvement like bugs fixed, but an important part of software engineering. – ximiki Feb 20 '20 at 01:20
  • Most of the parts that are messy - and that I want to prioritize rewriting or finding open source solutions for - are the home-grown utilities and shell scripts. The domain knowledge code that is captured in Fortran will likely remain largely unchanged; the only changes I plan on making to these Fortran parts are to modularize them and loosely couple the modules. Totally agree that messing around with years of scientific knowledge expressed in less than ideal, complex code is a bad time. – ximiki Feb 20 '20 at 01:31
  • @RobertBaron, I like the idea of automated tests, but how do you know when you have added enough coverage for edge cases? I understand that characterization tests can give a good test bed of a larger regression or integration type tests, but the concern is that these tests may not be rigorous enough, and some functionality may be missed during the refactor that could pop up at some point later in time, which would make finding the bug nearly impossible. – ximiki Feb 20 '20 at 01:37
  • @ximiki Firstly, any testing is better than no testing. You start by writing a few tests and adding to that any test that exercises a bug to replicate the bug, and then becomes part of the test suite after the bug is fixed. You can use a code coverage tool to see which parts of the code are being exercised and which prats are not. The toughest thing to do is to change corporate culture the enable this as both management and developers need to be behind this. – Robert Baron Apr 09 '20 at 11:10