Twenty years ago, when I was at Princeton, I and all of my fellow graduate students in physics were required to pass two foreign-language achievement tests in order to get our degrees. Since then, apparently convinced that such skills are of diminishing importance, the Princeton physics department— and most other graduate schools as well—have dropped such a requirement.
On the other hand, skill in the “foreign” language of computer programming has increasingly been recognized as a most important piece of equipment in the scientist’s tool kit. Although a demonstration of expertise in a computer language or two has not yet become a degree requirement for grad students, having this skill is so obviously essential that young scientists will want to acquire it, requirement or not.
Not only computer scientists and mathematicians, but many physicists, biologists, and chemists need to know how to program or at least work in a lab where someone does. There just isn’t as much off-the-shelf scientific and laboratory software as people might think. And often, the off-the-shelf packages are too slow or not quite specific enough. A scientist who becomes comfortable with programming can design special programs or customize off-the-shelf packages.
It follows that science mentors must face the issue of what specific languages they should recommend that their students master. Although the Fortran language remains popular with “the older generation”—those who cut their computer teeth a decade or more ago—I am convinced that today’s students should focus on two other languages: Pascal and C. These are the most popular languages for personal computers, and PCs are fast becoming an indispensable scientific tool.
Pascal and C are both “structured” languages—program variables must be declared at the beginning of a program, unlike Basic in which each unrecognized word is considered a new variable. The structured languages also present the code in a more logical way. Someone who already knows a little about programming can look at a Pascal program and probably figure out what it is designed to do. C on the other hand is more terse—it is closer to machine language and has been referred to as a “write-only” language.
As with foreign languages, learning several usually gives students a better understanding of all of them. Today’s students might want to become fluent or at least familiar with both Pascal and C, and also familiar with an assembly language or a language used in artificial-intelligence programming.
The tools discussed below are designed to reduce the tediousness of programming and debugging, they all fall in the “language-independent” category, that is, they work with any programming language. Many other tools are specific to a language, or even a particular compiler, and will be discussed in a subsequent issue.
Developing computer programs is often described as an “edit, compile, debug” process. The scientist first writes the program in an editor. At this point the program, called source code, looks very much like English. In the compiling process, the source code is changed from something a person can understand to something the machine understands. The compiler does exactly what the scientist tells it to do, which isn’t always exactly what he or she really wants it to do. As a result, debugging is usually an essential step. The scientist returns to the editor to correct or improve the instructions the compiler will follow.
The importance of a good editor to program development is clear— invariably most programming time is spent inside the editor. Many compilers come with built-in editors, but these cannot match the best stand-alone editors. A good editor is line-oriented rather than stream- or paragraph-oriented, with the ability to mark and move several lines effortlessly. It is especially important for the editor to think in terms of lines for maintaining the structure of a program during reformatting. With the prevalence of modular programming, scientists should choose an editor that allows loading of multiple files. Finally, the editor should include a macro language capability that will let the programmer customize the keyboard with easy-to-remember commands.
To understand the power and function of a macro language, consider the following scenario: A scientist writing a program is ready to test it. He saves, exits the editor, calls the compiler, compiles the source code, waits for the compiler to list the program’s errors, calls the editor, and starts over. It isn’t a difficult process, but it is time-consuming. By employing a language editor’s macro capability, the scientist can create a macro program that will, with a single keystroke, save the current file, enter DOS, compile the source code, store the error messages from the compiler, exit DOS to the file editor, place the cursor at successive error locations, and display the compiler’s error messages.
Two of the best editors currently available are Mansfield Software Group’s Kedit and Solution Systems’ Brief. Of the two, Brief offers some magnificent extras. My favorite lets the programmer undo the last 30 actions with a single command, a process that otherwise might have required the laborious moving of the cursor. Kedit, on the other hand, does not offer such a full undo function. Another difference? With Brief, a scientist can load an unlimited number of files, using a disk for overflow, while Kedit can only handle 15 files.
When programming or debugging, a scientist may want to search the source code for key words or phrases, but these searches are typically more sophisticated than those performed in a word processor. Kedit’s search targets are flexible but not as comprehensive as those allowed by Brief’s “pattern matches.”
For C programmers, Brief definitely is the best bet among editors. For scientists not familiar with that language, Kedit is a good alternative. And for a scientist who thinks all ,this macro stuff sounds neat but who has neither the need nor the time to tinker, there is another option. For example, a program called Qedit, from SemWare, is an excellent, inexpensive, multifile editor with a reconfigurable keyboard, windows, and drop-down menus.
And there are other tools as well. Polytron’s PVCS (for Polytron version control system) is an invaluable tool under such circumstances. It lets the programmer maintain special “logfiles,” one for each “workfile” in the program. The logfiles keep copies of the most recent workfile and differences between previous versions. Special software lets the scientist “check-in” new versions and update the logifies or “check-out” the old versions whereby the program uses the stored information to reconstruct the old source code. No matter what language scientists choose for programming, they should consider coding portions of the program that need to run faster than others in assembly language. While the standard in the assembler world is Microsoft’s Masm, SLR Systems’ Optasm assembles code four to five times faster. Masm was written in C while Optasm was written in assembly language. A scientist may have to compile hundreds of times when debugging a program, so a few seconds saved here and there throughout the program run can really add up.
Barry Simon is IBM Professor of Mathematics and Theoretical Physics