THE HISTORY OF PROGRAMMING LANGUAGES

This is a research paper I just wrote for my english class. I thought I might as well put it online, since Word97 allows to save as HTML anyway... I also thought some people might actually be interested in this. I looked on the net for quite a while to find stuff about the history of programming languages and think this might keep all the things together.

The history of programming languages is more interesting than most people think. Many people think there would be just one, maybe two programming languages, but at least, there would be no big differences. Actually, it is no big difference for the user, for in general, every program can be developed in almost any language. For programmers, on the other hand, the use of the right programming language for the task they want to accomplish is very important.

When the first electronic computers were built in the 1930s, programs were ‘written’ using screwdrivers: The technicians actually walked inside the computer (for they were big as houses) and changed the cable connections inside. These computers could work on only one specific task at a time. When the software was to be changed, the hardware was changed too.

Computers then became programmable; they learned their first language: Machine language. Machine language is made up of sets of zeros and ones (or on and off, current and no current) that told the computer what to do. These chunks of zeros and ones were grouped together to operation codes (or op-codes for short) to accomplish specific tasks such as adding two numbers or moving a number from one memory-cell to another¹. Programming had become a lot easier this way: The programmer only had to tell the computer the order of the zeros and ones and the computer would do the rest by itself.

The disadvantage of this is that memorizing operation codes was not really easy (even then there were a lot of them) and reading programs written in zeros and ones is not easy either. A lot of time was spent testing programs and correcting (= debug) them. On the other hand, memorizing and working at high accuracy is exactly what computers are good at. So programmers came up with the idea to make up mnemonics (from the Greek word for memory), which are just substitution codes for op-codes. The computer does not understand mnemonics, but programs were written to translate them into machine language. These programs are called assemblers and the language made of mnemonics is called assembly. Still, to accomplish difficult tasks (such as adding ten numbers together and outputting the result on a printer), many instructions had to be used. For the more common of these groups of instructions, macroinstructions (from Greek for large) were made up that were just one mnemonic standing for maybe a hundred op-codes. Assembly also took care of calculating jump addresses, which made changing programs a lot easier². Before that, whenever one wanted to change a program, all addresses had to be recalculated and put into the machine code by hand.

In 1953 it was estimated, that half the cost of running a computer-center was made of the salaries for programmers and up to half the computer-time was used to develop and debug programs. Thus 75 percent of the total cost of running computers was spent developing programs³. Also, old programs did not work on new computers. Since op-codes are different for every kind of computer, assembly programs are not portable at all. Programmers came up with the idea for so called high-level languages that are made up of nothing else than macros which are much broader than mnemonics and could be translated (=compiled) into different machine code for different computers⁴. These macros also took parameters. For example, the same macro would be used to print "Hello." or "Good evening!" on the printer, with the only difference being the parameter, the line of text that would actually be printed. Also mathematical formulas were allowed. The single line ‘c = sin (0.5)’ would be hundreds of lines in assembly.

The first widespread high-level language was named FORTRAN (=FORmula TRANslator) and was developed by a team around John Backus at IBM starting in 1954. Efficiency was the most important topic on developing the language and little attention was paid to the elegance of the language⁵. FORTRAN is unstructured and definitely not the language of choice today but it was a tremendous step in the development of programming languages. FORTRAN became a standard language for programming and there are still programs developed in it. Programs developed in FORTRAN could easily be ported to other computers and interchanged from one place to another. FORTRAN was not the only language though and by the late ‘50s several dozen different programming languages existed.

When computers became more widespread at universities, also students from other branches than computer science wanted to use them. For example, psychology students would want to use them to analyze their statistical studies. FORTRAN was a powerful tool, but it was very difficult to learn. As result of that, BASIC (= Beginners All-purpose Symbolic Instruction Code) was developed by Thomas Kurtz and John Kemeny at Dartmouth College. BASIC introduced a new concept: Interpretation. Earlier programming languages (and still most languages today) are compiled, that means, they are translated into machine code and then the compiled code is passed around, whereas the source code of the program is locked up for future changes. With an interpreted language, the source code is translated into machine code on the fly, instruction by instruction by an interpreter program. This made BASIC much easier to learn, because the programming environment is much more interactive. Mistakes are shown immediately as one runs the program and instructions can be tested immediately. On the other hand, interpretation eats up a lot of resources. Both, the interpreter program and the executed program have to be in memory at the same time and also interpreted programs are about 300 times slower than compiled programs⁶.

When microcomputers were developed in the ‘70s, BASIC became widely used. Most microcomputers already had a BASIC interpreter installed in their system and since it was easy to use, it was ideal for people who just wanted to play around with computers. BASIC is not good for developing big programs though. The original version by Kurtz and Kemeny only had the GOTO-statement (which is the same as a jump in assembly) to control the program flow and no other control structures. Control structures are essential parts of every program, and FORTRAN and most other programming languages had them by that time. Also, programming in BASIC does not encourage structured programming at all but encourages programmers to start programming before actually starting to think. All this makes complex BASIC programs difficult to read and almost impossible to maintain efficiently⁷.

The total opposite to BASIC is Pascal, named after Blaise Pascal, the inventor of one of the first calculating machines. Niklaus Wirth in Zurich invented Pascal in the early 1970s as an instructional language, helping learners to develop good programming habits. It is structured and strongly typed⁸. Unlike BASIC, Pascal does not even have a GOTO-statement. Instead, it has a wide variety of control structures to make loops and if-statements. Pascal programs are looked at to be hard to write and writing would be a slow process. That is true on the one side, but on the other hand, once a scratch for a Pascal program is written down, it is very easy to write the actual program and little time is needed for debugging⁹. Unlike BASIC it encourages to think first and write later. Also the structures make reusing code easier than BASIC does. The structuring also makes Pascal programs fairly easy to read, to maintain and to change.

Both Pascal and BASIC were developed to teach programming, not to produce software. That is why many programmers look at the wide use of them as a mistake. These programmers usually program in C or C++, the most common programming language today. Dennis Ritchie at Bell Labs developed C in 1972¹⁰. It was developed for writing UNIX. It uses few keywords and many abbreviations that make it efficient to write but somewhat cryptic to read. C spread very fast and was soon widely used and became and ANSI standard in 1983. Even though most of today's C/C++ compilers understand ANSI-C, extensions were made. In 1980, Bjarne Straustrup at Bell Labs added classes to C and named the result C with Classes; in 1983 other extensions to C, like operator overloading and virtual functions were made resulting in C++, suiting C for object oriented programming¹¹.

Today's C++ is a very powerful tool, but it suffers from being and extension to C. Thus, it has a lot of redundant ways to do things, too many keywords and differences from system to system. Originally intended for low-level programming, it is also not perfectly suited for writing applications for operation systems such as Windows95. Also, the abbreviations of C/C++ and extensions like operator overloading are very confusing and maintaining C++ code is not always as easy as it could be. Nevertheless, C++ is the most used programming language today; both, UNIX and OS/2 are written completely in C++ and parts of Windows95 and NT are as well.

In 1990, a new idea was formed at Sun Microsystems, to fight back against the variety of operation systems and API's (= Application Programming Interfaces) and the difficulties for programmers resulting from this fact. Patrick Naughton suggested to create a single programming toolkit and a single windows technology for all kinds of appliances such as computers, remote-controls or dishwashers and also to implement multi-networking capabilities in the language to make interactivity between these devices easier¹². The Green team was formed to research this field and it came up with a simple object based programming language named Oak.

Anyhow, Oak never became widely used and Sun was never able to sell it, but, in 1993, the National Center for Supercomputing Applications (NCSA) released Mosaic, the first web-browser and the new World Wide Web (WWW) made the Internet exciting¹³. When all this happened, there was a place were Oak could be successful. It was renamed to Java and changed to fit its new role: a language for producing online multimedia software.

Java's syntax is a lot like the syntax of C++, to make it easier for old programmers to switch to Java. Java is a lot easier than C++ though, because it reduces it to the core features. On the other hand it includes modern features like garbage collection and multi-threading capabilities built in into the language and not added on like in C++. It also introduces a new concept: it is both interpreted and compiled. First, the compiler translates it into something that is called Java bytecode which is more general than real machine code. A highly optimized interpreter (=Java Virtual Machine) then translates this bytecode in real time. All this has several advantages: The bytecode is often much smaller than machine code and can be understood by any computer with any interpreter. Also, interpreting a program insures security, because it does not give the program the full control of the computer, which is very important in internetworking environments¹⁴. Also, it is possible to produce microchips that can understand Java bytecode as their native machine language, which will speed up the programs by removing the disadvantages of interpretation.

A special form of Java programs, so called Applets, is designed to be run in a web-browser to add real interactivity to HTML (=Hypertext mark-up language, 1992) documents. HTML is not a real programming language. It is, like VRML and XML, a mark-up language, that is: it is a language to describe something. HTML describes the way webpages are supposed to be shown by a browser. For example the designer of webpages can define background colors and images, different text styles and much more. Nevertheless, HTML has not everything for advanced webpage design and that is where Java comes in: In Java it is possible to do about anything.

Another not-programming language is CGI. CGI stands for Common Gateway Interface and is used for example to build search engines on the WWW. The CGI only defines the way, a web-browser and a web-server talk with each other, and it takes a programming language like Perl to really do the task the programmer intended to do. Perl is a scripting language that works very well for text-operations and therefore is very suited for automated webpage creation.

Through the time, a change in the design of programming languages about programs is visible. In the beginning, every program that was executed had total control of the computer it was running on. When multi-tasking became popular, it was important that one program could not ruin the work of another program that was working at the same time. The protected mode was born. Nowadays, security is a very important issue, as more and more computers are connected to the Internet and diseases like computer virii or just errors in programs can cause immense costs. Thus, programs are written so that they can not cause harm to the whole system and destroy the work of weeks. The Java runtime environment is a good example for such a save place: A program can not do anything outside its own memory and all communication between the program and other programs or computers is watched carefully.

Modern programming languages are much more like human languages than the ones of 50 years ago. Even though they have a strict grammar, they are more structured and understand more intelligent commands. Maybe in another 50 years, we will live in a world, in which programmers just speak to their computers when they "write" a program, using the next generation of word processing programs that will not get typed input but spoken input.

BIBLIOGRAPHY

¹⁰: Academic American Encyclopedia, 9th ed. (1991), s.v. "C".

¹²: Bank, David. The Java Saga. Online: Wired Digital Inc., 1994 - 1997. WWW, Internet. 1. March 1998. Available http://www.wired.com/wired/3.12/features/java.saga.html

^{3, 5}: Campbell-Kelly, Martin and Aspray, William. Computer - A History of the Information Machine. New York: BasicBooks, 1996.

¹¹: Easton, Dr. Richard J. and Hale, Dr. Guy J. Foundations of Programming using C and C++. No publication information, summer 1996.

^{1, 2,
4, 6, 7, 9}: Lampton, Christopher. Computer Languages. New York: Franklin Watts Library Edition, 1983.

⁸: Schildt, Herbert. Advanced Turbo Pascal Programming and Techniques. Berkley, CA: Osborne McGraw-Hill, 1986.

^{13, 14}:Youmans, Bryan. Java: Cornerstone of the Global Network Enterprise? Online: n.p. Spring 1997. WWW, Internet. 1. March 1998. Available http://ei.cs.vt.edu/~history/Youmans.Java.html