RETRIEVE

From HandWiki
RETRIEVE
ParadigmImperative, Declarative
FamilydBASE
Designed byArden Scott
DeveloperRichard Moore
First appeared1971; 53 years ago (1971)
Implementation languageFORTRAN
PlatformSDS 940
LicenseProprietary
Influenced
dBASE, JPLDIS, RECALL, many others

RETRIEVE is a database management system (DBMS) offered on Tymshare's systems starting in August 1971. It was written in Tymshare's own SUPER FORTRAN on the SDS 940. It offered basic single-file, non-relational database functionality using an interactive programming language. It is one of the earliest examples of software as a service (SaaS).[1]

RETRIEVE was highly influential and spawned a number of relatively direct clones. Wang Laboratories's RECALL on the Wang 2200 minicomputer was almost identical to RETRIEVE, to the point the differences were detailed in a single page. JPL made a version known as JPLDIS for the UNIVAC 1108 in 1973 that was also very similar.

Wayne Ratliff, a contractor at JPL for many years, was inspired by JPLDIS to port it to the IMSAI 8080 to manage his football pool, later releasing it commercially as Vulcan for CP/M in 1979. Ashton-Tate licensed Vulcan and re-released it as dBASE II in 1980, which sparked the microcomputer database market. Most of RETRIEVE's original syntax remains unchanged in dBASE and the many xBASE clones that survive into the 21st century.

History

In 1969, Jim Ryan of Tymshare designed an expanded version of IBM System/360 H-level (258 kB) FORTRAN, adding strings and other features that took advantage of their SDS 940 systems. Ryan hired Richard Moore and Franck Bracher to develop it as SUPER FORTRAN, and it went live on their systems in 1970.[lower-alpha 1] Soon after SUPER FORTRAN was released, Arden Scott asked Moore to use SUPER FORTRAN to develop his vision of a database management system (DBMS). The first version was ready in a few weeks, and immediately proved very popular with Tymshare customers.[1][lower-alpha 2]

During the 1970s, Tymshare began moving their systems from SDS to the PDP-10 platform, eventually running TOPS-10. This led to an effort to build an entirely new database engine for this platform, known as MAGNUM. MAGNUM was a complete relational database engine, and in many references it is claimed to be the first such system offered commercially when it went live in October 1975.[3] Although most Tymshare customers, and internal users, switched to MAGNUM, by this time RETRIEVE had been ported to a number of platforms and these versions remained very popular outside the company.

In 1970, the Jet Propulsion Laboratory (JPL) installed three UNIVAC 1108 machines. Fred Thompson had been using RETRIEVE to manage a database of mechanical calculators at JPL, and decided to bring the system in-house when the 1108's arrived. In 1971, he collaborated with JPL programmer Jack Hatfield to produce JPLDIS. Hatfield left JPL in 1974, and the project was assigned to another JPL programmer, Jeb Long, who added a number of features.[4]

In 1973 the Wang 2200 was released, a desktop minicomputer with cassette tape storage.[5] RETRIEVE was ported to this platform under the name RECALL. A report for the US Army detailed the differences in a single page and concluded "Differences between the two implementations are very minor."[6]

While working at JPL as a contractor, Wayne Ratliff entered the office football pool. He had no interest in the game, but felt he could win the pool by processing the post-game statistics found in newspapers. In order to do this, he started looking for a database system and, by chance, came across the documentation for JPLDIS. He used this as the basis for a port to PTDOS on his kit-built IMSAI 8080 microcomputer, and called the resulting system Vulcan (after Mr. Spock on Star Trek). It was later ported to CP/M when that system became almost universal in the S-100 bus market.[7]

In 1980, George Tate saw an advertisement for Vulcan being sold for $49.[8] He arranged a licensing deal with Ratliff, renamed it to dBASE II to make it sound like the second version, and put it on the market for $695. The system was a success, and in 1981, IBM commissioned a port to the still-unreleased PC DOS.[9] This was a runaway success, one of the big three packages, along with Word Perfect and Lotus 1-2-3 that made up the equivalent of an office suite in the early DOS market.[10]

Description

Basic operations

RETRIEVE was a non-relational database, that concept not being introduced until 1970. RETRIEVE databases contained a single table stored in a single file, which typically used a short-form of the database name as the filename. The table structure is defined when the database is created, allowing fields of character, integer or free-form numeric inputs.[11] A database could have a maximum of 98 fields, with a maximum of 72 characters per field, and the total for any one row was 768 characters, or 256 words. Field names had to begin with a letter and could include additional letters, digits, a period and the @ character, with a maximum of 31 characters in total.[12]

The system was interactive, using a command prompt within the system for user interaction.[13] For instance, to start a new database the user would type CREATE at the command prompt, which would then respond by asking the user to type in the name for the database and then prompt for the field definitions. An empty line stops this process and sends it into data-entry mode, allowing rows to be entered.[14] Any step of this operation could be removed by providing the data in the original command, for instance, if one typed CREATE EMPLOYEES instead of CREATE, the system no longer asked for the file name.[15]

There were three database file formats, which could be specified during CREATE, the normal character-format SYMBOLIC, BINARY which saved numbers in their 24-bit-based internal formats, and SCRAMBLED which encrypted the file with a user-supplied password.[16]

Existing databases were loaded using LOAD or BASE (they were equivalent).[17] Data was saved as it was changed, but there was also a SAVE command to write out data (or selections of it) to flat files.[18] QUIT exited the system and returned to the underlying operating system.[17]

Retrieving data

Once the database was defined and populated, or an existing one was loaded, data could be displayed using LIST. Without any parameters, it printed out all the records in the order they were entered, their "RECNO", which was printed at the front of the line. One could also provide a field list to select and reorder the fields in the printout, like LIST EMP.NUM,NAME,SALARY. The PRINT statement worked almost identically, differing only in that it did not print out the RECNO at the start of the lines.[19] FAST was similar to PRINT, but suppressed the field headers as well.[18]

Individual records could be selected using the "record number addressing system", which was prepended to LIST or PRINT. This was a flexible format allowing both single RECNOs separated by commas or ranges with colons to be specified, for instance, 1,7:20,50 PRINT would print out the records 1, everything from 7 to 20, and then 50. It also included the pseudo-record $ to represent the last record, so 100:$ LIST printed out all records from 100 on.[20]

Records could also be selected by fields meeting certain criteria based on their contents, using the FOR syntax. To list all the employees with a salary greater than $40,000, one would PRINT FOR SALARY>40000. The same syntax could be used to select single records based on their contents, not the RECNO, for instance LIST FOR NAME="Bob Smith".[21][lower-alpha 3] RETRIEVE supported all basic comparisons, =, <, >, <=, >= and # for not-equals, which was found in some contemporary BASICs. Unusually, RETRIEVE also included English expansions of traditional comparisons, so one could use either SALARY>40000 or SALARY GREATER THAN 40000. Such expressions could also include basic math, including +, -, * for multiplication, / for division, and for exponentiation.[22] These could be further combined with boolean expressions using AND, OR and NOT.[21]

Additionally, the SUM, COUNT and AVERAGE worked similar to LIST or PRINT, including the same record selection concepts. As the name implies, these output a single value with their associated values. For instance, COUNT FOR NAME='Bob Smith' would likely return 1, while AVERAGE SALARY FOR SALARY>40000 might return 42500.[23]

In addition to the LIST/PRINT/FAST, a further output statement is REPORT. This works in a similar fashion, but has a number of options to pretty-print the output. It can be invoked alone or with qualifiers as above, but when it is used it enters an interactive mode that asks a number of questions about where to send the output (with T being the terminal), whether it should be single or double spaced, include headers and totals, etc.[24]

Modifying data

Records could be deleted using the DELETE statement, using the same record selectors or field expressions as above. New records were inserted using APPEND. APPEND FIELDS entered an interactive mode allowing the user to type in additional records field-by-field rather than entering comma-separated values a row at a time.[25] APPEND FROM filename read data from a comma-delimited text file into the current database already in memory.[26] MERGE was used to update existing records; it worked similar to APPEND, loading into the current database from another file, but in this case included an additional qualifier ON. For instance, MERGE ON NAME FROM ADDRESSES would read data from the file ADDRESSES and look in that file for a column where the first entry was "NAME". It would then process the file row-by-row, looking for entries in the database with that NAME and then updating the rest of the fields with the data from that row in ADDRESS.[27]

RETRIEVE supported two interactive methods to update existing records, CHANGE and REPLACE. REPLACE worked similarly to the modern SQL equivalent, UPDATE, taking a selector expression of some sort, one or more fields, and the new values. For instance, one might REPLACE SALARY=SALARY*1.05 WHERE YEARS.EMP>5.[28] While CHANGE ultimately did the same thing, it did so using an interactive mode. One invoked CHANGE without the new value, for instance, CHANGE FOR 'Bob Smith' IN NAME. The system printed out that row and then allowed the user to edit the record.[29] If there is more than one unique value, for instance, CHANGE FOR YEARS.EMP>5, each value was printed on a separate line and changes were sent to all matching records.[30] MODIFY was essentially identical to CHANGE but did not print out the existing values first.[31]

Sorting was not performed at retrieval time, but was implemented by modifying the database and re-writing its contents in sorted order. This was accomplished with SORT BY, followed by up to twenty field names. The original unsorted data is written to a backup file.[32]

The RESULTS TO modifier could be used with any of the data modification statements to redirect the results to a new database. For instance, APPEND FROM FEBSALES RESULTS TO CURSALES would append the data in the file FEBSALES to the current database and then save the results to CURSALES without updating the already-opened database. Or one might SORT BY NAME RESULTS TO SORTEMP.[33]

Other commands

Utility commands included STRUCTURE which printed out the database schema, and SIZE which returned the number of records.[34]

Programming RETRIEVE

Although RETRIEVE was often used interactively, the system also included the ability to save lists of commands to files and then play them back.[35] Command files could also include a number of other "helper" statements, including TYPE 'a string' to output any string, HUSH to suppress the command prompt (a period), TALK to turn the prompt back on, and ECHO ON and ECHO OFF to stop the playback from appearing on the terminal.[36]

These command files were run using the DO filename statement from the internal command line[37] or COMMAND filename from outside RETRIEVE, in EXECUTIVE.[38] If the script intended to leave the user in RETRIEVE at the end, one could put a COMMAND T at the end, "running" the Terminal, which specified what should happen next. Scripts could be strung together with COMMAND to form more complex workflows.[35]

When run, the commands inside the files operated just as they would if the user typed them in. This means that if a statement is provided that would normally require additional user input, for instance, a CHANGE with no parameters, the interactive mode would be invoked as normal. This allowed the command files to invoke user-based input and then perform additional instructions. One could, for instance, use REPLACE ALL WEEK.SAL=SALARY/52 to catch any recent changes in salary, call CHANGE HOURS to have the system present each employee record and ask for their weekly hours, then REPLACE ALL PAY=HOURS*WEEK.SAL to calculate the weekly paycheck for all users, and finally REPORT it send it all to a printer.[39]

Comparison with dBASE

Although separated by almost a decade, and having moved across four platforms in the process, dBASE on DOS remained very similar to RETRIEVE. Ratliff stated there was a "sort of progression from Retrieve to JPLDIS to my program, which I called Vulcan."[40] John Walker, better known for AutoCAD, also used JPLDIS and stated flatly that "DBase II, the Ashton-Tate database system, was a copy, a reimplementation of a package developed at the Jet Propulsion Laboratory called JPLDIS."[41]

LOAD/BASE became USE and periods in field names were replaced by colons, but most other commands and features remained unchanged other than to support differences in the underlying platforms, like numeric formats. For instance, the original dBASE User Manual uses this example:[42]

use people
list

Which is identical to the instructions in RETRIEVE:

LOAD people
LIST

The overall operation of the statements is largely identical between the two systems. dBASE's primary differences are related to the programmability; dBASE added variables, could LIST columns made of formulas like LIST SALARY*1.05, and added a much wider variety of functions to manipulate data, including functions for returning the length of a string or the data type of a given field.[43]

Notes

  1. It is not clear exactly when SUPER FORTRAN went live, but the manual is dated April 1970 so it is likely some time that year.[2]
  2. Gregory gives what appear to be conflicting dates for the release; if it was a "few weeks" after SUPER FORTRAN that would suggest some time in 1970, but the manual is from 1971 and he uses that date within the text.[1]
  3. Strings could use single or double quotes.[22]

References

Citations

  1. 1.0 1.1 1.2 Gregory 2018, p. 136.
  2. Gregory 2018, p. 127.
  3. Gregory 2018, p. 140.
  4. "How it Started - JPLDIS: How Came The Idea". http://www.foxprohistory.org/jeblong_jpldis.htm. 
  5. "Wang 2200 Computer: Everything You Need to Know" (in en-US). 2021-01-04. https://history-computer.com/wang-2200-guide/. 
  6. Bloom 1976, p. 1.
  7. Lammers 1986, p. 111.
  8. Lammers 1986, p. 115.
  9. Lammers 1986, p. 117.
  10. "ASHTON-TATE: Confronting a Hard Life in the World of Software". The Los Angeles Times. May 10, 1987. http://articles.latimes.com/1987-05-10/business/fi-6728_1_database-software. 
  11. Manual 1971, p. 5.
  12. Manual 1971, p. 6.
  13. Wiederhold 2001, p. 416.
  14. Manual 1971, p. 9.
  15. Manual 1971, p. 31.
  16. Manual 1971, pp. 28-29.
  17. 17.0 17.1 Manual 1971, p. 25.
  18. 18.0 18.1 Manual 1971, p. 47.
  19. Manual 1971, p. 14.
  20. Manual 1971, p. 15.
  21. 21.0 21.1 Manual 1971, p. 19.
  22. 22.0 22.1 Manual 1971, p. 18.
  23. Manual 1971, pp. 54-56.
  24. Manual 1971, pp. 67-68.
  25. Manual 1971, p. 44.
  26. Manual 1971, p. 35.
  27. Manual 1971, p. 46.
  28. Manual 1971, p. 23.
  29. Manual 1971, p. 20.
  30. Manual 1971, p. 21.
  31. Manual 1971, p. 57.
  32. Manual 1971, p. 59.
  33. Manual 1971, p. 61.
  34. Manual 1971, p. 48.
  35. 35.0 35.1 Manual 1971, p. 79.
  36. Manual 1971, p. 84.
  37. Manual 1971, p. 80.
  38. Manual 1971, p. 83.
  39. Manual 1971, p. 89.
  40. Lammers 1986, p. 114.
  41. Walmsley, Kean (19 September 2008). "An interview with John Walker - Part 2". https://through-the-interface.typepad.com/through_the_interface/2008/09/an-interview--1.html. 
  42. dBASE 1983, p. 3.10.
  43. dBASE 1983, p. Appendix C.

Bibliography