Linux:1 Introduction to Linux
What's Unix?
Unix, the original ancestor of Linux, is an operating system. Or at least it was an operating system; the original system known as Unix proper is not the "Unix" we know and use today; there are now many "flavors" of Unix, of which Linux has become the most popular.
A product of the 1960s, Unix and its related software was invented by Dennis Ritchie, Ken Thompson, Brian Kernighan, and other hackers at Bell Labs in 1969; its name was a play on "Multics," another operating system of the time.
In the early days of Unix, any interested party who had the hardware to run it on could get a tape of the software from Bell Labs, with printed manuals, for a very nominal charge. (This was before the era of personal computing, and in practice, mostly only universities and research laboratories did this). Local sites played with the software's source code, extending and customizing the system to their needs and liking.
Beginning in the late 1970s, computer scientists at the University of California, Berkeley, a licensee of the Unix source code, had been making their own improvements and enhancements to the Unix source during the course of their research, which included the development of TCP/IP networking. Their work became known as the BSD ("Berkeley Systems Distribution") flavor of Unix.
The source code of their work was made publicly available under licensing that permitted redistribution, with source or without, provided that Berkeley was credited for their portions of the code. There are many modern variants of the original BSD still actively developed today, and some of them--such as NetBSD and OpenBSD--can run on personal computers.
NOTE: The uppercase word `UNIX' became a trademark of AT&T (since transferred to other organizations), to mean their particular operating system. But today, when people say "Unix," they usually mean "a Unix-like operating system," a generalization that includes Linux.
If you'd like further information on this topic, you might be interested in consulting A Quarter Century of UNIX by Peter H. Salus (Addison-Wesley 1994), which has become the standard text on the subject.
What's Free Software?
Over the years, Unix's popularity grew. After the divestiture of AT&T, the tapes of the source code that Bell Labs provided became a proprietary, commercial product: AT&T UNIX. But it was expensive, and didn't come with the source code that made it tick. Even if you paid extra for a copy of the sources, you couldn't share with your programmer colleagues any improvements or discoveries you made.
By the early 1980s, proprietary software development, by only-for-profit corporations, was quickly becoming the norm--even at universities. More software was being distributed without source code than ever before.
In 1984, while at the Massachusetts Institute of Technology in Cambridge, Massachusetts, hacker Richard Stallman saw his colleagues gradually accept and move to this proprietary development model. He did not accept the kind of world such proprietism would offer: no sharing your findings with your fellow man, no freedom for anyone to take a look "under the hood" of a published work to see how it worked so that one could understand it or build upon it; it would mean no freedom to improve your copy of such works, or do what you please with your copy--including share it with others.
So instead of giving in to the world of non-free computing, Stallman decided to start a project to build and assemble a new Unix-like operating system from scratch, and make its source code free for anyone to copy and modify. This was the GNU Project ("GNU's Not Unix").
The GNU Project's software would be licensed in such a way so that everyone was given the freedom to copy, distribute, and modify their copy of the software; as a result, this kind of software became known as free software.
Individuals and businesses may charge for free software, but anyone is free to share copies with their neighbors, change it, or look at its source code to see how it works. There are no secrets in free software; it's software that gives all of its users the freedom they deserve.
Proprietary software strictly limits these freedoms--in accordance with copyright law, which was formulated in an age when works were normally set and manipulated in physical form, and not as non-physical data, which is what computers copy and modify.
Free software licensing was developed as a way to work around the failings of copyright law, by permitting anyone to copy and modify a work, though under certain strict terms and conditions. The GNU Project's GNU General Public License, or GNU GPL, is the most widely used of all free software licenses. Popularly called a "copyleft," it permits anyone to copy or modify any software released under its terms--provided all derivatives or modifications are released under the same terms, and all changes are documented.
What's Open Source?
The term open source was first introduced by some free software hackers in 1998 to be a marketing term for "free software." They felt that some people unfamiliar with the free software movement--namely, large corporations, who'd suddenly taken an interest in the more than ten years' worth of work that had been put into it--might be scared by the word "free." They were concerned that decision-makers in these corporations might confuse free software with things like freeware, which is software provided free of charge, and in executable form only. (Free software means nothing of the sort, of course; the "free" in "free software" has always referred to freedom, not price.)
The Open Source Initiative (OSI) was founded to promote software that conforms with their public "Open Source Definition," which was derived from the "Debian Free Software Guidelines" (DFSG), originally written by Bruce Perens as a set of software inclusion guidelines for Debian. All free software--including software released under the terms of the GNU General Public License--conforms with this definition.
But some free software advocates and organizations, including the GNU Project, do not endorse the term "open source" at all, believing that it obscures the importance of "freedom" in this movement.
Whether you call it free software, open source software, or something else, there is one fundamental difference between this kind of software and proprietary, non-free software--and that is that free software always ensures that everyone is granted certain fundamental freedoms with respect to that software.
What's Linux?
In the early 1990s, Finnish computer science student Linus Torvalds began hacking on Minix, a small, Unix-like operating system for personal computers then used in college operating systems courses. He decided to improve the main software component underlying Minix, called the kernel, by writing his own. (The kernel is the central component of any Unix-like operating system.)
In late 1991, Torvalds published the first version of this kernel on the Internet, calling it "Linux" (a play on both Minix and his own name).
When Torvalds published Linux, he used the copyleft software license published by the GNU Project, the GNU General Public License. Doing so made his software free to use, copy, and modify by anyone--provided any copies or variations were kept equally free. Torvalds also invited contributions by other programmers, and these contributions came; slowly at first but, as the Internet grew, thousands of hackers and programmers from around the globe contributed to his free software project. The Linux software was immensely extended and improved so that the Linux-based system of today is a complete, modern operating system, which can be used by programmers and non-programmers alike; hence this book.
What's Debian?
It takes more than individual software programs to make something that we can use on our computers--someone has to put it all together. It takes time to assemble the pieces into a cohesive, usable collection, and test it all, and then keep up to date with the new developments of each piece of software (a small change in any one of which may introduce a new software dependency problem or conflict with the rest). A Linux distribution is such an assemblage. You can do it yourself, of course, and "roll your own" distribution--since it's all free software, anyone can add to it or remove from it and call the resulting concoction their own. Most people, however, choose to leave the distribution business to the experts.
For the purposes of this book, I will assume that you are using the Debian GNU/Linux distribution, which, of all the major distributions, is the only one designed and assembled in the same manner that the Linux kernel and most other free software is written--by individuals.
And when I say "Linux" anywhere in this book (including in the title), unless noted, I am not referring to the bare kernel itself, but to the entire working free software system as a whole. Some people call this "GNU/Linux.
There are many other distributions, and some of them are quite acceptable--many users swear by Red Hat Linux, for example, which is certainly popular, and reportedly easy to install. The SuSE distribution is very well-received in Europe. So when people speak of Debian, Red Hat, SuSE, and the like in terms of Linux, they're talking about the specific distribution of Linux and related software, as assembled and repackaged by these companies or organizations. The core of the distributions are the same--they're all the Linux kernel, the GNU Project software, and various other free software--but each distribution has its own packaging schemes, defaults, and configuration methods. It is by no means wrong to install and use any of these other distributions, and every recipe in this book should work with all of them (with the exception of variations that are specific to Debian systems, and are labelled as such in the text).
In Debian's early days, it was referred to as the "hacker's distro," because it could be very difficult for a newbie to install and manage. However, that has changed--any Linux newbie can install and use today's Debian painlessly.
NOTE: I recommend Debian because it is non-corporate, openly developed, robust (the standard Debian CD-ROM set comes with more than 2,500 different software packages!), and it is entirely committed to free software by design (yes, there are distributions which are not).
Unix and the Tools Philosophy
To understand the way tasks are performed on Linux, some discussion on the philosophy behind the software that Linux is built upon is in order. A dip in these inviting waters will help clarify the rĂ´le of this book as "cookbook."
The fact that the Unix operating system has survived for more than thirty years should tell us something about the temerity of its design considerations. One of these considerations--perhaps its most endearing--is the "tools" philosophy.
Most operating systems are designed with a concept of files, come with a set of utility programs for handling these files, and then leave it to the large applications to do the interesting work: a word processor, a spreadsheet, a presentation designer, a Web browser. (When a few of these applications recognize each other's file formats, or share a common interface, the group of applications is called a "suite.")
Each of these monolithic applications presumably has an "open file" command to read a file from disk and open it in the application; most of them, too, come with commands for searching and replacing text, checking spelling, printing the current document, and so on. The program source code for handling all of these tasks must be accounted for separately, inside each application--taking up extra space both in memory and on disk. This is the anti-Unix approach.
And in the case of proprietary software, all of the actual program source code is kept from the public--so other programmers can't use, build on, or learn from any of it. This kind of closed-source software is presented to the world as a kind of magic trick: if you buy a copy of the program, you may use it, but you can never learn how the program actually works.
The result of this is that the code to handle essentially the same function inside all of these different applications must be developed by programmers from scratch, separately and independently of the others each time--so the progress of society as a whole is set back by the countless man-hours of time and energy programmers must waste by inefficiently reinventing all the same software functions to perform the same tasks, over and over again.
Unix-like operating systems don't put so much weight on application programs. Instead, they come with many small programs called tools. Each tool is generally capable of performing a very simple, specific task, and performing it well--one tool does nothing but output the file(s) or data passed to it, one tool spools its input to the print queue, one tool sorts the lines of its input, and so on.
An important early development in Unix was the invention of "pipes," a way to pass the output of one tool to the input of another. By knowing what the individual tools do and how they are combined, a user could now build powerful "strings" of commands.
Just as the tensile strength of steel is greater than the added strength of its components--nickel, cadmium, and iron--multiple tools could then be combined to perform a task unpredicted by the function of the individual tools. This is the concept of synergy, and it forms the basis of the Unix tools philosophy.
Here's an example, using two tools. The first tool, called who
, outputs a list of users currently logged on to the system. The second tool is called wc
, which stands for "word count"; it outputs a count of the number of words (or lines or characters) of the input you give it.
By combining these two tools, giving the wc
command the output of who
, you can build a new command to list the number of users currently on the system:
$ who | wc -l RET 4 $
The output of who
is piped--via a "pipeline," specified by the vertical bar (`|') character--to the input of wc
, which through use of the `-l' option outputs the number of lines of its input.
In this example, the number 4 is shown, indicating that four users are currently logged on the system. (Incidentally, piping the output of who
to wc
in this fashion is a classic tools example, and was called "the most quoted pipe in the world" by Andrew Walker in The UNIX Environment, a book that was published in 1984.)
Another famous pipeline from the days before spell-check tools goes something like this:
$ tr -cs A-Za-z '\012' | tr A-Z a-z | sort -u | comm -23 - /usr/dict/words RET
This command (typed all on one line) uses the tr
, sort
, and comm
tools to make a spelling checker--after you type this command, the lines of text you type (until you interrupt it) are converted to a single-column list of lowercase words with two calls of tr
, sorted in alphabetical order while ferreting out all duplicates, the resultant list which is then compared with `/usr/dict/words'
, which is the system "dictionary," a list of properly-spelled words kept in alphabetical order.
Collective sets of tools designed around a certain kind of field or concept were called "workbenches" on older Unix systems; for example, the tools for checking the spelling, writing style and grammar of their text input were part of the "Writer's Workbench" package (see section.
Today the GNU Project publishes collections of tools under certain general themes, such as the "GNU text utilities" and "GNU file utilities," but the idea of "workbenches" is generally not part of the idiom of today's Unix-based systems. Needless to say, we still use all kinds of tools for all kinds of purposes; the great bulk of this book details various combinations of tools to obtain the desired results for various common tasks.
You'll find that there's usually one tool or command sequence that works perfectly for a given task, but sometimes a satisfactory or even identical result can be had by different combinations of different tools--especially at the hands of a Unix expert. (Traditionally, such an expert was called a wizard.)
Some tasks require more than one tool or command sequence. And yes, there are tasks that require more than what these simple craft or hand tools can provide. Some tasks need more industrial production techniques, which are currently provided for by the application programs. So we still haven't avoided applications entirely; at the turn of the millennium, Linux-based systems still have them, from editors to browsers. But our applications use open file formats, and we can use all of our tools on these data files.
The invention of new tools has been on the rise along with the increased popularity of Linux-based systems. At the time of this writing, there were a total of 1,190 tools in the two primary tool directories (`/bin'
and `/usr/bin'
) on my Linux system. These tools, combined with necessary applications, make free, open source software--for perhaps the first time in its history--a complete, robust system for general use.
What to Try First
The first four chapters of this book contain all of the introductory matter you need to begin working with Linux. These are the basics.
Beginning Linux users should start with the concepts described in these first chapters. Once you've learned how to start power to the system and log in, you should look over the chapter on the shell, so that you are familiar with typing at the command prompt, and then read the chapter on the graphical windows interface called the X Window System, so that you can start X and run programs from there if you like.
If you are a Linux beginner and are anxious to get up to speed, you might want to skip ahead and read the chapter on files and directories next, to get a sense of what the system looks like and how to maneuver through it. Then, go on to learning how to view text, and how to edit it in an editor (respectively described in the chapters on viewing text and text editing). After this, explore the rest of the book as your needs and interests dictate.