Translation of Advance Programming In UNIX Environment – Part 1 Overview of the UNIX System

Tram Ho

Overview of the UNIX system


All operating systems provide services for the programs that run on it. Typical services include executing a new program, opening a file, reading a file, allocating memory, taking the current time, etc.The goal of the book is to describe the services provided. by many versions of UNIX operating systems.

Strict description of the UNIX system will not be described here, it is almost impossible (and if so, very depressing). This chapter helps you glance through the UNIX system from a programmer’s perspective. We will send you a few of the features and will go into details in the following chapters. This chapter will provide an introduction to introduce an overview of UNIX system services to new developers.

UNIX architecture

A standard definition, the operating system can be defined as a program that controls the computer’s hardware resources and provides an environment that allows programs to run on it. In general, we can call this software a kernel, since it is relatively small and resides within the core of the environment. Figure 1.1 will describe the UNIX architecture.

The kernel interface is software classes called system calls. Libraries of commonly used functions are built on the system call interface, applications can be freely used. (We will talk more about system calls and library fucntion). The shell is a special program that provides an interface to run other applications.

In a broader sense, an operating system consists of the kernel and all the programs that make a computer useful and create the personality of the computer. Other software include system utilities, applications, shells, library of common functions, etc.

For example, Linux is the kernel used by the GNU operating system. Some people refer to it as GNU / Linux operating system, but usually referred to simply as Linux. Although this interpretation is not entirely true, it is easy to understand.

Log in

user name

When we log into the UNIX system, we will enter the username and password. The system will search for the username in the password file, usually / etc / passwd. If we read the password file, we will see that it consists of 7 fields separated by a colon: username, encrypted password, user id (205), group id (105), description field (Stephen Rago), home directory (/ home / lemongrass), shell program (/ bin / kash)

sar ❌ 205: 105: Stephen Rago: / home / sar: / bin / ksh

All systems move encrypted passwords to other files. In chapter 6, we will look at these files and a few functions to access them.


Each time we log in, some information about the system will be displayed, and then we can type the command into the shell (Some systems start a window manager after you log in, but you usually end up with a shell running a window). A shell is a command line interpreter that reads input from a user and executes commands. Users entering commands to the shell are usually via a terminal (an interactive shell) or sometimes from a file (called a shell script). Common shells can be seen in Table 1.2

The system knows which shell to execute for us based on the last field in the password file.

The Bourne shell, developed by Steve Bourne at Bell Labs, is provided by most UNIX systems. The control structure of the Bourne shell is inspired by Algol 68.

The C shell, developed by Bill Joy at Berkeley, is provided with all BSD releases. In addition, the C shell is provided by AT&T with the release of System V / 386 3.2 and also includes the System V Release 4 (SVR4). (We have to say more about the differences between UNIX systems in the next chapter.). The C shell was built to the 6th version. Its control flow is similar to the C language, and it supports new features not provided by the Bourne shell: job control, history mechanism, command-line editting.

The Korn shell is considered to replace the Bourne shell and was first offered with SVR4. The Korn shell, developed by David Korn at Bell Labs, runs on most UNIX systems, but before SVR4 it was considered an extra-cost add-on, so it is not widely used as the other two shells. . It is also compatible with Bourne shell and includes features that make common C shell: job control, command line editting, …

The Bourne-again shell is provided by GNU with all Linux systems. (Translator: bash). It is designed to conform to the POSIX standard, while compatible with the Bourne shell, it also supports the features of both the C shell and the Korn shell.

TENEX C shell is an upgrade of the C shell. It borrowed features, such as command completion, from the TENEX operating system (developed in 1972 at Bolt Beranek and Newman). The TENEX C shell adds many features from the C shell and is frequently used to replace the C shell.

Shell standardized on POSIX 1003.2 This standard builds on the features of the Korn shell and the Bourne shell

Shell defaults on various Linux distributions. Some Linux distributions use Bourne-again shell. Others use a replacement for the Bourne shell, called dash (the Debian Almquist shell, written by Kenneth Almquist). The default shell on FreeBSD is Almquist shell. The default shell on Mac OS X is Bourne-again shell. Solaris provides all the shells provided in the image above. All shells are available on the Internet.

Through the book (in the textbook it is written, this text), we will display the examples we develop. Examples use common features of the Bourne shell, Korn shell and Bourne-again shell

Files and folders

File system

The UNIX file system has a tree structure of folders and files. It all starts from a directory called root, named /.

A directory is a file consisting of many items. Logically, we can think of each directory as containing files, with information about the structure described in the file’s properties. File attributes such as file type (regular file, directory), file size, owner, permissions, and last modification. The stat and fstat functions return the information structure including all file attributes. Chapter 4, we consider all the properties of the file.

We distinguish between logical view and disk storage. Most UNIX systems do not store attributes in the directory itself, because it is difficult to keep them in sync when the files have many hard links. This will be presented more clearly when we discuss hard links in chapter 4.


The names in the directory are called filenames. Only 2 characters are not allowed in the file name are slashes in the null character (/, 0). The slash separates the file name into quotes (will be described later). However, it is still better to restrict the characters used for file naming (If we use special characters in the shell, it can cause problems). According to POSIX.1 standard, only letters (a-zA-Z), numbers (0-9), period (.), Dash (-), underscore (_) should be used.

The two files created when creating a new folder called:. (dot) and .. (dot-dot). Dot points to the current directory, dot-dot points to the parent directory. The dot-dot root file will look like dot.

Researching UNIX systems and some older versions of UNIX System V limits file names to 14 characters, BSD versions are limited to 255 characters. Today, most commercial UNIX versions support at least 255 characters. characters.


A set of one or more filenames, separated by a slash and may start with a slash. A pathname begins with a / sign called an absolute pathname, otherwise it is called a relative pathname. Relative pathnames points to all files to the current path. The root name is / is a special case without filename.

For example

Displaying a list of all files in a directory is not difficult. Figure 3, shows a simple version of the ls command

The symbol ls (1) is the way to mark in the UNIX manual. It points to ls in Part 1 of that book. The sections are marked from 1 to 8, all sections are sorted alphabetically.

Back in history, the UNIX system had bundled eight parts together into UNIX Programmer’s Manual. The number of pages increased, split into sections: for normal users, for programmers, for system administrators.

Today, most documents have been digitized. This is how you view the ls document

The following figure will print the names of all the files in the folder. If the source file is named myls.c, we will compile it into a.out file and execute it by

Historically, cc (1) is a C compiler. On GNU C compilation systems, C compiler is gcc (1). Although research is often linked to gcc

Some examples of output are:

Remember that the files in the directory are not in alphabetical order, the ls command will sort the names before printing them.

There are many details to consider in these 20 lines of code.

First, we include our apue.h library. We will use this header for most programs in the book. This header includes standard system headers that define some of the constants and function definitions we use in books. This list is in Appendix B.

Next we will include a system function, dirent.h, to get this function for opendir, readdir, In some systems, these definitions are divided into multiple header files. For example in Ubuntu 12.04, /usr/include/dirent.h defines function prototypes and includes bits / dirent.h, actually defined in / usr / include / x86_64-linux-gnu / bits

Define main function according to ISO C standard. (We will talk more about the ISO C standard in the next chapter.)

Argv parameter [1]. In Chapter 7, we will learn more about the main function and how to call in the command line interface and environment variables accessible from the program.

Because the actual format of directory entries from one UNIX system to another, we use the functions opendir, readdir, closedir to use directories.

The opendir function returns a pointer to a structure, we will pass a pointer to readdir. We do not need to care about DIR structure. We will call the readdir function in the loop to read the entries in the directory. The readdir function returns a pointer to the dirent structure or then ends with the directory, a null pointer. Let’s all imagine the dirent structure is the name of each directory entry (d_name). Using this name, we can call the stat function (Section 4.2) to see the file’s properties.

We will call two functions to handle the error: err_sys and err_quit. We can view the output of the err_sys function in a format like (“Permission denied” or “Not a directory”). These two errors are shown and described in Appendix B. We will talk more about error handling in Section 1.7

After our program completes, it will call the exit function with the argument 0. The exit function will end the program, For convenience, an argument of 0 means OK, and an argument from 1 to 255 means an error has occurred. In chapter 8.5 we will see any program, such as the shell or a program we write can get the exit status of a program when they execute.

Working Directory

All processes have a working directory, sometimes called the current working directory. This is the directory for all relative pathnames to execute. A process can change the working directory with the chdir function command.

For example, the absolute path is currently doc / memo / joe referencing the file or directory joe, in the memo directory, in the doc directory, which must be a directory in the working directory. From this position we know that doc and memo must be directories, but we cannot say joe is a file or directory. Pathname / usr / lib / lint is the absolute path to the file or lint directory in the lib directory, in the usr directory, which is in the root directory.

Home Directory

When we log in, the working directory is set up in the / home directory. The home directory is defined in the password file (Section 1.3).

Share the news now

Source : Viblo