INTRO(1) User Commands INTRO(1)

intro - introduction to commands

The Heirloom Toolchest is a collection of standard Unix utilities that is intended to provide maximum compatibility with traditional Unix while incorporating additional features necessary today. To achieve this, utilities are derived from original Unix sources if permitted by its licenses. This means that material from Unix 6th Edition, Unix 7th Edition, and Unix 32V was used, since these systems were put under an Open Source license by Caldera in January 2002. In addition, 4BSD source (governed by the University's copyright and partially derived from 32V) has been used. (Other sources were Sun's `OpenSolaris', Caldera's `Open Source Unix[tm] Tools', the MINIX utility collection, Plan 9, and Info-ZIP's compression codes.) If no freely available Unix sources were available (for example, for tools introduced in System III or System V), utilities were rewritten from scratch. (The exact license terms are provided in a separate document.)

The tools in this collection are oriented on the specifications or systems named below. Since there are some incompatibilities between them, some tools are present in more than one version.

System V Interface Definition, Third Edition (UNIX System Laboratories, 1992) (SVID3). This specification corresponds to a System V Release 4 or Solaris 2 system. Utilities in /usr/5bin are modeled after this specification and related system environments. If extensions introduced in POSIX.2 or POSIX.1-2001 (see below) did not provoke conflicts with the behavior at this level, they were incorporated in these utilities as well. This is the most traditional personality available with the Heirloom Toolchest; prominently, regular expressions do not have any of the internationalization features (see ed(1) and egrep(1)), and awk is the old version, oawk(1). Use this personality to get best compatibility with traditional System V behavior.
System V Interface Definition, Fourth Edition (Novell, Inc., 1995) (SVID4). This specification corresponds to a System V Release 4.2 MP system. Utilities in /usr/5bin/s42 are modeled after this specification and related system environments. If extensions introduced in POSIX.2 or POSIX.1-2001 (see below) did not provoke conflicts with the behavior at this level, they were incorporated in these utilities as well. The most essential difference between this and the SVID3 personality are internationalized regular expressions and the choice of the new awk, nawk(1), for awk. Use this personality to get traditional System V behavior combined with internationalized regular expressions.
ISO/IEC 9945-2:1993 / ANSI/IEEE Std 1003.2-1992 (POSIX.2), with the extensions of The Single UNIX Specification, Version 2 (The Open Group, 1997). Utilities in /usr/5bin/posix are intended to fully comply to this specification even in cases of conflict with historical behavior. Non-conflicting extensions to POSIX.2 found in the environments described above are also present in these utilities. Use this personality if you need POSIX.2 features in preference to traditional System V ones.
ISO/IEC 9945-1:2001 / ANSI/IEEE Std 1003.1-2001 (POSIX.1-2001), with the extensions of The Single UNIX Specification, Version 3 (The Open Group, 2001). Utilities in /usr/5bin/posix2001 are intended to fully comply to this specification even in cases of conflict with historical behavior. Non-conflicting extensions to POSIX.1-2001 found in the environments described above are also present in these utilities. Use this personality if you need POSIX.1-2001 features in preference to traditional System V ones.

To use the Heirloom Toolchest, select one of these personalities and put the corresponding directory at the beginning of the PATH environment variable, immediately followed by the toolchest base directory, /usr/5bin (which contains the tools that are the same for all personalities). For example, to use the toolchest with a SVID4 personality, execute

PATH=/usr/5bin/s42:/usr/5bin:$PATH export PATH

You must select exactly one of the personalities above; you do not have access to the complete set of tools otherwise.

The manual pages generally note which behavior corresponds to which utility version. They also mark whether options and arguments were part of System V, were introduced with POSIX.2 or POSIX.1-2001, or if they are extensions provided by the Heirloom Toolchest, (possibly oriented at extensions introduced by other vendors). Such extensions are subject to change without a grace period; they are only intended for interactive usage and should not be included in scripts.

The toolchest also includes some utilities modeled after the BSD Compatibility environment of System V; these roughly correspond to 4.3BSD or SunOS 4 systems. These tools can be found in /usr/ucb; since they do not form a full personality set as the ones described above, they should be used in addition, as e.g.

PATH=/usr/ucb:/usr/5bin/s42:/usr/5bin:$PATH export PATH

does.

While the Heirloom Toolchest is intended to be as compatible as possible with historical practice in general, annoying static limits of historical implementations are not present any longer. Input lines of unlimited length are generally accepted (as long as enough memory is available); most utilities are also able to handle binary input data (i.e. ASCII NUL characters in the input stream).

The Heirloom Toolchest includes support for multibyte character encodings; if the underlying C library supports this and the LC_CTYPE locale (see locale(7) for an introduction) is set appropriately, multiple input bytes can form a single character and are handled as such in regular expressions, display width computations etc.

Multibyte character support was designed with special regard to the UTF-8 encoding. Additional supported encodings are EUC-JP, EUC-KR, Big5, Big5-HKSCS, GB 2312, and GBK. Other encodings may also work, with the following restrictions:

The character set must be a superset of ASCII (more specifically, of the International Reference Version of ISO 646). All ASCII characters must be encoded as a single byte with the same value as the ASCII character. This excludes 7-bit encodings like UTF-7. In addition, the C language implementation must map each ASCII character to a wide character with the same value.
The first byte of each multibyte character must have the highest bit set, i.e. it must not be an ASCII character. This excludes encodings whose sequences start with ASCII characters like TCVN 5712.
Locking-shift encodings, like those that use ISO 2022 escape sequences, are not supported.

Character comparison, regular expression matching and similar tasks are generally performed on the character representation obtained from the locale processing of the C library. A glyph formed by the application of combining characters to a base character will thus not normally be considered equal to the same glyph represented by a single base character. For string comparison, the results depend on the collation mechanism of the locale, which might or might not respect such relations.

Processing of multibyte character encodings is often notably slower than that of singlebyte character encodings. Since many widely-used languages (especially European ones based on Latin letters) contain few multibyte characters if encoded in UTF-8, and since experience shows that large amounts of textual data tend to be machine generated and to contain mostly ASCII characters (e.g. log files), while international language texts are mostly created by humans and tend to be smaller, processing of text in multibyte locales has generally been optimized for ASCII text. The performance penalty for using a multibyte locale is thus usually low if no or few multibyte characters actually occur in the data processed.

A problem with multibyte encodings that does not normally occur in singlebyte encodings is that of illegal byte sequences. In a singlebyte locale, each byte is treated as a character entity even if its value is not defined in the coded character set. For example, bytes with their highest bit set are simply passed through in the default `C' or `POSIX' locale, and can appear in option arguments as well as in input data. In multibyte locales however, byte sequences that do not form a valid character cannot be handled this way, because it is not always clear which bytes are to be grouped together. As an example, suppose that the `\200' byte introduces a multibyte sequence. If this byte occurs in a string to be matched by a utility but is not followed by a valid continuation byte, it is unclear if it should match any byte sequence containing this byte, including valid ones that form a character, or if matches should be restricted to occurences in other incomplete sequences. For this reason, this implementation generally treats illegal byte sequences in command line arguments or programming scripts as syntax errors. Utilities do not issue a warning or even terminate with an error if such sequences appear in input data, though, since this frequently occurs in practice when processing binary or foreign-locale files. In most cases, the sequences are passed to the output unaltered. That data is accepted or generated by a utility can thus not be taken as an indication for its validity in respect to the current character encoding.

Name Appears on Page Description
apply apply(1B) repeatedly apply a command to a group of arguments; select arguments
apropos apropos(1) locate commands by keyword lookup
banner banner(1) make posters
basename basename(1) return non-directory portion of a pathname
basename basename(1B) (BSD) return non-directory portion of a pathname
bc bc(1) arbitrary-precision arithmetic language
bdiff bdiff(1) big diff
bfs bfs(1) big file scanner
cal cal(1) print calendar
calendar calendar(1) reminder service
cat cat(1) concatenate and print files
catman catman(8) create the formatted files for the reference manual
chgrp chown(1) change owner or group
chmod chmod(1) change mode
chown chown(1) change owner or group
chown chown(1B) (BSD) change file ownwer
chroot chroot(8) change system's root directory and execute a command there
cksum cksum(1) write file checksums and sizes
cmp cmp(1) compare two files
col col(1) filter reverse line feeds
comm comm(1) select or reject lines common to two sorted files
copy copy(1XNX) (XENIX) copy groups of files
cp cp(1) copy files
cpio cpio(1) copy file archives in and out
csplit csplit(1) context split
cut cut(1) cut out selected fields of each line of a file
date date(1) print or set the date
dc dc(1) desk calculator
dd dd(1) convert and copy a file
deroff deroff(1) remove nroff/troff, tbl, and eqn constructs
deroff deroff(1B) (BSD) remove nroff, troff, tbl and eqn constructs
df df(1) disk free
df df(1B) (BSD) disk free
dfspace df(1) disk free
diff diff(1) differential file comparator
diff3 diff3(1) 3-way differential file comparison
dircmp dircmp(1) directory comparison
dirname dirname(1) return the directory portion of a pathname
du du(1) summarize disk usage
du du(1B) (BSD) summarize disk usage
echo echo(1) echo arguments
echo echo(1B) (BSD) echo arguments
ed ed(1) text editor
egrep egrep(1) search a file for a pattern using full regular expressions
env env(1) set environment for command invocation
expand expand(1) convert tabs to spaces
expr expr(1) evaluate arguments as an expression
factor factor(1) factor a number
false true(1) provide truth values
fgrep fgrep(1) search a file for a character string
file file(1) determine file type
find find(1) find files
fmt fmt(1) simple text formatter
fmtmsg fmtmsg(1) display a message in standard format
fold fold(1) fold long lines
getconf getconf(1) get configuration values
getopt getopt(1) parse command options
grep grep(1) search a file for a pattern
groups groups(1) show group memberships
groups groups(1B) (BSD) show group memberships
hd hd(1XNX) (XENIX) display files in hexadecimal format
head head(1) display first few lines of files
hostname hostname(1) set or print name of current host system
id id(1) print user and group IDs and names
install install(1B) (BSD) install files
join join(1) relational database operator
kill kill(1) terminate a process
lc ls(1) list contents of directory
line line(1) read one line
listusers listusers(1) print a list of user logins
ln ln(1) make a link
ln ln(1B) (BSD) make links
logins logins(1) list login information
logname logname(1) get login name
ls ls(1) list contents of directory
ls ls(1B) (BSD) list contents of directory
mail mail(1) send or receive mail among users
man man(1) find and display reference manual pages
mesg mesg(1) permit or deny messages
mkdir mkdir(1) make a directory
mkfifo mkfifo(1) make FIFO special file
mknod mknod(1M) build special file
more more(1) browse or page through a text file
mt mt(1) magnetic tape utility
mv mv(1) move or rename files and directories
mvdir mvdir(1) move a directory
nawk nawk(1) pattern scanning and processing language
newform newform(1) change the format of a text file
news news(1) print news items
nice nice(1) run a command at low priority
nl nl(1) line numbering filter
nohup nohup(1) run a command immune to hangups
oawk oawk(1) pattern scanning and processing language
od od(1) octal dump
page more(1) browse or page through a text file
paste paste(1) merge same lines of several files or subsequent lines of one file
pathchk pathchk(1) check pathnames
pax pax(1) portable archive interchange
pg pg(1) file perusal filter for CRTs
pgrep pgrep(1) find or signal processes by name and other attributes
pick apply(1B) repeatedly apply a command to a group of arguments; select arguments
pkill pgrep(1) find or signal processes by name and other attributes
pr pr(1) print files
printenv printenv(1) print out the environment
printf printf(1) print a text string
priocntl priocntl(1) process scheduler control
ps ps(1) process status
ps ps(1B) (BSD) process status
psrinfo psrinfo(1) displays information about processors
ptime time(1) time a command
pwd pwd(1) working directory name
random random(1XNX) (XENIX) generate a random number
readlink readlink(1) displays the target of a symbolic link
readlink readlink(1B) (BSD) displays the target of a symbolic link
renice renice(1) alter priority of running processes
rev rev(1) reverse lines of a file
rm rm(1) remove directory entries
rmdir rmdir(1) remove directories
sdiff sdiff(1) print file differences side-by-side
sed sed(1) stream editor
seq seq(1) print a sequence of numbers
setpgrp setpgrp(1) set process group ID and session ID
settime settime(1XNX) (XENIX) change the access and modification dates of files
shl shl(1) shell layer manager
sleep sleep(1) suspend execution for an interval
sort sort(1) sort or merge files
spell spell(1) find spelling errors
split split(1) split a file into pieces
stty stty(1) set the options for a terminal
stty stty(1B) (BSD) set the options for a terminal
su su(1) become super-user or another user
sum sum(1) sum and count blocks in a file
sum sum(1B) (BSD) sum and count blocks in a file
sync sync(1M) update the super block
tabs tabs(1) set terminal tabs
tail tail(1) deliver the last part of a file
tape tape(1) magnetic tape maintenance
tapecntl tapecntl(1) tape control for tape devices
tar tar(1) tape archiver
tcopy tcopy(1) copy a magnetic tape
tee tee(1) pipe fitting
test test(1) condition command
test test(1B) (BSD) condition command
time time(1) time a command
timeout timeout(1) execute a command with a time limit
touch touch(1) update file access and modification times
tr tr(1) translate characters
tr tr(1B) (BSD) translate characters
true true(1) provide truth values
tsort tsort(1) topological sort
tty tty(1) get terminal name
ul ul(1) underline
uname uname(1) get system name
unexpand unexpand(1) convert spaces to tabs
uniq uniq(1) report repeated lines in a file
units units(1) conversion program
uptime uptime(1) show how long system has been up
users users(1) display a compact list of users logged in
w w(1) who is on and what they are doing
wall wall(1M) write to all users
watch watch(1) Keep an eye on a command output.
wc wc(1) word count
what what(1) extract SCCS version information from a file
whatis whatis(1) display a one-line summary about a keyword
who who(1) who is on the system
whoami whoami(1) display the effective current username
whodo whodo(1) who is doing what
write write(1) write to another user
xargs xargs(1) construct argument list(s) and execute command
yes yes(1XNX) (XENIX) print string repeatedly

Page Description
fspec(5) format specification in text files
man(7) macros to typeset manual
1/22/06 Heirloom Toolchest