Source code listing for the Lions' Commentary in PDF and PostScript

UNIX OPERATING SYSTEM SOURCE CODE LEVEL SIX
Released April 2004
Over 15 years ago I had produced a replica of the Lions' source code listing since I was unhappy with the quality of my n-th generation copy. There was no TUHS and I had no access to any old source code. But in 1988 I discovered an old 9-track tape being discarded of a PDP11 backup. It was hard to determine what it was running, but it did have an intact /usr/src/ tree of which most of the files were timesamped 1979, even at that time it seemed ancient. So it was either 7th edition or a derivative like PWB, which I believe it was.

I used this as a basis and hand edited the source back into 6th edition form. Some code was completely the same, some required the easy edit of changing the modern += token into the archaic =+. Others needed to just remove casts, while some had to be completely retyped, but not that much.

Since it was not to be circulated as copyright and trade secrets, it remained closely held. In fact it remained lost even to me until recent discussions upon acquiring the out of print Lions' book made me look for it. Since you can get this almost exact source code from the TUHS V6 tree as well as older and newer source code in much more usable form than my produced document I feel more comfortable providing this derivative work now.

It's not exact, as my cross reference was programatically generated, but it is a superset of the original one. It is also based on the first edition of Lions' commentary. The script that I wrote to produce the book from the source files was along with it, which I have included below because I left some notes in it as to how long it took to run and found the metrics interesting. I left those notes in because it used significant resources at the time. It took over 15 minutes on a 3B20S and 3 minutes on mainframe hardware. Just for kicks I ran it today and it still works. Now it runs in under 6 seconds, and that's not even on a particularly fast machine, a 900MHz Sun V880. It could probably run in under a minute on mobile phones these days.

DOWNLOAD the document here as either PDF or gzip'd PostScript. It you plan to print it on a duplex printer, some (most) devices will print the back side of the page upside down due to the fact that it is in landscape and the binding is assumed to be the left 11" side of the paper, which would be the bottom of our page. You should test your printer by only printing the first four pages to see how they look. These are the alternates with the even numbered pages flipped upside down if you have this problem. Alternate page flipped PDF or alternate page flipped gzip'd PostScript.

A replica of the commentary itself can be found at another site here.

 
#!/bin/sh
### make v6 unix source listing book
### whutt!wally		WH 3A-327	AUGUST MCMLXXXVIII
# This takes forever: it will to take more than 15 minutes on a 3b20,
# So here's a timex of printing this on a lightly
# loaded IBM 3081k (2 CPU 16 MIP) running UTS:
#	real     3:07.13
# 	user     1:23.45
# 	sys        15.93
# Nearly all this time is used in making the Reference pages.
# If you don't want the reference pages made set "REFS=false" below
#
# set up
REFS=${REFS:-true}
umask 066
UNIX="UNIX Operating System"
TMPHOLD=/tmp/.v6.out
TMP=/tmp/unix
ASMLST=/tmp/asm.lst
PLST=/tmp/procs.lst
ASM_ONLY=/tmp/.v6.s.tmp
NO_ASM=/tmp/.v6.no.s.tmp
XREF=/tmp/xlist.tmp
mkdir $TMP
trap "rm -rf $TMPHOLD $TMP $ASMLST $PLST $ASM_ONLY $NO_ASM $XREF;exit" 0 1 2 3
SRCDIR=`pwd`
AWK=nawk
XFILE=/tmp/unix/.unumber
# Set up order of files
SET1="param.h systm.h seg.h proc.h user.h low.s m40.s main.c slp.c prf.c malloc.c"
SET2="reg.h trap.c sysent.c sys1.c sys4.c clock.c sig.c"
SET3="text.h text.c buf.h conf.h conf.c bio.c rk.c"
SET4="file.h filsys.h ino.h inode.h sys2.c sys3.c rdwri.c subr.c fio.c alloc.c iget.c nami.c pipe.c"
SET5="tty.h kl.c tty.c pc.c lp.c mem.c"
ORDER="$SET1 $SET2 $SET3 $SET4 $SET5"
# Go!
> $XFILE
for i in $ORDER
do
 # this awk script "eats" the first "^I" because "4.4%d %c%c%c" is the
 # fisrt tab stop.  Make each first one 2 "^I"s.
 sed 's/^	/		/' $i | \
 $AWK '
  BEGIN {XFILE="/tmp/unix/.unumber"
	getline start < XFILE
	if (start<100) start=100;
	lineno=start;lx=0;
	TAIL1="Reproduced under license from the Western Electric Company, NY";
	TAIL2="Copyright, J. Lions, 1976"
	}
  { printf("%4.4d ",lineno);
    print $0;
    lineno++;
    lx++;
    if(lx == 50) {
	printf("\n\n%s\n%s\n\nSheet %2.2d\n",TAIL1,TAIL2,(lineno-50)/100);
	lx=0;
    }
  }
 END {print lineno > XFILE}' >$TMP/$i
done
cd $TMP/..
DIR=`echo $TMP | $AWK '{x=split($1,S,"/"); print S[x]}'`
# need to stick $DIR in front of every element in $ORDER; sure would be nice
# to have the functionality of "rc" from "Plan 9", which would do it easy.
for i in $ORDER
do
  XORDER="$XORDER $DIR/$i"
done
pr -w66 -l66 $XORDER  > $TMPHOLD

if $REFS
then

# Make Reference Pages Before the book!
cd $SRCDIR
# Make procedure list sorted alfa..
# first - treat assembly special they start with an _
grep '^....[ 	]_[a-z][a-z]*[0-9]*:' $TMPHOLD | $AWK '{print $1, $2}'  > $ASMLST
for i in `cut -d_ -f2 $ASMLST | cut -d: -f1`
do
	A=`grep -l "[ 	(&*!|+-=/:?]*$i[ 	]*(" $TMPHOLD`
	if [ "$A" != "" ]
	then
		grep "_$i:" $ASMLST
	fi
done | sed 's/_//' > $PLST
# now do all C functions.
grep '^....[ 	][a-z][a-z]*[0-9]*(.*)[ 	]*$' $TMPHOLD | cut -d'(' -f1  >> $PLST
sort +1 $PLST | pr -4 -w132 -l66 -h "$UNIX Procedures Sorted Alphabetically"

# Make procedure listing file by file and getline number from above
for i in $ORDER 
do
  echo "   File  $i"
  FT=`echo $i | cut -d. -f2`
  if [ "$FT" = "s" ]
  then
	# this is an assembly module
	A=`grep '^_[a-z][a-z]*[0-9]*:' $i | cut -d_ -f2 | cut -d: -f1`
	for j in $A
	do
		grep " $j:\$" $PLST
	done | sed 's/ / _/'
  else
	A=`grep '^[a-z][a-z]*[0-9]*(.*)[ 	]*$' $i | cut -d'(' -f1`
	for j in $A
	do
		grep " $j\$" $PLST
	done
  fi
done | pr -5 -w132 -l66 -h "$UNIX Files and Procedures"

# Make list of all #define'd symbols and their values
grep  '#define' $TMPHOLD | \
   $AWK '{printf("%s %s  \t%s\n",$1,$3,$4)}' | sort +1 |\
	pr -5 -w132 -l66 -h "$UNIX Defined Symbols"

# Now the hard part - the Cross listing 
# Use a tokenizer to get all tokens (and line numbers)
# Since rules for C and asm differ; must use sep. progs.

SSTR=`grep -n 'unix/.*\.s' $TMPHOLD | sed -n 1p | cut -d: -f1`
SEND=`grep -n 'unix/.*\.s' $TMPHOLD | tail -1 | cut -d: -f1`
SEND=`expr $SEND + 60`
sed -n $SSTR,${SEND}p $TMPHOLD > $ASM_ONLY
sed $SSTR,${SEND}d $TMPHOLD > $NO_ASM

# first the assemby --
grep '^....[ 	]*[A-Za-z_][A-Za-z_0-9]*:' $ASM_ONLY | cut -d: -f1 |\
  $AWK '{print $2, $1}' | sed 's/_//' > $XREF
$AWK '
BEGIN { go() }

function gettok() {
   if (tok == "(eof)") return "(eof)"
   sub(/^[ \t]+/, "", line)
   tokno++;
   while ((nlen=length(line)) == 0) {
	if(getline line == 0)
		return tok = "(eof)"
	if (line ~ /^[0-9][0-9][0-9][0-9]/) {
		lineno = substr(line, 1, 4);
		line = substr(line, 5);
	} else line =""
        sub(/^[ \t]+/, "", line)
	tokno = 0
   }
   if (line ~ /^[A-Za-z]/) {
	nxtn = 1
	xline = substr(line, 2);
	while((xline ~ /^[A-Za-z_0-9]/) && (nxtn < nlen)) {
		nxtn++;
		xline = substr(xline, 2);
	}
    }
    else if (line ~ /^./) {
	nxtn =1 
    }
    tok = substr(line, 1, nxtn)
    line = substr(line, nxtn+1)
    return tok
}

function go() {
     gettok();
     while ( tok != "(eof)" ) {
	filter();
	gettok();
   }
}

function filter()
{
	if( tok != "/*") {
	   if((tok ~ /^[A-Za-z]/) && (tokno != 0))
		print tok, lineno
	   if(tok == "/" )
		line = ""
	   if(tok == "\"") 
	     if((stp=index(line, "\"")) != 0) 
		line = substr(line, stp+1)
	}
	else while((tok != "(eof)") && (tok != "*/")) gettok();
}
'  $ASM_ONLY  |\
sed -e '/^[bfA-Z] /d' -e '/^globl /d' -e '/^r[0-9] /d' -e '/^sp /d' -e '/^PS /s' -e '/^pc /d' >> $XREF

# this only prints the identifier tokens
$AWK '
BEGIN { go() }

function gettok() {
   if (tok == "(eof)") return "(eof)"
   sub(/^[ \t]+/, "", line)
   while ((nlen=length(line)) == 0) {
	if(getline line == 0)
		return tok = "(eof)"
	if (line ~ /^[0-9][0-9][0-9][0-9]/) {
		lineno = substr(line, 1, 4);
		line = substr(line, 5);
	} else line =""
        sub(/^[ \t]+/, "", line)
   }
   if (line ~ /^[A-Za-z]/) {
	nxtn = 1
	xline = substr(line, 2);
	while((xline ~ /^[A-Za-z_0-9]/) && (nxtn < nlen)) {
		nxtn++;
		xline = substr(xline, 2);
	}
    }
    else if((line ~ /^(\/\*)/) || (line ~ /^(\*\/)/)) {
	nxtn = 2
    }
    else if (line ~ /^./) {
	nxtn =1 
    }
    tok = substr(line, 1, nxtn)
    line = substr(line, nxtn+1)
    return tok
}

function go() {
     gettok();
     while ( tok != "(eof)" ) {
	filter();
	gettok();
   }
}

function filter()
{
	if( tok != "/*") {
	   if(tok ~ /^[A-Za-z]/)
		print tok, lineno
	   if(tok == "\"") 
	     if((stp=index(line, "\"")) != 0) 
		line = substr(line, stp+1)
	}
	else while((tok != "(eof)") && (tok != "*/")) gettok();
}
'  $NO_ASM  |\
    sed -e '/^for /d' -e '/^do /d' -e '/^while /d' -e '/^case /d' -e '/^switch /d' -e '/^include /d' -e '/^if /d' -e '/^define /d' -e '/^return /d' -e '/^struct /d' -e '/^int /d' -e '/^char /d' -e '/^register /d' -e '/^[a-im-pstA-Z] /d' -e '/^default /d' -e '/^goto /d' -e '/^break /d' -e '/^continue /d' -e '/^sizeof /d' >> $XREF
# the above sed removes C dirty words and other nasties

sort -f $XREF | uniq | $AWK '
BEGIN {last1 = ""; col=0}
{
 if (last1 == $1) {
	if(col>3) {
		printf("\n          ");
		col = 0
	}
	printf("%s ",$2);
	col++;
 } else {
	if (col > 0) printf("\n")
	printf("%s",$1);
	x=10-length($1)
	while((x--)>0) printf(" ");
	printf("%s ",$2);
	col =1;
	last1 = $1
 }
}
END {print}
' |\
pr -4 -w132 -l66 -h "$UNIX Source Code Cross Reference Listing"

fi #end-if of if $REFS

# Now, finally print the source code!!
# sed is to fix up stuff chopped off because of "pr -2"
pr -t -2 -w132 -l66 $TMPHOLD | \
sed -e '/lists: for/s/associat./associat-/' -e '/\*f_offset/s/nter	\*./nter	*\//' -e '/)(ip->i_addr\[0\])$/s/$/;/' -e '/\[0\],v)./s/v)./v);/' -e '/[ers] \*$/s/$/\//' -e '/==0$/s/$/)/' -e '/++$/s/$/)/'



See also by me 7th edition documents in PDF.

.