Previous Topic | Table of Contents | Next Topic

Programming Constructs

This chapter outlines the various structures provided for connecting statements together: describing serial relationships, conditional execution, iteration and how procedures are defined and used.


Serial Execution

As we’ve seen already, commands typed on successive lines are executed serially, one after the other. Writing several commands on one line with semicolons between them does the same thing.


368 D% echo hello; echo world
hello
world
369 D% _

The semicolon is not passed to the application you invoke. If you really do want to pass a semicolon, you have to escape it or put it inside quotes.

A non-zero return code is not normally considered an error: regardless of the return code from any particular command, serial execution continues. We can demonstrate this with the rcode utility in the samples directory which prints, then exits with the return code value you pass it on the command line. This example also shows how you can retrieve the return code of the last child process by referring to the built-in status variable.


369 D% cd ~\samples
370 D% rcode 1; rcode 2
1
2
371 D% calc status
2

It’s also possible to describe a conditional serial relationship. If statements are joined by “&&”, the second one is executed only if the return code from the first one is 0, i.e., if the first statement succeeds. If statements are joined by “||”, the second is executed only if the first one fails, i.e., returns a non-zero return code.


372 D% rcode 0 || rcode 1
0
373 D% rcode 1 || rcode 2
1
2
374 D% rcode 0 && rcode 1
0
1
375 D% rcode 1 && rcode 2
1


Statements and Statement Lists

I/O redirectors and statement connectors are recognized according to a precedence. Just as in expressions, where “*” is done before “+”, statements are parsed so that some things are done before others. I/O redirection comes before piping which comes before conditional execution which comes before serializing with semicolons. For example:


376 D% echo hello; echo world | wc
hello
        1        1        7

The shell makes a special distinction between individual statements, no matter how complex, and lists of statements typed on separate lines or separated by semicolons.

Here’s an example using the time command, which runs a statement and prints out the hours, minutes and seconds it took. time expects a single statement as a operand; if you type a semicolon, the time command (together with its operand) becomes just one statement in the list.


377 D% time echo hello world | wc
        1        2       13_
0:00:00.50
378 D% time echo hello; echo world
hello
0:00:00.00
world


Parenthesis

There are two ways to group a list of statements together to make them act like a single statement. The simplest way is with parenthesis, which work the way they would in an expression: even if the operator inside the parentheses are of lower precedence, they’re done first.


379 D% (echo hello; echo world) | wc
        2        2       14
380 D% time (echo hello; echo world)
hello
world
0:00:00.00

A parenthesized group gets its own copy of the current directory and disk. This makes it convenient to change directories inside the group and go do something without having to change back afterward.


381 D% cd
d:\Nicki\samples
382 D% (cd ..; cd)
d:\Nicki
383 D% cd
d:\Nicki\samples

The actual implementation uses the directory stack mechanism: at entry to the group, the current directory is pushed onto the directory stack and at exit, the top entry is popped.


384 D% dirs
d:\Nicki\samples
385 D% (dirs )
d:\Nicki\samples
d:\Nicki\samples
386 D% dirs
d:\Nicki\samples


Control Structures

The more general way of connecting statements together is with control structures, which provide ways of describing conditional or iterative execution or even (with procedures) adding new vocabulary to the language. You can use a control structure anywhere a statement is allowed.

The language is completely recursive: control structures can be nested inside control structures, etc. A statement can be arbitrarily complex. Here’s an example timing a statement that turns out to be a for loop piped to a wc and inside the for loop ...


387 D% time for i = 1 to 3 do
388 D? time echo hello world | wc
389 D?      end | wc
        6       12      126
0:00:01.03


If Statement

The if statement comes in two forms. The short form is convenient if the choice is only between executing and not executing a single statement, which appears on the same line.


390 D% if (5 == 2 + 3) echo yes
yes
391 D% if (5 == 10) echo really
392 D% _

The longer form provides the more traditional if-then-else structure. Indentation is a matter of choice, it’s used in these examples merely to improve readability.


392 D% if (5 == 10) then
393 D?    echo 5 == 10
394 D? else
395 D?    echo 5 is not 10
396 D? end
5 is not 10
397 D% _


Switch Statement

The switch statement works by attempting to pattern match the switch value against a series of alternative cases. The switch and case values can all be arbitrary expressions. If any pattern match succeeds, execution begins with the next statement following and continues, skipping over any interspersed case clauses until either the end of the switch block or a break statement is reached.


397 D% switch ("hello world")
398 D?    case 5:
399 D?       echo hit 5
400 D?    case "h*":
401 D?       echo hit "h*"
402 D?    case "x*":
403 D?       echo hit "x*"
404 D?       break
405 D?    case 43.2:
406 D?       echo hit 43.2
407 D?    default:
408 D?       echo did not hit
409 D? end
hit h*
hit x*

The break statement used here causes execution to “break out of” the innermost control structure. If you’re nested several layers deep into control structures and want to break out of a higher level structure you can label the higher level structure and specify that name on the break statement.


Foreach Statement

The foreach statement is designed for iterating over a series of words. In this example, i is iterated over the list of all the files in the samples directory. Each one, in turn, is tested to see if it’s executable (i.e., has a .csh, .cmd, .bat, .exe or .com extension or is a valid binary executable or .)


410 D% cd ~\samples
411 D% ls
args.c       dumpenv.c    finance.csh  myecho.exe   readme
args.exe     dumpenv.exe  makecpgm.csh rcode.c
bits.csh     factor.csh   myecho.c     rcode.exe
412 D% foreach i (*)
413 D?    if (-x $i) echo $i is executable
414 D? end
args.exe is executable
bits.csh is executable
dumpenv.exe is executable
factor.csh is executable
finance.csh is executable
makecpgm.csh is executable
myecho.exe is executable
rcode.exe is executable


For Statement

The for statement provides more traditional iteration over numerical values. If you specify a range (e.g., “1 to 3”) but don’t specify the increment, 1 is assumed. Although this example shows iteration over integer values, floating point values are equally acceptable.


415 D% for i = 1 to 3 do
416 D?    echo $i
417 D? end
1
2
3

You can also iterate over a list of ranges or individual values. The to and by clauses may be specified in either order.


418 D% for i = 1, 4, 7, 12, -4 to 6 by 3 do
419 D?    echo $i
420 D? end
1
4
7
12
-4
-1
2
5


While Statement

The while statement works in the traditional manner, iterating so long as the while condition is true. This example keeps popping up through the various levels of parent directories until it reaches the root. fullpath is one of the built-in procedures; it return the fully-qualified pathname of its argument. Notice that fullpath is invoked in three different ways: on line 421, as if it were a command, on 422 in more conventional procedure syntax and on 423, where it’s substituted in as if it were a variable.


421 D% fullpath .
d:\Nicki\samples
422 D% while (fullpath(".") !~ "[a-zA-Z]:\")
423 D?    echo $fullpath(".")
424 D?    cd ..
425 D? end
d:\Nicki\samples
d:\Nicki
426 D% cd
d:\


Repeat Statement

The repeat statement has two forms. In the short form, a numeric constant (not an expression) specifies the number of times to execute the statement following on the same line.


427 D% repeat 4 echo do this again
do this again
do this again
do this again
do this again

In the long form, repeat provides the more conventional repeat structure, iterating until some exit condition satisfied.


428 D% calc i = 1
1
429 D% repeat
430 D?    calc i++
431 D? until (i > 5)
1
2
3
4
5


Procedures

Procedures, as in any high-level language, are a convenient way to package together a series of statements as a more convenient operation. Once you’ve defined a procedure, you can invoke it simply as if it were a new command.


432 D% proc hello()
433 D?    echo hello world
434 D? end
435 D% hello
hello world

The proc statement can also be used to ask what procedures are already defined or what arguments a particular procedure takes:


436 D% proc hello
hello        ( )
437 D% proc | mi
abs          ( x )
acos         ( x )
asin         ( x )
:
:
samepath     ( a, b )
sin          ( x )
sinh         ( x )
--- more --- (Press H for Help)

You can explicitly discard a definition with unproc; otherwise the shell remembers any procedure you tell it until you exit the shell or give it a new definition.


438 D% unproc hello
439 D% hello
csh:  Couldn't find an executable file named 'hello'.

When you give the shell a procedure definition, the shell compiles it into an internal form so that the next time you refer to it, it’ll save the reparsing time and run much faster. As an example, unproc the whereis procedure to make the shell reload the definition from the .csh file and see what that does to the execution time:


440 D% unproc whereis
441 D% time whereis frizzle
d:\Nicki\bin\Frizzle.exe
0:00:02.15
442 D% !!
time whereis frizzle
d:\Nicki\bin\Frizzle.exe
0:00:01.28

The namespace for procedures is shared among all the threads: if one thread creates a new procedure, it becomes usable immediately by all the other threads.


Arguments

You can write a procedure so it expects arguments, just as you would in any other high level language. Argument names are somewhat like local variables: their initial values are set at entry to a procedure, hiding any previous definition; they go away as soon you exit the procedure code. Here’s a simple example which compares the timestamps on two files.


443 D% proc comparedates(a, b)
444 D?    if (`newer $a $b`) then
445 D?       echo $a is newer than $b
446 D?    else
447 D?       if (samepath(a, b)) then
448 D?          echo $a and $b are the same file!
449 D?       else
450 D?          echo $a is older than $b
451 D?       end
452 D?    end
453 D? end
454 D% comparedates `whereis Frizzle`
d:\Nicki\bin\Frizzle.exe is newer than d:\Nicki\LastRTM\Frizzle.com
455 D% _

When you pass arguments to a procedure on the command line, the individual argument words are paired up, one-by-one, with the argument names you gave. If the shell runs out of names before it runs out of words, the last named argument gets all the remaining words:


455 D% proc xx(a, b)
456 D?   echo $#a $a
457 D?   echo $#b $b
458 D? end
459 D% xx now is the time
1 now
3 is the time

If you pass arguments to a procedure that doesn’t take any, they’re evaluated but quietly ignored.

If a procedure does take an argument, it always get some value, even if it’s zero words long. So if you want to know if you got passed a value, just count the number of words:


460 D% proc xx(a)
461 D?    echo $#a ">>$a<<"
462 D?    if (a == "") echo null argument!
463 D? end
464 D% xx
0 >><<
null argument!

In a more serious vein, here’s a simple procedure definition I use all the time (I have it in my startup.csh file) to implement a real quick and dirty (but very easy to use!) personal phone index:


465 D% proc ppi(name)
466 D?    grep -i "$name" h:\phone
467 D? end
468 D% ppi hamilton
Hamilton Laboratories  425-497-0102  Fax: 425-497-8336

As you add lines to your \phone file, you merely add any interesting search phrases or other tidbits onto the same line with the person’s name. Totally free format. Add anything you like and search on anything you like and it’s fast.


Return Values

Procedures are also important in expressions, where it’s generally useful to think of the procedure as returning a value, just as it might in any other language. The type and value of what you choose to return is arbitrary. Here’s a purely mathematical example from finance.csh in the samples directory:


469 D% proc FV_PresentAmount(i, n)
470 D?    # Calculate the multiplier to convert $1 now to a
471 D?    #    future value, given interest rate i
472 D?    return 1/(1 + i/100)**n
473 D? end
474 D% # Calculate the future value of $500 invested
475 D% # for 10 years at 8% interest.
476 D% calc 500*FV_PresentAmount(8, 10)
1079.462499

If you call a procedure that returns a value as if it were a command, whatever it returns is printed:


477 D% FV_PresentAmount 8 10
2.158925


Recursion

A procedure can call other procedures or even itself. When a procedure calls itself, it’s called recursion. Typical uses of recursion are in cases where the problem itself is recursive, or self-replicating. For example, here’s a procedure to walk down two directory trees A and B that are thought to be related and list any non-hidden files in A that are not in B. (If you set nonohidden = 1, it’ll compare hidden files also.)


478 D% proc comparetrees(a, b)
479 D?    local i, f
480 D?    foreach i ($a\*)
481 D?       @ f = $i:t
482 D?       if (! -e $b\$f) then
483 D?          echo $b\$f is missing
484 D?       else
485 D?          if (-d $i) comparetrees $i $b\$f
486 D?       end
487 D?    end
488 D? end
489 D% comparetrees c:\src\projectx a:\src

Notice that i and f were declared as local variables. If the variables were simply set variables, one instance of them would be shared by all the levels of recursion. In this particular example, that would still have worked, but only because each level calls the next only after anything involving f or i has been evaluated; it wouldn’t matter if f or i was trampled by the next call. Here’s an example where obviously that would not be true: a clumsy attempt at a “post-order” traversal of a directory tree:


490 D% proc traverse(a)   # Don't do it this way
491 D?    foreach i ($a\*)
492 D?      if (-d $i) traverse $i
493 D?      echo $i
494 D?    end
495 D? end
496 D% traverse . | more

If you carefully examine the output of this traverse, you’ll see that subdirectories don’t get listed properly: instead of being listed by themselves, the name of their last child is listed twice. For a correct result, try it again with i defined as a local variable. (Use the key to help you quickly re-enter the lines that stay the same.)


Calling a Procedure

As you may have spotted, there are two ways to invoke a procedure. Sometimes, the arguments are inside parentheses, separated by commas, and sometimes they’re not. What’s the difference?

The difference is whether the context is an expression or a command. As discussed when we first introduced expressions, the shell always begins to parse statements by first breaking them up into words. That’s fine for normal commands, e.g., running an external utility. And it works also when you want to use a procedure as if it were a command, just typing the name of the procedure followed by a list of arguments separated by spaces, e.g.,


497 D% proc power(a, b)
498 D?    return a**b
499 D? end
500 D% power 2 3
8
501 D% _

But this style of parsing wouldn’t be very suitable in those instances where the point is to do some kind of calculation or expression evaluation. So when the shell encounters something that normally takes an expression, e.g., following the calc keyword, or inside the test in an if statement, it shifts to a different style of parsing, further breaking up the words into tokens, so that “*” isn’t misunderstood as a wildcard, so we don’t need to type spaces around all the operators, so we can type variable names without having to put a “$” in front of them and so on. All of this is so that the rules for typing an expression can bear some resemblance to those followed by other programming languages like C, FORTRAN, Pascal, etc.

When we call a procedure from within an expression, all these same arguments still apply. We want it to act pretty much like any other high level languages. We want to be able to pass it arbitrarily complex expressions as arguments. We want to be able to take the value it returns and use that value as a term in still other expressions.

So there’s a real problem: to call a procedure from within an expression and pass other expressions as arguments, we need a way of separating one argument from the next (obviously, it can’t be just a space as it would be when the procedure is used as if it were a command) and for separating the whole procedure call and its arguments from the rest of the expression. That’s why the common high-level language convention of separating arguments by commas and putting parentheses around the whole list is used. Here’s an example of what that looks like:


501 D% calc 5.5 + power(2, 3)*9
77.500000

If you try using a procedure as a command but accidentally type the argument list with parenthesis, it’s an error:


502 D% power(2, 3)
csh(line 490):  Couldn't evaluate expression operands as numeric
as required by the expression operator.
> in power( "(", "2,", "3", ")" ) defined at line 597
< called from line 502

The reason this is an error is because, since this was typed as a command, the shell took the words following the word power as literal arguments. It couldn’t tell you meant this as an expression. Let’s redefine that procedure, putting some echo statements in there so we can see what happened:


503 D% proc power(a, b)
504 D?   echo a is $a
505 D?   echo b is $b
506 D?   return a**b
507 D? end
508 D% power(2, 3)
a is (
b is 2, 3 )
csh(line 506):  Couldn't evaluate expression operands as numeric
as required by the expression operator.
> in power( "(", "2,", "3", ")" ) defined at line 503
< called from line 508

As you can see, the expression “a**b” failed to evaluate properly because a was set to the first argument word, “(“, and b was set to a string concatenation of all the rest of the words. Neither was a number. If you want to call a procedure and substitute the value back onto the command line even when the context is not an expression, it can be done, however. One way is with command substitution:


509 D% echo `power 2 3`
a is 2 b is 3 8

This is a bit expensive, though, because the shell will have to create a new thread to run the power procedure and set up a pipe to read the result. And as you see, if the procedure also writes to stdout, you’ll pick up that text also, probably unintentionally. Another, better way, is to use a dollar sign to introduce the substitution just as if it was a variable substitution:


510 D% echo $power(2, 3)
a is 2
b is 3
8

Notice that when use the dollar sign-style procedure reference, the rest of the syntax is as if the procedure had been called from within an expression. The arguments do need to be within parenthesis and they do need to be separated by commas. The reason is just the same one as for why a procedure call in an expression has to be done this way: without the parentheses, there’d be no way to tell where the arguments ended. A nice benefit is that in the argument list, we get to use the full expression grammar:


511 D% echo $power(2, 3*sin(1/2))
a is 2
b is 1.438277
2.709970


Shell Scripts

Scripts are a final way of bundling up a series of statements to be called up and executed as a single command. To create a script, create a file with a .csh extension:


512 D% cat >trythis.csh
echo hello from trythis
^Z
513 D% trythis
hello from trythis

When you tell the shell to run a script, it first creates a new thread to run it. This is partly a holdover from original UNIX language definition, partly a response to a provision in Windows for threads, but not a fork mechanism and partly due to a genuine need to inexpensively separate some of the script’s environment from that of its caller. (The next chapter has a longer discussion of threads.)


Shell Script Arguments

Arguments to a shell script are passed to it as the argv variable. argv will be a list of any words that appeared on the command line following the name of the shell script. (You can access the name of the script as the scriptname variable.) You can access argv like any other variable:


514 D% cat >tryargv.csh
echo $#argv $argv
^Z
515 D% tryargv hello how are you
4 hello how are you

There are also some shorthand forms for getting individual words of argv. $0 through $9 is the same as $argv[0] through $argv[9]. (Remember that unless you have nullwords set, subscripting errors will be caught.)


ignorestatus

If you write a script with serially connected statements the only thing that would cause the shell to quit before it gets to the end would be an explicit failure: an application name that couldn’t be found, a child process that terminated with a segment fault, or something else of an equally serious nature. Often in a script, that’s not what you want: you’ve written the script with the expectation that everything will work (as you planned) from one step to the next. If something is wrong, you’d like the script to quit as soon as possible, before any damage is done.

The way you do this is by setting ignorestatus = 0, which means you do not want to ignore the status codes coming back to this thread from its children. Here’s an example in the main thread:


516 D% set ignorestatus = 0
517 D% rcode 10
10
csh:  The child process running 'rcode' exited with a non-zero status = 10.

In the main thread, the shell will keep on going and prompt for the next command because interactively that’s most sensible. The shell knows to do this because ignoreerrors = 1. But in a script, errors cause the shell to quit:


518 D% cat >trythis.csh
calc ignoreerrors
set ignorestatus = 0
rcode 10
echo doesn^'t print
^Z
519 D% trythis
0
10
csh(d:\Nicki\trythis.csh:line 3): The child process running
'rcode' exited with a non-zero status = 10.
> in d:\Nicki\trythis.csh
< called from line 519
csh:  The csh script file 'd:\Nicki\samples\trythis.csh'
exited with a non-zero status = 10.

Notice that in this case we got two messages, one from the threads executing the script and one from the main thread, reporting what the script returned. Let’s return to the normal mode of ignoring status:


520 D% set ignorestatus = 1


source statement

The examples so far have shown how a script is normally run somewhat isolated in a separate thread. It is also possible to run a script in your current thread using the source statement. You might want to do this if you wanted to the script to change your current thread’s private variables or its current directories or disk. Here’s an example to showing how a sourced script runs in the same thread:


521 D% cat >trythis.csh
echo argv = $argv, threadid = $threadid
^Z
522 D% echo $threadid
6
523 D% trythis hello world
argv = hello world, threadid = 7
524 D% source trythis hello world
argv = hello world, threadid = 6
526 D% _

Notice how the argv argument vector is set up the same in either case. Also, notice that the statement number skipped by one. When you source a script, the effect is precisely as if you typed those lines in directly to the shell. The lines read by source are even entered into the history list:


526 D% h 5
    522  echo $threadid
    523  trythis hello world
    524  source trythis hello world
    525  echo argv = $argv, threadid = $threadid
    526  h 5


Caution: Labels and Gotos

We haven’t mentioned labels and gotos yet but it probably isn’t a surprise that the C shell allows them. Indeed:


527 D% cat >trythis.csh
goto next
echo this does not print
next: echo this prints
^Z
528 D% trythis
this prints

If you want to use gotos to labels, you should be aware that forward references can be little trickier than a more conventional compiled language. The C shell allows you to redefine a label anytime you like. But if you type a goto that refers to previously defined label, the shell has no way of knowing that you intend it to redefine it up ahead. You can keep running the last example over and over this way with exactly the same result: because a new thread is started each time with no prior definition of next, the shell knows it must be a forward reference. But imagine how repeatedly sourcing this script would fail in an infinite loop:


% source trythis
this prints
% source trythis
this prints
this prints
this prints
this prints
this prints
:

(Beware of actually trying this: you may find it difficult to interrupt out of it.)

The reason sourcing the script a second time turns into an infinite loop is that the label next is already defined after the first run. The second time, when the goto is read from the script, the history list would look something like this:


source trythis
goto next
echo does not print
next: echo this prints
source trythis
goto next

What particularly gets the shell into a muddle is the way this recurses indefinitely: each time through the loop, it recurses through an another level of sourcing. Ultimately, it runs out of stack space and fails. This is not a nice way to treat the shell!

In general, it’s hard to recommend gotos in any programming language nowadays; in a script you intend to run using source, they can be particularly nasty.

The shell does automatically age labels and throw them away after a while even if they haven’t been redefined. When it discards a label, it also discards any compiled statements it’s been holding onto that could have been executed only by a goto to that label. The cutoff point where the shell begins to discard labels is set by the gotowindow variable. Let’s now clean up after ourselves and move along:


529 D% rm trythis.csh


Interrupts

Normally, when you type Ctrl-C, you interrupt the foreground activity. But what if you were in the midst of a complex script and needed to do some kind of cleanup before you exited? What if you wanted to be sure you had a chance to delete any temporary files you might have littered around?

The solution is the onintr statement, which allows you to define the action to be taken when an interrupt is received. It causes whatever’s running to be interrupted all the way back up to the block in which the onintr routine was defined and for the interrupt routine to be run in that current thread. Within that interrupt routine, you could, for example, remove all your temporary files and goto the end of the script or return a special value from a procedure or whatever else might be appropriate.


530 D% onintr echo hello
531 D% for i = 1 to 5 do
532 D?    echo $i
533 D?    sleep 1
534 D? end
1
^C
hello

Here’s another example, returning from a procedure. Note how the value returned (and printed) is the one produced by the onintr statement.


535 D% proc foobar()
536 D?    onintr return 5
537 D?    for i = 1 to 5 do
538 D?        echo $i
539 D?        sleep 1
540 D?    end
541 D? return 2
542 D? end
543 D% foobar
1
^C
5

When execution leaves the block in which an onintr is defined, the previous onintr (if any) again takes effect. Note that a null onintr routine does not mean that interrupts are ignored, merely that after processing bubbles back up to the level where that onintr was defined, that it will continue with the next statement. Notice how, in this example, when the Ctrl-C is received when obviously execution is stuck in the infinite loop inside bar, that the “onintr goto xx” causes a branch to xx in the same block in which the onintr was defined, not the xx in the block where execution was going on. Also, notice that once both procedures have been exited, we’re back to the same onintr routine we defined a few statements earlier.


544 D% proc foo()
545 D?    onintr goto xx
546 D?    bar
547 D?   xx:  echo this is foo
548 D? end
549 D% proc bar()
550 D?    while (1)   # Deliberately infinite loop
551 D?    end
552 D?   xx:  echo this is bar
553 D? end
554 D% foo
^C
this is foo
555 D% ^C
hello
555 D% _


Masking Interrupts

In cases where you’d like to simply turn off interrupts or defer processing them, use the irqmask variable. By default, it’s set to 0, meaning interrupts will be accepted immediately. Setting it to 1 means interrupts will be deferred until the mask is cleared again. Setting it to 2 means interrupts will be totally ignored.

irqmask is a per-thread variable, meaning each thread can independently decide how it will respond to interrupts. Each new thread always starts out with irqmask = 0 (interrupts enabled).



Previous Topic | Table of Contents | Next Topic

Copyright © 1988-2003 by Hamilton Laboratories. All rights reserved.