Previous Topic | Table of Contents | Next Topic

Wildcarding

The notion of wildcarding is pretty simple: the user gives just a few characters describing the filename he’s looking for and system fills in the rest. With “vanilla” Windows, wildcarding is the responsibility of each application, based on the command-line arguments it’s given. Typically, the application designer fulfills this by linking in a library routine which does a simple-minded half-hearted wildcarding.

Hamilton C shell does the wildcarding before invoking the application. The shell’s wildcarding includes five components: home directory expansion, wildcarding characters, ranges, alternation and indefinite directories. A powerful recursive match algorithm is employed to guarantee a sensible result no matter how complex the pattern.


Home Directory Expansion

The tilde character, “~”, is recognized as shorthand for the home directory. In the simplest form, we can use it just by itself:


199 D? echo $home
d:\Nicki
200 D% cd ~
201 D% cd
d:\Nicki

There’s also shorthand for children or siblings of the home directory:


202 D% cd ~\samples
203 D% cd
d:\Nicki\Samples
204 D% cd ~Jeff
205 D% cd
d:\Jeff


Wildcard Characters

The wildcard characters, “*” and “?”, provide shorthand for “match any string” and “match any single character,” respectively.

Suppose the home directory contained the following contents:


206 D% cd ~
207 D% ls
bcs         mandel      sh          ex.rc       release.csh
bix         mba         testcode    icon.ico    ring.ico
channel.one online      util        login.csh   snapshot.csh
dial        postscpt    word        mail        startup.csh
excel       regressn    backup.csh  Ozzie.jpg   vi.ini
games       resume      brite.csh   popup.txt
icon        samples     class.txt   prime.c

The following example shows the use of “?” to match any single character. Wildcard results are always shown alphabetically in lower case. No distinction is made between directories and files.


208 D% echo ????
dial icon mail util word
209 D% echo b??
bcs bix

The “*” can match zero or more arbitrary characters except “:” or “\”; in contrast to DOS-style wildcarding, “*” can match “.”. If there are ordinary characters in the pattern, they must also be matched.


210 D% echo *mp*e*
samples

Because the wildcarding is done before the command is invoked (without the command even being aware), wildcarding can even be done on a cd command:


211 D% cd !$
cd *mp*e*
212 D% cd
d:\Nicki\samples

Wildcarding is most emphatically not restricted to matches only against a single directory level. Here’s an example that wildcards across all the subdirectories, looking for .c files that begin with “a”.


213 D% cd ..
214 D% echo *\a*.c
samples\args.c sh\allocate.c

Wildcarding can even be done against driveletters. For example:


215 D% echo *:\*\q*
i:\mail\quotes.doc i:\tmp\query.out j:\nicki\quantity.disc

When wildcarding against driveletters, the shell restricts the set of drives it will search down to just those specified by the DRIVEMASK environment variable. If you don’t specify a DRIVEMASK, the default is all drives except the floppies a: and b:. The search is restricted so you don’t waste time trying to access slow removable media that may not even be ready.


Ranges

Ranges describe a set of characters, any one of which will be matched. It’s specified as a list of acceptable characters inside “[...]” brackets. The range “[be]” means either “b” or “e”; “[b-e]” is shorthand for any character in the sequence “b” through “e”. Within the brackets, any number of hyphenated sequences and single characters can pasted one after the other in any order. For example, “[a-cu-zgkmp]” is a perfectly legal range. Here are a couple examples. Notice that ranges can also be used with driveletters.


216 D% echo [be]*
backup.csh bcs bix brite.csh ex.rc excel
217 D% echo[d-g]:\[s-t]*
d:\taxes d:\tmp e:\spool e:\startup.cmd e:\temp e:\toolkit.sys
f:\swap f:\tmp f:\toys g:\skip g:\temp g:\tmp

An exclusion range is written as a set of characters inside the brackets that starts with a circumflex. It’ll match any single character not in the range.


218 D% echo [^a-t]*
util vi.ini word


Alternation

Alternation, specified with “{...}” braces, is a shorthand way of specifying that all the combinations of frontparts and backparts should be generated. There isn’t any requirement that the filenames constructed actually exist.


219 D% echo {zork,gadzooks}.csh
zork.csh gadzooks.csh
220 D% echo {a,b}{c,d}{e,f}
ace acf ade adf bce bcf bde bdf

Alternation can be combined arbitrarily with the other wildcard constructs:


221 D% echo {[bc],*r}*i*
bix brite.csh brite.csh ring.ico


Indefinite Directories

The ellipsis, “...”, is an indefinite definite directory wildcard. It’ll match zero or more arbitrary directory levels -- whatever it takes to make the rest of the wildcard match. To be recognized as a wildcard, the context must indicate it’s really a filename, i.e., it must be preceded by “\”, “/”, “~” or “:” or followed by “\” or “/”. For example, to find all the .inf files anywhere on the C: drive, one might type:


222 D% ls d:\...\xm*
d:\Nicki\book\XmlConcepts.doc

As with all the wildcard constructs, the indefinite directory construct can be used completely arbitrarily. It can even be used several times in the same wildcard. But do notice if you do that, there is a possibility of getting the same file listed more than once:


223 D% ls f:\...\a*\...\s*
f:\u\Andrew\Accounts\Secret    f:\u\Andrew\Accounts\Secret

This can happen if there’s more than one possible way to match the same pathname. In this example, the “a*” part could matched either “Andrew” or “Accounts” with the first “...” matching either “u\Andrew” or “u” and the second “...” matching either “Accounts” or just zero levels.


Match Failures

When you specify a sequence of wildcard patterns and none of them match, it’s normally treated as an error. In this example, the first command causes an error because there’s no file or directory name with a “z” in it. The second command executes without error because, out of the sequence of patterns, there’s at least one match.


224 D% echo *z*
csh:  Wildcarding failed to produce any matches.  To suppress
this error, set nonomatch = 1 (pass through) or 2 (discard).
225 D% echo *z* sa*
samples

In this context, the fact that alternation caused something to be generated is not the same as a match. In the next example, “{zork,gadzooks,*z*}.csh” is the same as “zork.csh gadzooks.csh *z*.csh”; only the last element involves any matching, and it fails.


226 D% echo {zork,gadzooks,*z*}.csh
csh:  Wildcarding failed to produce any matches.  To suppress
this error, set nonomatch = 1 (pass through) or 2 (discard).

The nonomatch variable lets you control how a wildcard failure is treated. It works just the way nonovar works when you reference to a non-existent variable.


227 D% set nonomatch = 1
228 D% echo *z*
*z*
229 D% !s:s/1/2/
set nonomatch = 2
230 D% !e
echo *z*

231 D% !s:s/2/0/
set nonomatch = 0
232 D% !e
echo *z*
csh:  Wildcarding failed to produce any matches.  To suppress
this error, set nonomatch = 1 (pass through) or 2 (discard).


Caution: The copy, xcopy, rename and del commands

Hamilton C shell expands out wildcards before it invokes the application you name. This is not what the copy, xcopy, rename, and del commands expect! Suppose there are two files, file.a and file.b on your diskette a:, that you wanted to copy to your current drive. Under cmd.exe, it would be natural to type:


[D:\NICKI] xcopy.exe a:*.*
Source files are being read...

A:FILE.A
A:FILE.B

2 file(s) copied.

The destination is implicit. xcopy understands the wildcarding to mean “copy everything on drive a: to the current disk and directory.” That is not what would happen under the C shell! Because the wildcard would be expanded first, it would act instead as if you had typed:


[D:\NICKI] xcopy.exe a:file.a a:file.b
Source files are being read...

A:FILE.A

1 file(s) copied.

Do you see what happens? If wildcarding is done first, the xcopy command sees just the two filenames and figures you mean to copy one right over the other. file.b is lost! For this reason, the normal startup.csh file contains some carefully constructed aliases and procedures to intercept the copy, xcopy, rename and del commands:


proc safecopy(files)
   cmd /c copy $files; @ nowild = s; unlocal s
end
alias copy   (local s; @ s = nowild; @ nowild = 1; safecopy)

proc safexcopy(files)
   xcopy.exe $files; @ nowild = s; unlocal s
end
alias xcopy  (local s; @ s = nowild; @ nowild = 1; safexcopy)

proc saferename(files)
   cmd /c rename $files; @ nowild = s; unlocal s
end
alias rename (local s; @ s = nowild; @ nowild = 1; saferename)
alias ren    rename

proc safedel(files)
   cmd /c del $files; @ nowild = s; unlocal s
end
alias del      (local s; @ s = nowild; @ nowild = 1; safedel)
alias erase    del

The way this works by saving the current value of nowild (which tells whether wildcarding is should be done), turning off wildcarding, invoking the copy, xcopy, rename or del command, then restoring the wildcarding state. s is a temporary variable that gets discarded after its been used.

Be sure to always invoke copy, xcopy, rename and del via these aliases. If you encounter other applications that really must do their own wildcarding, use this same technique with them.



Previous Topic | Table of Contents | Next Topic

Copyright © 1988-2003 by Hamilton Laboratories. All rights reserved.