catm (Concatenate, merge, and expand files)

Usage:
catm [-f fname] [-s char] [-e end-of-page-text] listfile file2 file3 >tempfile
or
catm [-f fname] [-s char] [-e end-of-page-text] listfile -o file-once -r file3 file4  >tempfile
or
catm [-f fname] [-s char] [-e end-of-page-text] listfile -o file-once -r file3 file4  -o file2-once >tempfile

"catm" will read the first file ("listfile"), the field value file, and repeatedly copy the remaining files to standard output, performing substitutions based on information read from the field value file. "listfile" is the field values file, and will contain the names of "fields" and sets of values for them. The files repeatedly copied will be searched for references to the field names in <> brackets, and such references to a field will be replaced by the current value of this field as determined from the field value file.

Redirection of this output to a file or a pipe is most common.

The "-f" option, which must precede the "listfile", indicates that the field-names are defined in a separate file. This allows the "listfile" to contain only field values, without needing the field names to be defined at the top with the first set of fields. This file can also contain the "END=" directive described below, further enabling field content to be completely separated from "catm" processing instructions. Note that "END" is a reserved field name.

The "-s" option allows the field separator character to be changed from the Tab character. This only has effect if the field file is organized with multiple field values per line.

The "-e" option allows the value of an "END=" directive to be specified on the command-line. If field names are given as the first line of "listfile", then this option removes any need for a separate file containing this directive. If the END text contains blanks, it will need to be given in quotes. The default end text could be given with
      -e "%NP%%PS%\nQSdict begin Init ToTop end PE\n"

While copying the files:

  1. If the text string "%include file-name%" is encountered, the file named will be included in the output at that point. If the text string %DATE% is encountered, the current date will be substituted. (Include processing).
  2. If the text string "<field-name>" is encountered, then it will be replaced by the current value associated with the name field-name found from the first file (listfile) in the above examples. (Field processing).
  3. If the text %DM,name:string:% is encountered, it will be taken as defining a macro. Wherever %M,name% is subsequently found, the text in string will be substituted. (Macro processing)
COMMAND ARGUMENT PROCESSING:
The list of arguments is interpreted as filenames, except for the flags "-o" meaning output once, and "-r" meaning output repeatedly.

The first file is read and interpreted as a set of field definitions that will be applied to other files (the ones to be repeatedly output).

If a "-o" is encountered after the first file, it indicates that the following files are to be output directly with only "include" processing and macro processing being performed.

If a "-r" is encountered after the first file, it indicates that the following files are to be output repeatedly, in turn using each set of field values from the first file. That is, a set of field values is found from the first file. All files following the "-r" are output using this set of field values. Then the next set of field values is read and the process repeated. On each occasion, full "include" and macro processing is performed.

If neither "-o" nor "-r" are present, all files after the first are repeatedly output with field, macro and include processing.
FIELD PROCESSING
Files for repeated processing are read in turn and copied to standard output.

If a string <text> is encountered in the file, where text is a field-name defined in the first file or in a "-f" field-name file, the value of the field is substituted for the string.

This process of substituting and copying all the remaining files is repeated for each of the groups of field definitions in the first file.

The simplest way to set up the fields file is to provide on the first line the names of the fields, Tab separated. Subsequent lines will then contain the values of the fields for the next version of the document being generated.

If the "-f" option is used, then the file specified after "-f" will be read seeking the field names, rather than expecting them at the head of the field values file. The field names will be separated by tabs or new-line characters. There may also be an END= directive. The file of fields will contain values for the fields, tab separated, in the same order as the field names.

When a reference to a <fieldname> is found in a document to be output, the current replacement text is treated as if it is part of the input document. As this replacement text is output, it is scanned for any macro references, "include" directives, or other field references which are processed as they are encountered.
INCLUDE PROCESSING
If the string %include name% is found at the beginning of a line,
  1. the name is interpreted as a file name;
  2. reading of the original file is suspended, and
  3. the new file is inserted in the output stream.
A file being included can itself contain %include newname% instructions, and such new files will be inserted appropriately. Calls to include files can be nested to any depth. (Including a file recursively will cause an infinite loop.)

A file-name is interpreted relative to the file from which it was called. Names beginning with '..' are interpreted as being in a higher level directory.

A string %DATE% within a file will be replaced by the current local date.
MACRO PROCESSING
Within a file that is to be copied to output, macros can be defined using the form %DM,name:string:%, and such strings will not immediately be copied to the output. These set up text substitutions that will apply to the remainder of the files as they are copied to output.

Such macros can be invoked with %M,name%, so that this text will be replaced by the string in the definition. Invocation is recursive, by which I mean that the replacement text is not simply copied directly to standard output, but instead is processed as the input stream. Within this replacement text may be other macro invocations (%M,othername%), new macro definitions, or %include.. directives. (A macro cannot be redefined within its own body; our simple syntax will not allow colons to appear in the 'string' part of a definition).
EXAMPLE 1:
The simplest way to perform a mail merge is to ignore the "field-names" file and provide a single file containing a single line with the names of the fields followed by lines giving the field values. This file, called "fields" (say), might look like:
    NAME \t PERS \t ADDR1 \t ADDR2 \t ADDR3
    Mr H. Bloggs  \t Henry     \t 360 Hindmarsh Dr  \t Phillip, ACT 2616 \t Australia
    Ms R. McManus \t Rosemary  \t 19 McWilliam Cres \t Florey, WA 6517   \t 
    Mr GA Spence  \t Mr Spence \t 40 Plenty Ave     \t Lower Hutt        \t New Zealand 
    Mrs P Thomas  \t Penni     \t Computer Science  \t ADFA         \t Campbell, ACT 2600
("\t" means the Tab character.)

The second file (letter) might look like

    %include ../pscode/letter.hd.qs%
    %DATE%
      
    <NAME>
    <ADDR1>
    <ADDR2>
    <ADDR3>

    Dear <PERS>

    %FI%We are having a reunion of all students and staff who ... 

The Unix file "letter.hd.qs" will be found in a directory called "pscode" which is a sister directory to the directory that the file "letter" is in.

This letter could now be output with the Unix command

       
         catm fields -o Qs -r letter  |  lpr -Plaser

In this case, the output generated by this catm command will be:


         Quikscript file
         letter.hd.qs file
         today's date
      
         Mr H. Bloggs
         360 Hindmarsh Dr
         Phillip, ACT 2616
         Australia

         Dear Henry

         %FI%We are having a reunion of all students and staff who ...
         %NP%%PS%
         QSdict begin Init ToTop end PE
         letter.hd.qs file
         today's date
      
         Ms R. McManus
         19 McWilliam Cres
         Florey, WA 6517
         

         Dear Rosemary

         %FI%We are having a reunion of all students and staff who ...
         %NP%%PS%
         QSdict begin Init ToTop end PE
         letter.hd.qs file
         today's date
      
         Mr GA Spence
         40 Plenty Ave
         . . .
EXAMPLE 2:
The names of the fields (ie. the first line in the example above) could be given in a separate file, called "fnames".

This letter could now be output with the Unix command

         catm -f fnames fields -o Qs -r letter  |  lpr -Plaser

EXAMPLE 3:
In this more challenging example, an invitation is to be sent to many people, some local, and some remote. The letter will contain name, address and personal identification as in the first example, but will also contain some tailoring of the letter content depending on where the invitee lives.

The field values file, "fields3", could be set up as
    NAME \t PERS \t ADDR1 \t ADDR2 \t ADDR3 \t WHERE
    Mr H. Bloggs \t Henry \t 360 Hindmarsh Dr \t Phillip, ACT 2616 \t Australia \t %M,Local%
    Ms R. McManus \t Rosemary \t 19 McWilliam Cres \t Florey, WA 6517 \t \t %M,Remote%
    Mr GA Spence \t Mr Spence \t 40 Plenty Ave \t Lower Hutt \t New Zealand  \t %M,Remote%
    Mrs P Thomas \t Penni \t Dept Computer Science \t ADFA \t Campbell, ACT 2600  \t \t %M,Local% 

A second file, possibly called styles, would be set up to establish style features that were to apply to all of the letters. It might look like:


    %DM,Remote:%P%If you would like accommodation arranged...:%
    %DM,Local::%
    %include dijkstra.qs%
    %FN,Uft% 

This would be included in the output once, after Qs, and before the letters. It defines the text to substitute for the <WHERE> field, and sets up the font to be used in the letters. The text following DM,Remote must all be on the one line. In this case, no alternative text is provided if the person lives locally; their letter will be shorter.

The letter would commence in the same way as the previous example, but would have <WHERE> at the end of a paragraph to pull in the text defined in the second file.

The command to print the letters would then be:

       
         catm fields3 -o Qs styles -r letter  |  lpr -Plaser

If the volume of text in Remote was too great to include on one line (or exceeded the input buffer size of 1024 characters), then Remote could be defined to instead include text from another file:

       
         % DM,Remote:% include accom.txt%:%

EXAMPLE 4:
To put addresses on the envelopes, we might want to use label stationery, which has three columns of eight rows of labels per sheet. We could use the same field values file as previously. We do not want to go to a new page after each person, so we would need to change the default end-of-page processing. We could provide a field-names file containing the line
         END=%L,.6%%P%
or we could more simply specify this on the command line with
         -e %L,.6%%P%
Some trials may be needed to get the spacing correct between labels, and a new-paragraph instruction will be useful to force a new column when we are near the bottom of the page.

The printer we are using might allow text to be placed quite close to the top, left and right margins, but it might be limited at the bottom of the page because of the paper-feed mechanism. We would set up a page style that allowed for this:

         %PM,0,0,5,36%%PS /ColGap 0 store%%NC,3%%NF%%TB,10%%\%
This causes page margins to be set up (%PM,0,0,5,36%), remove any gap between columns (%PS /ColGap 0 store%), make three columns per page (%NC,3%), make input lines be simply output without filling the previous line (%NF%), set the left margin 10mm from the left (%TB,10%), and do not go to a new output line when the next input line is read (%\%).

Then the document to be repeated (called "labels") might look like:

         <NAME>
         <ADDR1>
         <ADDR2>
         <ADDR3>

and the command to print the labels could be:
       
         catm -e "%L,.6%%P%" fields -o Qs -r labels  |  lpr -Plaser
EXAMPLE 5:
An alternative format for the fields file could be adopted. There will be one line per field, and the lines for the first letter must define the names to be used in this and in subsequent letters.

The field values file (fields2) might look like
         NAME=Mr H. Bloggs
         PERS=Henry
         ADDR1=360 Hindmarsh Dr
         ADDR2=Phillip, ACT 2616
         ADDR3=Australia
         NEXT
         Ms R. McManus
         Rosemary
         19 McWilliam Cres
         Florey, WA 6517

         Mr GA Spence
         Mr Spence
         40 Plenty Ave
         Lower Hutt
         New Zealand 
         Mrs P Thomas
         Penni
         Dept Computer Science
         ADFA
         Campbell, ACT 2600 

The letter would be the same as in the example above.

The output pages could be generated with Qs layout using the Unix command

       
         catm fields2 -o Qs -r letter  |  lpr -Plaser
so that the Quikscript file (Qs) will be copied once at the start, followed by four copies of "letter", each tailored to a name and address given in the "fields2" file.