Questaal Home
Navigation

The File Preprocessor

Table of Contents


Main Features


The preprocessor allows you to declare variables and evaluate expressions. It also possesses some programming language capability, with branching control to skip or loop over a selected block of lines.

The preprocesor is built into the source code, rdfiln.f. Comments at the beginning of rdfiln.f document directives it can process. Here we use rdfiln as a name for the preprocessor.

Curly brackets contain expressions

rdfiln treats curly brackets {…} specially and substitutes its contents for something-else. Typically {…} will contain an algebraic expression, which is evaluated as a binary number and rendered back as an ASCII representation of the number.

Thus the line

    talk {4/2} me

becomes

    talk 2 me

{…} can contain other kinds of syntax as well; see below.

Note: The substitution {...}string may increase the length of the line. If the modified line exceeds the maximum size it is truncated. This maximum length is controlled by parameter recln0 in the main program. Source codes are distributed with recln0=120.

Variables

The preprocessor permits three kinds of variables: floating point scalar, floating-point vector, and strings. They can be declared with preprocessor directives. Scalar and character variables can also be declared on the command-line using, e.g.
-vsnam=expr  or  -vcnam=string.

  • Separate symbol tables are maintained for each of the three kinds of variables.
  • Scalar variables and vector elements can be used in algebraic expressions.
  • Character variables can be used in string expressions (see below).

Note: As rdfiln parses a file, it may create new variables, thus enlarging the symbols table. Variables allocated this way are temporary, however. When rdfiln has finished with whatever it is reading, it destroys variables created by preprocessor directives. You can preserve variables for future use with the % save directive.

Branching and looping constructs

You can conditionally read certain lines of a file, or loop over lines multiple times.

Expression Substitution

Enclosing a string in curly brackets, viz {strn}, instructs the preprocessor to parse the contents of  {strn}  and substitute it with something else.

Note: To suppress expression substitution, prepend {strn} with a backslash, viz \{strn}. The preprocessor will remove the backslash but leave {strn} unaltered.

strn must take one of the following syntactical forms. If rdfiln cannot match the first form, it tries the second, and so on. (It is an error if no form can be matched.) These four forms are as follows, arranged by the precedence they take in parsing:

  • (string substitution):   strn is name of a character variable. The value of the variable is substituted.
    The variable may be followed by a qualification (see 1a and 1b below).
  • (conditional substitution):   strn begins with a “?”. An expression is evaluated, which determines what string is substituted. See 2 below.
  • (vector substitution):   strn is the name of a vector. The result is the contents of the vector. See 3 below.
  • (expression substitution):   strn an algebraic expression. Expressions use C-like syntax. See 4 below.

In more detail, the four rules are as follows:

  1. (string substitution) strn consists of (or begins with) a character variable, say mychar.

    a. strn is a character variable. rdfiln replaces {mychar} with contents of mychar.
    Example: If mychar=’foo bar’,    {mychar} → foo bar.

    b. strn is a character variable followed by a qualifier (…), which must be one of the following:

    • (integer1,integer2) (substring of  strn). {mychar(n1,n2)} is replaced by the (n1:n2) substring of {mychar}.
      Example: If mychar=”foo bar”, {mychar(2,3)}oo.

    • (charlst,n) (index to charlst). {mychar(charlst,n)} returns in index to {mychar}. {mychar} is parsed for characters in charlst, returning the index to the n1th occurrence; charlst is a sequence of characters.
      n is optional: if omitted, the preprocessor uses n=1.
      Example: let mychar=”foo bar” and charlst=’abc’. Note that “foo bar” contains characters ‘b’ and ‘a’.
      {mychar(‘abc’,2)} → 6, because the 6th character contains the second occurence of [abc].

    • (:e) returns an index marking last nonblank character in {mychar}.
      Example: If mychar=’foo bar’,   {mychar(:e)} → 7.

    • (/’strn1’/’strn2’/,n1,n2) substitutes strn2 for strn1.
      Substitutions are made for the n1th to n2th occurrence of strn1.
      Example If mychar=”foo bar”, then   {mychar(/’foo’/’boo’/)} → “boo bar” n1 and n2 are optional, as are the quotation marks.

  2. (conditional substitution) strn takes the form {?~expr~strn1~strn2}   (Note: the ‘~’ can be any character).
    expr is an algebraic expression; strn1 and strn2 are strings. rdfiln returns either strn1 or strn2, depending on the result of expr.
    If expr evaluates to nonzero, {…} is replaced by strn1.; else {…} is replaced by strn2.
    Example:   {?~(n<2)~n is less than 2~n is at least 2} :
    {…}   becomes   “n is less than 2“   if n<2; otherwise it becomes   “n is at least 2”  

  3. (vector substitution) strn is name of a vector variable, say myvec. rdfiln replaces {myvec} with a sequence of numbers separated by one space, which are the contents of myvec.
    Example : suppose myvec. has been declared as a 5-element quantity in the following way:
       % vec myvec[5] 6-1 6-2 5-2 5-3 4-3
    {myvec}  will be turned into 5 4 3 2 1
    A single element of a vector acts like a scalar. Thus  {3*myvec(2)-2}  becomes 10.

  4. (expression substitution) strn is an algebraic expression composed of numbers combined with unary and binary operators. The syntax is very similar to the C programming language. rdfiln parses  strn  to obtain a binary number, renders the result in ASCII form, and substitutes the result.

    Note: strn  may consist of a sequence of expressions, separated by commas. rdfiln returns the value of the last expression. A variable should be assigned to each intermediate expression. Assignment may be simple (=) or involve an arithmetic operation.
    Examples:

      {x=3}               ←  assigns x to 3 and returns '3'
      {x=3,y=4}           ←  assigns x to 3 and y to 4, and returns '4'
      {x=3,y=4,x*=y}      ←  assigns x to 3*4 and y to 4, and returns '4'
      {x=3,y=4,x*=y,x*2}  ←  assigns x to 3*4 and y to 4, and returns '24'
    

Further properties of curly brackets

Brackets may be nested. rdfiln will work recursively through deeper levels of bracketing, substituting {..} at each level with a result before returning to the higher level.
Example: Suppose  {foo}  evaluates to 2. Then:

  {my{foo}bar}

will be transformed into

  {my2bar}

and finally the result of  {my2bar}  evaluated.

If rdfiln cannot evaluate {my2bar} it will abort with a message similar to this one:

 rdfile: bad expression in line
 {  ...  my2bar}

Note: there is a syntactical difference between {expr} and the value of expr itself, because {expr} returns an ASCII representation of expr, and precision is lost. Thus  {pi-3}  is replaced by .141592654

Syntax of Algebraic Expressions

The general syntax for an expression is a sequence of one or more expressions of the form

   {name=expr[,name=expr...]}

Commas separate declarations. Arithmetic operators can be used in place of assignment (=), for example  {x=3,y=4,x=y,x2}. The final expression may (and typically does) consist of an expression only omitting  name=.

Note: expr may not contain any whitespace.

expr has a syntax very similar to C. It is composed of numbers, scalar variables, elements of vector variables, and macros, combined with unary and binary operators.

    Unary operators take first precedence:
    1.   - arithmetic negative
         ~ logical negative (.not.)
           functions abs(), exp(), log(), sin(), asin(), sinh(), cos(), acos()
                     cosh(), tan(), atan(), tanh(), flor(), ceil(), erfc(), sqrt()
           Note: flor() rounds to the next lowest integer; ceil() rounds up.

    The remaining operators are binary, listed here in order of precedence with associativity
    2.   ^  (exponentiation)
    3.   *  (times), / (divide), % (modulus)
    4.   +  (add), - (subtract)
    5.   <  (.lt.); > (.gt.); = (.eq.); <> (.ne.); <= (.le.);  >= (.ge.)
    6.   &  (.and.)
    7.   |  (.or.)
    8&9  ?: conditional operators, used as: **test**?**expr1**:**expr2**
    10&11 () parentheses

The  ?:  pair of operators follow a C-like syntax: test, expr1, and expr2 are all algebraic expressions.
If test is nonzero, expr1 is evaluated and becomes the result. Otherwise expr1 is evaluated and becomes the result.

Assignment Operators

The following are the allowed assignment operators:

  assignment-op         function
    '='            simple assignment
    '*='           replace 'var' by var*expr
    '/='           replace 'var' by var/expr
    '+='           replace 'var' by var+expr
    '-='           replace 'var' by var-expr
    '^-'           replace 'var' by var^expr

Examples of expressions

Suppose that the variables table looks like:

   Var       Name                 Val
    1        t                   1.0000
    2        f                  0.00000
    3        pi                  3.1416
    4        a                   2.0000
 ...
   Vec       Name            Size   Val[1..n]
    1        firstnums          5    1.0000        5.0000
    2        nextnums           5    6.0000        10.000
 ...
     char symbol                     value
    1 c                               half
    2 a                               whole
    3 blank

Note: You can print out the current variables table with the % show directive. As described in more detail below, such a variables table can be created with the following directives:

 % const a=2
 % char c half a whole blank " "
 % vec firstnums[5] 1 2 3 4 5
 % vec nextnums[5] 6 7 8 9 10

Then the line

  {c} of the {a} {pi} is {pi/2}

is turned into the following;

  half of the whole 3.14159265 is 1.57079633

whereas the line

  one quarter is {1/(nextnums(4)-5)}

becomes

  one quarter is .25

Character Substrings Example:

 % char c half a whole
  To {c(1,3)}ve a cave is to make a {a(2,5)}!

becomes

  To halve a cave is to make a hole!

Vector Substitution Example:

  {firstnums}, I caught a hare alive, {nextnums} I let him go again ...

becomes

  1 2 3 4 5, I caught a hare alive, 6 7 8 9 10  I let him go again ...

Nesting Example: The following illustrates nesting to three levels. The innermost block is substituted first. Beginning with

    % const xx{1{2+{3+4}1}} = 2

substitution takes place in three passes:

    % const xx{1{2+71}} = 2
    % const xx{173} = 2
    % const xx173 = 2

_ Example of {?~expr~strn1~strn2} syntax_

    MODE={?~k~B~C}3

evaluates to, if k is nonzero:

    MODE=B3

or, if k_ is zero:

    MODE=C3

Note: the scalar variables table is always initialized with predefined variables t=1 and f=0 and **pi=π. It is STRONGLY ADVISED that you never alter any of these variables.

Preprocessor Directives

  • Lines beginning with  % keyword  are be interpreted as preprocessor directives. Such lines are not part of the the post-processed input.
  • Lines which begin with  #  are comment lines and are ignored. (More generally, text following a # in any line is ignored).

Recognized keywords are

   const cconst cvar udef var vec                        ← allocate and assign numerical variables
   char char0 cchar getenv vfind                         ← allocate and assign character variables
   if ifdef ifndef iffile else elseif elseifd endif      ← branching construct
   while repeat end   exit                               ← looping and terminating constructs
   echo include includo macro save show stop trace udef  ← miscellaneous

Variable declarations and assignments

  Keywords :   const cconst cvar udef var vec char char0 cchar getenv vfind

  1. const  and  var  load or alter the variables table. Example:
    % const  myvar=expr 

    does two things:

    • adds myvar to the scalar variables symbols table if it is not there already. const and var are equivalent in this respect.
    • assigns the result of expr to it, if either
      • you use the var directive or
      • you use the const directive and the variable had not yet been created.

    In other words, if  myvar  already exists prior to the directive,  const  will not alter its value but  var  will. Thus the lines

    % const a=2
    % const a=3
    

    incorporate a into the symbols table with value 2, while

    % const a=2
    % var a=3
    

    does the same but assigns 3 to a.

    Note: if myvar exists, you can multiply, divide, add, subtract from, or exponentiate it with expr, using one of the following C-like syntax:

       myvar*=expr  myvar/=expr  myvar+=expr  myvar-=expr  myvar^=expr
    

    These operators modify  myvar  for both  const  and  var  directives.

  2. cconst  and  cvar  conditionally load or alter the variables table. Example:
    % cconst test-expr myvar=expr 

    test-expr  is an algebraic expression (e.g., i==3) that evaluates to zero or nonzero.
    If test-expr evaluates to nonzero, the remainder of the directive proceeds as  const  or  var  do.
    Otherwise, no further action is taken.

      Example: the input segment

    % const a=2 b=3 c=4 d=5
    A={a} B={b} C={c} D={d}
    % const a=3
    % var d=-1
    % const b*=2 c+=3
    A={a} B={b} C={c} D={d}
    % cconst b==6  b+=3 c-=3
    A={a} B={b} C={c} D={d}
    % cconst b==6  b+=3 c-=3
    A={a} B={b} C={c} D={d}
    

    generates four lines:

    A=2 B=3 C=4 D=5
    A=2 B=6 C=7 D=-1
    A=2 B=9 C=4 D=-1
    A=2 B=9 C=4 D=-1
    

    a is unchanged from its initial assignment while d changes.

    Compare the two  cconst  directives. b and c are altered in the first instance, since the condition  b==6  evaluates to 1, while they do not change in the second instance, since now  b==6  evaluates to zero.

  3. char loads or alters the character table. Example:

    % char  c half     a whole      blank
    

    loads the character table as follows:

     char symbol                     value
    1 c                               half
    2 a                               whole
    3 blank
    

    The last declaration can omit an associated string, in which case its value is a blank, as  blank  is in this case.

    Note: Re-declaration of any previously defined variable will change the contents of the variable.

  4. char0  is the same as  char , except re-assignment of an existing variable is ignored. Thus  char0  is to  const  as  char  is to  var .

  5. cchar  is similar to  char   but tests are made to enable different strings to be loaded depending on the results of the tests. The syntax is
    % cchar nam  expr1 str1 /i>expr2</i> str2 ... 

    nam   is the name of the character variable; expr1 expr2  etc are algebraic expressions.
    nam   takes the value  str1   if  expr1   evaluates to nonzero, the value  str2   if  expr2   evaluates to nonzero, etc.

  6. getenv  has a function similar to char , only the contents of the variable are read from the unix environment variables table. Thus
    % getenv myhome HOME 

    puts the string of your home directory into variable myhome.

  7. vec  loads or alters elements in the table of vector variables.

    % vec v[n]                      &larr; creates a vector variable of length n
    % vec v[n] n1 n2 n3 ...         &larr; does the same, also setting the first elements
    

    Once v  has been declared, individual elements of  v   may be set with the following syntax

 % vec v(i) n                    ← assigns n to v(i)
 % vec v(i1:in)  n1 n2 ... nn    ← assigns range of elements i1..in to n1 n2 ... nn

There must be exactly in−i1+1  elements n1 … nn .

Note: if v  is already declared, it is an error to re-declare it.

  1. vfind  finds which element in a vector that matches a specified value. The syntax is
    % vfind v(i1:i2)  svar  match-value 

    svar  is a scalar variable and match-value  a number or expression. Elements  v(i1:i2)   are parsed.  svar   is assigned to the the first instance i for which  v(i)=match-value . If no match is found, svar  is set to zero.
       Example:

    % vec  a[3] 101 2002 30003
    % vfind a(1:3) k 2002           &larr; sets k=2
    % vfind a(1:3) k 10             &larr; sets k=0
    

Branching constructs

  Keywords :   if ifdef ifndef iffile else elseif elseifd endif

Branching constructs have a function similar to the C constructs.

  1. if exprelseif exprelse  and  endif  are conditional read blocks. Lines between these directives are read or not, depending on the value of  expr. Example:

    % if Quartz
     is clear
    % elseif Ag
     is bright
    % else
     neither is right
    % endif
    

    generates this line if  Quartz  evaluates to nonzero:
         is clear
    otherwise this line if  Ag  evaluates to nonzero*
         is bright
    and otherwise
         neither is right

  2. ifdef  is similar to if , but has a more general idea of what constitutes an expression.

    • if expr   requires that expr  be a valid expression, while  ifdef expr  evaluates expr  as false if it invalid (e.g. it contains an undefined variable).
    • expr   can be an algebraic expression, or a sequence of expressions separated by  &   or   |   (AND or OR binary operators), viz:
      % ifdef expr1 | expr2 | expr3 ...   
      If any of  expr1,    expr2,   ... evaluate to nonzero, the result is nonzero, whether or not preceding expressions are valid.
      Note  the syntactical significance of the spaces.  expr1|expr2   cannot be evaluated unless both  expr1   and  expr2   are valid expressions, while  expr1  |  expr2   may be nonzero if either is valid.
    • ifdef   allows a limited use of character variables in expressions. Either of the following are permissible expressions:
         char-variable            ← T if char-variable exists, otherwise F
         char-variable=='string'  ← T ifchar-variable has the value string 
      Example:
       % ifdef  x1==2 & atom=='Mg' | x1===1  
      is nonzero if scalar  x1   is 2 and if character variable  atom   is equal to "Mg", or if scalar  x1   is 1. Note  binary operators  &   and   |   are evaluated left to right:  &   does not take precedence over  |.

  3. elseifd   is to  elseif   as  ifdef   is to  if.

  4. ifndef expr  … is the mirror image of  ifdef expr. Lines following this construct are read only if  expr   evaluates to 0.

  5. iffile filename   is a construct analogous to  %if   or  %ifdef   for conditional reading of input lines.
    The test condition is set not by an expression, but whether file  filename   exists or not.
    Note:  if,    ifdef,   and  ifndef   constructs may be nested to a depth of mxlev. The codes are distributed with mxlev=6.

Looping constructs

  Keywords :   while repeat end

  1. while  and  end  mark the beginning and end of a looping construct. Lines inside the loop are repeatedly read until a test expression evaluates to 0.
    % while [expr1 expr2 ...] test-expr ← skip to `% end' if test-expr is 0
    ...                    ← these lines become part of the input while test-expr is nonzero
    % end                  ← return to the `% while' directive until test-expr is 0

    The (optional) expressions [expr1 expr2]  follow the rules of the  const   directive:

    • Each of expr1expr2, , … take the form  nam= expr   or  nam op= expr.
    • A simple assignment  nam=expr   has effect only when  nam   has not yet been loaded into the variables table. Thus it has effect on the first pass through the  while   loop (provided  nam   isn’t declared yet) but not subsequent passes.

    These rules make it very convenient to construct loops, as the following example shows.

    % udef -f db                   ← removes db from symbols table, if it already exists
    % while db=-1 db+=2 db<=3      ← db is initialized to -1 only once
    this is db={db}                ← the body of the loop that becomes the input
    % end                          ← return file pointer to %while until test db<=3 is 0

    generates

    this is db=1
    this is db=3
    

    Pass 1:  db  is created and assigned the value −1, then incremented to 1. Condition  db<=3   evaluates to 1 and the loop proceeds.
    Pass 2:  db  already exists so  db=-1  has no effect.  db+=2  increments  db   to 3.
    Pass 3:  db   increments to 5 causing the condition  db<=3   to become 0. The loop terminates.

  2. % repeat   …  % end   is another looping construct with the syntax
    % repeat varnam list
     ...                             ← lines parsed for each element in list
    % end
    

    As with the  while   construct, multiple passes are made through the input lines.  list   generates a sequence of integers (see the integer list syntax manual). For each member of the sequence  varnam   takes its value and the body of the loop passed through.  list   can be just an integer (e.g.  7 ) or define a more complex sequence, e.g.  1:3,6,2   generates the sequence  1 2 3 6 2.

    Example: nested  while  and  repeat  loops

    % const nm=-3 nn=4
    % while db=-1 db+=2 db<=3
    % repeat k= 2,7
    this is k={k} and db={db}
    {db+k+nn+nm} is db + k + nn+nm, where nn+nm={nn+nm}
    % end (loop over k)
    % end (loop over db)
    

    The nested loops are expanded into:

    this is k=2 and db=1
    4 is db + k + nn+nm, where nn+nm=1
    this is k=7 and db=1
    9 is db + k + nn+nm, where nn+nm=1
    this is k=2 and db=3
    6 is db + k + nn+nm, where nn+nm=1
    this is k=7 and db=3
    11 is db + k + nn+nm, where nn+nm=1
    

Other directives

  Keywords :   echo exit include includo macro save show stop trace udef

  1. echo contents   echoes  contents   to standard output.
    Example :
    % echo hello world
    

    prints

    #rf    line-no: hello world
    

    line-no   is the current line number.

  2. exit [expr]   causes the program to stop parsing the input file, as though it encountered an end-of-file.
    • If  expr   evaluates to nonzero, or if it is omitted, parsing ends.
    • If  expr   evaluates to 0 the directive has no effect.

    Note:  compare to the  stop  directive.

  3. include filename  causes rdfiln to include the contents file  filename  into the input.
    • If  filename  exists, rdfiln opens it and the file pointer is transferred to this file until no further lines are to be read. At that point file pointer returns to the original file.
    • If  filename  does not exist, the directive has no effect.

    Notes:  %include  may be nested to a depth of 10.  Looping and branching constructs must reside in the same file.

  4. includo filename  is identical to  include , except that rdfiln aborts if  filename  does not exist.

  5. macro(arg1,arg2,..) expr  defines a macro. arg1,arg2,…  are substituted into expr  before it is evaluated. Example :
    % macro xp(x1,x2,x3,x4) x1+2*x2+3*x3+4*x4
    The result of xp(1,2,3,4) is {xp(1,2,3,4)}
    

    generates

    The result of xp(1,2,3,4) is 30
    

    Note:  macros are not quite identical to function declarations. The following lines illustrate this:

    % macro xp(x1,x2,x3,x4) x1+2*x2+3*x3+4*x4
    The result of xp(1,2,3,4) is {xp(1,2,3,4)}
    The result of xp(1,2,3,3+1) is {xp(1,2,3,3+1)}
    The result of xp(1,2,3,(3+1)) is {xp(1,2,3,(3+1))}
    

    generates

    The result of xp(1,2,3,4) is 30
    The result of xp(1,2,3,3+1) is 27
    The result of xp(1,2,3,(3+1)) is 30
    

    macro  merely substitutes 1,2,3,… for  x1,x2,x3,x4  in  expr  as follows:

    1+2*2+3*3+4*4            ← xp(1,2,3,4)
    1+2*2+3*3+4*3+1          ← xp(1,2,3,3+1)
    1+2*2+3*3+4*(3+1)        ← xp(1,2,3,(3+1))
    

    Operator order matters, so  4  and  3+1  behave differently. By enclosing the fourth argument in parenthesis, operator precedence is maintained.

  6. save  preserves variables after the preprocessor exits. The syntax is:
    % save                   ← preserves all variables defined to this point
    % save name [name2 ...]  ← saves only variables named
    

    Only variables in the scalar symbols table are saved.

  7. show …  prints various things to standard output:
    % show vars              ← prints out the state of the variables table
    % show lines             ← echos each line generated to the screen until:
    % show stop              ← is encountered
    

    Note:  because the vector variables can have arbitrary length,  show  prints only the size of the vector and the first and last entries.

  8. stop [expr msg]  : causes the program to stop execution.
    • If  expr  evaluates to nonzero, or if it is omitted, program stops ( msg , if present, is printed to standard output before aborting).
    • If  expr  evaluates to 0 the directive has no effect.

    Note:  compare to the  exit  directive.

  9. trace  turns on debugging printout. rdfiln prints to standard output information about what it is doing.
    • trace 0 turns the tracing off
    • trace 1 turns the tracing on at the lowest level.
      rdfiln traces directives having to do with execution flow (if-else-endif, repeat/while-end).
    • trace 2 prints some information about most directives.
    • trace 4 is the most verbose
    • trace (no argument) toggles whether it is on or off.

10. udef [−f]  name [name2 …]’  remove one or more variables from the symbols table. If the  −f  is omitted, rdfiln aborts with error if you remove a nonexistent variable. If  −f  is included, removing nonexistent variable does not generate an error. Only scalar and character variables may be deleted.

Source codes

Source codes the preprocessor uses are found in the  slatsm  directory:

   rdfiln.f  The source code for the preprocessor. Subroutine rdfile parses
             an entire file and returns a preprocessed one, can be found in rdfiln.f
             The key subroutine is rdfiln, which parses one line of a file.
   symvar.f  Maintains the table of variables for floating point scalars.
   symvec.f  Maintains the table of vector variables.
   a2bin.f   Evaluates ASCII representations of algebraic expressions using a C-like
             syntax, converting the result into a binary number. Expressions may
             include variables and vector elements.
   bin2a.f   Converts a binary number into a character string (inverse function to a2bin.f).
   mkilst.f  Generates a list of integers for looping constructs,
             as described below.  describes the syntax of integer lists.

rdfiln also maintains a table of character variables. It is kept in the character array ctbl, and is passed as an argument to rdfiln.

Note:  the ASCII representation of a floating-point expression is represented to 8 or 9 decimal places; thus it has less precision than the binary form. For example, ‘{1.2345678987654e-8}’  is turned into 1.2345679e-8.
Edit This Page