Rpcalc Lexer
Previous: <Rpcalc Rules=>RpcalcRulf> * Next: <Rpcalc Main=>RpcalcMaio> * Up: <RPN Calc=>RPNCalc>

#Wrap on
{fH4}The {fCode}rpcalc{f} Lexical Analyzer{f}

The lexical analyzer's job is low-level parsing: converting characters or
sequences of characters into tokens.  The Bison parser gets its tokens by
calling the lexical analyzer.  \*Note <Lexical=>Lexical>: The Lexical Analyzer Function {fCode}yylex{f}.

Only a simple lexical analyzer is needed for the RPN calculator.  This
lexical analyzer skips blanks and tabs, then reads in numbers as
{fCode}double{f} and returns them as {fCode}NUM{f} tokens.  Any other character
that isn't part of a number is a separate token.  Note that the token-code
for such a single-character token is the character itself.

The return value of the lexical analyzer function is a numeric code which
represents a token type.  The same text used in Bison rules to stand for
this token type is also a C expression for the numeric code for the type.
This works in two ways.  If the token type is a character literal, then its
numeric code is the ASCII code for that character; you can use the same
character literal in the lexical analyzer to express the number.  If the
token type is an identifier, that identifier is defined by Bison as a C
macro whose definition is the appropriate number.  In this example,
therefore, {fCode}NUM{f} becomes a macro for {fCode}yylex{f} to use.

The semantic value of the token (if it has one) is stored into the global
variable {fCode}yylval{f}, which is where the Bison parser will look for it.
(The C data type of {fCode}yylval{f} is {fCode}YYSTYPE{f}, which was defined
at the beginning of the grammar; \*Note <Rpcalc Decls=>RpcalcDecm>: Declarations for {fCode}rpcalc{f}.)

A token type code of zero is returned if the end-of-file is encountered.
(Bison recognizes any nonpositive value as indicating the end of the
input.)

Here is the code for the lexical analyzer:

#Wrap off
#fCode
\/\* Lexical analyzer returns a double floating point 
   number on the stack and the token NUM, or the ASCII
   character read if not a number.  Skips all blanks
   and tabs, returns 0 for EOF. \*\/

\#include <ctype.h>

yylex ()
\{
  int c;

  \/\* skip white space  \*\/
  while ((c = getchar ()) == ' ' || c == '\\t')  
    ;
  \/\* process numbers   \*\/
  if (c == '.' || isdigit (c))                
    \{
      ungetc (c, stdin);
      scanf ("%lf", &yylval);
      return NUM;
    \}
  \/\* return end-of-file  \*\/
  if (c == EOF)                            
    return 0;
  \/\* return single chars \*\/
  return c;                                
\}
#f
#Wrap on

