<HTML>
<HEAD>
<TITLE> Rlab2 Reference Manual: User Functions</TITLE>
</HEAD>
<BODY>
<A HREF="rlab-ref-5.html"><IMG SRC="prev.gif" ALT="Previous"></A>
<A HREF="rlab-ref-7.html"><IMG SRC="next.gif" ALT="Next"></A>
<A HREF="rlab-ref.html#toc6"><IMG SRC="toc.gif" ALT="Contents"></A>
<HR>
<H2><A NAME="s6">6. User Functions</A></H2>



<P>Functions are almost first class variables in Rlab. Functions are
stored as ordinary variables in the symbol-table; this explains the
unusual syntax for creating and storing a function. User written
functions are a key feature in any high level language. Mastering
their usage is key to getting the most out of Rlab.</P>
<P>User functions are created within the Rlab programming
environment... that is the major distinction between user and
builtin functions. Otherwise, there is no difference in the usage of
builtin and user-written functions.</P>
<P>The argument passing and function scoping rules are fairly
straightforward and simple. The primary objective is ease of use,
and safety, without compromising capability or efficiency.</P>
<P>The two inter-related concepts important to understanding functions
are <EM>execution-environments</EM>, and <EM>variable-scopes</EM>.  An
<EM>execution-environment</EM> consists of the variables available to
the currently executing statements. A variable's scope refers to the
environment that variable is bound to. For instance statements
executed from the command line are always executed in the
global-environment. The global environment consists of all variables
and functions created within that environment. Thus, a statement
executed in the global environment only has access to global
variables.</P>
<P>All function arguments, and variables created within functions are
visible only within the function's environment, or
<EM>local-scope</EM>. Conversely, function statements do not have
access to variables in the global-scope (unless certain steps are
taken).</P>
<P>Furthermore, function arguments can be considered local variables
(in simplistic sense). Any modifications to a function's arguments
are only visible within the function's environment. This is commonly
referred to as "pass by value" (as opposed to "pass by
reference"). Pass by value means that a variable's value (as opposed
to the variable itself) is passed to the function's
environment. Thus, when a function argument is modified, only a copy
of its value is affected, and not the original variable.</P>
<P>This separation of environments, or scopes, ensures a "safe"
programming environment. Function writers will not accidentally
clobber a global variable with a function statement, and users will
not accidentally impair the operation of a function with global
statements. </P>


<H2><A NAME="ss6.1">6.1 Function Syntax</A></H2>


<P>The function definition syntax is:</P>
<P>
<BLOCKQUOTE>
function ( <EM>arg1</EM> , <EM>arg2</EM> , ... <EM>argN</EM> )
{
<EM>statements</EM>
}
</BLOCKQUOTE>
</P>
<P>Of course, the function must be assigned to a variable in order for
it to be usable:</P>
<P>
<BLOCKQUOTE>
<EM>var</EM> = function ( <EM>arg1</EM> , <EM>arg2</EM> , ... <EM>argN</EM> )
{
<EM>statements</EM>
}
</BLOCKQUOTE>
</P>
<P>The number of arguments, and their names are arbitrary (actually
there is a 32 argument limit. If anyone ever hits that limit, it can
be easily increased). The function can be used with fewer arguments
than specified, but not more.</P>
<P>The function <EM>statements</EM> are any valid Rlab statements, with
the exception of other function definitions. The optional
<EM>return-statement</EM> is peculiar to functions, and allows the
program to return data to the calling environment.</P>



<H2><A NAME="ss6.2">6.2 Using Functions</A></H2>


<P>Every invocation of a function returns a value. Normally the user
determines the return-value of a function. However, if a return
statement does not exist, the function return value will default to
zero. Since function invocations are expressions, many syntactical
shortcuts can be taken. For example, if a function returns a list,
like the <CODE>eig</CODE> function:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; eig(a)
   val          vec          
&gt; eig(a).val
   -0.677      0.469       1.44  
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P><CODE>eig</CODE> returns a list containing the elements <CODE>val</CODE> and
<CODE>vec</CODE>; if we just want to get at one of the list elements, such
as <CODE>val</CODE>, we can extract it directly. The same sort of shortcut
can be used when a function returns matrices:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; size(rand(20,30))
       20         30  
&gt; size(rand(20,30))[2]
       30  
</PRE>
</CODE></BLOCKQUOTE>
</P>




<H2><A NAME="ss6.3">6.3 Function Arguments</A></H2>


<P>As mentioned earlier, functions can have an arbitrary number of
arguments. When program execution is passed to a user-function, all
or some, or none of the arguments can be specified. Arguments that
are not specified are replaced with UNDEFINED objects. How well the
user-functions behaves when invoked with varying argument lists is
entirely up to the user/programmer. If you don't want to specify the
last N arguments to a function, just truncate the argument list. If
you don't want to specify arguments that are in the beginning, or
middle of the complete list of arguments, use commas. For example, a
user-function defined like:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; x = function ( arg1 , arg2 , arg3 , arg4 )
  {
    if (exist (arg1)) { arg1 }
    if (exist (arg2)) { arg2 }
    if (exist (arg3)) { arg3 }
    if (exist (arg4)) { arg4 }
  }
        &lt;user-function&gt;
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Can be used as follows:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; x ( );
&gt; x ( pi );
     3.14  
&gt; x ( pi, 2*pi );
     3.14  
     6.28  
&gt; x ( , , , 4*pi );
     12.6  
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>In the first instance, no arguments are specified, and the function
does nothing. In the second instance, only the first argument is
specified, and it is echoed to the screen. In the third instance,
the first and second arguments are specified, and are properly
echoed. In the last case, only the fourth argument is specified.</P>
<P>Each function has a local variable named <CODE>nargs</CODE>
available. <CODE>nargs</CODE> is automatically set to the number of
arguments passed to a function for each invocation. For example:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; x = function ( a, b, c, d, e, f, g ) { return nargs; }
        &lt;user-function&gt;
&gt; x(1,2,3)
        3
&gt; x(1,,2)
        3
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This simple one-line function demonstrates how the <CODE>nargs</CODE>
variable is initialized. Notice that while nargs correctly reports
the number of arguments in the first instance, the results from the
second trial may seem a little confusing. The variable <CODE>nargs</CODE>
is intended to provide programmers with the number of arguments
passed to a user-function, and to provide some degree of
compatibility with other popular very high-level languages. However,
the existence of the variable <CODE>nargs</CODE> and the ability to
<EM>skip</EM> arguments conflict and make the meaning of <CODE>nargs</CODE>
ambiguous. The bottom line is: <CODE>nargs</CODE> has the value of the
argument number of the last <EM>specified</EM> argument.</P>
<P>A better method of determining which arguments were explicitly
specified, and where were not is the use the builtin function
<CODE>exist</CODE>. For example:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; x = function ( a , b, c, d, e, f, g )
  {
    if (exist (a)) { &quot;1st argument was specified&quot; }
    if (!exist (a)) { &quot;1st argument was not specified&quot; }
  }
        &lt;user-function&gt;
&gt; x(2);
1st argument was specified  
&gt; x(,3);
1st argument was not specified  
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This method is often used to decide if an argument's default value
needs to be used. Another example:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; # compute a partial sum of a vector...
&gt; psum = function ( v , n )
  {
    if (!exist (n)) { n = length (v); }
    return sum (v[1:n]);
  }
        &lt;user-function&gt;
&gt; v = rand(1024,1);
&gt; psum(v)
      497  
&gt; psum(v,5)
     2.99  
</PRE>
</CODE></BLOCKQUOTE>
</P>



<H2><A NAME="ss6.4">6.4 Function Variable Scoping Rules</A></H2>


<P>Although the subject of function variable scoping has already been
addressed, it deserves more attention. Rlab utilizes a pass by value
scheme for all function arguments. In many interpretive languages,
this means a copy of the function arguments is made prior to
function execution. This practice can be very time and memory
inefficient, especially when large matrices or list-structures are
function arguments. Rlab does <EM>not</EM> copy function arguments,
unless the function modifies the arguments.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; x = function ( a ) { global(A) a = 2*a; A ? return a; }
        &lt;user-function&gt;
&gt; A = 2*pi
     6.28  
&gt; x(A)
     6.28  
     12.6  
&gt; A
     6.28  
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>In the above example the function <CODE>x</CODE> modifies its argument
<CODE>a</CODE>. Before modifying its argument, the function prints the
value of the global variable <CODE>A</CODE>. <CODE>A</CODE> is assigned the
value of <CODE>2*pi</CODE>, and <CODE>x</CODE> is called with argument
<CODE>A</CODE>. After <CODE>x</CODE> modifies its argument, <CODE>A</CODE> is
printed; note that the value of <CODE>A</CODE> is not changed. Printing
the value of <CODE>A</CODE> afterwards verifies that its value has not
changed. </P>



<H2><A NAME="ss6.5">6.5 Function Recursion</A></H2>


<P>Functions can be used recursively. Each invocation of a function
creates a unique environment. Probably the most popular example for
displaying this type of behavior is the factorial function:</P>
<P>
<HR>
<PRE>
//
// Fac.r (the slowest fac(), but the most interesting)
//

fac = function(a) 
{
  if(a &lt;= 1) 
  {
      return 1;
  else
      return a*$self(a-1);
  }
};
</PRE>
<HR>
</P>
<P>The special syntax: <CODE>$self</CODE> can be used within recursive
functions to protect against the possibility of the function being
renamed. </P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; fac(4)
       24  
</PRE>
</CODE></BLOCKQUOTE>
</P>



<H2><A NAME="ss6.6">6.6 Loading Functions and Function Dependencies</A></H2>


<P>Functions definitions are treated just like any other valid
statements. Since function definitions are mostly stored in files
for re-use, you will need to know how to load the program statements
within a file into Rlab's workspace. There are three main methods
for performing this task:</P>
<P>
<UL>
<LI> The <CODE>load</CODE> function reads the argument (a string), and
reads that file, interpreting the file's contents as Rlab
program(s). The file is read, and interpreted, regardless of
file modification or access times.
</LI>
<LI> the <CODE>rfile</CODE> statement is a short-cut to the <CODE>load</CODE>
statement. Additionally, <CODE>rfile</CODE> searches a pre-defined
path of directories to files (ending in <CODE>.r</CODE>) to
load. The file is read, and interpreted, regardless of file
modification or access times.
</LI>
<LI> The <CODE>require</CODE> statement, is similar to the <CODE>rfile</CODE>
statement, except it only loads the specified file
once. <CODE>require</CODE> looks for a user-function with the same
name as the argument to <CODE>require</CODE>, loading the file if,
and only if, a user-function of the same name cannot be
found. File access and/or modification times are not
considered. </LI>
</UL>
</P>

<H3>How a function is Loaded</H3>


<P>Some discussion of the "behind the scenes work" that occurs when a
function is loaded is warranted. Although not absolutely necessary,
understanding this process will help the reader write more
complicated programs efficiently.</P>
<P>
<UL>
<LI> The file is opened, and its contents are read.
</LI>
<LI> As the program statements are read, they are compiled and
executed. Thus, as Rlab encounters global program statements,
they are executed. Function definitions (assignments) are
global program statements. So... function definitions get
compiled, and stored as they are read.
</LI>
<LI> Variable (this includes) functions references are resolved
at compile-time. A variable can be UNDEFINED as long as it is
resolved before statement execution. Thus a function can
reference other functions, that have not been previously
defined; as long as these references are resolved before the
function is executed.
</LI>
<LI> When the file's end mark (EOF) is encountered, the file is
closed, and execution returns to its previous environment. </LI>
</UL>
</P>
<P>Some examples will make the previous remarks more meaningful.</P>

<H3>Example-1</H3>


<P>The following file contains a trivial function
definition/assignment. Note that the variable (function) <CODE>y</CODE> is
undefined.</P>
<P>
<HR>
<PRE>
x = function ( A )
{
  return y(A);
};
</PRE>
<HR>
</P>
<P>Now, we will load the file (<CODE>ex-1.r</CODE>). Note that there are no
apparent problems until the function is actually executed.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; load(&quot;./ex-1.r&quot;);
&gt; x(3)
cannot use an undefined variable as a function
ERROR: ../rlab: function undefined
        near line 3, file: ./ex-1.r
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Now, we can define <CODE>y</CODE> to be a function, and <CODE>x</CODE> will
execute as expected.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; y = function ( A ) { return 3*A; };
&gt; x(3)
        9  
</PRE>
</CODE></BLOCKQUOTE>
</P>

<H3>Example-2</H3>


<P>Understanding how files are loaded will help the efficiency of your
functions. For example, functions are often written which depend
upon other functions. Resolving the dependencies within the file
that contains the function is common, for example, lets use the
previous file (<CODE>ex-1.r</CODE>). We know that the function <CODE>x</CODE>
depends upon the function <CODE>y</CODE>, so we will put a <CODE>rfile</CODE>
statement within the file <CODE>ex-1.r</CODE>, which we will now call
<CODE>ex-2.r</CODE>. </P>
<P>
<HR>
<PRE>
rfile y

x = function ( A )
{
  return y(A);
};
</PRE>
<HR>
</P>
<P>Note that the <CODE>rfile</CODE> statement is exterior to the function; it
is a global program statement. Thus, the file <CODE>y.r</CODE> will get
loaded once, and only once when the file <CODE>ex-2.r</CODE> is loaded.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
&gt; rfile ex-2
&gt; x(4)
       12  
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>We could have put the <CODE>rfile</CODE> or <CODE>load</CODE> statement within
the function definition. But, then the file <CODE>y.r</CODE> would get
loaded every time the function <CODE>x</CODE> was executed! Performance
would be worse than if the <CODE>rfile</CODE> statement was a global
statement, especially if the function was used within a for-loop!</P>



<H3>Restrictions</H3>


<P>Although there has been no reason for the user to infer that there
are limitations on the number of functions per file, or the way
function comments must be formatted, the fact that there are no
restrictions in these matters must be explicitly stated because
other interpreted languages often make such restrictions.</P>
<P>
<UL>
<LI> There are no restrictions on the number of function definitions
within a single file.
</LI>
<LI> There are no restrictions on how a functions comments must be
formatted to work with the online help system. When the help
command is used, the <EM>entire</EM> contents of the named file
are displayed on the screen (usually via a pager). The user
can view comments at the top of the file, or look at the
entire function.
</LI>
<LI> There are no restrictions on the type of programs stored in
files. Simple user-functions can be stored, as well as global
program statements, or any mixture of the two.
</LI>
</UL>
</P>




<H2><A NAME="ss6.7">6.7 Static Variables</A></H2>


<P>Static variables, or file-static variables are variables that are
only visible within the file in which they are declared. The syntax
for the static-declaration is:</P>
<P>
<BLOCKQUOTE>
static ( <EM>var1</EM>, <EM>var2</EM>, ... <EM>varN</EM> )
</BLOCKQUOTE>
</P>
<P>File-static variables are accessible by all program statements after
the static declaration. All functions have access to file-static
variables as if they were local variables, no special declarations
are necessary.</P>
<P>Since functions are variables, functions can be hidden within a file
with the static declaration. These functions are only accessible to
program statements within that particular file. This technique is
very useful for writing function toolboxes or libraries, since it
eliminates the problem of global variables accidentally colliding
with toolbox variables.</P>



<H2><A NAME="ss6.8">6.8 Help for User Functions</A></H2>


<P>Although help for user-functions has already been eluded to, it is
worth covering in detail. By convention, all the documentation
necessary to use a function, or a library of functions is included
within the same file, in the form of comments. This practice is very
convenient since your function is never without
documentation. Rlab's online help system is very simple. when a user
types <CODE>help eig</CODE> (for example) Rlab looks for a help file named
<CODE>eig</CODE> in the designated help directory. If the file <CODE>eig</CODE>
cannot be found, the rfile search path is checked for files named
<CODE>eig.r</CODE>. If a file named <CODE>eig.r</CODE> is found, its entire
contents are displayed in the terminal window. Normally some sort of
paging program is used, such as <CODE>more(1)</CODE>, which allows the
user to terminate viewing the file at any time.</P>




<HR>
<A HREF="rlab-ref-5.html"><IMG SRC="prev.gif" ALT="Previous"></A>
<A HREF="rlab-ref-7.html"><IMG SRC="next.gif" ALT="Next"></A>
<A HREF="rlab-ref.html#toc6"><IMG SRC="toc.gif" ALT="Contents"></A>
</BODY>
</HTML>
