to:elamr@paradise.net.nz,parabola@paradise.net.nz
subject:MegaByte/Explanatn of Linear Regression PlotY-x

13.10.01   09:40
my file  > donmcdonald pgms programsPd statistics.SPRegr
// ILinRegTx.

Dear Richard Elam,
I hope you are well soon.

Below is my explanation of Linear Regression Program
PlotY-x, for Megabyte.
Or just include my graphs and newspaper letter.
(separate email or snail mail.)

Thank you

Regards Don McDonald.

I hope your suitable address is in Phone Book.--no!!
elam rc & pm  76  karimu st stkvly  938  6891.
overlooked in 2001/02 book!!!

Posted 13-10-01   12 noon     v/  v/
Should be delivered Mond. 14 Oct. but doesn't give much time
before Octob 2001 Mtg 3d Wednes.

389  6820.

    1992, TUES 7 April.  REF:
    DSMCDON3/92. LINREG, 1050-1350, 
    (596 words.) 16.5.92, 2255-2315,  
    20.5.92, 1750, (748 words.)

    (Richard Elam NZPCA 13-10-01.)***  
             copied to DSMCDON3/92 
    disk. 6.6.92.  
      
    to space out text and re-save.      
       (779 words.)  
     Also spool to SeriCalc disk, as 
    "LinRegTEXT",  4.6.93.  
       
    Editor, Beeblet,   
    BBC/Acorn Computer User Group NZ 
    INC Pte Box 9592, Te Aro,  
     WGTN.   (Mr Peter Browne )  
      
      
    Linear Regression plot of 'Y' vs. 
    'X'.  
      
    by Don S McDonald, 63/ 3 
    Hut.chi.son Rd, WELLINGTON 2,  
     Tel. 0-4- 389.6820,  N.Z.  
      
      
      
    BBC BASIC 2 Program "PLOTY-X", by D 
    S  McDonald, Wellington, can plot  
    any variable versus any variable on 
    a fully labelled X-Y graph, fit  
    the best ("least squares") straight 
    line to the plot, and print the  
    regression equation, etc.  
      
      
    Example.  The following are 
    hypothetical data for physical  
    characteristics of children, age 1 
    to 8 years.  
      
    AGE      HEIGHT    WEIGHT  
      
    1        1.6       1  
    2        2.2       1.5  
    3        2.5       2  
    4        3.0       2.8  
    5        - (U)     3.6  
    6        3.7       4.4  
    7        3.9       ?(U)  
    8        4.2       5  
      
    There are 8 rows of data, (sets or 
    observations).  The column  
    variables are: Age, Height, Weight. 

                                    Page 1
     
      
      The first column, "AGE", is a 
    sequence, starting at 1, and  
    increasing by 1 each term, for 
    total 8 terms.  
      
      There are two incomplete 
    observations.  Height corresponding 
    to age 5 was not obtained.  Weight 
    corresponding to age 7 is also 
    missing.  
      
    Height can be plotted against Age, 
    or Weight against Height, or Weight
    against Age, etc.  
      
    Program "PLOTY-X" can do all the 
    above things and more.  It can  
    evaluate a variable name as a 
    function.  E.g.  Plot  "I*I" 
    against "X", a quadratic curve.  
      
                                        
      
    These are the features of Program 
    PLOTY-X.  
      
      
    160 lines of BBC BASIC 2. (NOW 
    BASIC V) Nil lines Assembler- 
    Machine code.  
      
    BBC model B microcomputer with 32 
    kilobytes RAM  (ORIGINALLY. NOW 
    RISC OS.) 
      
    Disc Graphic dump utility, *SDUMP.  
     (NOW *SCREENSAVE) 
      
    Program execution does NOT write to 
    disk.  
      
    Graph Title: Plot (Yname) vs 
    (Xname), PGM PLOTY-X.  
      
    Named and Scaled Axes.  
      
    MODE 4 Graphic.  
      
    Full range of REALs, +/- 2^127.  
      
    (Rounded numbers on axes.  YES !) 
      
    4 x 4 Grid.  

                                    Page 2
    Graph window, +- Xmin, Xmax,  Ymin, 
    Ymax.  
      
    Scatter-plot symbol, "o"  
      
    Fit least squares regression line - 
    Optional. (Minimise sum of squared  
    deviations from line, (y-y')^2. )  
    Cross-product, XPROD matrix.  
      
    Print Rows:  X, Y, Y-predicted, 
    Residual = ACTUAL - PREDICTED, 
    (Cumulative ??)  
      
    Define arithmetic sequence in 
    column 0.  Variable name (X), Start 
     
    value, Step, Length.  E.g. Year, 
    1975, step 5, for total 6 terms.  
      
    Correlation and simple regression 
    equation, (zero)-intercept - slope 
    :  
        Y = A + B.X  
    Y0, Y1, Left and right intercept.  
    (Margins of plot.)  
      
    Printout column statistics : 
    Minimum, Max, Range, Total, Mean ,  
       No. observations, No. missing / 
    undefined, Standard deviation.  
      
    Dump Plot.  Requires *SDUMP 
    program.  Archimedes *SCREENSAVE.  
      
    Data - Column no., variable name, 
    values, U.ndefined,  
       data terminator "*" - stored in 
    program.  
      
    Array,  X(row  I, column no. A%)  
    Printout data  
      
    Any number of observations (rows of 
    data.)  
      
    Maximum of 10 column variables, 
    selectable by "Hit Key" 0..9 - more 
    columns if you change to :  INPUT 
    "Enter column no.";   (Yes.) 
      
    Omit missing data - "U", identified 
    by coincidence with 'improbable'    
    specific user-supplied real value. 
    E.g. 0= Undefined,            or 
    -1E30 = Undefined, etc.  

                                    Page 3
      
    Select 'X vs X', or  'Y vs X' by 
    column nos.  
    Repeat.  
      
    Fast - less than 1 minute for 30 
    observations.   (BBC model b) 
    Calculate / Plot function of I.  
    DATA -neg column no., variable  
             definition.  
      
    Auto-correlation  (xx ???)  
      
      
      
    Procedures.  
      
      
    PROCSEQUENCE, generate arithmetic 
    sequence variable in column 0, e.g. 
    Year.  1980, 1982, 1984, 1986, 
    1988, 1990.  
      
    PROCDATENTRY, read data per DATA 
    statements.  
      
    PROCMINMAX, display variable column 
    nos., names, Min, Max, Range.  
      Select Y-variable,  X-variable.  
      
    PROCSCATTERPLOT,  Plot X-Y, title, 
    Axes, Grid.  
      
    PROCREGRESSION(PRT),   
    PRT=0, calculate regression line 
    and superimpose on scatter-plot.  
    PRT=-1, printout regression 
    equation, standard deviations, 
    statistics.  
      
    FNX(I) = EVAL VAR$(A%),  evaluate 
    variable name as function value of 
    I  
    - row/observation no.  E.g. I^2, 
    quadratic.  
    ----  
      
      
    Cautions.  
      
      Data in program should be 
    reasonably consistent with the 
    number of rows/ columns entered by 
    keyboard input.  n.b. Keyboard 
    input sets DIMension of array. I.e. 
    DIM X(rows, cols) should 

                                    Page 4
    accommodate data specified within 
    program.    
      
    Unwanted data is easily cleared by 
    deleting appropriate DATA  
    statement(s).  
      
    If Data array is "zeroised", then 
    zero ,"0", is not generally/always  
    considered to equal "U-ndefined."  
      
      
    The Author, Don S  McDonald, has 
    BBC-B microcomputer, 32 kilo-byte  
    RAM,  *BASIC2 programming language, 
    *IWORD word-processor in ROM, *DFS  
    (Disc Filing System), 5.25 in DSDD 
    floppy disk drive.  
      
    The Author has Acorn Archimedes 
    A4000 computer, Easter 93.  
      
      
      
                                        
    References.    
      
      
    F J Andrew, "Generalised Plot 
    program", Beeblet 92/3.  (John 
    Andrew's program does not claim the 
    features offered by the present 
    program by D S  McDonald.)  
      
    IBM Compatible software - Borland's 
    Quattro-Pro. (Latest publicity  
    mentions "(with) Regression.")  
      
    Vogel Computing, ICES - STATS.  
    Statistics / data manipulation /  
    graphing package.   (Works 
    Division, Government, Wellington, 
    NZ.) 
      
    S.A.S. - Statistical analysis 
    system.  Originally main-frame 
    computer data analysis package.  
      
    ==================       
    oooooooooooo          
                                    Page 5
