1992, TUES 7 April.  REF: DSMCDON3/92. LINREG, 1050-1350, (596 words.) 
16.5.92, 2255-2315,  20.5.92, 1750, (748 words.) 
 
         copied to DSMCDON3/92 disk. 6.6.92. 
 
to space out text and re-save.         (779 words.) 
 Also spool to SeriCalc disk, as "LinRegTEXT",  4.6.93. 
  
Editor, Beeblet,  
BBC/Acorn Computer User Group NZ INC 
Pte Box 9592, Te Aro, 
 WGTN.   (Mr Peter Browne ) 
 
 
Linear Regression plot of 'Y' vs. 'X'. 
 
by Don S McDonald, 63/ 5 Hut.chi.son Rd, WELLINGTON 2, 
 Tel. 0-4- 389.6820,  N.Z. 
 
 
 
BBC BASIC 2 Program "PLOTY-X", by D S  McDonald, Wellington, can plot 
any variable versus any variable on a fully labelled X-Y graph, fit 
the best ("least squares") straight line to the plot, and print the 
regression equation, etc. 
 
 
Example.  The following are hypothetical data for physical 
characteristics of children, age 1 to 8 years. 
 
AGE      HEIGHT    WEIGHT 
 
1        1.6       1 
2        2.2       1.5 
3        2.5       2 
4        3.0       2.8 
5        - (U)     3.6 
6        3.7       4.4 
7        3.9       ?(U) 
8        4.2       5 
 
There are 8 rows of data, (sets or observations).  The column 
variables are: Age, Height, Weight. 
 
  The first column, "AGE", is a sequence, starting at 1, and 
increasing by 1 each term, for total 8 terms. 
 
  There are two incomplete observations.  Height corresponding to age 
5 was not obtained.  Weight corresponding to age 7 is also missing. 
 
Height can be plotted against Age, or Weight against Height, or Weight 
against Age, etc. 
 
Program "PLOTY-X" can do all the above things and more.  It can 
evaluate a variable name as a function.  E.g.  Plot  "I*I" against 
"X", a quadratic curve. 
 
                                    Page 1
 
These are the features of Program PLOTY-X. 
 
 
160 lines of BBC BASIC 2. (NOW BASIC V) Nil lines Assembler- Machine code. 
 
BBC model B microcomputer with 32 kilobytes RAM  (ORIGINALLY. NOW RISC OS.)
 
Disc Graphic dump utility, *SDUMP.   (NOW *SCREENSAVE)
 
Program execution does NOT write to disk. 
 
Graph Title: Plot (Yname) vs (Xname), PGM PLOTY-X. 
 
Named and Scaled Axes. 
 
MODE 4 Graphic. 
 
Full range of REALs, +/- 2^127. 
 
(Rounded numbers on axes.  YES !)
 
4 x 4 Grid. 
Graph window, +- Xmin, Xmax,  Ymin, Ymax. 
 
Scatter-plot symbol, "o" 
 
Fit least squares regression line - Optional. (Minimise sum of squared 
deviations from line, (y-y')^2. )  Cross-product, XPROD matrix. 
 
Print Rows:  X, Y, Y-predicted, Residual = ACTUAL - PREDICTED, (Cumulative ??) 
 
Define arithmetic sequence in column 0.  Variable name (X), Start 
value, Step, Length.  E.g. Year, 1975, step 5, for total 6 terms. 
 
Correlation and simple regression equation, (zero)-intercept - slope : 
    Y = A + B.X 
Y0, Y1, Left and right intercept.  (Margins of plot.) 
 
Printout column statistics : Minimum, Max, Range, Total, Mean , 
   No. observations, No. missing / undefined, Standard deviation. 
 
Dump Plot.  Requires *SDUMP program.  Archimedes *SCREENSAVE. 
 
Data - Column no., variable name, values, U.ndefined, 
   data terminator "*" - stored in program. 
 
Array,  X(row  I, column no. A%) 
Printout data 
 
Any number of observations (rows of data.) 
 
Maximum of 10 column variables, selectable by "Hit Key" 0..9 - more 
    columns if you change to :  INPUT "Enter column no.";   (Yes.)
 
Omit missing data - "U", identified by coincidence with 'improbable' 
    specific user-supplied real value. E.g. 0= Undefined, 
                                    Page 2
    or -1E30 = Undefined, etc. 
 
Select 'X vs X', or  'Y vs X' by column nos. 
Repeat. 
 
Fast - less than 1 minute for 30 observations.   (BBC model b)
Calculate / Plot function of I.  DATA -neg column no., variable 
         definition. 
 
Auto-correlation  (xx ???) 
 
 
 
Procedures. 
 
 
PROCSEQUENCE, generate arithmetic sequence variable in column 0, e.g. 
Year.  1980, 1982, 1984, 1986, 1988, 1990. 
 
PROCDATENTRY, read data per DATA statements. 
 
PROCMINMAX, display variable column nos., names, Min, Max, Range. 
  Select Y-variable,  X-variable. 
 
PROCSCATTERPLOT,  Plot X-Y, title, Axes, Grid. 
 
PROCREGRESSION(PRT),  
PRT=0, calculate regression line and superimpose on scatter-plot. 
PRT=-1, printout regression equation, standard deviations, statistics. 
 
FNX(I) = EVAL VAR$(A%),  evaluate variable name as function value of I 
- row/observation no.  E.g. I^2, quadratic. 
---- 
 
 
Cautions. 
 
  Data in program should be reasonably consistent with the number of 
rows/ columns entered by keyboard input.  n.b. Keyboard input sets 
DIMension of array. I.e. DIM X(rows, cols) should accommodate data 
specified within program.   
 
Unwanted data is easily cleared by deleting appropriate DATA 
statement(s). 
 
If Data array is "zeroised", then zero ,"0", is not generally/always 
considered to equal "U-ndefined." 
 
 
The Author, Don S  McDonald, has BBC-B microcomputer, 32 kilo-byte 
RAM,  *BASIC2 programming language, *IWORD word-processor in ROM, *DFS 
(Disc Filing System), 5.25 in DSDD floppy disk drive. 
 
The Author has Acorn Archimedes A4000 computer, Easter 93. 
 
 
 
                                    Page 3
References.   
 
 
F J Andrew, "Generalised Plot program", Beeblet 92/3.  (John Andrew's 
program does not claim the features offered by the present program by 
D S  McDonald.) 
 
IBM Compatible software - Borland's Quattro-Pro. (Latest publicity 
mentions "(with) Regression.") 
 
Vogel Computing, ICES - STATS.  Statistics / data manipulation / 
graphing package.   (Works Division, Government, Wellington, NZ.)
 
S.A.S. - Statistical analysis system.  Originally main-frame computer 
data analysis package. 
 
==================       oooooooooooo          =============== 








































                                    Page 4
