                             TEXBASE

                         (Paul Pibworth)

STATEMENT
These programs, with accompanying articles, have been published 
in Beebug magazine, issues Sept to Nov 1991. They are not public 
domain, and remain my copyright. See note at end re Archimedes.


INTRODUCTION
How do you file those handy computer hints and tips? How do you 
remember the gist of a particular review?  Having typed in that 
new recipe that uses spinach, turkey, olives, and custard 
(urgh!), can you locate and retrieve it from up to 250 other 
exotic recipes! This program, or rather suite of programs 
assumes that you already have familiarity will both databases 
and word processors. However, can your information system 
process, (i.e. store and retrieve) plain text such as quotations, 
reviews, recipes, or examination questions similar to a data 
base, or put another way, can your data base handle irregular 
pieces of text?

TEXBASE is a suite of programs that might just fit your needs. 
The system has been written for a BBC B, and makes use of 
a WYSIWYG screen in Mode 3. This means that the memory space is 
limited, making it necessary to work within certain constraints. 
The main constraint is to have separate programs to handle 
different processes.


The basis of the system is one large random access file (text) in 
which the entries are stored. Each entry is called a page, and 
can be up to 50 lines of text. I will describe the format of this 
DATA file later.

     The files on this disc include
     1    !Boot   -   this will load the menu program.
     2    MENU    -   this calls Enter, Search and Delete.
     3    ENTER   -   text entering program.    
     4    SEARCH  -   search program.
     5    DELETE  -   to delete an entry.
     6    KS      -   a Keystrip in ASCII
     7    PRINTKS -   Basic prog, reads KS and prints a condensed 
          version.
     8    text    -   a mini sample data file. 
     9    README  -   this blurb.
    10    BUILD   -   to create a data file.

I would suggest that you copy the first five files to a clean 
disc, and then use SETUP to create your data file, (text).

                       ABOUT THE PROGRAMS
                       ------------------
MENU.
The purpose of the Menu is to set some of the conditions. One of 
the conditions is that the programs run at PAGE &1400 in order to 
recover some memory space. When RUN, the Menu provides a simple 
title screen, and gives four choices, Enter, Search, Delete, and 
Quit.
Lines 370-410 modify the use of the function keys, and cursor 
keys. For example, *FX225,239 sets the function keys to return 
ASCII values of 239+ when pressed alone, and *FX226,140 sets them 
to return values of 140+ when used in conjunction with Shift. You 
are referred to the User Guide for further information.
Line 420 is used to reprogram some of the displayed characters, 
(420-425) which are used to give an on-screen indication of 
embedded printer codes.
The program also resets some of the *FX modifications if Quit is 
chosen.

ENTER
This is the program for entering text. It is well structured, and 
quite easy to follow, apart from the multi-statement lines (these 
are to save on memory). Note that in some cases, lines must be 
split if there is a condition introduced into the statement, 
("IF...").  The main body of the program is very short, lines 60-
110, and is within a REPEAT-UNTIL loop. Key presses are detected, 
and action is taken from within this loop.

THE DATA file, text
You must remember that each entry is stored inside the random 
access file, "text". Each entry, or page, begins with 154 bytes, 
which contain the following information. There is the page 
number, the number of lines in that piece of stored text, the 
keywords, (as one string, starting and ending with a slash), and 
the title. Each line on the screen is a string array, of 75 
characters.
When the entry is saved, each line takes up 77 characters on the 
disc. The length of each entry, or page, is thus a multiple of 
77. (Any blank lines that remain on the screen after your last 
entered line are not stored). 

The first two sectors of the data file ("text") are used as an 
index, where byte 0 is the number of pages that have been saved. 
The modified address of each page is stored as a 2 byte number. 
Bytes 1 and 2 give the address of page 2, bytes 3 and 4 give page 
3 etc. Page one starts right after the index, so the address does 
not need to be stored. To find the exact address, each 2 byte 
number is then multiplied by 77, and &200 is added. Thus, if a 
page number is known, its address can be found, and loaded by 
using the OPENUP command, and then PTR#Z%=address. Conversely, by 
stepping through the addresses, each entry can be located and the 
keyword string loaded and compared, until a match is found. When 
creating a new entry, the address of the last page is used to 
find that last page, read its length, and thus put the new entry 
on the end. The index bytes are then updated.  

When using the ENTER program, the next available position is
calculated prior to displaying the blank screen ready to receive 
your entry. Originally, I wanted the number of entries (pages) to 
be unlimited, like files in the ADFS system. An index can speed 
up search, but it limits the entries. Two hundred and fifty 10 
ten line entries would need (250x(10+2)x77)+512 bytes = 231512 
bytes. I compared this to the 102400 bytes on a 40 track DFS 
disc, and decided to compromise. Thus there is a limit of 250 
pages per disc. However, one could modify the program to have a 
choice of two or more data files!

SEARCH
Like ENTER, it has to be run from MENU, which sets up the VDU23 
defined characters. Also like ENTER, it is structured, and runs 
from a small loop so you should be able to follow what happens. 

DELETE
Its purpose is to remove one page (at a time) from the data file. 
It is similar to a *COMPACT, in that all the rest of the data is 
shunted down inside "text". (Remember, "text" is a large random 
access file, created empty, and then used to store the entries as 
"pages"). There is a major difference, however, between this and 
*COMPACT. The later works on sectors, since each new file in DFS 
begins a new sector.

You will remember that new pages in "text" begin at 77 byte 
intervals. If a page is deleted, it can be regarded as a gap 
within "text", which must be filled. A block of data from the end 
of this gap is read into a buffer, and then saved back to the 
disc at a lower address on the disc, thus filling the gap. This 
creates a new gap, so the process is repeated until the end of 
the file is reached, or the last page within the file is reached. 
I suggest making a backup or the data first, in case something 
goes wrong. If the file is quite full, and it is an early page, 
then it will take some time. This is because each page has its 
page number stored, besides its text. Each page has to be 
renumbered, and the index has to be modified.

The process works in MODE 7, to allow more memory. It is 
necessary to know the page number to be deleted, there is no 
keyword search facility. The first few lines of the page are 
displayed for confirmation. However, because MODE7 uses a 40 
character screen, some words will be split. This is not 
important.

You could also have a larger buffer at line 50, eg. DIM store% 
10000. In this case, change the 4999 at lines 720 and 730 to 
9999.

                       USING THE PROGRAMS
                       ------------------

ENTER - To enter text
In essence, the ENTER program is a simple word processor, driven 
from a very brief menu! The brevity is due to the memory 
limitation. However, this does not make it hard to follow. It 
gives five choices:

                         1 INPUT
                         2 TITLE
                         3 SAVE
                         4 RESET
                         5 EXIT

INPUT takes you to the screen, for you to enter text. Pressing 
Escape at any time will return you to this menu. The input screen 
is WSYIWYG, and is function key driven. At this stage you may 
find it more convenient to refer to the key strip, (see the 
diagram 1). The facilities included are:
(a) Insert on/off
(b) Printer codes for italic, underline, sub-script, super-script, 
and bold. 
(c) Some rearranging of the text.
(d) Loading an ASCII text file.
(e) Marking key words.

Most of these will be obvious from the key strip, but a few may 
need further comments. Sub/super script act on one character 
only, i.e. they self cancel, the others require a cancel code, 
which is then shown on the screen. The block-text commands 
obviously need parameters. On issuing these commands, questions 
appear on the ruler, such as "from what line?". The cursor keys 
are modified using Ctrl and Shift. You can set up to three tab 
positions using Shift/Tab, and cancel them with Ctrl/Tab.
The header information shows five states. The "Page=", on the 
right, is the current page, or entry, on which you are working. 
The "Bal" is the number of available lines of text, still unused 
in the data file. This is a guide, as will be explained later, 
but with say Bal=200, you could expect to input another 20 short 
entries. "Title" is an optional entry which is entered at the 
menu, and then displayed in place of the row of hash characters. 
Finally, the heading "KEYWDS". This is the basis of the system, 
text cannot be saved until you have chosen at least one keyword! 
Keywords are words in the text that you will use as the basis of 
the search. They are chosen and stored separately before the bulk 
of the text. To chose your keywords, place the cursor under the 
first letter of each word in the text that you wish to use and 
press Shift/f8. Only alphabetic characters are recognised, and 
they are converted to upper case. You can enter as many keywords 
as you like, space permitting! If the space runs out, the first 
one in is lost! Try it and see. To cancel all the keywords, just 
press Shift/f9

The facilities NOT included are word wrap, right justify and 
reformat. The use of printer codes also takes up actual space, 
which is then ignored on printout (later).

One final facility that must be mentioned is EDIT (Ctrl/f7). This 
allows a previously saved entry to be modified. The page number 
must be known, and that page is then displayed. The program will 
then resave that page from the MENU, overwriting the original. It 
is not possible to extend or enlarge the entry, as the number of 
lines saved will be the same as that loaded. It is anticipated 
that changes are cosmetic, such as spellings. In the case of a 
major review, the page can be saved separately, and then 
reloaded.

What about if there is a need to extend? In this case, the page 
must be saved (ie copied), possibly on another disc. Then the 
page must be deleted, using the DELETE option from the MAIN MENU. 
Finally, the page can be reloaded (Load ASCII file), edited, and 
saved. It would be necessary to note the title and keywords, as 
these are not saved with the text. This will be become clearer 
when you read about the SEARCH program, below.

Returning to the MENU, apart from SAVE and EXIT, there is one 
other choice, RESET. This is an important choice. Each time the 
program runs, the program calculates the next available space 
within the DATA file. This is not updated following a SAVE 
action, unless RESET is pressed. This is not an oversight, it 
allows one to modify and even extend, what has just been saved. 
On pressing RESET, the program examines the disc, and resets the 
variables for the next entry.


SEARCH - to search the data
When SEARCH is run, a brief menu is presented, with only five 
choices. I will describe the function of each choice in turn.

                         1 ENTER KEY WORDS
                         2 LOGIC = OR (Press to change)
                         3 SEARCH
                         4 START AT PAGE
                         5 EXIT

Choice 1, ENTER KEY WORDS
You will remember from last month that words in the text could be 
chosen as key words. Pressing 1 allows you to enter up to six 
keywords. These are the basis of the search. If none are chosen, 
then a wildcard is chosen, and every page is displayed. You can 
change, delete, or modify your list of key words from within 
choice 1. The program works by using the INSTR command. This 
means that a small "search word" could be found within larger 
keywords. Eg. using RUM would display a page which had CRUMB or 
DRUM. This can be prevented by preceding the search word with a 
slash, e.g. /RUM. A slash can also be included as a terminator, 
DRUM/ would not find DRUMS, although DRUM itself would.

Choice 2, LOGIC is between OR and AND. In the latter case, each 
of the key words must be matched. Pressing 2 toggles between 
these two states.

Choice 4, START AT PAGE enables you to start the search some way 
through the file, if you know for example what page the data is 
actually in! This may seem to nullify the whole point, but on 
reflection, you may just happen to know it if returning to the 
same information, of if you have a separate index! (More later)

Now we come to choice 3, SEARCH. This choice looks at each group 
of keywords in turn, until a match is found. If this occurs, (and 
it may take time, be patient, other 8 bit machines would take 
much longer!) then the page is displayed. The cursor keys with 
SHIFT, and CTRL, can move the page up and down. When each page is 
displayed, you have four choices, as shown at the base of the 
screen. N (next) will continue the search, and E (exit) will 
return you to the menu. P will allow you to print out all or some 
of the page, you enter the required answers in response to the 
questions. The questions appear on the ruler, near the top of the 
screen. Printing makes use of the procedure PROCcode. This has 
the printing codes in it that match my printer. You will need to 
change these if necessary, but I think they are standard. The 
only problem is if you have two printers. In this case, you would 
need two similar procedures, with a means of choosing which to 
use.

Save (S) enables you to save the page, as an ASCII file. If your 
disc is full (i.e. you have the maximum sized data file 
possible), you will have to change discs. The program assumes 
this, just answer Y, and ignore it if you do have room on the 
disc. Each time you save such text, it is given the filename 
"savedtx". You then need to *COPY this of *RENAME it at your 
convenience, or you will overwrite it.


SUMMARY
So there it is, a text data base. As I write this further 
suggestions come to mind. There could be a utility to print out 
all the page numbers, titles, and keywords. Another possibility 
is a program to copy a page or pages from "text" on one disc to 
"text" on another disc. And if a page can be deleted, can one be 
inserted? There may be other ideas.

Version 1.0 

Archimedes version
  
This is virtually the BBC edition, just repackaged. The notes 
above were written for the BBC version. Since the programs have not
been been altered significantly, the notes are virtually unchanged.
The program is not multitasking in this edition, but does run from
inside an application. 
Use is made of system variables to handle the filing, so it does
not need to be in the root directory.
I have begun looking at the possibilty of a mutitasking/windoes
version. 

Paul Pibworth
2 Pine Tree Drive
Hucclecote
Gloucester GL3 3AJ
