ReadMe file for NiallII version 2.00 (07-Mar-1992)
by Stuart McDonald

This application is a word-learning parser.
It is named after a similar program on the Amiga,
although the author is different and the algorithms
are my own.

Niall is not yet multi-tasking; the full, multi-
tasking version will not be out for some time.
An keyboard environment is supported, and help is
available by typing >help at the Niall prompt.

Niall works by taking the input text from the user.
It then prepares and analyses the text in various
ways:

First it removes any punctuation.  It removes
apostrophes and replaces all other punctuation with
spaces.  It then shifts the text entirely into
lowercase.  The next routine separates the words in
the text and then the parsing routines begin.
The words are checked one by one.  If a word is not
already stored in the vocabulary (which is empty to
begin with) then it is added to the vocabulary as a
string.  The program then reads the words
immediately preceeding and succeeding the word in
the input text.  If either of these words has not
been seen before in this position relative to the
word then it is stored using a variable (in ARM
Niall I, it was stored as a string which is much
slower and far less memory-efficient) which
indicates its position relative to the root word.

When every word in the input has been processed in
this way, the system makes a sentence of its own.
It picks a root word from the vocabulary and then
extends the sentence by picking any word which can
adjoin the root word, and placing it either before
or after the root word, depending on the nature of
the link.  This word is then used as the new root,
and the process continues until there are no more
links or the sentence is long enough.

An example will help to clarify this:  Imagine that
the vocabulary is blank.  Entering the text

I own three camels; they're called John, Eric and
Mr. Explanation.  Do you like that?

will cause the following to happen:

i) processing

I own three camels theyre called John Eric and Mr
Explanation Do you like that

i own three camels theyre called john eric and mr
explanation do you like that

i/own/three/camels/theyre/called/john/eric/and/mr/
explanation/do/you/like/that

ii) learning

The system learns each of these words since none of
them are already known.  It also makes the links, eg

"i" can be followed by "own"
"own" can be preceeded by "i" and can be followed by
"three", etc.

iii) reply

Since the system only has (at most) one link in each
direction for each word at present, the output will
probably be the same as the input.  Future input
sentences will allow it to make new links; for
example, now entering

I do believe that you just repeated what I said!

could cause the new output

mr explanation do believe that you like that you
like

because the words now have several links.
Essentially, the larger the vocabulary, the weirder
and more complex the output.  Niall has told me such
pearls of wisdom as:

king charles darwin discovered to my surprise that
science tests every day and soon,

brain surgeons like you cause thermonuclear war and
chess disasters in rainbows,

everyone smells of words,

and, perhaps most bizzarely:

you chase people become heated especially when
steamed.

At the end of a session, save the vocabulary if you
want it to grow.  I've released this new version
because it uses a new, more compact filetype.  This
new format can store far more links in about the
same size of file, because rather than storing the
linked words as strings, it stores their indices in
the string array.  Hence every word is stored only
once as a string, and so the system is faster and
smaller.  I think the code is also more elegant.
Anyway, the multitasking version (which will include
some nice features I'm dreaming up now) will use
this new format, but by the time it's released some
people will have HUGE vocabulary files and
converting them will take a long time.  So you can
convert them now, continue to use them in Niall II,
and then get the best of both worlds when NiallIII
is released.

NiallII has an error handler which is very, very
useful: I've lost countless files because of full
discs.  It also looks nicer and uses the OS SWI
PrettyPrint rather than my own routine for printing.
To use multiple vocab files you can store them in a
directory somewhere, and copy the one you want into
the Niall directory when you want to.  To make a
blank vocabulary file, just create a file called
Vocabulary in the Niall directory.  It should be
totally empty, so use Edit.  Don't enter anything,
even spaces: just save the file.  The filetype
doesn't matter:  Niall sets it when it saves the
vocabulary.

If you mistype a word in Niall, use >correct.  This
can also delete words you don't like.  Note that if
you delete a word, it's sensible to compact the
vocabulary when you've got the time, as otherwise
the system grinds slowly to a halt.

If you want to convert an old file, put it on the
RAM disc under the name ToDo.  Make sure that the
RAM disc has enough room for three copies of the
file and then there should be no problems.  The
conversion is slow but it does do wonders to the
files.  I examined my old NiallI file and found it
to be full of holes.  The convertor mopped these
errors up and NiallII uses all of the words, which
NiallI couldn't because the file was full of errors.
You'll end up with a file called Done.  You can then
load Done into NiallII by renaming it Vocabulary and
storing it in the application.  Convert itself is
inside !NiallII.  Just run it, first checking that
ToDo is on a large RAM disc, and that there's enough
memory to run Convert: it needs a few hundred
kilobytes.

Remember that NiallI and II use the same filetype,
so don't mix them up.  The formats are incompatible,
so don't try to swap the files.  The best thing to
do is to store NiallI somewhere where its !Boot file
can't run by mistake.  I suggest storing it in a
directory called OldNiall, inside NiallII, and never
running it again.  NiallII doesn't support running
the files because the new format means that you'd
have to load the entire file into memory before you
can see the links.

Known Bugs / Problems :

1) As I have said, the delete routine leaves gaps in
   the file, so compact afterwards if possible.

2) The interface is not very friendly, but pity
   those poor PC owners!  Dos is even less
   hospitable.

3) The method for loading other vocabulary files
   is a little fiddly.

Niall III will be very nice.  It'll be ARM coded,
and the biggest change to the filetype is likely to
be cosmetic.  It'll detect NiallII (but not NiallI)
format files and convert them very quickly.
It will multitask, load and save files easily, and
have a built-in browser which would allow the user
to see the links between words.

Until then, though, please enjoy it, and remember
that a little weirdness keeps us all sane.

If you have any comments/suggestions/tips/
improvements, please write to me at the
following address:

Stuart McDonald
"Maple House"
1, Bursledon Road
Hedge End
Southampton
SO3 4BP
ENGLAND

This program is PD; see the Legal file for
more information on that side of things.

A detailed history of the program can be
found at the beginning of the RunImage
listing.




  As they always say, if this scrolled by too quickly for
  you, load Edit and try again.  Or, I suppose, you could
  learn to read very, very quickly.



