
  ----------    Yet Another Locater  -----------

              2018/04/27, 2021/02/28

YAL finds occurrences within files, or directories of files, lines that
meet sets of criteria, positive and negative.


Use
===
Doubleclick !YAL to put its icon on the iconbar.

To tell YAL what to search for, click Select on the iconbar icon.
This opens a directory called Locate containing a text file
called Search. This you need to edit in a text-editor,
so that each line determines a search-criterion (see below).

After that you drag onto the iconbar-icon the file or directory in which to search  and
YAL will report its findings in a taskwindow, giving the linenumbers and lines that
match the criteria (and the filename in the case of a directory).
The same search criteria will be used for subsequent drags of files or directories.
until you edit the Search file.

Clicking Adjust on the iconbar icon opens the configuration file in a
text-editor. It determines the values:

validtype :  a list determining the filetypes allowed in the search.
            The default is just textfiles.
validleaf :  a pattern that the names of files to be searched must match.
            The default pattern matches anything.
recurse :    a Boolean value determining whether the search recurses down
            subdirectories.
            The default value is true.

These values can be edited, and the configuration file saved.

Search files
=============
Each line of the search file adds a criterion, which is negative if the line
starts with the symbol ~. That is to say, lines that satisfy the
criterion will be rejected. Only lines satisfying all the positive criteria
and none of the negative will be displayed. The criteria may appear in any order
and may be of two kinds:
text or pattern. The syntax of patterns will be explained below.

A positive textual criterion is just literal text, not starting with ~, on a single
line. The criterion is satisfied if the text appears as a substring anywhere in
the line. It may be reported many times if it appears in different positions.
A negative textual criterion is a line starting with ~ followed immediately
after by a blank space. The remainder of the line, including blank spaces,
determines the criterion. Lines will not be reported if they contain that
remainder as a substring. Be careful about blank spaces.

A searchfile

Robert
~ Lady Jane

will report all lines mentioning Robert somewhere so long as they do not mention Lady
Jane anywhere.

Pattern criteria begin the line with == for positive criteria and ~=
for negative. What follows must be a pattern, preceded or followed by any number of blank
spaces. Blank spaces cannot occur in patterns, so in this case they are simply ignored.


A searchfile containing

==          ^%s*Dear%s+Sir
==          appalling%.$
~=           Sir%s+James
~=           my%s+wife

for example, reports any line beginning with the phrase Dear Sir and ending
with appalling. so long as it does not contain the phrase Sir James or my wife .

You can, of course, save your searchfiles for later use.

Why another locater?
=====================
I have quite a few locater applications, for searching through multiple files.
Some work well, others not at all, perhaps because they need recompiling.
Most are a lot slicker than YAL, with nice dialogues for the settings, some with
throwback. None, however, offer the pattern-syntax that I am used to. Apart from
the pattern-syntax YAL is a simple, almost minimal, application.

Most people probably only use word search with textual criteria. Until you are
used to them, patterns may seem bothersome - another thing to be learned. But
they are much more powerful, and they can repay the effort. In fact YAL translates
textual criteria into pattern criteria.

Pattern-syntax
==============
A pattern could be simply a word. But not a phrase, because blank spaces
are not allowed in patterns. If you wanted to search for the phrase

                 did you say "another $2"?

you would use the pattern

                 did%syou%ssay%s"another%s%$2"%?

The expression %s is the pattern for a blank space. The so called magic characters,

                          ^ $ ( ) % . [ ] * + - ?

must be preceded by a percent (%) sign to match themselves in plain text. This is so
that magic characters not preceded by % can have special meanings within patterns.

^ denotes the beginning of a line.
$ denotes the end of a line.

So to search for a line starting

                          From:

as one might find in an email, you would use the pattern

                         ^From:

The pattern . matches any single character (that includes a blank space character).

A character-class is a subset of the 256 possible ASCII characters. Here is
a list of standard ones:

%a   all letters
     can also be written [A-Za-z]
%A   all non-letters
%c   all control characters
%C   all non-control characters
%d   all digits
     can also be written [0-9]
%D   all non-digits
%g   all printable characters except a space
%G   a space or any non-printable character
%l   all lower-case letters
     can also be written [a-z]
%L   anything not a lower-case letter
%p   all punctuation characters
%P   all non-punctuation characters
%s   all space characters
%S   all non-space characters
%u   all upper-case letters
     can also be written [A-Z]
%U   anything not an upper-case letter
%w   all alphanumeric characters
     can also be written [0-9A-Za-z]
%W   anything not an alphanumeric character
%x   any hexadecimal digit
     can also be written [0-9A-Fa-f]
%X   anything not a hexadecimal digit
%z   ASCII nul

Any other character preceded by a percent sign just represents itself.

Square brackets [...], within which the first character is not ^, represent
the character-class which is the union of the classes and characters it contains.
The complementary character-class is denoted by [^...].
So, for example, %S is the same as [^%s].

A character-class or character can be qualified by one of the following
postfix operators to give a pattern

*   matches zero or more repetitions of characters in the class, greedily
    i.e  the longest sequence it can.
+   matches one or more repetitions of characters in the class, greedily
    i.e  the longest sequence it can.
-   matches the shortest sequence it can of zero or more repetitions of
    characters in the class.
?   matches at most one character in the class.

Thus

              tooth%s?some

matches only

              toothsome

and

             tooth some

A pattern can contain sub-patterns enclosed in parentheses (round brackets), which
denote captures. When a match succeeds the substrings matched by the captures are
available for use in the pattern as the expressions %1 , %2 , ... %9 .
Parentheses are ordered according to the position of their opening parenthesis.
As a special case the empty capture () yields the position in the string
of the opening parethesis, a number.

For example

                   (['"])quoted%1

matches

                     'quoted'

or

                     "quoted"


The pattern denoted by %f followed by a character-class matches an empty string
at a frontier position; i.e so that the next character belongs to the class but
the previous one does not.

The pattern denoted by %b followed by two distinct characters, say x and y, matches
balanced substrings starting with x and ending with y. So %b() matches balanced
parentheses, %b[] balanced square brackets, %b{} balanced braces, and so on.

The patterns expressible here are those which can be matched without backtracking.
It is the syntax used by Lua's standard string library.

References:
Programming in Lua by Roberto Ierusalimschy
http://www.lua.org/docs.html

Limitations
===========
YAL only deals with text line by line. It cannot search for patterns that occur over
multiple lines, nor can it use backtracking. Those would require use of more
sophisticated pattern-matching, say parser-expression grammars. It leaves the
searched texts strictly alone, and does no replacement.

G.C.Wraith
