Notes on RISC OS Filing Systems - and splitting FileCore


History:

  0.01  10-Jan-94  Started from notes by JRo
        21-Jan-94  Sections describing current situation added
  0.02  25-Jan-94  Sent to JRo, Jre, KW for info/comment
        26-Jan-94  Sections on device driver interface and read-ahead/
                    write-behind added
                   Notes on JRo's proposed new architecture added
  0.03             Path name validation procedure corrected
                   Note extra parameter to ReadContigSpace
        31-Jan-94  Note added about how FileCore ensures at most one transfer
                    is in progress at a time
        01-Feb-94  New section 7 on image stamps
        10-Feb-94  Minor clarification on image stamping


1  Host and Image Filing Systems


Every Filing System (FS) can be thought of as a basic implementation of files
and directories for FileSwitch (FSw) - which is the module which implements
application-level filing system operations:

                    ---------------
                   |    Client     |
                    ---------------
                          |
                          |
                          |
                    ---------------
                   |  FileSwitch   |
                    ---------------
                          |
                          |  create, delete, open, close, read, write, etc.
                          |
                    ---------------
                   | Filing System |
                    ---------------

Details of the functions which this interface may/must support are defined
in the PRM chapter 44 (Writing a file system): they are the FSEntry and
ImageEntry functions. In summary, these functions include facilities to:

  - create/delete files
  - rename files/directories
  - extend/truncate files
  - open/close files
  - read/write blocks* (in open files)
  - read/write information (in directories)

   * When FSw asks a FS to open a file, the FS returns the file's "block
     size"; this must be a power of 2 (min. 64, max. 1024), and is used
     by FSw as its buffer size. FSw guarantees always to read/write data
     in block-sized units. The block size will, typically, be the sector
     size or allocation unit size.


RISC OS 3 supports two kinds of FS: "Host Filing Systems" (HFS) and "Image
Filing Systems" (IFS).

The way in which a HFS implements files and directories is entirely its own
affair. For example, FileCore has a low-level interface to "device drivers"
which themselves implement storage in a variety of ways (ADFS - floppy and
hard discs, SCSI, RAM discs), whereas NetFS interfaces to network filing
system protocols. Common to all HFS is the concept of "disc name", which is
normally used to identify a particular external storage item.

                    ---------------
                   |    Client     |
                    ---------------
                          |
                          |
                          |
                    ---------------
                   |  FileSwitch   |
                    ---------------
                          |
                          |  create, delete, open, close, read, write, etc.
                          |
                    ---------------
                   |   FileCore    |
                    ---------------
                          |
                          |  read, write sectors etc.
                          |
                    ---------------
                   |     ADFS      |
                    ---------------
                          |
                          |
                          |
                    ---------------
                   |  Floppy disc  |
                    ---------------
                         
On the other hand, IFSs implement files and directories on top of "images",
which are themselves files supported by an "underlying" FS. For example, an
ADFS disc might include a file of type DOSDisc which is an image managed by
DOSFS. Thus there is a well-defined interface from an IFS downwards - it
simply calls FSw to read/write bytes* from the image file. Just as each HFS
may know of a number of different discs at the same time (each identified by
a different disc name), so each IFS may know of a number of different images,
each identified by a different FSw file handle.

   * And "flush to storage medium"; it's also possible to imagine that other
     calls might be made - for example, a compression FS might wish to extend
     and truncate the image file.


                    ---------------
                   |    Client     |
                    ---------------
                          |
                          |
                          |
                    ---------------
                   |  FileSwitch   |
                    ---------------
                          |
                          |  create, delete, open, close, read, write, etc.
                          |
                    ---------------
                   |     IFS       |
                    ---------------
                          |
                          |  read, write
                          |
                    ---------------
                   |  FileSwitch   |
                    ---------------
                          |
                          |
                         ...

One way to look at the difference between HFSs and IFSs is that a HFS does
I/O directly through device drivers, whereas an IFS does all of its I/O
through FSw.



2  Data structures


Before we consider how FSw directs client operations through the appropriate
layers of software, this section describes the information held by FSw and
by the FSs about open files and images.

The description is neither intended to be complete nor precise - but should
be sufficient to understand how FSw, IFSs and HFSs interact.


FSw maintains two lists of control blocks:

  - Each open file is represented by a Stream Control Block (SCB)

  - Each registered filing system is represented by a Filing System Control
    Block (FSCB)

FSw also maintains a table mapping filing system names to FSCBs, so that it
knows which filing system to call when presented with a filing system name
at the head of a full path name. 


Each FSCB includes the following information:

    Entry_Points:
      Entry points (FSEntry_... for a HFS, ImageEntry_... for an IFS)

  and, for a HFS only:

    Image_List:
      List of SCBs of all image files open on this HFS (these image files
      may be of different image types)

  and, for an IFS only:

    Image_Type:
      Filetype for image files of this kind (eg DOSDisc for DOSFS images)


Each SCB includes the following information:

    File_Name:
      File name (as a full path name from root of immediately enclosing file
      system)

    FSw_File_Handle:
      FSw file handle (as returned to the client)

    FS_FSCB:
      Pointer to FSCB for this file (if the file is an image file, this
      identifies the *underlying* FS that is responsible for I/O on this
      file)

    FS_File_Handle
      FS file handle (as supplied by the (underlying) FS when the file was
      opened)

  and, for an image file only:

    Image_Handler:
      Pointer to FSCB for the IFS responsible for managing the contents of
      this image

    Image_Handle:
      IFS image handle (as supplied by the IFS when this file was registered
      as an image file by FSw)

    Image_List:
      List of SCBs of all image files open on this IFS (for nested images -
      which may be of different image types; cf FSCB's Image_List field)

  and, for any file which is inside an image:

    Image_SCB:
      Pointer to SCB for image file containing this file


Each IFS retains information about each open file (ISCB) and about each
open image (IICB).

The ISCB includes:

    FS_File_Handle:
      IFS file handle

    Image_IICB:    
      Pointer to IICB for the image containing this file

The IICB includes:

    Image_Handle:
      IFS image handle

    FSw_File_Handle:
      FSw file handle for the image file


As an example to clarify the various "handles", consider the case of an open
file called FRED within a DOSFS image file ("PC partition") called PC on an
ADFS hard disc:

    ADFS::HD4.$.PC.FRED

  FRED is a file managed by DOSFS, and so has a FSw file handle (by which the
  client refers to the file), and a DOSFS file handle (which DOSFS uses
  internally).

  :HD4.$.PC is file managed by FileCore%ADFS (FC), and so has a FSw file
  handle (by which the client refers to the file), and a FC file handle
  (which FC uses internally).

  :HD4.$.PC is also an image file for the DOSFS IFS, and so has a DOSFS image
  handle (which DOSFS uses internally).


As an example of the various control blocks and the links between them,
consider the following file hierarchy on ADFS::HD4:

                                      $
                                      |
                           ----------------------
                          |                      |
                         PC1                    Mac
                          |                      |
                         DrC                   file3
                          |
                 -------------------
                |                   |
               PC2                file1
                |
              FDimg
                |
              file2

   $.PC1.DrC is a file of type DOSDisc, and so is an image file which is
   managed by the IFS DOSFS.

   file1 is a file within this DOSFS image, and PC2 is a directory.

   PC2.FDimg is a further file within the DOSFS image which is itself of type
   DOSDisc, and so is a further DOSFS image.

   file2 is a file within image2.

   $.Mac is a file of (hypothetical) type MACDisc, and so is an image file
   which is managed by the IFS MACFS.

   file3 is a file inside this MACFS image.


FSw contains three FSCBs and six SCBs:

                       fscb1          fscb2          fscb3

filing system             FC          DOSFS          MACFS
Image_Type                 -        DOSDisc        MACDisc
Image_List       {scb4,scb5}           -              -


                  scb1       scb2       scb3       scb4       scb5       scb6

File_Name        file1      file2      file3        (1)        (2)  PC2.FDimg
FSw_File_Handle     11         12         13         14         15         16
FS_FSCB          fscb2      fscb2      fscb3      fscb1      fscb1      fscb2
FS_File_Handle     201        202        301        101        102        203
Image_Handler       -          -          -       fscb2      fscb3      fscb2
Image_Handle        -          -          -        2001       3001       2002
Image_List          -          -          -      {scb6}         {}         {}
Image_SCB         scb4       scb6       scb5         -          -        scb4

                                                    (1)  :HD4.$.PC1.DrC
                                                    (2)  :HD4.$.Mac

DOSFS contains two IICBs and three ISCBs:

                      iicb1        iicb2

Image_Handle           2001         2002
FSw_File_Handle          14           16

                      iscb1        iscb2        iscb3

FS_File_Handle          201          202          203
Image_IICB            iicb1        iicb2        iicb1


MACFS contains one IICB and one ISCB:

                      iicb1

Image_Handle           3001
FSw_File_Handle          15

                      iscb1

FS_File_Handle          301
Image_IICB            iicb1



3  Opening a file and canonicalisation


Before* requesting a FS to open a file, FSw first "canonicalises" the file's
name. This is a complex process which includes operations such as:

  - substitute CSD or CLD etc. if necessary (eg if incomplete path name)
  - determine disc name (eg if drive number quoted)
  - resolve wild cards

[*To be more precise, the process of canonicalisation may be interleaved with
 opening, since image files may need to be opened before files within them
 can be accessed.]

The result of canonicalisation is an object name which is:

  - a complete path name from the root of a HFS
  - the disc name is in a standard representation
  - the object is known to exist


As an example, consider a request to open ":0.$.PC1.DrC.PC2.FDimg.file2"
when just file1 is/has been open (in other words, the image file $.PC1.DrC
is open, but the nested image file PC2.FDimg is not).

  FSw first identifies which FS is involved by looking up the value of the
  currently selected filing system; we suppose this is ADFS.

  Next, FSw calls FSEntry_Func(CanonicaliseSpecialAndDisc, ":0") in FC to
  determine the standard name of the disc in drive 0; suppose this is ":HD4".

  FSw now examines each SCB in the Image_List held in FC's FSCB, looking to
  see if there is an image file whose file name matches a prefix of the
  expanded name ":HD4.$.PC1.DrC.PC2.FDimg.file2". In this case, the image
  list contains just a single entry - scb4 - with matching file name
  ":HD4.$.PC1.DrC"; FSw now knows that it must look for a file
  "PC2.FDimg.file2" inside this image.

  There may be a nested image within this path name, and so FSw now examines
  any entries in the Image_List attached to scb4; there are none.

  Next, FSw calls ImageEntry_File(ReadCatalogueInfo, "PC2.FDimg.file2", 2001)
  in DOSFS to see if this object exists: FSw knows that it is DOSFS that
  must be called by looking at the Image_Handler field in scb4, and picks up
  the image handle - 2001 - from the Image_Handle field.

  DOSFS says that this object does not exist, so FSw now makes a similar call
  to find out about "PC2.FDimg".

  DOSFS says that "PC2.FDimg" is a file of type DOSDisc. FSw expected a
  directory, so looks up the file type in its FSCB list, and locates fscb2
  which identifies the IFS DOSFS.

  FSw allocates a new SCB ready for "PC2.FDimg", filling in all fields except
  FS_File_Handle and Image_Handle (see scb6).

  FSw calls ImageEntry_Open("PC2.FDimg", 2001, 16) to open the image as a
  file, and stores the returned value as the FS_File_Handle for scb6.

  The open image file must now be registered with its IFS; FSw calls
  ImageEntry_Func(NotifyNewImage, 16) to inform DOSFS that it now has a new
  image to handle; DOSFS allocates a new IICB (see iicb2) and returns 2002
  as the Image_Handle to be stored in scb6.

  FSw adds scb2 to the Image_List for scb4.

  FSw is at last able to enquire about file2 itself, and calls
  ImageEntry_File(ReadCatalogueInfo, "file2", 2002) in DOSFS; DOSFS reports
  that this object exists and is a file.

Canonicalisation is now complete - and so FSw can at last complete the
business of opening the client's file:

  FSw allocates a new SCB ready for "file2", filling in all the necessary
  fields except the FS_File_Handle (see scb2).

  Finally, FSw calls ImageEntry_Open("file2", 2002, 12) to open the file,
  and stores the returned FS_File_Handle in scb2.



4  Reading data from a file


For the sake of simplicity, we first suppose that the client's request is
to read one block of data - starting at a block boundary - from the file;
we also assume that the FC disc and both DOSFS images all share the same
block size. We do not need to consider buffering under these circumstances,
since the data can be transferred direct from disc to the client's buffer.

  Client calls:
         OS_GBPB(4, 12, buff, 512)
  to read 512 bytes from the current position in the file whose file handle
  is 12, to a buffer at address buff.

  FSw locates the SCB for the file with handle 12 as scb2. The FS_FSCB field
  identifies DOSFS as the filing system to call to read data from this file,
  and the FS_File_Handle field contains the handle by which that filing
  system knows this file.

  FSw calls:
         ImageEntry_GetBytes(202, buff, 512, 2048*)
  in DOSFS to read 512 bytes starting at offset 2048* from the file whose
  handle is 202 into the buffer at address buff.
                            [* Suppose the current file pointer is 2048]

  DOSFS locates the file's ISCB - iscb2 - and from that identifies the IICB
  for the image in which the file lies - iicb2; this in turn yields FSw's
  handle for the image file - 16. Using information about the layout of
  structures in the DOS image, DOSFS calculates that the desired block starts
  at location 4096 in the image file PC2.FDimg, and calls FSw to read those
  bytes from the image file into the buffer:
        OS_GBPB(3, 16, buff, 512, 4096)

  FSw locates the corresponding SCB - scb6 - and as before determines the
  filing system and FS_File_Handle to use; the call to DOSFS this time is:
        ImageEntry_GetBytes(203, buff, 512, 4096)

  DOSFS determines that these bytes start at offset 12288 in the image file
  :HD4.$.PC1.DrC and calls FSw again as:
        OS_GBPB(3, 14, buff, 512, 12288)

  FSw locates the file's SCB - scb4 - and identifies the FS as the HFS FC,
  and the handle as 101; the call to FC is:
        FSEntry_GetBytes(101, buff, 512, 12288)

  FC works out that this block is at location 1048576 on the disc, and, by
  means of the ADFS device drivers, reads bytes 1048576 to 1049087 inclusive
  from disc :HD4 to the buffer at address buff.

  The call stack unstacks.


In reality, FSw maintains a single block buffer for each open file whenever
this proves to be necessary. This means that a client request to read some
arbitrary number of bytes may be met - at least partially - by transferring
data directly from the file's buffer within FSw.

When the buffer needs refilling, FSw will make a call for exactly one block
of data from an offset that is on a block boundary, and - provided that all
block sizes are identical - the process described above will take place. This
means that no further buffering of data from nested image files takes place.

Thus in summary, file buffering within FSw will take place only at the "top"
level when transferring data from/to a file inside a (set of nested) image
files.



5  Whole disc images


The descriptions above explain how images which are files inside some other
filing system are handled; what happens if the image occupies an entire
physical disc?

In this case, the disc will be recognised as containing an IFS during the
disc identification process (see later), and the HFS responsible for
reading data from it will note this in its own internal tables.

Whenever any request is made to read catalogue information about the root
directory on the disc, the HFS will respond that it is a file of the
appropriate IFS type - which FSw will then open as an image file, and
register with the IFS in the usual way.

For example, the SCB for a DOS floppy disc called "MYDOS" might look like
this:

                     scb1

File_Name        :MYDOS.$
FSw_File_Handle        17
FS_FSCB             fscb1
FS_File_Handle        201
Image_Handler       fscb2
Image_Handle         2001
Image_List             {}
Image_SCB             -

                       fscb1          fscb2

filing system             FC          DOSFS
Image_Type                 -        DOSDisc
Image_List            {scb1}           -

After this, for example, OSFind(64, "ADFS::MYDOS.$") - the "open file" call -
will open the floppy disc as a file of size 720K, allowing bytes to be
transferred directly to/from the raw disc.



6  What happens when a new disc is mounted


User actions:

  Insert floppy into disc drive
  Click on floppy disc icon


What happens inside (note that indentation indicates call nesting):

  Filer:  Calls FSw to read contents of directory "ADFS::0.$"

  FSw:      Calls FC to canonicalise disc name ":0"

  FC:         Calls PollChange(0) to see if the disc in the drive has changed
              since it was last paying attention.

  ADFS:         Returns note that disc in drive 0 has been changed

 FC now knows that a new disc may have been inserted into the drive, and
 must determine what kind of disc it is, and who is responsible for it; this
 is called the "identify disc" sequence:

  FC:         Calls MiscOp_Mount(0) to obtain basic information about the
              disc's physical format.

  ADFS:         Identifies the disc's basic format and fills in the supplied
                disc record noting: density, sectors, sides, heads, etc.

  FC:         Issues Service_IdentifyDisc(0, ADFS_DiscOp,
                    sector cache,  /* to avoid each filing system re-reading,
                                      say, sector 0 */
                    disc record)

              Each FS module will receive this Service_Call in turn, passing
              it on to the next unless it recognises the disc as one that it
              knows about.

              In this example we will assume that there are only two FSs
              present - the IFS DOSFS, and the HFS FC.

  DOSFS:        Looks at disc record, but sees that it cannot be a DOS disc

  FC:           Looks at disc record, and recognizes it as, say, E format;
                can now add more information to the disc record, such as
                "interleaved sides".

                Reads more sectors (eg map block and root directory), and
                completes a "sanity check" on the disc's contents.

                Adds more information about the disc to the disc record:
                  - disc name
                  - image type: <EFormatFileCoreFloppy> in this case
                  - disc sequence number

                Claims the service call

                [ Note that "image type" is also known as "disc type", and
                  "disc sequence number" as "image stamp", "disc id" or
                  "disc cycle id" ]

  FC:         The service call has been claimed, so this means that the disc
              has been identified - if not, an error would be raised.

              The new disc's identity is compared against a table of (max. 8)
              known discs; this comparison includes both disc name and disc
              sequence number.

              Here we assume that the disc is not already known, so a new
              record is added to the table and filled in with information
              from the disc record.

              Finally the drive and disc are cross-referenced: ie the drive
              record says what disc is inserted, and the disc record says
              what drive it's in.

              Disc identification - and hence canonicalisation of the name
              ":0" - is complete, and the disc's name - say ":MyFloppy" - is
              returned to FSw from the disc record.

  FSw:      Calls FC to find out what kind of object ":MyFloppy.$" is

  FC:         Returns "is a directory" to FSw

  FSw:      Calls FC to read the contents of the directory ":MyFloppy.$"

  FC:         Calls ADFS as necessary to read sectors from the disc in order
              to assemble the requested directory entries.

  FSw:      Formats and returns list of directory entries to the Filer.

  Filer:  Displays the directory entries in a viewer.



7  Image stamping


The idea of the "image stamp" is to make it possible to distinguish between
two discs with the same name - these may be physically different discs, or
the same disc at different times.

For example, consider the case where disc Floppy1 is removed from my computer
and placed in yours; you delete file $.fred from it, and return it to me. I
insert it into my computer and click on the floppy disc icon to see what's
on it.

Without the image stamp, the disc would appear identical to its previous
incarnation, and so cached directory information would be displayed. But if
we suppose the image stamp was updated when $.fred was deleted, then FC will
not recognise the disc as one that it knows about and so will re-identify it
before reading new directory information.


For this scheme to work, FC has to make sure that the image stamp is updated
as soon as the first update is made to the image following its
identification or re-identification*: this is sufficient to ensure that no
confusion can arise if it is removed and used in other machines. 

 [* Suppose I put Floppy1 into my machine for the first time, and suppose
    its stamp value is 35; I add a file, and the stamp value is now 36. You
    take the disc, and read some items from it, and then give it back to me.
    Its stamp value is still 36, and so my machine identifies it as the same
    disc as before; however, its stamp value must be changed - to 37, say -
    as soon as any further update is made, since otherwise your machine will
    erroneously recognise it as unchanged should you take it away again to
    read some more files. ]

Now if the disc is a FC disc, FC can - and does - manipulate the image stamp
directly (and it is then usually called the "disc cycle id"). But if the disc
instead belongs to an IFS, then FC must call the IFS to request it to update
the image stamp when the next change is made to the image contents. When
this happens, the IFS must also inform FC what value the new image stamp has,
so that it knows how to recognise the disc in future. The whole process is
illustrated by the following example:

  User inserts a floppy disc called DOSFLOP into drive 0, which is identified
  as a DOSFS disc.

  FC notes that drive 0 contains a disc called DOSFLOP with image stamp value
  48, say [these values come from the disc record that is filled in by the
  IFS DOSFS when it recognises the disc].

  FC calls FSw [OS_FSControl 51] to ask that the image stamp be updated no
  later than the next update to the image.**

  FSw calls DOSFS [ImageEntry_Func 32] to ask that the image file's image
  stamp be updated no later than the next update to the image.**

  When the image is next updated, DOSFS chooses/calculates a new image stamp
  value - say 72 - and stores it in the image; it then calls FSw [OS_Args 8]
  to inform the underlying FS of the new value.

  FSw sees that the underlying FS is a HFS*, and calls FC [FSEntry_Args 10]
  to inform FC of the new image stamp value.

  FC updates its internal table showing which drives contain which discs to
  show that drive 0 contains DOSFLOP with image stamp value 72.

   [* If the underlying FS was another IFS, FSw ignores the OS_Args 8 call:
      there is no corresponding ImageEntry_Args 8 ]

   [** FC does this only on *re*identification; if it is the *first* time
       that the disc has been recognised, FSw will register the image with
       DOSFS by calling ImageEntry_Func 21 - and DOSFS will itself note that
       the image's stamp must be updated no later than the next update to
       the image. ]


If the IFS has a free hand to implement the image stamp field, then a simple
sequence number - cyclically incremented each time a new value is required -
suffices. But if the IFS is actually interpreting an existing format (such
as DOSFS), then this may not be possible, and some suitable pre-existing
field must be used.

This means that the IFS may change the image stamp more frequently than the
underlying FS requires, and so may need to make "stand-alone" calls of
OS_Args 8 that are not a direct consequence of the interplay between FC and
the IFS described above.


We have so far assumed that the image handled by the IFS is a complete
(removeable) disc; what happens if the image is simply a file inside another
FS? In this case, the image cannot be independently moved - it is sufficient
to recognise that the disc containing the image has changed. So in this case,
the concept of image stamp is redundant:

   - FSw can afford to ignore calls of OS_Args 8
   - Calls of OS_FSControl 51 should not arise (and so can be faulted)


Finally, there is a special requirement for "back-up" utilities which may
wish to ensure that a backed-up copy of a disc is distinguishable from the
original by giving it a new image stamp value. This is done by calling FSw
directly to update the image stamp immediately - a special option of
OS_FSControl 51.

If the backed-up disc is one managed by an IFS, FSw will then call
ImageEntry_Func 32 to update the image stamp immediately, and a sequence
similar to the one described before takes place.

If the backed-up disc contains a HFS, then FSw calls FSEntry_Func 32 which
must update the image stamp directly. [Note that if FC is split into FC-S
and FC-D (see section 10 below) then this call will no longer be required.]



8  Device driver interface, including background transfers


ADFS_DiscOp provides three different facilities for data transfer:

  a) foreground transfer from/to a single buffer
  b) foreground transfer from/to a number of buffers
  c) part background transfer from/to a number of buffers

In each case, the transfer is of a contiguous sequence of bytes from a single
start disc address, which must be a multiple of the sector size (ie it's on a
sector boundary).


When a device driver (DD) "registers" with FileCore - by calling
FileCore_Create(..) to create a new instantiation of the FileCore module -
FC returns the addresses of three "call back" routines for the DD to use:

  - FloppyTransferDone: called when a floppy background transfer completes
  - WinnieTransferDone: called when a hard disc background transfer completes
  - ReleaseFIQs: called to release FIQ ownership


The device driver needs to use FIQs, and FileCore (FC) claims and releases
FIQs on its behalf, as follows:

  - FC ensures FIQs are claimed before calling the DD to transfer data
  - The DD must call back to FC (ReleaseFIQs) as soon as the transfer is
     complete (this applies to foreground as well as to background transfers)


Foreground transfers are straightforward; type (a) caters for transfers of an
arbitrary number of bytes into a single buffer, whereas type (b) allows the
data to be scattered over a number of buffers - all but the last of which
must be exact multiples of the sector size. The total number of bytes to be
transferred is given in R4, and the "scatter list" consists of an array of
address/length pairs addressed by R3:

            -------------------
    R3 ->  |  addr1  |  len1   |
           |-------------------|
           |  addr2  |  len2   |
           |-------------------|
           |       .....       |
           |                   |
           |-------------------|
           |  addrn  |  lenn   |
            -------------------

    where  len1 + len2 + ... + lenn >= R4

When the transfer is complete, R3 will point to the first entry which was not
completed, and its address and length will be updated appropriately to
reflect the transfer that has taken place.


The scatter list for a background transfer has the following form:

            -------------------
      -8   |  error  | status  |
           |-------------------|
       0   |  addr1  |  len1   |
           |-------------------|
       8   |  addr2  |  len2   |
           |-------------------|
           |       .....       |
           |                   |
           |-------------------|
           |  addrn  |  lenn   |
           |-------------------|
       N   |    -N   |  <n/a>  |
            -------------------

It is treated as an infinite list, with addrn/lenn immediately followed by
addr1/len1 again. The transfer completes as soon as an addr/len pair is
encountered with a zero length.

On entry, R4 specifies the number of bytes to be transferred in the
foreground, and R3 points to an entry in the scatter list (but not
necessarily to the first addr/len pair); both R4 and all lengths must be
exact multiples of the sector size.

The DD first transfers R4 bytes, and then sets up an interrupt-driven
background process to continue data transfers before returning to FC. The DD
updates the relevant scatter list entry (increment addr, decrement len) as
each sector is successfully transferred.

The error and status words at the head of the scatter list should be cleared
on entry; they are set by the DD as necessary when the transfer is over. The
status word currently has two bits:

  - process active
  - process can be extended

It seems that these two bits are always either both zero or both one (?).


Most importantly, at most one transfer can be in progress at any one time:
the DD does not "queue" transfer requests, nor "suspend" any on-going
background transfer in order to initiate a foreground transfer.

In fact, an attempt to initiate a new transfer before the previous one has
completed will quite likely crash ADFS_DiscOp - it is FC that makes sure that
only one transfer is in progress at a time. To be precise, FC makes sure that
there is - at any one time - at most:

  - one transfer on drives 0 to 3  (floppies)
  - one transfer on drives 4 to 7  (hard discs)

This means that drivers which *can* handle multiple transfers (such as SCSI,
where one transfer per LUN is feasible) are not optimally utilised.



9  Read-ahead and write-behind


"Read-ahead" is what happens when FC decides to read more data from a file
than its client has asked for.

"Write-behind" is what happens when FC decides to write client data to a file
in the background.

Both of these activities require background transfers from/to buffers which
are taken from a pool allocated for this purpose and managed by FC. The size
of this pool is determined by the configuration option "ADFSBuffers".

FC is responsible for deciding whether to read-ahead or write-behind in
response to a particular client request. The precise algorithms used to make
this decision are hidden inside the code, but clearly issues such as how
many sequential reads have been made recently, and how much data is to be
written, must be taken into account. Since at most one background operation
can be in progress at any one time, the existence of a background operation
at the time is also important: such an operation can be prematurely stopped
if appropriate in order to start a "more useful" one. (Does FC do this?)


Note that any data that has been written-behind or read-ahead will always
have suffered an extra RAM-to-RAM copy. For example, suppose a client makes
a request to read a single sector into a buffer buff: if the sector has not
been read-ahead, the transfer can be directly from disc into his buffer;
otherwise, it will be a transfer from a FC buffer into his buffer. This is
true even if the "client" is, in fact, FSw - which may be simply maintaining
a file buffer on behalf of the true client who is reading in non-sector-sized
chunks.



10 Proposed new architecture for FileCore


This proposal (due to Jonathan Roach) is to re-organise the architecture of
the present FSw/FC interface as a prelude to introducing NewCore.


The main idea is to split FileCore into two components:

  FileCoreDiscs (FC-D) - which will be responsible for handling whole discs
    as files; and

  FileCoreStructures (FC-S) - which will be responsible for managing the
    contents of an image file as a FileCore filing system.

In other words, FC-S will behave as an image filing system, and FC-D as a
(very simple) host filing system.


Features provided by FC-D will include:

  - read/write blocks from the image file
  - background read/write blocks

and both of these will need to support I/O on very large files (up to, say,
at least 1 Tbyte).

[ Note that if we address by block number instead of by bytes, 2^32 blocks of
  size 512 bytes is 2Tbytes - so maybe a 32-bit parameter is still adequate.]

These interfaces will have to be reflected back through FSw, so that FC-S
can manage structures on such large image files.


FSw should also take over responsibility for all file I/O including
read-ahead/write-behind buffering; this is made possible by the following
new entry-point which should be supported by FC-S (and all other
participating IFSs such as DOSFS):

  ImageEntry_Args(ReadContigSpace, handle, file_offset,
                                          &image_offset, &contig_blocks)

   - locates the position of block 'file_offset' of the open file 'handle'
     inside its image file; this position is returned in 'image_offset', and
     'contig_blocks' is set to the number of blocks of the file which are
     contiguous at that point

As an example of the use of this function, consider the case where a client
application calls FSw to read some data from an open file; we suppose the
file is a DOSFS file inside a DOSFS image that is itself a FC-S file, and is
called ":HD4.$.PC.F".


  FSw first looks to see if the data is already inside a file buffer for
  the file :HD4.$.PC.F; we assume not.

  FSw now works out which block(s) it needs to read in order to satisfy the
  request; for simplicity we assume that a single block is sufficient, and
  that this is block 56. FSw allocates a file buffer to hold this block,
  and - we shall assume - decides to read-ahead on this file as well.

  FSw calls ImageEntry_Args(ReadContigSpace, F_hdl, 56) in DOSFS to find out
  how many blocks are available to read contiguously; DOSFS says that block
  56 of file F can be found at block 83 in the image file, which is itself
  immediately followed by 7 more blocks of F:

    file F:     56  57  58  59  60  61  62  63  64  65  66  67 ...

    file PC:    83  84  85  86  87  88  89  90 120 121 122 123 ...

  FSw now calls ImageEntry_Args(ReadContigSpace, PC_hdl, 83) in FC-S to see
  whether this part of the image file is contiguous, and, if so, how much is
  available; FC-S says there are just 3 blocks starting at 22:

    file PC:    83  84  85  86  87  88  89  90 ...

    file $:     22  23  24  44  45  46  47  48 ...

  FSw notes that file $ is managed by a HFS*, and does not pursue its queries
  further; it now knows that the next three blocks of file F are contiguously
  placed on the HFS upon which the file is stored.

  [* or:  FSw notes that FC-D does not support the ReadContigSpace call ...
     or:  FSw calls FSEntry_Args(ReadContigSpace, $_hdl, 22) in FC-D to see
           if the HFS thinks it's worth reading these blocks in one go ...]

  FSw must read the first block directly, and decides to read-ahead the next
  two blocks; two read-ahead buffers are assigned (for blocks 23 and 24), and
  a suitable scatter list constructed for the foreground/background DiscOp to
  do this. As soon as the DD returns to FSw, FSw returns to the caller.


  A short time later, the client application asks for some more data from F;
  we assume this is a sequential request.

  FSw determines which blocks it needs in order to meet the client's request;
  let's suppose it's just block 57.

  As before, FSw makes a series of calls to the FSs involved in implementing
  the file F (DOSFS, FC-S - and, maybe, FC-D) to determine the current
  location of block 57 of file F on the disc and its contiguity. This is
  necessary because the file may have been moved (by DOSFS or FC-S) since it
  was last accessed. Suppose the result is that block 57 is still at block
  23; then FSw can return data directly from the read-ahead buffer (perhaps
  after waiting a short time for the read to complete). If not, a new read
  must be initiated - after aborting the previous one if it is still in
  progress.


This scheme works provided that all of the IFSs in the "stack" that supports
a particular file are "data-transparent": that is, no IFS in the stack alters
a file's contents. This means that IFSs which encrypt or compress cannot be
processed in this way; instead:

  FSw notes that the IFS does not support the ReadContigSpace call, and so
  just reads each block from it as necessary.

  The IFS, on the other hand, makes requests to FSw to read data from the
  underlying image file - and these will be processed in exactly the same
  way as top-level client requests.

This means that access to encrypted files, say, may still benefit from
read-ahead on the underlying image.


Another complication arises if the IFS stack includes IFSs with different
natural block sizes; this can probably be handled by making the
ReadContigSpace interface a little more complicated.
