Proposal to allow specification of allocation strategies

Jonathan Poole

A postscript version of this whole document is available here

The latex source is available here

Russel's original mail:

To: A.McEwan@lpac.ac.uk, J.Poole@cs.ucl.ac.uk, G.Roberts@cs.ucl.ac.uk
Subject: A thought on loading.
Phone: +44 (0)171 380 7293
Fax: +44 (0)171 387 1397
Date: Sun, 16 Apr 95 10:44:32 +0100
From: Russel Winder <R.Winder@cs.ucl.ac.uk>


I think the contents of this file probably need adding into the "bugs"
or "todo" database.

Consider the following scenario.  I want to write an UC++ program
(perhaps the Sieve) that executes locally but puts all the prime
number objects on the DEC mpp.  This raises a number of issues:

1. It would be sensible to be able to force loading of the prime
number objects onto the DEC mpp in the UC++ source.  Mian's idea was
to put the Internet name string into the file.  This is a bit
non-portable but does work.  The alternative, relying on the UCConfig
file and on clauses, is open to breaking due to failure on the user's
part, either with on clauses or in UCConfig file (the need for
synchronisation between them is dangerous), but is more portable.

2. It should be possible to mark machines specified in the UCConfig
file as not usable as part of the round robin allocation strategy.  We
only want certain object (using on clause) to be loaded on certain
machines.  In the above we don't want general objects loaded on the
DEC mpp.

3. We need to be able to amend  allocation strategies.  In fact we
need a mechanism for user defined allocation strategies.  There are
two routes here:  UC++ defines a set of allocation strategy options
and provides a (perhaps file based) mechanism for selecting between
them; or The user has to program the allocation in their UC++ code --
this required the user to be able to find the number of real machines
available to them and also their type (remember Terry's discussion of
graphics boxes and the requirement for group loading -- a set of boxes
offering the same services for a set of objects).

There was a fourth but I can't remember.

1 is an issue of how the system is to be used but is not critical
immediately.  3 is probably important but (apart from ensuring that
the library can tell the user program how many real processors there
are) is probably not critical immediately.  2 is something I had
overlooked until now (even if someone had already mentioned it) and is
I think crucial now.

Russel.

Jonathan's reply:

To: Russel Winder <R.Winder@cs.ucl.ac.uk>
cc: A.McEwan@lpac.ac.uk, J.Poole@cs.ucl.ac.uk, G.Roberts@cs.ucl.ac.uk
Subject: Re: A thought on loading.
In-reply-to: Your message of "Sun, 16 Apr 95 10:44:32 BST."
Date: Tue, 18 Apr 95 12:13:55 +0100
From: Jonathan Poole <J.Poole@cs.ucl.ac.uk>

Russel,

Re: your message about loading, and mapping the machines specified in the on 
clause to real machines.  I have given thought to this, but have not put 
forward any specific points as it has not arisen.  My view is that we need at 
one level of indirection between the compilation and the running: we want a 
single set of executables that will run on different configurations.

At present each object is sent to a particular machine, though the particular 
mapping of virtual machine number to machine is set only at runtime, based on 
the UCconfig file info.  This latter flexibility is very important, as we want 
the programmer to be able to tweak the particular configuration at runtime.

At present the mapping is many-to-one, many objects can be put on one 
machines---but we can't do many to many.  I think the argument to the 
on-clause should be not a machine number but a group number: thus we might have

        C* c = activenew C on FARM;
        W* w = activenew W on MASPAR;
        
where FARM might be a group of machines with different addresses.  In the 
UCConfig file we would have

FARM x.cs /cs/research/..../ file1.exe
FARM y.cs /cs/.....        / file2.exe
FARM ....
FARM
MASPAR jupiter.lpac.ac.uk /usr/maspar/uc++/ maspar.exe
GRAPHICS ...
GRAPHICS ...
GRAPHICS ...


and so on.  Of course these group ids might be numbers rather than symbolic 
names, or strings.  We might have a group calles DEFAULT that is used for 
machines that are not given another group name, and is used for objects not 
given an explicit on clause---though in practice I don't believe ever really 
be useful to be able to not specify the on clause.  We might also have other 
predefined machine names such as CONSOLE, EXCEPTIONHANDLER and so on.

A particular machine could also be part of more that one group presumably.

This idea is not completely thought out yet, but I believe it is completely 
general, and answers all the points in RW's mail. If it could be combined with 
the "parallel slackness" that can come from lightweight objects, I believe it 
would allow a complete separation between compilation of code and parallel 
configuration.

A possible extension would be to allow wildcarding, so we could have a 
hierarchy of machines such as 
        
FARM:GROUPA:MACHINE1 blah.blah.blah /...
FARM:GROUPA:MACHINE2 blah.blah.blah /...

etc, so we could say things like activenew on FARM:*:* or on FARM:GROUPA:*, so 
we get finer control of which objects go on the same machine---which allow us 
to express ideas like "these objects can be spread out as much as possible, 
but if they are clustered together, it is better to put this subset on one 
machine as there is more communication between them.." and so on.   I'm not 
sure that this extension is necessary at this stage, however, and it would 
seem to be easy enough to extend to this later if the need is found.

Jonathan



Jonathan Poole
Wed Aug 2 17:54:26 BST 1995