[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re. new process synchronisation primitives ...



Hi ev'rybody,

Shame about the T9000.

Now, for something much more fun ... Rick wrote:

> {{{  Abstract Data Types
>  
> <snip>
>  
> >Although users have visibility of the data structures used for each primitive,
> >advantage must *not* be taken of this visibility.  Components of the data
> >structures must *not* be accessed directly by user programs.  Instances of
> >the primitives may *only* be operated on by calling the PROCs and FUNCTIONs
> >provided.
> 
> Maybe there's scope here for syntax to declare a DATA TYPE without
> defining it. This is probably tricky ground, but the advantage of
> declaring a DATA TYPE in a #INCLUDE file and defining it separately in a
> #USE module would be to give private scope to the elements within the
> DATA TYPE. Getting dangerously 
> O.O. ...    :-)

occam3 supports proper ADTs ... see page 98 of the draft reference manual.
I think it goes like this (I don't have the manual in front of me!):

--------------------------------------------------------------------------------

  LIBRARY            -- separatly compiled unit (should have a name!)

    EXPORT

      DATA TYPE NAME SEMAPHORE:

      PROC initialise.semaphore (SEMAPHORE s, VAL INT count):
      PROC claim.semaphore (SEMAPHORE s):
      PROC release.semaphore (SEMAPHORE s):
    
    FROM

      ...  body (full implementations of the above)
  
  :

--------------------------------------------------------------------------------

> {{{  RESOURCEs
>  
> {{{  Differences between this prototype and T9000 RESOURCEs
>  
> <snip>
> 
> >However, the T9000 implementation is crucially different.  Only the server
> >is aware of the sharing and it's this that enables the direct implementation
> >of a distributed shared channel.
> 
> N.B. the CLAIM is still important - it is the means by which the
> compiler ensures correctness of the program.

Definitely!  occam3 clients of a SHARED channel need to know it's shared and
so does the server end.  The T9000 RESOURCE implementation does not need the
clients to know and the SEMAPHORE implementation does not need the server to
know -- but, whichever implementation is used, the compiler and language still
make both ends know (for reasons of being able to check security).

> {{{  RESOURCEs versus SEMAPHOREs for SHARED channels
> 
> <snip>
> 
> >  o The SHARED channel isn't actually shared!  We still have to define and
> >    wire up a private channel between each client and the server.
>  
> Isn't this bound to be the case for a distributed-memory implementation?

No.  The SEMAPHORE implementation of a distributed-memory SHARED channel only
requires one virtual channel (or channel-pair) from each processor containing
a client to the processor containig the server.  Multiple clients in the same
processor share that single virtual channel (or channel-pair) using a local
SEMAPHORE.  The RESOURCE implementation requires a virtual channel (or
channel-pair) from *every* client to the server.

I will write up the SEMAPHORE-distributed SHARED channel real soon now (:-).
We will support this under KRoC for workstation clusters (:-).

Note that for a (virtual) shared-memory multi-processor, the simple scheme
with SEMAPHOREs will work directly (modulo the same modifications we will
have to make to CHANs to enable them to work properly in such an environment).

> It seems to me that the advantages/disadvantages of each balance out
> fairly evenly. SEMAPHOREs might be considered a desirable addition to
> occam if it were to support efficient shared memory accessing, but this
> may well be without the normal provability benefits of occam programs.

I'm not proposing ever making SEMAPHOREs visible at the language level --
well, probably not!  Direct use of SEMAPHOREs enable their misuse.  But
they seem to be very useful to implement higher-level ideas (like SHARED
channels) that can be made secure.

> {{{  Barrier synchronisation
> 
> <snip>
> 
> >Since this is occam, we are not restricted to the replicated PAR of SPMD.  For,
> >example, the following system has three different processes synchronising on
> >the named barrier:
> > 
> >  ...  shared global data
> >  PAR EVENT e
> >    ...  process A
> >    ...  process B
> >    ...  process C
> 
> I think it needs stating explicitly that A, B, and C must all contain
> 'SYNC e', a requirement which the compiler will check for.

Unless we start inventing schemes for expressing cyclic processes (such as
the occam3 SERVER declaration),  I don't think we can enforce such a rule.
For instance, it would make unrolling a loop (with the requisite one SYNC)
into one with ten SYNCs illegal ... this is such a natural transformation
that we can't ban it.  This answers your later question:

> Are multiple SYNCs meaningful? For example
> 
>     PAR EVENT e
>       SEQ
>         ...
>         SYNC e
>         ...
>         SYNC e
>       SEQ
>         ...
>         SYNC e
>         ...
>         SYNC e

with a "yes".

> What if the different processes have differing numbers of SYNCs?

Again, if we allow arbitrary processes under a "PAR EVENT e", this must also
be allowed.  It's the programmer's responsibility to understand about the
steps between SYNCs ... this is a dynamic property that need not be reflected
by syntactic structure.  Of course, we may want to fix the steps-n-SYNCs
with some syntactic structure, but that will require some *new* syntax
(a bit like the SERVER declarations of occam3).

> To what extent does the proposed EVENT/SYNC mechanism fulfil the
> requirements of BSP?

BSP just uses a barrier synchronisation, so these EVENTs and SYNCs should
be fine for them.  With our ability to control multiple process groups
under different barriers and our toleration of the numbers of processes
in each group varying, EVENTs are much more flexible than standard BSP
synchronisation.  But I hear that BSP is graduating towards allowing multiple
process groups at least ... EVENTs are slightly ahead of them!

But we don't have the BSP shared-memory read/write model, where you read/write
shared-memory to/from local memory and the read/writes don't take effect
until after the next barrier.  We could introduce read/write primitives
that work like that ... do we want to?  It would give us BSP-programming
capability from occam ... ?!!

> >Such conclusions give no optimism for a sound engineering basis for parallel
> >computing and raise fundamental questions as to its viability.  Fortunately,
> >the scientific insights of occam and CSP, along with those of BSP, are
> >becoming available to a wider audience.  Unfortunately, resistance to the
> >concept of sound engineering is hard to underestimate in today's mainstream
> >computer culture.
> 
> ...
> 
> You probably mean 'overstate' rather than 'underestimate'. 

I think you're right!  Natural lanhuage is tricky stuff ...

> It might be worth commenting that
> 
>         PAR i = 0 STEP stride FOR n
>           A[i] := A[i] + A[i + (stride >> 1)]
> 
> can be expressed as 
> 
>         PAR j = 0 FOR n
>           VAL i IS j * stride:
>           A[i] := A[i] + A[i + (stride >> 1)]
> 
> at additional cost. STEP is worth adding to the language if the
> arithmetic involved is substantially cheaper than the multiplication
> needed otherwise. It should of course be added to SEQ, IF and ALT
> replicators too. Also to be considered: can the stride have a negative
> value?

Yes, yes and yes.  In general:

  <replicator> i = <start-exp> STEP <step-exp> FOR <for-exp>
    ...  body

maps to:

  <replicator> i = <start-exp> FOR <for-exp>
    VAL INT i IS <start-exp> + (i * <step-exp>):
    ...  body

but the formal STEP can be compiled into one addition per loop-end.  Of course,
without a STEP (or with a constant <step-exp> of 1), exactly the same code
as at present would be generated.

It doesn't matter a hoot if <step-exp> is negative.  The <for-exp> says how
many times we iterate and this is independent of <start-exp> and <step-exp>.
We don't have the mental (and semantic) gymnastics of Pascal or C.  The
replicator defines an arithmetic sequence of control values:  <start-exp>
gives the start value, <step-exp> gives the interval and <for-exp> gives
the length.  Beautiful!

> At first I thought that EVENTs and BUCKETs were essentially the same.

Their implementations are almost identical, but their semantics are very
different.  EVENTs are deterministic, BUCKETs are not.

> Now perhaps I think BUCKETs are more like an enhancement of suspend and
> resume in the posix thread sense.  ...

I don't know posix -- guess I should!  However, BUCKETs are a bit like the
the `synchronized' methods `wait' and `notifyAll' of Java.  To fall into
a bucket, in Java you `synchronize' on a raw OBJECT (which is the bucket)
and then `wait'.  That puts you on the wait-queue for the object, along
with any other fallers.  To flush the bucket, the flushing process also
`synchronizes' on the same raw OBJECT and executes the `notifyAll'.  That
relases all the fallers from the wait-queue.  However, they then go back
into the main queue for the OBJECT and, when they reach the front, just
exit their original (falling) `synchronized' method.

So, to flush the Java bucket, you get on the queue (behind other fallers)
and kick the bucket over once you're in.  To fall into a bucket, you go
through 3 queues: the main-queue twice and the wait-queue once.  The
implementation code is short but the execution is long.  Flushing is
linear in the number of processes in the bucket.

> ................................  Does that make them dangerous to use?
> What happens if A flushes an empty bucket then B falls into the bucket
> (i.e. is there a risk of race hazards)?

I'll answer this later!  Off to lunch now ...

Peter.