[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Oh, Java!



//{{{}}}

Dear All,

I had a nasty Java shock this morning ... but it sorted itself out by
lunchtime ... maybe.  Please could you help by compiling and running
the enclosed Java program on any Java systems to which you have access?
Please mail me the output and say what platform and Java system you used.

The problem I had was some grossly unfair scheduling of some JavaPP
processes.  The current implementation of JavaPP shared channels
(multi-writer/multi-reader) relies on the threads kernel underlying
the JVM to do just one sensible thing: queue any threads trying to
synchronise on an object when another thread holds the monitor for
that object.  So, when the monitor is released, the thread that was
waiting the longest to synchronise gets in next.

If we don't have this property, we have no guarantee that a thread
will ever be able to synchronise with an object (even when deadlock
doesn't happen).  This is thread starvation.

The following program tests the mechanism supported by your Java system.
Copy it into a single file -- say TestQueue.java:

  //{{{  TestQueue.java
  
  //{{{  class X_class {
  class X_class {
    
    int count = 0;

    //{{{  public synchronized void sync (int id) {
    public synchronized void sync (int id) {
      count++;
      System.out.println ("(" + System.currentTimeMillis() + ")" +
                          " Phil " + id + " : syncing ... " + count);
    }
    //}}}
  
    //{{{  public synchronized void hold (int n) {
    public synchronized void hold (int n) {
      int seconds = 1000;
      System.out.println ("(" + System.currentTimeMillis() + ")" +
                          " Holder : HOLDING for " + n + " seconds ... " + count);
      //{{{  wait for n seconds
      try {Thread.sleep (n*seconds);} catch (InterruptedException e) {}
      //}}}
    }
    //}}}
  
  }
  //}}}
  
  //{{{  class Phil extends Thread {
  class Phil extends Thread {
    //{{{  
    
    //{{{  parameters
    private int id;
    private X_class X;
    //}}}
    
    //{{{  constructor
    public Phil (int id, X_class X) {
      this.id = id;
      this.X = X;
      start ();
    }
    //}}}
    
    //{{{  run
    public void run () {
      int seconds = 1000;
      //{{{  wait for id seconds
      try {sleep (id*seconds);} catch (InterruptedException e) {}
      //}}}
      //{{{  say want to sync
      System.out.println ("(" + System.currentTimeMillis() + ")" +
                          " Phil " + id + " : want to sync ... ");
      //}}}
      X.sync (id);
    }
    //}}}
    
    //}}}
  }
  //}}}
  
  //{{{  class Holder extends Thread {
  class Holder extends Thread {
    //{{{  
    
    //{{{  parameters
    private X_class X;
    //}}}
    
    //{{{  constructor
    public Holder (X_class X) {
      this.X = X;
      start ();
    }
    //}}}
    
    //{{{  run
    public void run () {
      int hold_period = 15;
      //{{{  say want to hold
      System.out.println ("(" + System.currentTimeMillis() + ")" +
                          " Holder : want to hold ... ");
      //}}}
      X.hold (hold_period);
    }
    //}}}
    
    //}}}
  }
  //}}}
  
  //{{{  class College {
  class College {
    //{{{  
    
    //{{{  main
    public static void main (String argv[]) {
      //{{{  
      
      int n_philosophers = 10;
      
      X_class X = new X_class ();
      
      Holder holder = new Holder (X);
      
      Phil[] phil = new Phil[n_philosophers];
      
      for (int i = 0; i < n_philosophers; i++) {
        phil[i] = new Phil (i + 1, X);
      }
      
      //}}}
    }
    //}}}
    
    //}}}
  }
  //}}}
  
  //}}}

Compile and run it (e.g. javac TestQueue.java; java College).

Do you get the following output?

  //{{{  from JDK 1.1.3 on SPARC/Solaris
  ash% javac TestQueue.java
  ash% java College
  (870279910759) Holder : want to hold ...
  (870279910869) Holder : HOLDING for 15 seconds ... 0
  (870279911875) Phil 1 : want to sync ...
  (870279912876) Phil 2 : want to sync ...
  (870279913876) Phil 3 : want to sync ...
  (870279914876) Phil 4 : want to sync ...
  (870279915876) Phil 5 : want to sync ...
  (870279916876) Phil 6 : want to sync ...
  (870279917876) Phil 7 : want to sync ...
  (870279918877) Phil 8 : want to sync ...
  (870279919877) Phil 9 : want to sync ...
  (870279920877) Phil 10 : want to sync ...
  (870279925878) Phil 1 : syncing ... 1
  (870279925880) Phil 2 : syncing ... 2
  (870279925881) Phil 3 : syncing ... 3
  (870279925882) Phil 4 : syncing ... 4
  (870279925883) Phil 5 : syncing ... 5
  (870279925885) Phil 6 : syncing ... 6
  (870279925886) Phil 7 : syncing ... 7
  (870279925887) Phil 8 : syncing ... 8
  (870279925889) Phil 9 : syncing ... 9
  (870279925890) Phil 10 : syncing ... 10
  ash%
  //}}}

If so, all is well.  The numbers down the left are timestamps in milliseconds.
At second 0, Holder grabs the X monitor for 15 seconds.  At seconds 1 through
10, each of the philosophers arrive wanting to sync on X.  At second 15, they
all make it -- in the order in which they made their calls.

Now Java doesn't specify that synchronising calls are queued.  Thank goodness
that they implemented them that way in the latest JDK (at least on our
Solaris machine).

I *was* working on a SunOS machine that had an earlier JDK (1.0.?) installed.
This is the output it produced from the same program:

  //{{{  from JDK 1.0.? on SPARC/SunOS
  mint.ukc.ac.uk% javac TestQueue.java
  mint.ukc.ac.uk% java College
  (870261025501) Holder : want to hold ...
  (870261025601) Holder : HOLDING for 15 seconds ... 0
  (870261026611) Phil 1 : want to sync ...
  (870261027612) Phil 2 : want to sync ...
  (870261028612) Phil 3 : want to sync ...
  (870261029612) Phil 4 : want to sync ...
  (870261030612) Phil 5 : want to sync ...
  (870261031612) Phil 6 : want to sync ...
  (870261032613) Phil 7 : want to sync ...
  (870261033613) Phil 8 : want to sync ...
  (870261034613) Phil 9 : want to sync ...
  (870261035613) Phil 10 : want to sync ...
  (870261040616) Phil 10 : syncing ... 1
  (870261040620) Phil 9 : syncing ... 2
  (870261040623) Phil 8 : syncing ... 3
  (870261040626) Phil 7 : syncing ... 4
  (870261040629) Phil 6 : syncing ... 5
  (870261040631) Phil 5 : syncing ... 6
  (870261040634) Phil 4 : syncing ... 7
  (870261040637) Phil 3 : syncing ... 8
  (870261040640) Phil 2 : syncing ... 9
  (870261040643) Phil 1 : syncing ... 10
  mint.ukc.ac.uk%
  //}}}

where it is apparent that synchronising calls are stacked!  Someone mailed
me once to say that some implementors wanted to do that, but I thought that
was a bit far fetched.  The argument was that taking the *latest* call,
when a monitor got released, would lead to more efficient computation ...
because the latest caller would have the greatest chance of still having its
data in cache!

But that's such a silly scheme as there can be no guarantee that a thread
making a synchronised call will *ever* get serviced.  It's trivial to dream
up an example where a calling thread is infinitely overtaken by two later
calling threads.  It's like work stacked up on your desk.  The stuff on the
bottom, that came in ages ago, never gets done because it's always covered
up by later arriving work!  I know that leads to tears ...

Anyway, I'm curious as to what happens in PC/Mac/... environments, so would be
very grateful to be told.  Also, if anyone knows of any rationale from Sun
as to why they made this (very sensible) change in JDK 1.1.3 and whether
they are they going to mandate it, please say!  If the Java threads library
ever went back to stacking those calls, starvation would threaten everyone
-- it's not a problem peculiar to JavaPP.

Note that the current implementation of JavaPP doesn't care how the Java
`wait-sets' are managed (i.e. do threads `wait' in a queue or stack or ... ?).
It doesn't care because it never has more than one thread in any wait-set.

So ... I think we should guarantee that JavaPP processes queue up in a FIFO
(or, at least, a *weak* FIFO) when waiting to read or write on a shared
channel.  This looks pretty basic to me!

It ought to be basic for Java itself.  I'm told that a reason FIFOs may not
have been mandated is the difficulty of managing them across multiprocessors.
But, weak FIFOs (e.g. a queue of queues) can be so managed and cheaply --
for example, see the Distributed SHARED Channel in KRoC occam:

  http://www.hensa.ac.uk/parallel/occam/projects/occam-for-all/dsc/

Many thanks for your help,

Peter Welch.