User login


We examine the issues of building a multi–paradigm runtime with the three execution styles (dynamic threading, MPI, coarse grain functional parallelism). Microsoft CCR supports efficient thread management for handlers (continuations) spawned in response to messages being posted to ports. Microsoft DSS is a lightweight, service oriented application model that sits on top of CCR for creating coarse grain application in a distributed environment. MPI benchmark includes MPICH2 and Nemesis, mpiJava, and MPJE.  Tests are implemented in C#, C++, and Java languages and run on multicore systems with a variety of OS (Windows XP, Windows Vista, RedHat and Fedora).



  Cache Line Interference
  • Implementations of our clustering algorithm showed large fluctuations due to the cache line interference effect (false sharing)
  • We have one thread on each core each calculating a sum of same complexity storing result in a common array A with different cores using different array locations
  • Thread i stores sum in A(i) is separation 1 – no memory access interference but cache line interference
  • Thread i stores sum in A(X*i) is separation X
  • Serious degradation if X < 8 (64 bytes) with Windows
    • Note A is a double (8 bytes)
    • Less interference effect with Linux – especially Red Hat


Runtime System Used

  • micro-parallelism
    • Microsoft CCR (Concurrency and Coordination Runtime)
      • supports both MPI rendezvous and dynamic (spawned) threading style of parallelism
      • has fewer primitives than MPI but can implement MPI collectives with low latency threads
      • http://msdn.microsoft.com/robotics/
    • MPI.Net
  • macro-paralelism (inter-service communication)
    • Microsoft DSS (Decentralized System Services) built in terms of CCR for service model
    • Mash up
    • Workflow (Grid)