Use of the Parallel Computing Toolbox on the cluster is likely to cause you more trouble than it is worth. It is particularly prone to odd task failures (which Mathworks has been unable to track down for us) and has no facility for even detecting such failures much less recovering from them

For alternative approaches, including hand-holding, and/or algorithm and numerical expertise please contact Professor Alan Edelman (edelman AT mit.edu) to participate in our open source project with performance and ease-of-use guaranteed to please.

it does have some interesting options for local use on modern multicore systems though...

Distributed Computing with MATLAB

Mathworks introduced a new feature as of Release 14, service pack 3, to the MATLAB product line: the Distributed Computing Toolbox and Distributed Computing Engine. Together, these products allow users to run MATLAB code across multiple parallel hosts.

Matlab DCT and the CSAIL Cluster Quickstart

I really should be more verbose later, but something is better than nothing right?

First you will need to be on a cluster system as described in CondorIntro, the Distributed Computing Toolkit (DCT) handles the job submissions, so you don't need to worry about those details. You do need to have your working directory set to an NFS path for example /data/scratch/$USERNAME

Before you use DCT for the first time you will need to configure some MATLAB preferences through the GUI. This is a bit silly, but as of 2008a there is nolonger a way for us to define defautls for this and it must be done at the user end. So from the menus select Parallel -> Manage Configurations, then double click the "generic" configuration. You will need to set four values

  1. ClusterMatlabRoot /afs/csail.mit.edu/system/amd64_linux26/matlab/latest/
  2. SubmitFcn @submitfcn
  3. ClusterOsType unix
  4. HasSharedFilesystem true

you may optionally set DataLocation as well, if you do this must be on NFS so that the workers can write there (/data/scratch/ is a good place or if your group has dedicated NFS space that would also be a good option).

screen shot of DCT config window

For Condor submission we user the "generic" scheduler, so here's a quick and dirty M file that calls system('hostname') on a number of systems:

 jm = findResource('scheduler','configuration','generic');
set(jm,'configuration','generic');
job = createJob(jm);
for i=1:5
        createTask(job, @system, 2, {'hostname'}); end; submit(job);
waitForState(job); results = getAllOutputArguments(job); results{:}

My Matlab-Fu is weak, hopefully you can deduce how to wrap more interesting calculations in there. Or even better someone doing real work with it might give a better overview...

-- JonProulx - 18 Dec 2008

NOTE: The attached Mathworks document describes use of the "parfor" command. This works wonderfully on local machines, but currently NOT on the CSAIL cluster. -HGM 20 Jan 2011

Your MATLABPATH will not be transferred to the clients automatically! In your matlab code use:
set(job,'PathDependencies',{'$DIR1','$DIR2',...,'$DIRN'}); 
Before submitting the job.

-- mmt@csail.mit.edu - 24 Jan 2009

Note On Exceptions

Each job you create will generate a directory named JobN/ (N increments as you create more jobs). In that directory you will find files names TaskM.{out.mat,err,log,out} (one set of files per task, M be the task number).

If an error in your code triggers a Matlab Exception rather than a fatal error you will not see any information in the TaskM.err or TaskM.log. Instead it will be stored in TaskM.out.mat. Use Matlab's load to view the exception.

-- mmt@csail.mit.edu - 24 Jan 2009

Topic attachments
I Attachment Action Size Date Who Comment
MIT-CSAIL-Parallel.pdfpdf MIT-CSAIL-Parallel.pdf manage 1073.2 K 15 Jul 2009 - 15:57 JonProulx Slides (only) from Mathworks talk on Parallel Computing Toolkit
MathworksParallelPresentation.tgztgz MathworksParallelPresentation.tgz manage 812.8 K 15 Jul 2009 - 15:56 JonProulx Examples and Slides from Mathworks talk on Parallel Computing Toolkit
Topic revision: 25 Aug 2017, JasonDorfman
 

MIT Computer Science and Artificial Intelligence Laboratory

 

  • About CSAIL
  • Research
  • News + Events
  • Resources
  • People

This site is powered by Foswiki MIT: Massachusetts Institute of Technology