Page contents
Running a simple parallel code on one node using the toolbox
Example 1 (func_local): In this example, a simple code uses multiple workers on a single node to compute a sum in parallel. Here is the sample code:
parallel_sum.m: function s = parallel_sum(N) s = 0; parfor i = 1:N s = s + i; end fprintf('Sum of numbers from 1 to %d is %d.n', N, s); end
This function is executed by MATLAB code that reads in a few parameters from your user environment. For single-node parallel jobs, this example uses the “local” cluster profile that is available by default when using the toolbox.
Do not run the local toolbox profile on login nodes; excessive CPU and/or memory use will result in your script being terminated. Instead, use a PBS batch job as in the following example to run a single-node cluster on a compute node. This PBS job also specifies the number of workers, which corresponds to the number of CPUs requested in the job script.
submit_local.pbs: #!/bin/bash #PBS -N matlab_pct #PBS -A <PROJECT> #PBS -l walltime=05:00 #PBS -q casper #PBS -j oe #PBS -o local.log #PBS -l select=1:ncpus=4:mpiprocs=4:mem=10GB # The following script is not distributed; it uses threads # and so this PBS job should only ever request a single node module load matlab # Derive the number of workers to use in the toolbox run script export NUMWORKERS=$(wc -l $PBS_NODEFILE | cut -d' ' -f1) SECONDS=0 matlab -nodesktop -nosplash << EOF % Start local cluster and submit job with custom number of workers c = parcluster('local') j = c.batch(@parallel_sum, 1, {100}, 'pool', $((NUMWORKERS - 1))); % Wait for the job to finish, then get output wait(j); diary(j); exit; EOF echo "Time elapsed = $SECONDS s"
MPS cluster profiles
When using the PCT, you are expected to create and use cluster profiles that manage either node-local tasks or batch-scheduler tasks. While there is a preconfigured profile for single-node use (local), you will need to do some setup before you can use the MPS. CISL provides a distributed cluster profile for PBS on both Casper and Cheyenne for all versions of MATLAB starting with R2020a.
You can import an existing cluster profile using the wizard in the graphical interface, or you can do it programmatically as follows. If you use our sample distributed script provided in the following section, we include the MPS cluster profile setup for you, so you can skip the commands in this section.
At the MATLAB command line, enter the following line to import the MPS profile:
ncar_mps = parallel.importProfile('/glade/u/apps/opt/matlab/parallel/ncar_mps.mlsettings');
You need to import the profile only once; MATLAB will remember it in future sessions. If you anticipate using the parallel server profile frequently, you may want to make it your default parallel profile as shown here:
parallel.defaultClusterProfile(ncar-mps);
Using the MATLAB parallel server (MPS) to span multiple nodes
The configuration above will limit your job to the number of CPUs on a single node; on Casper and Cheyenne this means 36 workers, or 72 if you use hyperthreads. However, you can use the parallel server to span multiple nodes. When using MPS, MATLAB itself will submit a job to the batch scheduler and use an internal MPI library to enable communication between remote workers.
Example 2 (func_mps): Here again, use a MATLAB script to set up your parallel cluster as in this example, which embeds MATLAB code into a driver script:
submit_server.sh: #!/bin/bash # This script doesn't need to run on a batch node... we can simply submit # the parallel job by running this script on the login node module rm ncarenv module load matlab mkdir -p output # Job parameters MPSNODES=2 MPSTASKS=4 MPSACCOUNT=<PROJECT> MPSQUEUE=casper@casper-pbs MPSWALLTIME=300 SECONDS=0 matlab -nodesktop -nosplash << EOF % Add cluster profile if not already present if ~any(strcmp(parallel.clusterProfiles, 'ncar_mps')) ncar_mps = parallel.importProfile('/glade/u/apps/opt/matlab/parallel/ncar_mps.mlsettings'); end % Start PBS cluster and submit job with custom number of workers c = parcluster('ncar_mps'); % Matlab workers will equal nodes * tasks-per-node - 1 jNodes = '$MPSNODES'; jTasks = '$MPSTASKS'; jWorkers = str2num(jNodes) * str2num(jTasks) - 1; c.ClusterMatlabRoot = getenv('NCAR_ROOT_MATLAB'); c.ResourceTemplate = append('-l select=', jNodes, ':ncpus=', jTasks, ':mpiprocs=', jTasks); c.SubmitArguments = append('-A $MPSACCOUNT -q $MPSQUEUE -l walltime=$MPSWALLTIME'); c.JobStorageLocation = append(getenv('PWD'), '/output'); % Output cluster settings c % Submit job to batch scheduler (PBS) j = batch(c, @parallel_sum, 1, {100}, 'pool', jWorkers); % Wait for job to finish and get output wait(j); diary(j); exit; EOF echo "Time elapsed = $SECONDS s"
Sample PCT scripts
Including the scripts shown above, there are four sets of example scripts that you can use, modify, and extend to fit your purposes. All four examples can be copied from /glade/u/apps/opt/matlab/parallel/examples.
- func_local - Run a specified function using a pool of local (single-node) workers. This is Example 1 above.
- func_mps - Run a specified function using a pool of workers distributed across multiple nodes. This is Example 2 above.
- multi_script_mps - Run a specified collection of MATLAB functions in separate script files using a pool of workers. Configured for MPS use but can be modified to use the local profile.
- spmd_mps - Run a single MATLAB function in a specified script using many input parameters.
The last two examples are functionally similar to a command-file PBS job, but with the licensing benefits of using MPS.