Using MATLAB on Quest
Setup and reference instructions for running MATLAB on Quest.
When running MATLAB on Quest, you have the option to make explicit use of parallelization. However, even if you aren't explicitly parallelizing your code, you should be aware of MATLAB's use of multithreading. This is discussed first below. Then there are instructions for setting up MATLAB to run code in parallel. How you use MATLAB's parallelization will depend on whether you are running on a single node, or on multiple nodes.
Compute nodes on Quest have at least 20 cores, but some partitions (collections of compute nodes) have more than 20 cores. See Quest Technical Specifications.
If you want to use more than 20 cores per node, you can request nodes in a specific partition of Quest with the Moab option:
#MSUB -l partition=quest6
MATLAB has built-in multithreading for some linear algebra and numerical functions. By default, MATLAB will try to use all of the cores on a machine to perform these computations. However, if a job you've submitted to Quest uses more cores than were requested, the job will be cancelled. To avoid this situation, you can start MATLAB with the singleCompThread option to restrict a MATLAB process to a single core:
This is the recommended way to run MATLAB on Quest.
However, if you want MATLAB to be able to use multiple cores for these calculations, then you can omit the -singleCompThread option when starting MATLAB and request an entire node for your job. Use the options below in your submission script or msub command:
#MSUB -l nodes=1:ppn=<numberofcores> #MSUB -l partition=<partitiionName>
#MSUB -l nodes=1:ppn=20 #MSUB -l partition=quest4
It is strongly recommended not to mix multithreading and explicit parallelization within MATLAB to avoid your process either running at reduced efficiency or being cancelled for exceeding allotted resources.
Note: it may be possible to use multithreading with less than a full node with the maxNumCompThreads option in your MATLAB code, but some libraries may not adhere to the limits set with this option, and it may not be respected by MATLAB in the future.
Cross-node Parallel MATLAB Jobs on Quest
To run parallel jobs on Quest that use cores across multiple nodes, you need to create a parallel profile.
Log in to Quest with X-forwarding enabled (use the -X option when connecting via ssh or connect with FastX to have X-forwarding enabled by default).
Launch MATLAB on the login node you land on from your home directory:
module load matlab/r2016a matlab
When MATLAB opens, select the Parallel menu button:
This opens the Cluster Profile Manager (or choose to open Cluster Profile Manager).
Create a Validation Profile
First, we'll make a test profile to validate our settings.
Click on the Add button in the upper left, then choose Custom and then Torque.
If you get a message about needing the Distributed Computing Toolbox, that's ok. It's available on Quest.
This will create a new profile called TorqueProfile1 which will show up in the Cluster Profile list on the left of the Cluster Profile Manager. Double click on the name to change it. Call this profile something like "multinode-quest-validate."
Then, with this new profile selected from the Cluster Profile list, click the Edit button in the bottom right of the Cluster Profile Manager.
Edit several of the fields:
In the top option block, set "Number of workers available to cluster Numworkers" to 4. This is the number of cores/workers that we want to use. We'll run our test using 4 cores.
In the top option block, set "Additional command line arguments for job submission SubmitArguments" to
where you change <allocationID> to be the name of your allocation and <queueName> is the name of the queue. The example in the image below uses
but you need to use your own allocation ID instead. If you want to set other options, such as joining the output and error files, or setting email notification preferences, you can do that here as well. So you might use something like:
-A a9009 -q short -j oe -m abe -M firstname.lastname@example.org
Scroll down to the bottom option block Additional Torque Properties. Set "Resource list parameter. User ^N^ for number of tasks in a parallel job. ResourceTemplate" to
These are our job parameters.
- nodes=2:ppn=2 says to use 2 nodes and 2 cores on each of those nodes. This totals to the 4 cores we specified we wanted to use above.
- walltime=00:10:00 specifies a 10 minute job.
- gres=mdcsw:4 reserves 4 matlab licenses (matching the 4 cores/workers we're using) before the job starts. This makes sure that the licenses will be available once the job runs.
In the bottom option block, set "Remote shell command to call on UNIX when running communicating jobs RshCommand" to ssh.
Set "Remote copy command for non-shared file systems RcpCommand" to scp.
Click on Done in the bottom right. Then click the Validate button at the top of the window (with a green check mark) to start the validation tests. All of them should pass. If something fails, then recheck that you entered the above parameters correctly. If the parameters are correct (in particular, that you entered your allocation name and not the example allocation above), but the validation still hasn't passed, contact email@example.com for assistance.
Great. Everything works. Now we'll make a real profile that you'll use to run your job.
Create a Job Profile
Right click on the name of the validation profile you just made (multinode-quest-validate) in the Cluster Profile list on the left, and choose duplicate from the menu. This will make a copy of the profile we just created. Rename this profile by double clicking on the name. Choose a name that's descriptive of the parameters for your job, perhaps something like multinode-quest-30cores.
With the new profile selected, click on the Edit button in the bottom right corner of the window. We're going to change some of the parameters we set in the validation profile to correspond to the values you need for you actual job.
In the top option block, set "Number of workers available to cluster Numworkers" to the total number of cores/workers you want to use.
In the top option block, change the -q option in "Additional command line arguments for job submission SubmitArguments" to be the correct name of the queue to use, which depends on the walltime or special project. In the validation profile, we set the queue to short because we were running a job less than 4 hours (see Quest Queues [http://www.it.northwestern.edu/research/user-services/quest/queues.html] for more information). You can also change the allocation number here if needed.
Scroll down to the bottom option block Additional Torque Properties. Change "Resource list parameter. User ^N^ for number of tasks in a parallel job. ResourceTemplate" to correspond to your job. If you use the format we used before:
then the number of nodes (2 here) times the number of cores per node ppn (2 here) should equal the "Number of workers available to cluster Numworkers" you set in the options above and the value you set for mdcsw (which is 4 in this example). mdcsw should always equal the "Number of workers available to cluster Numworkers." Adjust the walltime parameter as appropriate for the maximum length of your job.
If you know how many cores you want to use, but you don't care how they are distributed across nodes, the you can use a command like:
which would request 30 cores spread across whatever nodes are necessary. We also set mdcsw equal to 30 to get the appropriate number of licenses, and you'd set "Number of workers available to cluster Numworkers" above to 30 as well. Adjust the walltime too.
When you're finished editing the parameters, click Done. The profile is now ready to use in your job script. You can exit MATLAB if you're done using it.
Submitting your Batch Job
Your MATLAB script file (myscript.m in this example) should include the following command:
where profile-name-here is the name of the parallel profile you created with your actual job parameters in it (the second one we made; in the example above, it was multinode-quest-30cores); the name of the profile is surrounded by single quotes. By default, MATLAB will then use the maximum number of cores you specified in the profile.
Submission ScriptYou will use MATLAB to submit your job to the cluster instead of using msub. Make a shell script (mymatlabjob.sh) like the following:
#!/bin/bash module load matlab/r2016a matlab -nosplash -nodesktop -singleCompThread -r <myscript> > <log> & exit
The options given above to MATLAB are needed to run the batch job correctly.
- -singleCompThread tells MATLAB not to try to use more processors on the node than have been allocated to the job. Without this option, your job may get killed for using more resources than requested.
- <myscript> is the name of a MATLAB script .m file. Omit the .m file extension.
- <log> is the name of the log file you want to direct output to.
- & and exit send the MATLAB job (which is running on a login node) to the background and then exits the script so you get the command prompt back.
Make this script executable (you only need to do this once per file):
You execute this script as:
from a Quest login node (where you land when you log into Quest). This is a shell script you're executing to run MATLAB directly. Then MATLAB will request resources from Quest. You are not executing this script with Moab (msub).
When you submit a job this way, MATLAB may write a directory with a name like Job1 and several related files in the working directory. You can delete these after the job is done, but you should leave them in place while the job is running. Because of this behavior, you may want to submit your job from a directory where you're OK having these extra files written, and update paths in your script to point to other locations where you keep you actual script files to keep everything separate.
Single Node Parallel MATLAB Jobs on Quest
If you want to use MATLAB's parallelization capabilities with a small number of cores, such that the job will fit on a single node (see limits above), then you do not need to create a parallel profile as we did above. You can use the default "local" profile.
In your MATLAB script (<matlabscript.m> in the example below), use
where N is the number of cores you want to use.
Then create a Moab submission script for your job. There are more details at Submitting a Job on Quest, but a sample script myjob.sh looks like:
#!/bin/bash #MSUB -A <allocationID> #MSUB -l nodes=1:ppn=<N> #MSUB -l walltime=<hh:mm:ss> #MSUB -q <queueName> #MSUB -N <jobName> #MSUB -m abe #MSUB -M <email>
#MSUB -j oe
## set your working directory cd $PBS_O_WORKDIR ## job commands; <matlabscript> is your MATLAB .m file, specified without the .m extension module load matlab/r2016a matlab -nosplash -nodesktop -singleCompThread -r <matlabscript>
Submit this script as a batch job using msub: