Cheyenne and Casper users can run their own Singularity containers on those systems as described on this page. This documentation details which modules to load to support the container, which commands to run, and how to perform a simple test in the new Singularity container environment.
After completing a successful test, users should be able to use their Singularity container on the host system. Performance has not been tested.
Before you start, do the following to prepare:
Step-by-step instructionsLoad the CISL Singularity module as shown.
Create a sandbox. (Use your own container name and .sif file.) The following example generates a sandbox under a directory named container_name.
If you need to delete a sandbox directory, you will first need to set the proper permissions by executing the find command, then run the rm command.
Load a shell in the container.
The default shell is bash. To invoke a different shell – tcsh, for example – specify it as follows:
In the container shell, compile your Hello, World code to run as a test.
Run the executable in the container. The following example command will run four tasks on the same node.
Exit the container after the job runs.
For the next step, you will launch and run an MPI command on the host system outside of the container, but before doing so, be sure to use the same MPI version on the host and container. Using mismatched MPI libraries or library versions will likely result in a fatal error reported as:
In contrast, a successful launch and run will produce the same output as when the executable ran in the container. Here is the MPI command for the next step:
When you have confirmed that the output is the same as before, submit an interactive job to run on multiple nodes as in this example, using four nodes with two tasks on each node.
When the interactive job starts, confirm the Singularity executable is still available by using the which command.
Then execute the following command at the new prompt to run the same job on multiple nodes within the container. The output from this job (example shown) should be the same as for the previous job.
|