Common computational problems

Here are some computational problems commonly encountered on the compute servers and cluster and some solutions or workarounds.

  • Your cluster job won't start
    • Most biostat users and users with class-related accounts are not allowed to run jobs on the cluster; one solution for biostat users is to use the Biostatistics cluster.
    • Jobs in the high queue can use at most 12 cores.
    • Jobs in any queue can use at most 32 cores (unless run via the "mpi" parallel environment).
    • See also the next common problem listed just below.
  • Your cluster job is stuck at the top of the queue and other jobs start before it
    • Jobs requesting multiple cores may not start because not enough cores become available at once. Jobs requesting fewer cores may start instead because a smaller number of cores become available. To remedy this, use the "-R y" flag to qsub when submitting your job. This will request a 'reservation' such that the queueing software accumulates free cores for your job. Alternatively, see the answers to the problem listed just above.
  • R hangs when using profiling
    • This is likely a conflict between's Rprof() and the threaded BLAS used by R for linear algebra. Solutions include (1) disabling threading (see here for more details) and (2) not using profiling.
  • Matlab won't use more than 12 cores at once
    • Our installation of Matlab makes use of the Parallel Computing Toolbox for parallelization. This toolbox restricts usage to no more than 12 cores. To use more, we would need to have the Matlab Distributed Computing Server set up. If this becomes a limitation in your work, you can let us know and we can talk about the options.