Skip to main content

Run over your storage quota?

If you are experiencing:

  • Issues accessing or using Strudel,
  • Error messages like Disk quota exceeded.

then this could be the result of:

  • running over your storage quota in a key directory, OR
  • running out of disk space on a local disk like /tmp/.

If a key directory is full...

If one of your key directories is full, then software tools may break. This is especially true for your home directory. To resolve this:

  1. Connect to a login node. Either connect via SSH, or connect to Strudel and open a terminal.
  2. Verify you are over quota. See user_info. You will see quota usage over 100% if you are over quota.
  3. Identify exactly which files/directories are using up your quota. See ncdu.
  4. Remove or move large files. See Common causes of disk filling up for advice here.
important

Your home quota will not be increased, so please follow the instructions below to clean up your home directory.

Common causes of disk filling up

Remember to first use ncdu to figure out if any of these examples apply to you! For solutions involving setting environment variables, consider placing these in your ~/.bashrc so these variables are automatically set every time you log in to M3. Note that cache files are generally safe to delete.

CauseDetailSolution
CondaBy default, Conda will store your environments and their packages in your home directory, easily filling up your quota.Configure Conda as in our guide. Then either move your old Conda environments to their new location, or simply delete all of your Conda environments and packages with:
rm -rf ~/.conda/envs
rm -rf ~/.conda/pkgs
Pip cacheBy default, Python's pip installer places cache files in ~/.cache/pip/.Set PIP_CACHE_DIR. For example:
PROJECT_ID=ab12
export PIP_CACHE_DIR="/scratch/$PROJECT_ID/$USER/pip-cache"
Apptainer cacheBy default, Apptainer places cache files in ~/.apptainer/cache/.Set APPTAINER_CACHEDIR. For example:
PROJECT_ID=ab12
export APPTAINER_CACHEDIR="/scratch/$PROJECT_ID/$USER/apptainer-cache"
mkdir -p "$APPTAINER_CACHEDIR"
VNC log filesNot super common, but sometimes Strudel usage results in a very large log file in ~/.vnc/.No nice solution sadly, but this is quite rare. Just delete the log file and move on.
Your own large files!You may have put your own large data files in your home directory.Move the files into a project or scratch directory instead.
XDG_CACHE_HOME

Some software will obey the XDG_CACHE_HOME environment variable environment variable.

If a local disk directory like /tmp/ is full...

Sometimes when you run a program, it will produce an error like:

OSError: [Errno 28] No space left on device

But when you check your storage quotas using user_info, you seem to be well under quota for all of your directories! In this scenario, it is almost always /tmp/ filling up. /tmp/ is unique to each node on M3 and is used for storing temporary files. The login nodes have relatively little space in /tmp/:

[lexg@m3-login3 ~]$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg00-root 46G 16G 28G 37% /

So if you are on a login node and have this issue, you should run your program in a Slurm job on a compute node. Then, /tmp/ should be bound to a much larger local disk (> 1 TB):

[lexg@m3s101 ~]$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p1 2.9T 7.1M 2.8T 1% /tmp
note

/tmp/ is shared by every user on a node, but it's very rare that it ever fills up on compute nodes. A user's files in /tmp/ are automatically deleted once their job terminates.

If somehow /tmp/ is still filling up even inside of a Slurm job, then you can try setting the TMPDIR environment variable to a scratch directory in your job script (or shell if using an interactive session). For example:

PROJECT_ID=ab12 # Change this to your own project ID
export TMPDIR="/scratch/$PROJECT_ID/$USER/tmp"
mkdir -p "$TMPDIR" # in case this cache directory doesn't already exist