Changes to storage policies#
Research Computing is making policy changes regarding storage organization and retention. This is in response to the increasing needs of users, and represent best practice.
Policy updates
- User scratch directories (/scratch/u/...) will be deleted
- Files in scratch that have not been accessed or modified in the last 60 days will be removed daily
User scratch removal#
Research Computing will no longer provide user scratch directories. For reference, these directories have been located at /scratch/u/netid
. This is a response to user feedback about confusion over purpose of user vs. group scratch. Additionally, we have found that group scratch is more heavily utilized, and more actively utilized, i.e., less old data. We encourage all users to cleanup their user scratch data immediately. Research Computing will begin deleting these directories on March 6, 2024.
Scratch retention#
Research Computing has had documented guidance for use of scratch storage in the HPC environment since the start of operations for the HPC cluster. This included a data deletion policy for old files in the scratch file system. Historically, we have not enforced this policy in any automated way. In fact, we have only emailed our MCW users to manually cleanup their scratch space when storage ran out. This has led to issues with users storing old data in scratch that is not needed and takes up valuable space meant for new jobs.
To fix these issues, Research Computing will begin deleting old scratch data with an automated script. The script will daily scan all files and directories in your scratch space and move any objects that have not been modified or accessed in the last 60 days to a daily trash space. After 14 days, a daily trash space will expire and be deleted. This means that the scratch file system will be cleaned up every day, and you will have the opportunity to retrieve those old files for an additional 14 days.
This scratch retention process is effective March 6, 2024. For details, please review the scratch retention information.
Next steps for you#
Start cleaning up your scratch directories. If you have data in /scratch/u
, move it to your lab group space /group/pi_netid
, or /scratch/g/pi_netid
if you have more jobs to run.
We encourage all users to review the FAQ section below for answers to common questions. All further questions should be submitted to help-rcc@mcw.edu.
FAQ#
How do I know the age of my files?
To list files in a directory ordered by age from top to bottom:
ls -latr
To find all files in a directory that have not been modified in the last 60 days:
find /path/to/directory -mtime +60
To find all files in a directory that have not been accessed in the last 60 days:
find /path/to/directory -atime +60
Remember, the scratch file system is not for regular data storage. It is only to be used for storing data while a SLURM job is running. When the job finishes, clean it up. There should never be a time where old data exists in your scratch folder. In this way, you can avoid a situation where you do not know or remember the age of a file.
My PI does not have space in their group storage. Where do I store my user scratch data?
You should discuss this issue with your PI. Group storage is available with higher quota limits for $60/TB/year. Contact help-rcc@mcw.edu if you require assistance in discussing this issue.
Remember, it is best to avoid this situation. Do not use scratch storage as a free general data storage. Do not use the cluster to generate results if you are not willing to store that data either in group storage, or some other lab solution.
I am a PI. How do I know if my past/present lab members have data stored in scratch that could be deleted?
We suggest to ask those lab members directly. The best person to identify a piece of data is the person that generated it. If you cannot contact that lab member, contact help-rcc@mcw.edu to arrange a consultation.
I have data in scratch now. Will my data be deleted?
This depends on the age of your data. Any data older than 60 days will be moved to a trash folder for another 14 days. After that time, it will be deleted. Avoid that situation by cleaning up your scratch folder early and often.
Is my scratch data backed up before you delete it?
No, data in /scratch
is not backed up, replicated, snapshotted, or protected by any other means. Again, this is temporary space only for data being processed in a SLURM job. Do not leave your data in scratch where it may be deleted or lost. Clean it up early and often by transferring results to group storage.