Skip to content
Snippets Groups Projects
_index.md 5.4 KiB
Newer Older
title = "Data Storage"
description = "How to work with and transfer data to/from HCC resources."
weight = "30"
eharstad's avatar
eharstad committed
{{% panel theme="danger" header="**Sensitive and Protected Data**" %}}HCC currently has *no storage* that is suitable for **HIPAA** or other **PID** data sets.  Users are not permitted to store such data on HCC machines.{{% /panel %}}

All HCC machines have three separate areas for every user to store data,
each intended for a different purpose.   In addition, we have a transfer
service that utilizes [Globus Connect]({{< relref "../data_transfer/globus_connect" >}}).
{{< figure src="/images/35325560.png" >}}
eharstad's avatar
eharstad committed
---
### Home Directory
eharstad's avatar
eharstad committed
{{% notice info %}}
You can access your home directory quickly using the $HOME environmental
variable (i.e. '`cd $HOME'`).
eharstad's avatar
eharstad committed
{{% /notice %}}

Your home directory (i.e. `/home/[group]/[username]`) is meant for items
that take up relatively small amounts of space.  For example:  source
code, program binaries, configuration files, etc.  This space is
quota-limited to **20GB per user**.  The home directories are backed up
for the purposes of best-effort disaster recovery.  This space is not
intended as an area for I/O to active jobs. 
eharstad's avatar
eharstad committed
---
### Common Directory
eharstad's avatar
eharstad committed
{{% notice info %}}
You can access your common directory quickly using the $COMMON
environmental variable (i.e. '`cd $COMMON`')
eharstad's avatar
eharstad committed
{{% /notice %}}

The common directory operates similarly to work and is mounted with
**read and write capability to worker nodes all HCC Clusters**. This
Caughlin Bohn's avatar
Caughlin Bohn committed
means that any files stored in common can be accessed from Crane and Rhino, making this directory ideal for items that need to be
accessed from multiple clusters such as reference databases and shared
data files.

eharstad's avatar
eharstad committed
{{% notice warning %}}
Common is not designed for heavy I/O usage. Please continue to use your
work directory for active job output to ensure the best performance of
your jobs.
eharstad's avatar
eharstad committed
{{% /notice %}}

Quotas for common are **30 TB per group**, with larger quotas available
for purchase if needed. However, files stored here will **not be backed
up** and are **not subject to purge** at this time. Please continue to
backup your files to prevent irreparable data loss.

Additional information on using the common directories can be found in
eharstad's avatar
eharstad committed
the documentation on [Using the /common File System]({{< relref "using_the_common_file_system" >}})
eharstad's avatar
eharstad committed
---
### High Performance Work Directory
eharstad's avatar
eharstad committed
{{% notice info %}}
You can access your work directory quickly using the $WORK environmental
variable (i.e. '`cd $WORK'`).
eharstad's avatar
eharstad committed
{{% /notice %}}
eharstad's avatar
eharstad committed
{{% panel theme="danger" header="**File Loss**" %}}The `/work` directories are **not backed up**. Irreparable data loss is possible with a mis-typed command. See [Preventing File Loss]({{< relref "preventing_file_loss" >}}) for strategies to avoid this.{{% /panel %}}

Every user has a corresponding directory under /work using the same
naming convention as `/home` (i.e. `/work/[group]/[username]`).  We
encourage all users to use this space for I/O to running jobs.  This
directory can also be used when larger amounts of space are temporarily
needed.  There is a **50TB per group quota**; space in /work is shared
among all users.  It should be treated as short-term scratch space, and
**is not backed up**.  **Please use the `hcc-du` command to check your
own and your group's usage, and back up and clean up your files at
eharstad's avatar
eharstad committed
reasonable intervals in $WORK.**
eharstad's avatar
eharstad committed
---
### Purge Policy

HCC has a **purge policy on /work** for files that become dormant.
 After **6 months of inactivity on a file (26 weeks)**, an automated
purge process will reclaim the used space of these dormant files.  HCC
provides the **`hcc-purge`** utility to list both the summary and the
actual file paths of files that have been dormant for **24 weeks**.
 This list is periodically generated; the timestamp of the last search
is included in the default summary output when calling `hcc-purge` with
no arguments.  No output from `hcc-purge` indicates the last scan did
not find any dormant files.  `hcc-purge -l` will use the less pager to
list the matching files for the user.  The candidate list can also be
accessed at the following path:` /lustre/purge/current/${USER}.list`.
 This list is updated twice a week, on Mondays and Thursdays.
eharstad's avatar
eharstad committed
{{% notice warning %}}
`/work` is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout.
{{% /notice %}}

If you have space requirements outside what is currently provided,
please
email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> and
we will gladly discuss alternatives.

eharstad's avatar
eharstad committed
---
### [Attic]({{< relref "using_attic" >}})
Attic is a near line archive available for purchase at HCC.  Attic
provides reliable large data storage that is designed to be more
reliable then `/work`, and larger than `/home`. Access to Attic is done
through [Globus Connect]({{< relref "../data_transfer/globus_connect" >}}).

More details on Attic can be found on HCC's
<a href="https://hcc.unl.edu/attic" class="external-link">Attic</a>
website.

### [Globus Connect]({{< relref "../data_transfer/globus_connect" >}})

For moving large amounts of data into or out of HCC resources, users are
highly encouraged to consider using [Globus
Connect]({{< relref "../data_transfer/globus_connect" >}}).
eharstad's avatar
eharstad committed
---
### Using Box
Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and
upload files from any of the HCC clusters.