diff --git a/content/handling_data/_index.md b/content/handling_data/_index.md index 1b648b4155e1ae350ace2c08ba9673fa4bd9932b..4b16681a402d45389a8133ffeac1c40e75ad87df 100644 --- a/content/handling_data/_index.md +++ b/content/handling_data/_index.md @@ -7,124 +7,23 @@ weight = "30" {{% panel theme="danger" header="**Sensitive and Protected Data**" %}}HCC currently has *no storage* that is suitable for **HIPAA** or other **PID** data sets. Users are not permitted to store such data on HCC machines.{{% /panel %}} All HCC machines have three separate areas for every user to store data, -each intended for a different purpose. In addition, we have a transfer -service that utilizes [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}). +each intended for a different purpose. The three areas are `/common`, `/work`, and `/home`, each with different functions. `/home` is your home directory with a quota limit of **20GB** and is backed up for best-effort disaster recovery purposes. `/work` is the high performance, I/O focused directory for running jobs. `/work` has a **50TB per group quote**, is not backed-up and is subject to a [purge policy]({{<relref "data_storage/#purge-policy" >}}) of **6 months of inactivity on a file**. `/common` works similarly to `/work` and is mounted with read and write capabilities on all HCC clusters, meaning any files on `/common` can be accessed from all of HCC clusters unlike `/home` and `/work` which are cluster dependant. More information on the three storage areas on HCC's clusters are available in the [Data Storage]({{<relref "data_storage">}}) page. {{< figure src="/images/35325560.png" height="500" class="img-border">}} ---- -### Home Directory +HCC also offers a separate, near-line archive with space available for purchase called Attic. Attic provides reliable large data storage that is designed to be more reliable than `/work`, and larger than `/home`. More information on Attic and how to transfer data to and from Attic can be found on the [Using Attic]({{<relref "data_storage/using_attic">}}) page. -{{% notice info %}} -You can access your home directory quickly using the $HOME environmental -variable (i.e. '`cd $HOME'`). -{{% /notice %}} - -Your home directory (i.e. `/home/[group]/[username]`) is meant for items -that take up relatively small amounts of space. For example: source -code, program binaries, configuration files, etc. This space is -quota-limited to **20GB per user**. The home directories are backed up -for the purposes of best-effort disaster recovery. This space is not -intended as an area for I/O to active jobs. **/home** is mounted -**read-only** on cluster worker nodes to enforce this policy. - ---- -### Common Directory - -{{% notice info %}} -You can access your common directory quickly using the $COMMON -environmental variable (i.e. '`cd $COMMON`') -{{% /notice %}} - -The common directory operates similarly to work and is mounted with -**read and write capability to worker nodes all HCC Clusters**. This -means that any files stored in common can be accessed from Crane or Rhino, -making this directory ideal for items that need to be -accessed from multiple clusters such as reference databases and shared -data files. - -{{% notice warning %}} -Common is not designed for heavy I/O usage. Please continue to use your -work directory for active job output to ensure the best performance of -your jobs. -{{% /notice %}} - -Quotas for common are **30 TB per group**, with larger quotas available -for purchase if needed. However, files stored here will **not be backed -up** and are **not subject to purge** at this time. Please continue to -backup your files to prevent irreparable data loss. - -Additional information on using the common directories can be found in -the documentation on [Using the /common File System]({{< relref "using_the_common_file_system" >}}) - ---- -### High Performance Work Directory - -{{% notice info %}} -You can access your work directory quickly using the $WORK environmental -variable (i.e. '`cd $WORK'`). -{{% /notice %}} - -{{% panel theme="danger" header="**File Loss**" %}}The `/work` directories are **not backed up**. Irreparable data loss is possible with a mis-typed command. See [Preventing File Loss]({{< relref "preventing_file_loss" >}}) for strategies to avoid this.{{% /panel %}} - -Every user has a corresponding directory under /work using the same -naming convention as `/home` (i.e. `/work/[group]/[username]`). We -encourage all users to use this space for I/O to running jobs. This -directory can also be used when larger amounts of space are temporarily -needed. There is a **50TB per group quota**; space in /work is shared -among all users. It should be treated as short-term scratch space, and -**is not backed up**. **Please use the `hcc-du` command to check your -own and your group's usage, and back up and clean up your files at -reasonable intervals in $WORK.** +You can also use your [UNL Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and +upload files from any of the HCC clusters. ---- -### Purge Policy +For moving general data into or out of HCC Resources, users are recommended to use [scp]({{<relref "data_transfer/scp" >}}) for command line transfers on Windows 10, MacOS, and Linux, or for graphical transfers, [WinSCP]({{<relref "data_transfer/winscp" >}}) for Windows, and [CyberDuck]({{<relref "data_transfer/cyberduck" >}}) for MacOS and Linux -HCC has a **purge policy on /work** for files that become dormant. - After **6 months of inactivity on a file (26 weeks)**, an automated -purge process will reclaim the used space of these dormant files. HCC -provides the **`hcc-purge`** utility to list both the summary and the -actual file paths of files that have been dormant for **24 weeks**. - This list is periodically generated; the timestamp of the last search -is included in the default summary output when calling `hcc-purge` with -no arguments. No output from `hcc-purge` indicates the last scan did -not find any dormant files. `hcc-purge -l` will use the less pager to -list the matching files for the user. The candidate list can also be -accessed at the following path:` /lustre/purge/current/${USER}.list`. - This list is updated twice a week, on Mondays and Thursdays. +For moving large amounts of data into or out of HCC resources, users are highly encouraged to consider using [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}). -{{% notice warning %}} -`/work` is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout. -{{% /notice %}} -If you have space requirements outside what is currently provided, +If you have space requirements outside what is currently provided or any questions regarding moving data around, please -email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> and -we will gladly discuss alternatives. - ---- -### [Attic]({{< relref "using_attic" >}}) - -Attic is a near line archive available for purchase at HCC. Attic -provides reliable large data storage that is designed to be more -reliable then `/work`, and larger than `/home`. Access to Attic is done -through [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}). +email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a>. -More details on Attic can be found on HCC's -<a href="https://hcc.unl.edu/attic" class="external-link">Attic</a> -website. ---- -### [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}) - -For moving large amounts of data into or out of HCC resources, users are -highly encouraged to consider using [Globus -Connect]({{< relref "data_transfer/globus_connect/" >}}). - ---- -### Using Box - -You can use your [UNL -Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and -upload files from any of the HCC clusters.