_index.md 5.51 KB
Newer Older
1
+++
2
3
4
title = "Handling Data"
description = "How to work with and transfer data to/from HCC resources."
weight = "30"
5
6
+++

eharstad's avatar
eharstad committed
7
{{% panel theme="danger" header="**Sensitive and Protected Data**" %}}HCC currently has *no storage* that is suitable for **HIPAA** or other **PID** data sets.  Users are not permitted to store such data on HCC machines.{{% /panel %}}
8
9

All HCC machines have three separate areas for every user to store data,
10
11
each intended for a different purpose.   In addition, we have a transfer
service that utilizes [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}).
Carrie A Brown's avatar
Carrie A Brown committed
12
{{< figure src="/images/35325560.png" height="500" class="img-border">}}
13

eharstad's avatar
eharstad committed
14
15
---
### Home Directory
16

eharstad's avatar
eharstad committed
17
{{% notice info %}}
18
19
You can access your home directory quickly using the $HOME environmental
variable (i.e. '`cd $HOME'`).
eharstad's avatar
eharstad committed
20
{{% /notice %}}
21
22

Your home directory (i.e. `/home/[group]/[username]`) is meant for items
23
24
25
26
27
that take up relatively small amounts of space.  For example:  source
code, program binaries, configuration files, etc.  This space is
quota-limited to **20GB per user**.  The home directories are backed up
for the purposes of best-effort disaster recovery.  This space is not
intended as an area for I/O to active jobs.  **/home** is mounted
28
29
**read-only** on cluster worker nodes to enforce this policy.

eharstad's avatar
eharstad committed
30
31
---
### Common Directory
32

eharstad's avatar
eharstad committed
33
{{% notice info %}}
34
35
You can access your common directory quickly using the $COMMON
environmental variable (i.e. '`cd $COMMON`')
eharstad's avatar
eharstad committed
36
{{% /notice %}}
37
38
39

The common directory operates similarly to work and is mounted with
**read and write capability to worker nodes all HCC Clusters**. This
40
41
means that any files stored in common can be accessed from Crane or Rhino,
making this directory ideal for items that need to be
42
43
44
accessed from multiple clusters such as reference databases and shared
data files.

eharstad's avatar
eharstad committed
45
{{% notice warning %}}
46
47
48
Common is not designed for heavy I/O usage. Please continue to use your
work directory for active job output to ensure the best performance of
your jobs.
eharstad's avatar
eharstad committed
49
{{% /notice %}}
50
51
52
53
54
55
56

Quotas for common are **30 TB per group**, with larger quotas available
for purchase if needed. However, files stored here will **not be backed
up** and are **not subject to purge** at this time. Please continue to
backup your files to prevent irreparable data loss.

Additional information on using the common directories can be found in
eharstad's avatar
eharstad committed
57
the documentation on [Using the /common File System]({{< relref "using_the_common_file_system" >}})
58

eharstad's avatar
eharstad committed
59
60
---
### High Performance Work Directory
61

eharstad's avatar
eharstad committed
62
{{% notice info %}}
63
64
You can access your work directory quickly using the $WORK environmental
variable (i.e. '`cd $WORK'`).
eharstad's avatar
eharstad committed
65
{{% /notice %}}
66

eharstad's avatar
eharstad committed
67
{{% panel theme="danger" header="**File Loss**" %}}The `/work` directories are **not backed up**. Irreparable data loss is possible with a mis-typed command. See [Preventing File Loss]({{< relref "preventing_file_loss" >}}) for strategies to avoid this.{{% /panel %}}
68
69

Every user has a corresponding directory under /work using the same
70
71
naming convention as `/home` (i.e. `/work/[group]/[username]`).  We
encourage all users to use this space for I/O to running jobs.  This
72
directory can also be used when larger amounts of space are temporarily
73
74
75
needed.  There is a **50TB per group quota**; space in /work is shared
among all users.  It should be treated as short-term scratch space, and
**is not backed up**.  **Please use the `hcc-du` command to check your
76
own and your group's usage, and back up and clean up your files at
eharstad's avatar
eharstad committed
77
reasonable intervals in $WORK.**
78

eharstad's avatar
eharstad committed
79
---
80
81
82
### Purge Policy

HCC has a **purge policy on /work** for files that become dormant.
83
84
 After **6 months of inactivity on a file (26 weeks)**, an automated
purge process will reclaim the used space of these dormant files.  HCC
85
86
provides the **`hcc-purge`** utility to list both the summary and the
actual file paths of files that have been dormant for **24 weeks**.
87
 This list is periodically generated; the timestamp of the last search
88
is included in the default summary output when calling `hcc-purge` with
89
90
91
no arguments.  No output from `hcc-purge` indicates the last scan did
not find any dormant files.  `hcc-purge -l` will use the less pager to
list the matching files for the user.  The candidate list can also be
92
accessed at the following path:` /lustre/purge/current/${USER}.list`.
93
 This list is updated twice a week, on Mondays and Thursdays.
94

eharstad's avatar
eharstad committed
95
96
97
{{% notice warning %}}
`/work` is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout.
{{% /notice %}}
98
99
100

If you have space requirements outside what is currently provided,
please
101
email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> and
102
103
we will gladly discuss alternatives.

eharstad's avatar
eharstad committed
104
105
---
### [Attic]({{< relref "using_attic" >}})
106

107
108
109
110
Attic is a near line archive available for purchase at HCC.  Attic
provides reliable large data storage that is designed to be more
reliable then `/work`, and larger than `/home`. Access to Attic is done
through [Globus Connect]({{< relref "data_transfer/globus_connect/" >}}).
111
112
113
114
115

More details on Attic can be found on HCC's
<a href="https://hcc.unl.edu/attic" class="external-link">Attic</a>
website.

eharstad's avatar
eharstad committed
116
---
117
### [Globus Connect]({{< relref "data_transfer/globus_connect/" >}})
118
119

For moving large amounts of data into or out of HCC resources, users are
120
121
highly encouraged to consider using [Globus
Connect]({{< relref "data_transfer/globus_connect/" >}}).
122

eharstad's avatar
eharstad committed
123
124
---
### Using Box
125
126

You can use your [UNL
127
Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and
128
129
130
upload files from any of the HCC clusters.