Skip to content
Snippets Groups Projects
Commit 3992f8a8 authored by Adam Caprez's avatar Adam Caprez
Browse files

Merge branch 'fix_handling_data' into 'master'

Update handling data articles

See merge request !32
parents 51eeee2c a06ce44e
Branches
No related tags found
1 merge request!32Update handling data articles
......@@ -4,33 +4,20 @@ description = "How to work with and transfer data to/from HCC resources."
weight = "30"
+++
<span id="title-text"> HCC-DOCS : Handling Data </span>
=======================================================
Created by <span class="author"> Derek Weitzel</span>, last modified by
<span class="editor"> Carrie Brown</span> on Sep 18, 2018
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
HCC currently has no storage that is suitable for HIPAA or other PID
data sets. Users are not permitted to store such data on HCC machines.
{{% panel theme="danger" header="**Sensitive and Protected Data**" %}}HCC currently has *no storage* that is suitable for **HIPAA** or other **PID** data sets. Users are not permitted to store such data on HCC machines.{{% /panel %}}
All HCC machines have three separate areas for every user to store data,
each intended for a different purpose. In addition, we have a transfer
service that utilizes [Globus Connect](Globus-Connect_6357013.html).
<span
class="confluence-embedded-file-wrapper image-center-wrapper confluence-embedded-manual-size"><img src="assets/images/332256/35325560.png" class="confluence-embedded-image image-center" width="1000" /></span>
service that utilizes [Globus Connect]({{< relref "globus_connect" >}}).
{{< figure src="/images/35325560.png" >}}
Home Directory
--------------
<span
class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span>
---
### Home Directory
{{% notice info %}}
You can access your home directory quickly using the $HOME environmental
variable (i.e. '`cd $HOME'`).
{{% /notice %}}
Your home directory (i.e. `/home/[group]/[username]`) is meant for items
that take up relatively small amounts of space. For example: source
......@@ -40,28 +27,25 @@ for the purposes of best-effort disaster recovery.  This space is not
intended as an area for I/O to active jobs. **/home** is mounted
**read-only** on cluster worker nodes to enforce this policy.
Common Directory
----------------
<span
class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span>
---
### Common Directory
{{% notice info %}}
You can access your common directory quickly using the $COMMON
environmental variable (i.e. '`cd $COMMON`')
{{% /notice %}}
The common directory operates similarly to work and is mounted with
**read and write capability to worker nodes all HCC Clusters**. This
means that any files stored in common can be accessed from Crane, Tusker
and Sandhills making this directory ideal for items that need to be
means that any files stored in common can be accessed from Crane and Tusker, making this directory ideal for items that need to be
accessed from multiple clusters such as reference databases and shared
data files.
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
{{% notice warning %}}
Common is not designed for heavy I/O usage. Please continue to use your
work directory for active job output to ensure the best performance of
your jobs.
{{% /notice %}}
Quotas for common are **30 TB per group**, with larger quotas available
for purchase if needed. However, files stored here will **not be backed
......@@ -69,23 +53,17 @@ up** and are **not subject to purge** at this time. Please continue to
backup your files to prevent irreparable data loss.
Additional information on using the common directories can be found in
the documentation on [Using the /common File System](30444241.html)
the documentation on [Using the /common File System]({{< relref "using_the_common_file_system" >}})
High Performance Work Directory
-------------------------------
<span
class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span>
---
### High Performance Work Directory
{{% notice info %}}
You can access your work directory quickly using the $WORK environmental
variable (i.e. '`cd $WORK'`).
{{% /notice %}}
<span
class="aui-icon aui-icon-small aui-iconfont-error confluence-information-macro-icon"></span>
The `/work` directories are **not backed up**. Irreparable data loss is
possible with a mis-typed command. See [Preventing File
Loss](Preventing-File-Loss_29065313.html) for strategies to avoid this.
{{% panel theme="danger" header="**File Loss**" %}}The `/work` directories are **not backed up**. Irreparable data loss is possible with a mis-typed command. See [Preventing File Loss]({{< relref "preventing_file_loss" >}}) for strategies to avoid this.{{% /panel %}}
Every user has a corresponding directory under /work using the same
naming convention as `/home` (i.e. `/work/[group]/[username]`). We
......@@ -93,11 +71,11 @@ encourage all users to use this space for I/O to running jobs.  This
directory can also be used when larger amounts of space are temporarily
needed. There is a **50TB per group quota**; space in /work is shared
among all users. It should be treated as short-term scratch space, and
**is not backed up**. <span style="color: rgb(255,0,0);"><span
style="color: rgb(0,0,0);">Please use the `hcc-du` command to check your
**is not backed up**. **Please use the `hcc-du` command to check your
own and your group's usage, and back up and clean up your files at
reasonable intervals in $WORK.</span></span>
reasonable intervals in $WORK.**
---
### Purge Policy
HCC has a **purge policy on /work** for files that become dormant.
......@@ -113,58 +91,39 @@ list the matching files for the user.  The candidate list can also be
accessed at the following path:` /lustre/purge/current/${USER}.list`.
This list is updated twice a week, on Mondays and Thursdays.
<span
class="aui-icon aui-icon-small aui-iconfont-error confluence-information-macro-icon"></span>
/work is intended for recent job output and not long term storage.
Evidence of circumventing the purge policy by users will result in
consequences including account lockout.
{{% notice warning %}}
`/work` is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout.
{{% /notice %}}
If you have space requirements outside what is currently provided,
please
email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> and
we will gladly discuss alternatives.
[Attic](Using-Attic_11635580.html)
----------------------------------
---
### [Attic]({{< relref "using_attic" >}})
Attic is a near line archive available for purchase at HCC. Attic
provides reliable large data storage that is designed to be more
reliable then `/work`, and larger than `/home`. Access to Attic is done
through [Globus Connect](Globus-Connect_6357013.html).
through [Globus Connect]({{< relref "globus_connect" >}}).
More details on Attic can be found on HCC's
<a href="https://hcc.unl.edu/attic" class="external-link">Attic</a>
website.
<span style="color: rgb(0,0,0);line-height: 1.4285715;font-size: 20.0px;">[Globus Connect](Globus-Connect_6357013.html)</span>
------------------------------------------------------------------------------------------------------------------------------
---
### [Globus Connect]({{< relref "globus_connect" >}})
For moving large amounts of data into or out of HCC resources, users are
highly encouraged to consider using [Globus
Connect](Globus-Connect_6357013.html).
Connect]({{< relref "globus_connect" >}}).
Using Box
---------
---
### Using Box
You can use your [UNL
Box.com](Integrating-Box-with-HCC_8192521.html) account to download and
Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and
upload files from any of the HCC clusters.
Attachments:
------------
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[HCCStorageOptions\_cb\_edits.pdf](attachments/332256/30444364.pdf)
(application/pdf)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[HCCStorageOptions\_cb\_edits.png](attachments/332256/30444365.png)
(image/png)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[StorageOptions.png](attachments/332256/35325560.png) (image/png)
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
+++
title = "Data for UNMC Users Only"
description= "Data storage options for UNMC users"
weight = 50
+++
<span id="title-text"> HCC-DOCS : Data for UNMC users only </span>
==================================================================
Created by <span class="author"> Mako Furukawa Furukawa</span>, last
modified on Apr 07, 2014
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
HCC currently has no storage that is suitable for HIPAA or other PID
{{% panel theme="danger" header="Sensitive and Protected Data" %}} HCC currently has no storage that is suitable for HIPAA or other PID
data sets. Users are not permitted to store such data on HCC machines.
Tusker and Crane have a special directory, only for UNMC users. Please
note that this filesystem is still not suitable for HIPAA or other PID
data sets.
{{% /panel %}}
Transferring files to this machine from UNMC.
---------------------------------------------
---
### Transferring files to this machine from UNMC.
You will need to email us
at <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> to
......@@ -28,8 +20,8 @@ gain access to this machine. Once you do, you can sftp to 10.14.250.1
and upload your files. Note that sftp is your only option. You may use
different sftp utilities depending on your platform you are logging in
from. Email us if you need help with this. Once you are logged in, you
should be at /volumes/UNMC1ZFS/\[group\]/\[username\], or
/home/\[group\]/\[username\]. Both are the same location and you will be
should be at `/volumes/UNMC1ZFS/[group]/[username]`, or
`/home/[group]/[username]`. Both are the same location and you will be
allowed to write files there.
For Windows, learn more about logging in and uploading files
......@@ -38,17 +30,14 @@ For Windows, learn more about logging in and uploading files
Using your uploaded files on Tusker or Crane.
---------------------------------------------
<span style="color: rgb(51,51,51);"><span
style="font-size: 14.0px;line-height: 1.4285715;">Using your
uploaded </span><span
style="font-size: 14.0px;line-height: 20.0px;">files</span><span
style="font-size: 14.0px;line-height: 1.4285715;"> is easy. Just go to
/shared/unmc1/\[group\]/\[username\] and your files will be in the same
Using your
uploaded files is easy. Just go to
`/shared/unmc1/[group]/[username]` and your files will be in the same
place. You may notice that the directory is not available at times. This
is because the unmc1 directory is automounted. This means, if you try to
go to the directory, it will show up. Just "cd" to
/shared/unmc1/\[group\]/\[username\] and all of the files will be
there.</span></span>
go to the directory, it will show up. Just "`cd`" to
`/shared/unmc1/[group]/[username]` and all of the files will be
there.
If you have space requirements outside what is currently provided,
please
......
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
<span id="title-text"> HCC-DOCS : High-Speed Data Transfers </span>
===================================================================
Created by <span class="author"> Emelie Harstad</span>, last modified by
<span class="editor"> Josh Samuelson</span> on May 17, 2018
Tusker, Crane and Sandhills each have a dedicated transfer server with
10 Gb/s connectivity (Sandhills currently limited to 1 Gb/s) that allows
+++
title = "High Speed Data Transfers"
description = "How to transfer files directly from the transfer servers"
weight = 10
+++
Crane, Tusker, and Attic each have a dedicated transfer server with
10 Gb/s connectivity that allows
for faster data transfers than the login nodes. With [Globus
Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect), users
Connect]({{< relref "globus_connect" >}}), users
can take advantage of this connection speed when making large/cumbersome
transfers.
<span style="line-height: 1.4285715;">
</span>
<span style="line-height: 1.4285715;">Those who prefer scp, sftp or
Those who prefer scp, sftp or
rsync clients can also benefit from this high-speed connectivity by
using these dedicated servers for data transfers:</span>
For Tusker transfers, use:
For Crane transfers, use:
using these dedicated servers for data transfers:
For Sandhills Transfers, use:
`tusker-xfer.unl.edu`
`crane-xfer.unl.edu`
`sandhills-xfer.unl.edu`
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
Cluster | Transfer server
----------|----------------------
Crane | `crane-xfer.unl.edu`
Tusker | `tusker-xfer.unl.edu`
Attic | `attic-xfer.unl.edu`
{{% notice info %}}
Because the transfer servers are login-disabled, third-party transfers
between `tusker-xfer,` `crane-xfer` and `sandhills-xfer` must be done
via [Globus
Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect).
between `crane-xfer`, `tusker-xfer,` and `attic-xfer` must be done via [Globus Connect]({{< relref "globus_connect" >}}).
{{% /notice %}}
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
<span id="title-text"> HCC-DOCS : Integrating Box with HCC </span>
==================================================================
Created by <span class="author"> Derek Weitzel</span>, last modified by
<span class="editor"> Adam Caprez</span> on Oct 11, 2018
+++
title = "Integrating Box with HCC"
description = "How to integrate Box with HCC"
weight = 30
+++
UNL has come to an arrangement
with <a href="https://www.box.com/" class="external-link">Box.com</a> to
......@@ -17,221 +12,81 @@ results when the job has completed.  Combined with
<a href="https://sites.box.com/sync4/" class="external-link">Box Sync</a>,
the uploaded files can be sync'd to your laptop or desktop upon job
completion. The upload and download speed of Box is about 20 to 30 MB/s
in good network traffic conditions. There are two programs that can be
used to transfer files to/from Box - `cadaver` or `lftp`. Instructions
are provided for both options
Step-by-step guide for Lftp
---------------------------
1. Create an external password for Box as described in steps 1 and 2
in the Cadaver instructions below.
2. Load the `lftp` module:
**Load the lftp module**
``` syntaxhighlighter-pre
module load lftp
```
3. Connect to Box using your full email as the username and external
password you created:
**Connect to Box**
``` syntaxhighlighter-pre
lftp -u <username>,<password> ftps://ftp.box.com
```
4. Test the connection by running the `ls` command. You should see a
listing of your Box files. Assuming it works, add a bookmark named
"box" to use when connecting later:
**Add lftp bookmark**
``` syntaxhighlighter-pre
lftp demo2@unl.edu@ftp.box.com:/> bookmark add box
```
5. Exit `lftp` by typing `quit`. To reconnect later, use bookmark
name:
**Connect using bookmark name**
``` syntaxhighlighter-pre
lftp box
```
6. To upload or download files, use the `get` and `put` commands. For
example:
**Transferring files**
``` syntaxhighlighter-pre
[demo@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> put myfile.txt
lftp demo2@unl.edu@ftp.box.com:/> get my_other_file.txt
```
7. To download directories, use the `mirror` command. To upload
directories, use the `mirror` command with the `-R` option. For
example, to download a directory named `my_box_dir` to your current
directory:
**Download a directory from Box**
``` syntaxhighlighter-pre
[demo@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> mirror my_box_dir
```
To upload a directory named `my_hcc_dir` to Box, use `mirror` with
the `-R` option:
**Upload a directory to Box**
``` syntaxhighlighter-pre
[demo@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> mirror -R my_hcc_dir
```
8. Lftp also supports using scripts to transfer files. This can be
used to automatic downloading or uploading files during jobs. <span
style="color: rgb(0,0,0);">For example, create a file called
"transfer.sh" with the following lines:</span>
**transfer.sh**
``` syntaxhighlighter-pre
open box
get some_input_file.tar.gz
put my_output_file.tar.gz
```
To run this script, do:
**Run transfer.sh**
``` syntaxhighlighter-pre
module load lftp
lftp -f transfer.sh
```
Step-by-step guide for Cadaver
------------------------------
1. You need to create your UNL
<a href="http://Box.com" class="external-link">Box.com</a> account
<a href="http://box.unl.edu/" class="external-link">here</a>.
2. Since we are going to be using
<a href="https://en.wikipedia.org/wiki/WebDAV" class="external-link">webdav</a> protocol
to access your
<a href="http://Box.com" class="external-link">Box.com</a> storage,
you need to create an **External Password**. In the
<a href="http://Box.com" class="external-link">Box.com</a>
interface, you can create it
at **<a href="https://unl.app.box.com/settings" class="external-link">Account Settings</a>** &gt; **Create
External Password.
<span
class="confluence-embedded-file-wrapper confluence-embedded-manual-size"><img src="assets/images/8192521/8126683.png" class="confluence-embedded-image" width="747" height="185" /></span>**
3. Create a
<a href="http://www.mavetju.org/unix/netrc.php" class="external-link">.netrc</a>
file in order to automatically login to your box account without
typing the password. The file needs to be in your home directory,
ie `~/.netrc`. You can easily create this file using the nano text
editor by using the command:
``` syntaxhighlighter-pre
nano ~/.netrc
```
The file should contain the following lines:
``` syntaxhighlighter-pre
machine dav.box.com
login <box_username>@unl.edu
password <external_password>
```
Once you have typed or pasted these lines into the file, press
CTRL-X to exit. Follow the prompts to save the file as `.netrc`.
4. Be sure to have the correct permissions on the file. You can change
the permissions with the command:
``` syntaxhighlighter-pre
$ chmod 600 ~/.netrc
```
5. Try out the webdav client by issuing the command:
``` syntaxhighlighter-pre
$ cadaver https://dav.box.com/dav
```
It should give you a prompt like:
``` syntaxhighlighter-pre
dav:/dav/>
```
Within this prompt, you can view files and navigate through the file
system using the usual Bash commands **cd** and **ls**. To download
files from Box, use the command:
``` syntaxhighlighter-pre
get <filename>
```
Or, alternately, to upload files to your Box, use:
``` syntaxhighlighter-pre
put <filename>
```
To exit the prompt, press **ctrl-d**
6. Within a submit script, you can upload and download files by using
commands such as:
``` syntaxhighlighter-pre
#!/bin/sh
#SBATCH ...
....
cat << EOF | cadaver https://dav.box.com/dav
get inputfile.txt
EOF
cat << EOF | cadaver https://dav.box.com/dav
put outputfile.txt
EOF
```
7. The files should automatically appear in your Box account, and be
sync'd to your computer if you have the
<a href="https://sites.box.com/sync4/" class="external-link">sync client</a>
installed.
Related articles
----------------
- <span class="icon aui-icon aui-icon-small aui-iconfont-page-default"
title="Page">Page:</span>
[Integrating Box with HCC](/display/HCCDOC/Integrating+Box+with+HCC)
- <span class="icon aui-icon aui-icon-small aui-iconfont-page-default"
title="Page">Page:</span>
[Handling Data](/display/HCCDOC/Handling+Data)
Attachments:
------------
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" /> [Screen
Shot 2014-08-14 at 4.55.18 PM.png](attachments/8192521/8126683.png)
(image/png)
in good network traffic conditions. Users can use a tool called lftp to transfer files between HCC clusters and their Box accounts.
---
### Step-by-step guide for Lftp
1. You need to create your UNL [Box.com](https://www.box.com/) account [here](https://box.unl.edu/).
2. Since we are going to be using [webdav](https://en.wikipedia.org/wiki/WebDAV) protocol to access your [Box.com](https://www.box.com/) storage, you need to create an **External Password**. In the [Box.com](https://www.box.com/) interface, you can create it at **[Account Settings](https://unl.app.box.com/settings) > Create External Password.**
{{< figure src="/images/box_create_external_password.png" class="img-border" >}}
3. After logging into the cluster of your choice, load the `lftp` module by entering the command below at the prompt:
{{% panel theme="info" header="Load the lftp module" %}}
{{< highlight bash >}}
module load lftp
{{< /highlight >}}
{{% /panel %}}
4. Connect to Box using your full email as the username and external password you created:
{{% panel theme="info" header="Connect to Box" %}}
{{< highlight bash >}}
lftp -u <username>,<password> ftps://ftp.box.com
{{< /highlight >}}
{{% /panel %}}
5. Test the connection by running the `ls` command. You should see a listing of your Box files. Assuming it works, add a bookmark named "box" to use when connecting later:
{{% panel theme="info" header="Add lftp bookmark" %}}
{{< highlight bash >}}
lftp demo2@unl.edu@ftp.box.com:/> bookmark add box
{{< /highlight >}}
{{% /panel %}}
6. Exit `lftp` by typing `quit`. To reconnect later, use bookmark name:
{{% panel theme="info" header="Connect using bookmark name" %}}
{{< highlight bash >}}
lftp box
{{< /highlight >}}
{{% /panel %}}
7. To upload or download files, use `get` and `put` commands. For example:
{{% panel theme="info" header="Transferring files" %}}
{{< highlight bash >}}
[demo2@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> put myfile.txt
lftp demo2@unl.edu@ftp.box.com:/> get my_other_file.txt
{{< /highlight >}}
{{% /panel %}}
8. To download directories, use the `mirror` command. To upload directories, use the `mirror` command with the `-R` option. For example, to download a directory named `my_box-dir` to your current directory:
{{% panel theme="info" header="Download a directory from Box" %}}
{{< highlight bash >}}
[demo2@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> mirror my_box_dir
{{< /highlight >}}
{{% /panel %}}
To upload a directory named `my_hcc_dir` to Box, use `mirror` with the `-R` option:
{{% panel theme="info" header="Upload a directory to Box" %}}
{{< highlight bash >}}
[demo2@login.crane ~]$ lftp box
lftp demo2@unl.edu@ftp.box.com:/> mirror -R my_hcc_dir
{{< /highlight >}}
{{% /panel %}}
9. Lftp also supports using scripts to transfer files. This can be used to automatically download or upload files during jobs. For example, create a file called "transfer.sh" with the following lines:
{{% panel theme="info" header="transfer.sh" %}}
{{< highlight bash >}}
open box
get some_input_file.tar.gz
put my_output_file.tar.gz
{{< /highlight >}}
{{% /panel %}}
To run this script, do:
{{% panel theme="info" header="Run transfer.sh" %}}
{{< highlight bash >}}
module laod lftp
lftp -f transfer.sh
{{< /highlight >}}
{{% /panel %}}
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
<span id="title-text"> HCC-DOCS : Preventing File Loss </span>
==============================================================
Created by <span class="author"> Adam Caprez</span>, last modified by
<span class="editor"> Carrie Brown</span> on Feb 09, 2018
+++
title = "Preventing File Loss"
description = "How to prevent file loss on HCC clusters"
weight = 60
+++
Each research group is allocated 50TB of storage in `/work` on HCC
clusters. With over 400 active groups, HCC does not have the resources
......@@ -23,6 +18,7 @@ needs. For truly robust file backups, we recommend combining multiple
methods. For example, use Git regularly along with manual backups to an
external hard-drive at regular intervals such as monthly or biannually.
---
### 1. Use your local machine:
If you have sufficient hard drive space, regularly backup your `/work`
......@@ -30,11 +26,11 @@ directories to your personal computer. To avoid filling up your personal
hard-drives, consider using an external drive that can easily be placed
in a fireproof safe or at an off-site location for an extra level of
protection. To do this, you can either use [Globus
Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect) or an
Connect]({{< relref "globus_connect" >}}) or an
SCP client, such
as <a href="https://cyberduck.io/" class="external-link">Cyberduck</a> or <a href="https://winscp.net/eng/index.php" class="external-link">WinSCP</a>.
For help setting up an SCP client, check out our [Quick Start
Guides](https://hcc-docs.unl.edu/display/HCCDOC/Quick+Start+Guides).
Guides]({{< relref "/quickstarts" >}}).
For those worried about personal hard drive crashes, UNL
offers <a href="http://nsave.unl.edu/" class="external-link">the backup service NSave</a>.
......@@ -43,19 +39,20 @@ automatically backup selected files from their personal machine.
Benefits:
- Gives you full control over what is backed up and when.
- Gives you full control over what is backed up and when.
- Doesn't require the use of third party servers (when using SCP
clients).
- Take advantage of our high speed data transfers (10 Gb/s) when using
Globus Connect or [setup your SCP client to use our dedicated high
speed transfer
servers](https://hcc-docs.unl.edu/display/HCCDOC/High-Speed+Data+Transfers)
servers]({{< relref "high_speed_data_transfers" >}})
Limitations:
- The amount you can backup is limited by available hard-drive space.
- Manual backups of many files can be time consuming.
---
### 2. Use Git to preserve files and revision history:
Git is a revision control service which can be run locally or can be
......@@ -84,12 +81,13 @@ Limitations:
tracking files over 1GB in size can be time consuming and lead to
errors when using other repository hosts.
---
### 3. Use Attic:
HCC offers
long-term, <a href="https://en.wikipedia.org/wiki/Nearline_storage" class="external-link">near-line</a> data
storage
through [Attic](https://hcc-docs.unl.edu/display/HCCDOC/Using+Attic).
through [Attic]({{< relref "using_attic" >}}).
HCC users with an existing account
can <a href="http://hcc.unl.edu/attic" class="external-link">apply for an Attic account</a> for
a <a href="http://hcc.unl.edu/priority-access-pricing" class="external-link">small annual fee</a> that
......@@ -102,14 +100,15 @@ Benefits:
layer against file loss.
- No limits on individual or total file sizes.
- High speed data transfers between Attic and the clusters when using
Globus Connect and [HCC's high-speed data
servers](https://hcc-docs.unl.edu/display/HCCDOC/High-Speed+Data+Transfers).
[Globus Connect]({{< relref "globus_connect" >}}) and [HCC's high-speed data
servers]({{< relref "high_speed_data_transfers" >}}).
Limitations:
- Backups must be done manually which can be time consuming. Setting
up automated scripts can help speed up this process.
---
### 4. Use a cloud-based service, such as Box:
Many of us are familiar with services such as Google Drive, Dropbox, Box
......@@ -117,8 +116,8 @@ and OneDrive. These cloud-based services provide a convenient portal for
accessing your files from any computer. NU offers OneDrive and Box
services to all students, staff and faculty. But did you know that you
can link your Box account to HCC’s clusters to provide quick and easy
access to files stored there? [Follow a few set-up
steps](https://hcc-docs.unl.edu/display/HCCDOC/Integrating+Box+with+HCC) and
access to files stored there? [Follow a few set-up
steps]({{< relref "integrating_box_with_hcc" >}}) and
you can add files to and access files stored in your Box account
directly from HCC clusters. Setup your submit scripts to automatically
upload results as they are generated or use it interactively to store
......@@ -137,7 +136,8 @@ Limitations:
- Box has individual file size limitations, larger files will need to
be backed up using an alternate method.
### 5. Copy important files to /home:
---
### 5. Copy important files to `/home`:
While `/work` files and directories are not backed up, files and
directories in `/home` are backed up on a daily basis. Due to the
......@@ -149,7 +149,7 @@ the cluster.
Benefits:
- No need to make manual backups. home files are automatically backed
- No need to make manual backups. `\home` files are automatically backed
up daily.
- Files in `/home` are not subject to the 6 month purge policy that
exists on `/work`.
......
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
<span id="title-text"> HCC-DOCS : Using Attic </span>
=====================================================
Created by <span class="author"> Zhongtao Zhang</span>, last modified by
<span class="editor"> Natasha Pavlovikj</span> on Oct 19, 2018
+++
title = "Using Attic"
description = "How to store data on Attic"
weight = 20
+++
For users who need long-term storage for large amount of data, HCC
provides an economical solution called Attic. Attic is a reliable
<a href="https://en.wikipedia.org/wiki/Nearline_storage" class="external-link">near-line data archive</a> storage
system. The files in Attic can be accessed and shared from anywhere
using [Globus
Connect](https://hcc-docs.unl.edu/display/ADMIN/Globus+Connect+Usage),
Connect]({{< relref "globus_connect" >}}),
with a fast 10Gb/s link. Also, the data in Attic is backed up between
our Lincoln and Omaha facilities to ensure high availability and
disaster tolerance. The data and user activities on Attic are subject to
our
<a href="http://hcc.unl.edu/hcc-policies" class="external-link">HCC Policies</a>.
Accounts and Cost
-----------------
---
### Accounts and Cost
To use Attic you will first need an
<a href="https://hcc.unl.edu/new-user-request" class="external-link">HCC account</a>, and
then you may apply for an
<a href="http://hcc.unl.edu/attic" class="external-link">Attic account</a>.
then you may request an
<a href="http://hcc.unl.edu/attic" class="external-link">Attic allocation</a>.
We charge a small fee per TB per year, but it is cheaper than most
commercial cloud storage solutions. For the user application form and
cost, please see the
<a href="http://hcc.unl.edu/attic" class="external-link">HCC Attic page</a>.
Transfer Files Using Globus Connect
-----------------------------------
---
### Transfer Files Using Globus Connect
The easiest and fastest way to access Attic is via Globus. You can
transfer files between your computer, our clusters ($HOME and $WORK on
Crane, Tusker, and Sandhills), and Attic. Here is a detailed tutorial on
how to set up and use [Globus Connect](Globus-Connect_6357013.html). For
transfer files between your computer, our clusters ($HOME, $WORK, and $COMMON on
Crane and Tusker), and Attic. Here is a detailed tutorial on
how to set up and use [Globus Connect]({{< relref "globus_connect" >}}). For
Attic, use the Globus Endpoint **hcc\#attic**. Your Attic files are
located at `~, `which is a shortcut
for `/attic/<groupname>/<username>`.
......@@ -47,24 +42,22 @@ for `/attic/<groupname>/<username>`.
group, you should explicitly set the path to
/attic/&lt;supplementary\_groupname&gt;/. If you don't do that, by
default the endpoint will try to place you in your primary group's Attic
path, which access will be denied if the primary group doesn't have
Attic allocation.*
Transfer Files Using SCP/SFTP/RSYNC
-----------------------------------
The transfer server for Attic storage is named `attic.unl.edu`.
path, to which access will be denied if the primary group doesn't have an Attic allocation.*
**SCP Example**
---
### Transfer Files Using SCP/SFTP/RSYNC
``` syntaxhighlighter-pre
$ scp /source/file <username>@attic.unl.edu:~/destination/file
```
The transfer server for Attic storage is `attic.unl.edu` (or `attic-xfer.unl.edu`).
**SFTP example**
{{% panel theme="info" header="SCP Example" %}}
{{< highlight bash >}}
scp /source/file <username>@attic.unl.edu:~/destination/file
{{< /highlight >}}
{{% /panel %}}
``` syntaxhighlighter-pre
$ sftp <username>@attic.unl.edu
{{% panel theme="info" header="SFTP Example" %}}
{{< highlight bash >}}
sftp <username>@attic.unl.edu
Password:
Duo two-factor login for <username>
Connected to attic.unl.edu.
......@@ -72,38 +65,35 @@ sftp> pwd
Remote working directory: /attic/<groupname>/<username>
sftp> put source/file destination/file
sftp> exit
```
{{< /highlight >}}
{{% /panel %}}
**RSYNC Example**
``` syntaxhighlighter-pre
{{% panel theme="info" header="RSYNC Example" %}}
{{< highlight bash >}}
# local to remote rsync command
$ rsync -avz /local/source/path <username>@attic.unl.edu:remote/destination/path
rsync -avz /local/source/path <username>@attic.unl.edu:remote/destination/path
# remote to local rsync command
$ rsync -avz <username>@attic.unl.edu:remote/source/path /local/destination/path
```
rsync -avz <username>@attic.unl.edu:remote/source/path /local/destination/path
{{< /highlight >}}
{{% /panel %}}
You can also access your data on Attic using our three [high-speed
transfer servers](High-Speed-Data-Transfers_6946857.html) if you prefer.
You can also access your data on Attic using our [high-speed
transfer servers]({{< relref "high_speed_data_transfers" >}}) if you prefer.
Simply use scp or sftp to connect to one of the transfer servers, and
your directory is mounted at `/attic/<groupname>/<username>`.
Check Attic Usage
-----------------
---
### Check Attic Usage
The usage and quota information for your group and the users in the
group are stored in a filed named "disk\_usage.txt" in your group's
directory (`/attic/<groupname>`). You can use either Globus Connect or
group are stored in a file named "disk\_usage.txt" in your group's
directory (`/attic/<groupname>`). You can use either [Globus Connect]({{< relref "globus_connect" >}}) or
scp to download it. Your usage and expiration is also shown in the web
interface (see below).
Use the web interface
---------------------
---
### Use the web interface
For convenience, a web interface is also provided. Simply go to
<a href="https://attic.unl.edu" class="external-link">https://attic.unl.edu</a>
......
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Handling Data](Handling-Data_332256.html)
+++
title = "Using the /common File System"
description = "How to use HCC's /common file system"
weight = 70
+++
<span id="title-text"> HCC-DOCS : Using the /common File System </span>
=======================================================================
Created by <span class="author"> David Swanson</span>, last modified on
Sep 10, 2018
<span style="color: rgb(0,0,0);">Quick overview: </span>
--------------------------------------------------------
### Quick overview:
- Connected read/write to all HCC HPC cluster resources – you will see
the same files "in common" on any HCC cluster (i.e. crane, tusker,
sandhills)
the same files "in common" on any HCC cluster (i.e. Crane and Tukser).
- 30 TB Per-group quota at no charge – larger quota available for
$100/TB/year
- No backups are made! Don't be silly! Precious data should still be
......@@ -23,58 +16,53 @@ Sep 10, 2018
disk failure or user error, they won't be removed by the purge
scripts.
Accessing common
---
### Accessing common
<span
class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span>
Your /common directory can be accessed via the `$COMMON` environment
variable, i.e. `cd $COMMON.`
{{% notice info %}}
Your `/common` directory can be accessed via the `$COMMON` environment variable, i.e. `cd $COMMON.`
{{% /notice %}}
How should I use /common?
---
### How should I use `/common`?
- Store things that are routinely needed on multiple clusters
- /common is a network attached FS, so limit the number of files per
directory (1 million files in a directory is a very bad idea)
- I**f you are accessing /common for a job, you will need to add a
- Store things that are routinely needed on multiple clusters.
- `/common` is a network attached FS, so limit the number of files per
directory (1 million files in a directory is a very bad idea).
- **If you are accessing `/common` for a job, you will need to add a
line to your submission script! **
- We have each user check out a "license" to access /common for a
given job
- this allows us to know exactly who is accessing it, and for how
- We have each user check out a "license" to access `/common` for a
given job.
- This allows us to know exactly who is accessing it, and for how
long, in case of the need for a shut down so we can try to avoid
killing jobs whenever possible
- it also allows us to limit how many jobs can hammer this single
filesystem so it remains healthy and happy
<span style="color: rgb(0,0,0);">To gain access to the path on worker
nodes, a job must be </span><span style="color: rgb(0,0,0);">submitted
with the following SLURM directive:</span>
**SLURM Submit File**
``` syntaxhighlighter-pre
killing jobs whenever possible.
- It also allows us to limit how many jobs can hammer this single
filesystem so it remains healthy and happy.
To gain access to the path on worker
nodes, a job must be submitted
with the following SLURM directive:
{{% panel theme="info" header="**SLURM Submit File**" %}}
{{< highlight bash >}}
#SBATCH --licenses=common
```
{{< /highlight >}}
{{% /panel %}}
<span style="color: rgb(0,0,0);">If a job lacks the above Slurm
directive, /common will not be accessible from </span><span
style="color: rgb(0,0,0);">the</span><span
style="color: rgb(0,0,0);"> worker nodes. (<span
style="">Briefly</span>, this construct will allow us
If a job lacks the above SLURM directive, `/common` will not be accessible from the worker nodes. (Briefly, this construct will allow us
to quickly do maintenance on a single cluster without having to unmount
$COMMON </span><span style="color: rgb(0,0,0);">from</span><span
style="color: rgb(0,0,0);"> all HCC resources). </span>
`$COMMON` from all HCC resources).
<span style="color: rgb(0,0,0);">What should I **not** do when using /common? </span>
-------------------------------------------------------------------------------------
---
### What should I **not** do when using `/common`?
- Don't use it for high I/O work flows, use /work for that – /common
should mostly be used to read largely static files or data
- Don't use it for high I/O work flows, use `/work` for that – `/common`
should mostly be used to read largely static files or data.
- Do not expect your compiled program binaries to work everywhere!
/common is available on machines with different cpu architecture,
`/common` is available on machines with different cpu architecture,
different network connections, and so on. *caveat emptor!*
- Serial codes will not be optimized for all clusters
- Serial codes will not be optimized for all clusters.
- MPI codes, in particular, will likely not work unless recompiled
for each cluster
- if you use `module `things should be just fine!
for each cluster.
- If you use `module` things should be just fine!
static/images/35325560.png

990 KiB

static/images/box_create_external_password.png

6.63 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment