diff --git a/content/guides/handling_data/_index.md b/content/guides/handling_data/_index.md index f562d43d776835394afe586c8e04621bc9c87f6c..793dc09ab41daf7fc338fc72092f0e8973bbc8f8 100644 --- a/content/guides/handling_data/_index.md +++ b/content/guides/handling_data/_index.md @@ -4,33 +4,20 @@ description = "How to work with and transfer data to/from HCC resources." weight = "30" +++ -<span id="title-text"> HCC-DOCS : Handling Data </span> -======================================================= - -Created by <span class="author"> Derek Weitzel</span>, last modified by -<span class="editor"> Carrie Brown</span> on Sep 18, 2018 - -<span -class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span> - -HCC currently has no storage that is suitable for HIPAA or other PID -data sets. Users are not permitted to store such data on HCC machines. +{{% panel theme="danger" header="**Sensitive and Protected Data**" %}}HCC currently has *no storage* that is suitable for **HIPAA** or other **PID** data sets. Users are not permitted to store such data on HCC machines.{{% /panel %}} All HCC machines have three separate areas for every user to store data, each intended for a different purpose. In addition, we have a transfer -service that utilizes [Globus Connect](Globus-Connect_6357013.html). - -<span -class="confluence-embedded-file-wrapper image-center-wrapper confluence-embedded-manual-size"><img src="assets/images/332256/35325560.png" class="confluence-embedded-image image-center" width="1000" /></span> +service that utilizes [Globus Connect]({{< relref "globus_connect" >}}). +{{< figure src="/images/35325560.png" >}} -Home Directory --------------- - -<span -class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span> +--- +### Home Directory +{{% notice info %}} You can access your home directory quickly using the $HOME environmental variable (i.e. '`cd $HOME'`). +{{% /notice %}} Your home directory (i.e. `/home/[group]/[username]`) is meant for items that take up relatively small amounts of space. For example: source @@ -40,28 +27,25 @@ for the purposes of best-effort disaster recovery. This space is not intended as an area for I/O to active jobs. **/home** is mounted **read-only** on cluster worker nodes to enforce this policy. -Common Directory ----------------- - -<span -class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span> +--- +### Common Directory +{{% notice info %}} You can access your common directory quickly using the $COMMON environmental variable (i.e. '`cd $COMMON`') +{{% /notice %}} The common directory operates similarly to work and is mounted with **read and write capability to worker nodes all HCC Clusters**. This -means that any files stored in common can be accessed from Crane, Tusker -and Sandhills making this directory ideal for items that need to be +means that any files stored in common can be accessed from Crane and Tusker, making this directory ideal for items that need to be accessed from multiple clusters such as reference databases and shared data files. -<span -class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span> - +{{% notice warning %}} Common is not designed for heavy I/O usage. Please continue to use your work directory for active job output to ensure the best performance of your jobs. +{{% /notice %}} Quotas for common are **30 TB per group**, with larger quotas available for purchase if needed. However, files stored here will **not be backed @@ -69,23 +53,17 @@ up** and are **not subject to purge** at this time. Please continue to backup your files to prevent irreparable data loss. Additional information on using the common directories can be found in -the documentation on [Using the /common File System](30444241.html) +the documentation on [Using the /common File System]({{< relref "using_the_common_file_system" >}}) -High Performance Work Directory -------------------------------- - -<span -class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span> +--- +### High Performance Work Directory +{{% notice info %}} You can access your work directory quickly using the $WORK environmental variable (i.e. '`cd $WORK'`). +{{% /notice %}} -<span -class="aui-icon aui-icon-small aui-iconfont-error confluence-information-macro-icon"></span> - -The `/work` directories are **not backed up**. Irreparable data loss is -possible with a mis-typed command. See [Preventing File -Loss](Preventing-File-Loss_29065313.html) for strategies to avoid this. +{{% panel theme="danger" header="**File Loss**" %}}The `/work` directories are **not backed up**. Irreparable data loss is possible with a mis-typed command. See [Preventing File Loss]({{< relref "preventing_file_loss" >}}) for strategies to avoid this.{{% /panel %}} Every user has a corresponding directory under /work using the same naming convention as `/home` (i.e. `/work/[group]/[username]`). We @@ -93,11 +71,11 @@ encourage all users to use this space for I/O to running jobs. This directory can also be used when larger amounts of space are temporarily needed. There is a **50TB per group quota**; space in /work is shared among all users. It should be treated as short-term scratch space, and -**is not backed up**. <span style="color: rgb(255,0,0);"><span -style="color: rgb(0,0,0);">Please use the `hcc-du` command to check your +**is not backed up**. **Please use the `hcc-du` command to check your own and your group's usage, and back up and clean up your files at -reasonable intervals in $WORK.</span></span> +reasonable intervals in $WORK.** +--- ### Purge Policy HCC has a **purge policy on /work** for files that become dormant. @@ -113,58 +91,39 @@ list the matching files for the user. The candidate list can also be accessed at the following path:` /lustre/purge/current/${USER}.list`. This list is updated twice a week, on Mondays and Thursdays. -<span -class="aui-icon aui-icon-small aui-iconfont-error confluence-information-macro-icon"></span> - -/work is intended for recent job output and not long term storage. -Evidence of circumventing the purge policy by users will result in -consequences including account lockout. - - +{{% notice warning %}} +`/work` is intended for recent job output and not long term storage. Evidence of circumventing the purge policy by users will result in consequences including account lockout. +{{% /notice %}} If you have space requirements outside what is currently provided, please email <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> and we will gladly discuss alternatives. -[Attic](Using-Attic_11635580.html) ----------------------------------- +--- +### [Attic]({{< relref "using_attic" >}}) Attic is a near line archive available for purchase at HCC. Attic provides reliable large data storage that is designed to be more reliable then `/work`, and larger than `/home`. Access to Attic is done -through [Globus Connect](Globus-Connect_6357013.html). +through [Globus Connect]({{< relref "globus_connect" >}}). More details on Attic can be found on HCC's <a href="https://hcc.unl.edu/attic" class="external-link">Attic</a> website. -<span style="color: rgb(0,0,0);line-height: 1.4285715;font-size: 20.0px;">[Globus Connect](Globus-Connect_6357013.html)</span> ------------------------------------------------------------------------------------------------------------------------------- +--- +### [Globus Connect]({{< relref "globus_connect" >}}) For moving large amounts of data into or out of HCC resources, users are highly encouraged to consider using [Globus -Connect](Globus-Connect_6357013.html). +Connect]({{< relref "globus_connect" >}}). -Using Box ---------- +--- +### Using Box You can use your [UNL -Box.com](Integrating-Box-with-HCC_8192521.html) account to download and +Box.com]({{< relref "integrating_box_with_hcc" >}}) account to download and upload files from any of the HCC clusters. - - -Attachments: ------------- - -<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" /> -[HCCStorageOptions\_cb\_edits.pdf](attachments/332256/30444364.pdf) -(application/pdf) -<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" /> -[HCCStorageOptions\_cb\_edits.png](attachments/332256/30444365.png) -(image/png) -<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" /> -[StorageOptions.png](attachments/332256/35325560.png) (image/png) - diff --git a/content/guides/handling_data/data_for_unmc_users_only.md b/content/guides/handling_data/data_for_unmc_users_only.md index 667dab8bace3619b7b05ec187c0c8d855fc57c95..e22b259214d1b06f86427fcc7dfe6b3576b26605 100644 --- a/content/guides/handling_data/data_for_unmc_users_only.md +++ b/content/guides/handling_data/data_for_unmc_users_only.md @@ -1,26 +1,18 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) ++++ +title = "Data for UNMC Users Only" +description= "Data storage options for UNMC users" +weight = 50 ++++ -<span id="title-text"> HCC-DOCS : Data for UNMC users only </span> -================================================================== - -Created by <span class="author"> Mako Furukawa Furukawa</span>, last -modified on Apr 07, 2014 - -<span -class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span> - -HCC currently has no storage that is suitable for HIPAA or other PID +{{% panel theme="danger" header="Sensitive and Protected Data" %}} HCC currently has no storage that is suitable for HIPAA or other PID data sets. Users are not permitted to store such data on HCC machines. - Tusker and Crane have a special directory, only for UNMC users. Please note that this filesystem is still not suitable for HIPAA or other PID data sets. +{{% /panel %}} -Transferring files to this machine from UNMC. ---------------------------------------------- +--- +### Transferring files to this machine from UNMC. You will need to email us at <a href="mailto:hcc-support@unl.edu" class="external-link">hcc-support@unl.edu</a> to @@ -28,8 +20,8 @@ gain access to this machine. Once you do, you can sftp to 10.14.250.1 and upload your files. Note that sftp is your only option. You may use different sftp utilities depending on your platform you are logging in from. Email us if you need help with this. Once you are logged in, you -should be at /volumes/UNMC1ZFS/\[group\]/\[username\], or -/home/\[group\]/\[username\]. Both are the same location and you will be +should be at `/volumes/UNMC1ZFS/[group]/[username]`, or +`/home/[group]/[username]`. Both are the same location and you will be allowed to write files there. For Windows, learn more about logging in and uploading files @@ -38,17 +30,14 @@ For Windows, learn more about logging in and uploading files Using your uploaded files on Tusker or Crane. --------------------------------------------- -<span style="color: rgb(51,51,51);"><span -style="font-size: 14.0px;line-height: 1.4285715;">Using your -uploaded </span><span -style="font-size: 14.0px;line-height: 20.0px;">files</span><span -style="font-size: 14.0px;line-height: 1.4285715;"> is easy. Just go to -/shared/unmc1/\[group\]/\[username\] and your files will be in the same +Using your +uploaded files is easy. Just go to +`/shared/unmc1/[group]/[username]` and your files will be in the same place. You may notice that the directory is not available at times. This is because the unmc1 directory is automounted. This means, if you try to -go to the directory, it will show up. Just "cd" to -/shared/unmc1/\[group\]/\[username\] and all of the files will be -there.</span></span> +go to the directory, it will show up. Just "`cd`" to +`/shared/unmc1/[group]/[username]` and all of the files will be +there. If you have space requirements outside what is currently provided, please diff --git a/content/guides/handling_data/high_speed_data_transfers.md b/content/guides/handling_data/high_speed_data_transfers.md index 7108206f30e8d1050b6d58d63d0162acb6cc3e20..95cce2a440ce9853445a855c682cc1c5f6b3f96d 100644 --- a/content/guides/handling_data/high_speed_data_transfers.md +++ b/content/guides/handling_data/high_speed_data_transfers.md @@ -1,46 +1,28 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) - -<span id="title-text"> HCC-DOCS : High-Speed Data Transfers </span> -=================================================================== - -Created by <span class="author"> Emelie Harstad</span>, last modified by -<span class="editor"> Josh Samuelson</span> on May 17, 2018 - -Tusker, Crane and Sandhills each have a dedicated transfer server with -10 Gb/s connectivity (Sandhills currently limited to 1 Gb/s) that allows ++++ +title = "High Speed Data Transfers" +description = "How to transfer files directly from the transfer servers" +weight = 10 ++++ + +Crane, Tusker, and Attic each have a dedicated transfer server with +10 Gb/s connectivity that allows for faster data transfers than the login nodes. With [Globus -Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect), users +Connect]({{< relref "globus_connect" >}}), users can take advantage of this connection speed when making large/cumbersome transfers. -<span style="line-height: 1.4285715;"> -</span> - -<span style="line-height: 1.4285715;">Those who prefer scp, sftp or +Those who prefer scp, sftp or rsync clients can also benefit from this high-speed connectivity by -using these dedicated servers for data transfers:</span> - -For Tusker transfers, use: - -For Crane transfers, use: +using these dedicated servers for data transfers: -For Sandhills Transfers, use: - -`tusker-xfer.unl.edu` - -`crane-xfer.unl.edu` - -`sandhills-xfer.unl.edu` - -<span -class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span> +Cluster | Transfer server +----------|---------------------- +Crane | `crane-xfer.unl.edu` +Tusker | `tusker-xfer.unl.edu` +Attic | `attic-xfer.unl.edu` +{{% notice info %}} Because the transfer servers are login-disabled, third-party transfers -between `tusker-xfer,` `crane-xfer` and `sandhills-xfer` must be done -via [Globus -Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect). - +between `crane-xfer`, `tusker-xfer,` and `attic-xfer` must be done via [Globus Connect]({{< relref "globus_connect" >}}). +{{% /notice %}} diff --git a/content/guides/handling_data/integrating_box_with_hcc.md b/content/guides/handling_data/integrating_box_with_hcc.md index 868d6b28c864d76e8ef5579ec7b085d283b2facd..480f295c4b0a9b110b710443d263ddd0241f614f 100644 --- a/content/guides/handling_data/integrating_box_with_hcc.md +++ b/content/guides/handling_data/integrating_box_with_hcc.md @@ -1,13 +1,8 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) - -<span id="title-text"> HCC-DOCS : Integrating Box with HCC </span> -================================================================== - -Created by <span class="author"> Derek Weitzel</span>, last modified by -<span class="editor"> Adam Caprez</span> on Oct 11, 2018 ++++ +title = "Integrating Box with HCC" +description = "How to integrate Box with HCC" +weight = 30 ++++ UNL has come to an arrangement with <a href="https://www.box.com/" class="external-link">Box.com</a> to @@ -17,221 +12,81 @@ results when the job has completed. Combined with <a href="https://sites.box.com/sync4/" class="external-link">Box Sync</a>, the uploaded files can be sync'd to your laptop or desktop upon job completion. The upload and download speed of Box is about 20 to 30 MB/s -in good network traffic conditions. There are two programs that can be -used to transfer files to/from Box - `cadaver` or `lftp`. Instructions -are provided for both options - -Step-by-step guide for Lftp ---------------------------- - -1. Create an external password for Box as described in steps 1 and 2 - in the Cadaver instructions below. -2. Load the `lftp` module: - - **Load the lftp module** - - ``` syntaxhighlighter-pre - module load lftp - ``` - -3. Connect to Box using your full email as the username and external - password you created: - - **Connect to Box** - - ``` syntaxhighlighter-pre - lftp -u <username>,<password> ftps://ftp.box.com - ``` - -4. Test the connection by running the `ls` command. You should see a - listing of your Box files. Assuming it works, add a bookmark named - "box" to use when connecting later: - - **Add lftp bookmark** - - ``` syntaxhighlighter-pre - lftp demo2@unl.edu@ftp.box.com:/> bookmark add box - ``` - -5. Exit `lftp` by typing `quit`. To reconnect later, use bookmark - name: - - **Connect using bookmark name** - - ``` syntaxhighlighter-pre - lftp box - ``` - -6. To upload or download files, use the `get` and `put` commands. For - example: - - **Transferring files** - - ``` syntaxhighlighter-pre - [demo@login.crane ~]$ lftp box - lftp demo2@unl.edu@ftp.box.com:/> put myfile.txt - lftp demo2@unl.edu@ftp.box.com:/> get my_other_file.txt - ``` - -7. To download directories, use the `mirror` command. To upload - directories, use the `mirror` command with the `-R` option. For - example, to download a directory named `my_box_dir` to your current - directory: - - **Download a directory from Box** - - ``` syntaxhighlighter-pre - [demo@login.crane ~]$ lftp box - lftp demo2@unl.edu@ftp.box.com:/> mirror my_box_dir - ``` - - To upload a directory named `my_hcc_dir` to Box, use `mirror` with - the `-R` option: - - **Upload a directory to Box** - - ``` syntaxhighlighter-pre - [demo@login.crane ~]$ lftp box - lftp demo2@unl.edu@ftp.box.com:/> mirror -R my_hcc_dir - ``` - -8. Lftp also supports using scripts to transfer files. This can be - used to automatic downloading or uploading files during jobs. <span - style="color: rgb(0,0,0);">For example, create a file called - "transfer.sh" with the following lines:</span> - - **transfer.sh** - - ``` syntaxhighlighter-pre - open box - get some_input_file.tar.gz - put my_output_file.tar.gz - ``` - - To run this script, do: - - **Run transfer.sh** - - ``` syntaxhighlighter-pre - module load lftp - lftp -f transfer.sh - ``` - -Step-by-step guide for Cadaver ------------------------------- - -1. You need to create your UNL - <a href="http://Box.com" class="external-link">Box.com</a> account - <a href="http://box.unl.edu/" class="external-link">here</a>. -2. Since we are going to be using - <a href="https://en.wikipedia.org/wiki/WebDAV" class="external-link">webdav</a> protocol - to access your - <a href="http://Box.com" class="external-link">Box.com</a> storage, - you need to create an **External Password**. In the - <a href="http://Box.com" class="external-link">Box.com</a> - interface, you can create it - at **<a href="https://unl.app.box.com/settings" class="external-link">Account Settings</a>** > **Create - External Password. - <span - class="confluence-embedded-file-wrapper confluence-embedded-manual-size"><img src="assets/images/8192521/8126683.png" class="confluence-embedded-image" width="747" height="185" /></span>** -3. Create a - <a href="http://www.mavetju.org/unix/netrc.php" class="external-link">.netrc</a> - file in order to automatically login to your box account without - typing the password. The file needs to be in your home directory, - ie `~/.netrc`. You can easily create this file using the nano text - editor by using the command: - - ``` syntaxhighlighter-pre - nano ~/.netrc - ``` - - The file should contain the following lines: - - ``` syntaxhighlighter-pre - machine dav.box.com - login <box_username>@unl.edu - password <external_password> - ``` - - Once you have typed or pasted these lines into the file, press - CTRL-X to exit. Follow the prompts to save the file as `.netrc`. - - -4. Be sure to have the correct permissions on the file. You can change - the permissions with the command: - - ``` syntaxhighlighter-pre - $ chmod 600 ~/.netrc - ``` - -5. Try out the webdav client by issuing the command: - - ``` syntaxhighlighter-pre - $ cadaver https://dav.box.com/dav - ``` - - It should give you a prompt like: - - ``` syntaxhighlighter-pre - dav:/dav/> - ``` - - Within this prompt, you can view files and navigate through the file - system using the usual Bash commands **cd** and **ls**. To download - files from Box, use the command: - - ``` syntaxhighlighter-pre - get <filename> - ``` - - Or, alternately, to upload files to your Box, use: - - ``` syntaxhighlighter-pre - put <filename> - ``` - - To exit the prompt, press **ctrl-d** - - -6. Within a submit script, you can upload and download files by using - commands such as: - - ``` syntaxhighlighter-pre - #!/bin/sh - #SBATCH ... - .... - cat << EOF | cadaver https://dav.box.com/dav - get inputfile.txt - EOF - - cat << EOF | cadaver https://dav.box.com/dav - put outputfile.txt - EOF - ``` - -7. The files should automatically appear in your Box account, and be - sync'd to your computer if you have the - <a href="https://sites.box.com/sync4/" class="external-link">sync client</a> - installed. - -Related articles ----------------- - -- <span class="icon aui-icon aui-icon-small aui-iconfont-page-default" - title="Page">Page:</span> - - [Integrating Box with HCC](/display/HCCDOC/Integrating+Box+with+HCC) - -- <span class="icon aui-icon aui-icon-small aui-iconfont-page-default" - title="Page">Page:</span> - - [Handling Data](/display/HCCDOC/Handling+Data) - -Attachments: ------------- - -<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" /> [Screen -Shot 2014-08-14 at 4.55.18 PM.png](attachments/8192521/8126683.png) -(image/png) - +in good network traffic conditions. Users can use a tool called lftp to transfer files between HCC clusters and their Box accounts. + +--- +### Step-by-step guide for Lftp + +1. You need to create your UNL [Box.com](https://www.box.com/) account [here](https://box.unl.edu/). + +2. Since we are going to be using [webdav](https://en.wikipedia.org/wiki/WebDAV) protocol to access your [Box.com](https://www.box.com/) storage, you need to create an **External Password**. In the [Box.com](https://www.box.com/) interface, you can create it at **[Account Settings](https://unl.app.box.com/settings) > Create External Password.** +{{< figure src="/images/box_create_external_password.png" class="img-border" >}} + +3. After logging into the cluster of your choice, load the `lftp` module by entering the command below at the prompt: +{{% panel theme="info" header="Load the lftp module" %}} +{{< highlight bash >}} +module load lftp +{{< /highlight >}} +{{% /panel %}} + +4. Connect to Box using your full email as the username and external password you created: +{{% panel theme="info" header="Connect to Box" %}} +{{< highlight bash >}} +lftp -u <username>,<password> ftps://ftp.box.com +{{< /highlight >}} +{{% /panel %}} + +5. Test the connection by running the `ls` command. You should see a listing of your Box files. Assuming it works, add a bookmark named "box" to use when connecting later: +{{% panel theme="info" header="Add lftp bookmark" %}} +{{< highlight bash >}} +lftp demo2@unl.edu@ftp.box.com:/> bookmark add box +{{< /highlight >}} +{{% /panel %}} + +6. Exit `lftp` by typing `quit`. To reconnect later, use bookmark name: +{{% panel theme="info" header="Connect using bookmark name" %}} +{{< highlight bash >}} +lftp box +{{< /highlight >}} +{{% /panel %}} + +7. To upload or download files, use `get` and `put` commands. For example: +{{% panel theme="info" header="Transferring files" %}} +{{< highlight bash >}} +[demo2@login.crane ~]$ lftp box +lftp demo2@unl.edu@ftp.box.com:/> put myfile.txt +lftp demo2@unl.edu@ftp.box.com:/> get my_other_file.txt +{{< /highlight >}} +{{% /panel %}} + +8. To download directories, use the `mirror` command. To upload directories, use the `mirror` command with the `-R` option. For example, to download a directory named `my_box-dir` to your current directory: +{{% panel theme="info" header="Download a directory from Box" %}} +{{< highlight bash >}} +[demo2@login.crane ~]$ lftp box +lftp demo2@unl.edu@ftp.box.com:/> mirror my_box_dir +{{< /highlight >}} +{{% /panel %}} +To upload a directory named `my_hcc_dir` to Box, use `mirror` with the `-R` option: +{{% panel theme="info" header="Upload a directory to Box" %}} +{{< highlight bash >}} +[demo2@login.crane ~]$ lftp box +lftp demo2@unl.edu@ftp.box.com:/> mirror -R my_hcc_dir +{{< /highlight >}} +{{% /panel %}} + +9. Lftp also supports using scripts to transfer files. This can be used to automatically download or upload files during jobs. For example, create a file called "transfer.sh" with the following lines: +{{% panel theme="info" header="transfer.sh" %}} +{{< highlight bash >}} +open box +get some_input_file.tar.gz +put my_output_file.tar.gz +{{< /highlight >}} +{{% /panel %}} +To run this script, do: +{{% panel theme="info" header="Run transfer.sh" %}} +{{< highlight bash >}} +module laod lftp +lftp -f transfer.sh +{{< /highlight >}} +{{% /panel %}} diff --git a/content/guides/handling_data/preventing_file_loss.md b/content/guides/handling_data/preventing_file_loss.md index c47b38e4225d79e351edd3a321417ebf27ba7f61..8d20435f97caebbde64fc7aa792db534747d73ee 100644 --- a/content/guides/handling_data/preventing_file_loss.md +++ b/content/guides/handling_data/preventing_file_loss.md @@ -1,13 +1,8 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) - -<span id="title-text"> HCC-DOCS : Preventing File Loss </span> -============================================================== - -Created by <span class="author"> Adam Caprez</span>, last modified by -<span class="editor"> Carrie Brown</span> on Feb 09, 2018 ++++ +title = "Preventing File Loss" +description = "How to prevent file loss on HCC clusters" +weight = 60 ++++ Each research group is allocated 50TB of storage in `/work` on HCC clusters. With over 400 active groups, HCC does not have the resources @@ -23,6 +18,7 @@ needs. For truly robust file backups, we recommend combining multiple methods. For example, use Git regularly along with manual backups to an external hard-drive at regular intervals such as monthly or biannually. +--- ### 1. Use your local machine: If you have sufficient hard drive space, regularly backup your `/work` @@ -30,11 +26,11 @@ directories to your personal computer. To avoid filling up your personal hard-drives, consider using an external drive that can easily be placed in a fireproof safe or at an off-site location for an extra level of protection. To do this, you can either use [Globus -Connect](https://hcc-docs.unl.edu/display/HCCDOC/Globus+Connect) or an +Connect]({{< relref "globus_connect" >}}) or an SCP client, such as <a href="https://cyberduck.io/" class="external-link">Cyberduck</a> or <a href="https://winscp.net/eng/index.php" class="external-link">WinSCP</a>. For help setting up an SCP client, check out our [Quick Start -Guides](https://hcc-docs.unl.edu/display/HCCDOC/Quick+Start+Guides). +Guides]({{< relref "/quickstarts" >}}). For those worried about personal hard drive crashes, UNL offers <a href="http://nsave.unl.edu/" class="external-link">the backup service NSave</a>. @@ -43,19 +39,20 @@ automatically backup selected files from their personal machine. Benefits: -- Gives you full control over what is backed up and when. +- Gives you full control over what is backed up and when. - Doesn't require the use of third party servers (when using SCP clients). - Take advantage of our high speed data transfers (10 Gb/s) when using Globus Connect or [setup your SCP client to use our dedicated high speed transfer - servers](https://hcc-docs.unl.edu/display/HCCDOC/High-Speed+Data+Transfers) + servers]({{< relref "high_speed_data_transfers" >}}) Limitations: - The amount you can backup is limited by available hard-drive space. - Manual backups of many files can be time consuming. +--- ### 2. Use Git to preserve files and revision history: Git is a revision control service which can be run locally or can be @@ -84,12 +81,13 @@ Limitations: tracking files over 1GB in size can be time consuming and lead to errors when using other repository hosts. +--- ### 3. Use Attic: HCC offers long-term, <a href="https://en.wikipedia.org/wiki/Nearline_storage" class="external-link">near-line</a> data storage -through [Attic](https://hcc-docs.unl.edu/display/HCCDOC/Using+Attic). +through [Attic]({{< relref "using_attic" >}}). HCC users with an existing account can <a href="http://hcc.unl.edu/attic" class="external-link">apply for an Attic account</a> for a <a href="http://hcc.unl.edu/priority-access-pricing" class="external-link">small annual fee</a> that @@ -102,14 +100,15 @@ Benefits: layer against file loss. - No limits on individual or total file sizes. - High speed data transfers between Attic and the clusters when using - Globus Connect and [HCC's high-speed data - servers](https://hcc-docs.unl.edu/display/HCCDOC/High-Speed+Data+Transfers). + [Globus Connect]({{< relref "globus_connect" >}}) and [HCC's high-speed data + servers]({{< relref "high_speed_data_transfers" >}}). Limitations: - Backups must be done manually which can be time consuming. Setting up automated scripts can help speed up this process. +--- ### 4. Use a cloud-based service, such as Box: Many of us are familiar with services such as Google Drive, Dropbox, Box @@ -117,8 +116,8 @@ and OneDrive. These cloud-based services provide a convenient portal for accessing your files from any computer. NU offers OneDrive and Box services to all students, staff and faculty. But did you know that you can link your Box account to HCC’s clusters to provide quick and easy -access to files stored there? [Follow a few set-up -steps](https://hcc-docs.unl.edu/display/HCCDOC/Integrating+Box+with+HCC) and +access to files stored there? [Follow a few set-up +steps]({{< relref "integrating_box_with_hcc" >}}) and you can add files to and access files stored in your Box account directly from HCC clusters. Setup your submit scripts to automatically upload results as they are generated or use it interactively to store @@ -137,7 +136,8 @@ Limitations: - Box has individual file size limitations, larger files will need to be backed up using an alternate method. -### 5. Copy important files to /home: +--- +### 5. Copy important files to `/home`: While `/work` files and directories are not backed up, files and directories in `/home` are backed up on a daily basis. Due to the @@ -149,7 +149,7 @@ the cluster. Benefits: -- No need to make manual backups. home files are automatically backed +- No need to make manual backups. `\home` files are automatically backed up daily. - Files in `/home` are not subject to the 6 month purge policy that exists on `/work`. diff --git a/content/guides/handling_data/using_attic.md b/content/guides/handling_data/using_attic.md index 5d0c76db0e86f3ef92a3fe07d3ee3e0e7812e7de..ae1b5cac483e8760bea73aeb338a2c80f187f21d 100644 --- a/content/guides/handling_data/using_attic.md +++ b/content/guides/handling_data/using_attic.md @@ -1,45 +1,40 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) - -<span id="title-text"> HCC-DOCS : Using Attic </span> -===================================================== - -Created by <span class="author"> Zhongtao Zhang</span>, last modified by -<span class="editor"> Natasha Pavlovikj</span> on Oct 19, 2018 ++++ +title = "Using Attic" +description = "How to store data on Attic" +weight = 20 ++++ For users who need long-term storage for large amount of data, HCC provides an economical solution called Attic. Attic is a reliable <a href="https://en.wikipedia.org/wiki/Nearline_storage" class="external-link">near-line data archive</a> storage system. The files in Attic can be accessed and shared from anywhere using [Globus -Connect](https://hcc-docs.unl.edu/display/ADMIN/Globus+Connect+Usage), +Connect]({{< relref "globus_connect" >}}), with a fast 10Gb/s link. Also, the data in Attic is backed up between our Lincoln and Omaha facilities to ensure high availability and disaster tolerance. The data and user activities on Attic are subject to our <a href="http://hcc.unl.edu/hcc-policies" class="external-link">HCC Policies</a>. -Accounts and Cost ------------------ +--- +### Accounts and Cost To use Attic you will first need an <a href="https://hcc.unl.edu/new-user-request" class="external-link">HCC account</a>, and -then you may apply for an -<a href="http://hcc.unl.edu/attic" class="external-link">Attic account</a>. +then you may request an +<a href="http://hcc.unl.edu/attic" class="external-link">Attic allocation</a>. We charge a small fee per TB per year, but it is cheaper than most commercial cloud storage solutions. For the user application form and cost, please see the <a href="http://hcc.unl.edu/attic" class="external-link">HCC Attic page</a>. -Transfer Files Using Globus Connect ------------------------------------ +--- +### Transfer Files Using Globus Connect The easiest and fastest way to access Attic is via Globus. You can -transfer files between your computer, our clusters ($HOME and $WORK on -Crane, Tusker, and Sandhills), and Attic. Here is a detailed tutorial on -how to set up and use [Globus Connect](Globus-Connect_6357013.html). For +transfer files between your computer, our clusters ($HOME, $WORK, and $COMMON on +Crane and Tusker), and Attic. Here is a detailed tutorial on +how to set up and use [Globus Connect]({{< relref "globus_connect" >}}). For Attic, use the Globus Endpoint **hcc\#attic**. Your Attic files are located at `~, `which is a shortcut for `/attic/<groupname>/<username>`. @@ -47,24 +42,22 @@ for `/attic/<groupname>/<username>`. group, you should explicitly set the path to /attic/<supplementary\_groupname>/. If you don't do that, by default the endpoint will try to place you in your primary group's Attic -path, which access will be denied if the primary group doesn't have -Attic allocation.* - -Transfer Files Using SCP/SFTP/RSYNC ------------------------------------ - -The transfer server for Attic storage is named `attic.unl.edu`. +path, to which access will be denied if the primary group doesn't have an Attic allocation.* -**SCP Example** +--- +### Transfer Files Using SCP/SFTP/RSYNC -``` syntaxhighlighter-pre -$ scp /source/file <username>@attic.unl.edu:~/destination/file -``` +The transfer server for Attic storage is `attic.unl.edu` (or `attic-xfer.unl.edu`). -**SFTP example** +{{% panel theme="info" header="SCP Example" %}} +{{< highlight bash >}} +scp /source/file <username>@attic.unl.edu:~/destination/file +{{< /highlight >}} +{{% /panel %}} -``` syntaxhighlighter-pre -$ sftp <username>@attic.unl.edu +{{% panel theme="info" header="SFTP Example" %}} +{{< highlight bash >}} +sftp <username>@attic.unl.edu Password: Duo two-factor login for <username> Connected to attic.unl.edu. @@ -72,38 +65,35 @@ sftp> pwd Remote working directory: /attic/<groupname>/<username> sftp> put source/file destination/file sftp> exit -``` +{{< /highlight >}} +{{% /panel %}} - - -**RSYNC Example** - -``` syntaxhighlighter-pre +{{% panel theme="info" header="RSYNC Example" %}} +{{< highlight bash >}} # local to remote rsync command -$ rsync -avz /local/source/path <username>@attic.unl.edu:remote/destination/path +rsync -avz /local/source/path <username>@attic.unl.edu:remote/destination/path # remote to local rsync command -$ rsync -avz <username>@attic.unl.edu:remote/source/path /local/destination/path -``` - - +rsync -avz <username>@attic.unl.edu:remote/source/path /local/destination/path +{{< /highlight >}} +{{% /panel %}} -You can also access your data on Attic using our three [high-speed -transfer servers](High-Speed-Data-Transfers_6946857.html) if you prefer. +You can also access your data on Attic using our [high-speed +transfer servers]({{< relref "high_speed_data_transfers" >}}) if you prefer. Simply use scp or sftp to connect to one of the transfer servers, and your directory is mounted at `/attic/<groupname>/<username>`. -Check Attic Usage ------------------ +--- +### Check Attic Usage The usage and quota information for your group and the users in the -group are stored in a filed named "disk\_usage.txt" in your group's -directory (`/attic/<groupname>`). You can use either Globus Connect or +group are stored in a file named "disk\_usage.txt" in your group's +directory (`/attic/<groupname>`). You can use either [Globus Connect]({{< relref "globus_connect" >}}) or scp to download it. Your usage and expiration is also shown in the web interface (see below). -Use the web interface ---------------------- +--- +### Use the web interface For convenience, a web interface is also provided. Simply go to <a href="https://attic.unl.edu" class="external-link">https://attic.unl.edu</a> diff --git a/content/guides/handling_data/using_the_common_file_system.md b/content/guides/handling_data/using_the_common_file_system.md index e6e2d009eecd543f17a2985c81f6bd24246b4289..3ca200bf26c9c2e430e8fa6be44a33fa03ab7c80 100644 --- a/content/guides/handling_data/using_the_common_file_system.md +++ b/content/guides/handling_data/using_the_common_file_system.md @@ -1,20 +1,13 @@ -1. [HCC-DOCS](index.html) -2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html) -3. [HCC Documentation](HCC-Documentation_332651.html) -4. [Handling Data](Handling-Data_332256.html) ++++ +title = "Using the /common File System" +description = "How to use HCC's /common file system" +weight = 70 ++++ -<span id="title-text"> HCC-DOCS : Using the /common File System </span> -======================================================================= - -Created by <span class="author"> David Swanson</span>, last modified on -Sep 10, 2018 - -<span style="color: rgb(0,0,0);">Quick overview: </span> --------------------------------------------------------- +### Quick overview: - Connected read/write to all HCC HPC cluster resources – you will see - the same files "in common" on any HCC cluster (i.e. crane, tusker, - sandhills) + the same files "in common" on any HCC cluster (i.e. Crane and Tukser). - 30 TB Per-group quota at no charge – larger quota available for $100/TB/year - No backups are made! Don't be silly! Precious data should still be @@ -23,58 +16,53 @@ Sep 10, 2018 disk failure or user error, they won't be removed by the purge scripts. - Accessing common +--- +### Accessing common - <span - class="aui-icon aui-icon-small aui-iconfont-info confluence-information-macro-icon"></span> - Your /common directory can be accessed via the `$COMMON` environment - variable, i.e. `cd $COMMON.` +{{% notice info %}} +Your `/common` directory can be accessed via the `$COMMON` environment variable, i.e. `cd $COMMON.` +{{% /notice %}} -How should I use /common? +--- +### How should I use `/common`? -- Store things that are routinely needed on multiple clusters -- /common is a network attached FS, so limit the number of files per - directory (1 million files in a directory is a very bad idea) -- I**f you are accessing /common for a job, you will need to add a +- Store things that are routinely needed on multiple clusters. +- `/common` is a network attached FS, so limit the number of files per + directory (1 million files in a directory is a very bad idea). +- **If you are accessing `/common` for a job, you will need to add a line to your submission script! ** - - We have each user check out a "license" to access /common for a - given job - - this allows us to know exactly who is accessing it, and for how + - We have each user check out a "license" to access `/common` for a + given job. + - This allows us to know exactly who is accessing it, and for how long, in case of the need for a shut down so we can try to avoid - killing jobs whenever possible - - it also allows us to limit how many jobs can hammer this single - filesystem so it remains healthy and happy - -<span style="color: rgb(0,0,0);">To gain access to the path on worker -nodes, a job must be </span><span style="color: rgb(0,0,0);">submitted -with the following SLURM directive:</span> - -**SLURM Submit File** - -``` syntaxhighlighter-pre + killing jobs whenever possible. + - It also allows us to limit how many jobs can hammer this single + filesystem so it remains healthy and happy. + +To gain access to the path on worker +nodes, a job must be submitted +with the following SLURM directive: +{{% panel theme="info" header="**SLURM Submit File**" %}} +{{< highlight bash >}} #SBATCH --licenses=common -``` +{{< /highlight >}} +{{% /panel %}} -<span style="color: rgb(0,0,0);">If a job lacks the above Slurm -directive, /common will not be accessible from </span><span -style="color: rgb(0,0,0);">the</span><span -style="color: rgb(0,0,0);"> worker nodes. (<span -style="">Briefly</span>, this construct will allow us +If a job lacks the above SLURM directive, `/common` will not be accessible from the worker nodes. (Briefly, this construct will allow us to quickly do maintenance on a single cluster without having to unmount -$COMMON </span><span style="color: rgb(0,0,0);">from</span><span -style="color: rgb(0,0,0);"> all HCC resources). </span> +`$COMMON` from all HCC resources). -<span style="color: rgb(0,0,0);">What should I **not** do when using /common? </span> -------------------------------------------------------------------------------------- +--- +### What should I **not** do when using `/common`? -- Don't use it for high I/O work flows, use /work for that – /common - should mostly be used to read largely static files or data +- Don't use it for high I/O work flows, use `/work` for that – `/common` + should mostly be used to read largely static files or data. - Do not expect your compiled program binaries to work everywhere! - /common is available on machines with different cpu architecture, + `/common` is available on machines with different cpu architecture, different network connections, and so on. *caveat emptor!* - - Serial codes will not be optimized for all clusters + - Serial codes will not be optimized for all clusters. - MPI codes, in particular, will likely not work unless recompiled - for each cluster - - if you use `module `things should be just fine! + for each cluster. + - If you use `module` things should be just fine! diff --git a/static/images/35325560.png b/static/images/35325560.png new file mode 100644 index 0000000000000000000000000000000000000000..ade4160900b388271b7abcb6126f496d340a3dbb Binary files /dev/null and b/static/images/35325560.png differ diff --git a/static/images/box_create_external_password.png b/static/images/box_create_external_password.png new file mode 100644 index 0000000000000000000000000000000000000000..03a3b89fbb08ca6b2c6dc01b928c2198914d5c73 Binary files /dev/null and b/static/images/box_create_external_password.png differ