adding documentation for globus auto backups

753fbb3d · eharstad · 718b4156 · 753fbb3d
Commit 753fbb3d authored 4 years ago by eharstad
--- a/content/handling_data/data_transfer/globus_connect/globus_auto_backups.md
+++ b/content/handling_data/data_transfer/globus_connect/globus_auto_backups.md
+++
+title = "Globus Automated Backups"
+description = "How to use the Globus CLI to Automate Backups"
+weight = 80
+++
+
+Users with Attic allocations can automate regular Rsync-type backups to Attic from Crane or Rhino using the Globus CLI.  The Globus transfer command will be executed from within a Slurm job which, upon completion, launches another identical Slurm job (with a built-in delayed execution).
+
+---
+### Set Up Automated Backups Using Globus CLI
+
+1.  Create shared Globus endpoints for both source and destination directories.
+
+	a.    
+	Create the destination shared endpoint.  In this example, our destination endpoint is on Attic.
+
+	Sign in to globus.org, click on "ENDPOINTS" on the left hand side of the window,
+ then search for and select the *hcc#attic* endpoint.  Select the *Collections* tab, and click on 
+"+ Add a Guest Collection".  (You will first be required to authenticate the endpoint if you have 
+not done so in the last 6 days.  To do so, click "Activate" on the *Overview* tab of the endpoint.)
+
+	Enter the Host Path for your shared endpoint.  It should be at least as high in the directory tree 
+as the path to which you will transfer data.  For example, if you are backing up to a directory 
+located somewhere inside of your home directory, then you can set the Host Path to your home 
+directory:  "/~/"
+     
+	Enter a name for your shared endpoint and click "Create Share".
+
+	b.  
+	Repeat the above instructions to create the source shared endpoint (on for example hcc#crane or hcc#rhino).  This time, the Host Path should be at least as high in the directory tree as the path to the source data.
+
+2.  Regardless of the location of the source endpoint host, the Globus transfer script can be executed on 
+either Crane or Rhino (because Globus does "third-party" transfers).  We will use Crane in this example.  
+
+	First log into crane and load the globus-cli module:
+{{< highlight bash >}}module load globus-cli {{< /highlight >}}
+
+	Log into Globus with the command:
+{{< highlight bash >}}globus login{{< /highlight >}}
+
+	Copy and paste the URL that appears into a web browser and then copy and paste the resulting Authorization Code as prompted on the command-line.  These credentials should not expire as long as they are used periodically (at least every 6 months).
+
+3.  On Crane still, cd into your $WORK directory and download the Git repo containing the automated backup scripts:
+{{< highlight bash >}}cd $WORK
+git clone https://github.com/eharstad/globus_transfer.git
+cd globus_transfer{{< /highlight >}}
+
+4.  Edit the files for your personal use.
+
+	a.   
+	**globus_transfer.submit**
+
+	If you would like to receive email notifications when the slurm job fails, uncomment the “—mail-user”  and “—mail-type" directives and add your email address to the end of the ‘mail-user’ line.  For example:  
+{{< highlight bash >}}
+#SBATCH --mail-user=<your_email@huskers.unl.edu>
+#SBATCH --mail-type=FAIL{{< /highlight >}}
+
+	Set the desired number of days between backups by editing the following line (backup frequency is in units of days and should be set long enough for one backup to finish before the next one begins - testing may be required to determine the best interval):
+{{< highlight bash >}}export BACKUP_FREQ=60{{< /highlight >}}
+
+	b.   
+	**globus_transfer.sh**
+
+	Edit the source and destination endpoint UUID’s.  These can be retrieved from the endpoints tab in the Globus web portal.  Select the desired shared endpoint (it should appear in the "Administered By You" tab) and from the *Overview* screen, scroll down to "Endpoint UUID".  Copy the UUID to the clipboard and paste it into the *globus_transfer.sh* script:
+{{< highlight bash >}}export SourceShare=<Your Source UUID>
+export DestinationShare=<Your Destination UUID>{{< /highlight >}}
+
+	
+	Define the source and destination paths *relative* to the Host Path you chose when creating the shared endpoint.  For example, if your source host path is /work/group/user/ and your destination host path is /~/ and you want to backup the directory /work/group/user/data on the source to the directory /~/data_backup on the destination, then your *SourcePath* and *DestinationPath* variables should be defined as follows:
+{{< highlight bash >}}export SourcePath=/data
+export DestinationPath=/data_backup{{< /highlight >}}
+
+5.  Submit the Slurm job:
+{{< highlight bash >}}sbatch globus_transfer.submit{{< /highlight >}}
+
+6.  Check the status of the transfer.  An easy way to do this is through the globus.org user portal.  Log into globus.org, select the "ACTIVITY" tab.  Click on an individual task for a display with more detailed information.
+
+
+---