Merge branch 'postgres' into 'master'

Add page on running postgres. See merge request !223

Merge branch 'postgres' into 'master'
e879e304 · Adam Caprez · d7094b9b · 01abadcc · e879e304
Commit e879e304 authored 5 years ago by Adam Caprez
--- a/content/applications/app_specific/running_postgres.md
+++ b/content/applications/app_specific/running_postgres.md
+++
+title = "Running PostgreSQL"
+description = "How to run a PostgreSQL server within a SLURM job"
+++
+
+This page describes how to run a PostgreSQL server instance with a SLURM job
+on HCC resources. Many software packages require the use of an SQL type database
+as part of their workflows. This example shows how to start an PostgreSQL server
+inside of a SLURM job on HCC resources. The database will be available as long as
+the SLURM job containing it is running, and other jobs may then be submitted to
+connect to and use it. The database files are stored on the clusters filesystem
+(here `$COMMON` is used), so that even when the containing SLURM job ends the data
+is persistent. That is, you can submit a subsequent identical PostgreSQL server job
+and data that was previously imported in to the database will still be there.
+
+{{% notice warning %}}
+One **one** instance of the database server job can run at a time. Submitting multiple
+server jobs simultaneously will result in undefined behavior and database corruption.
+{{% /notice %}}
+
+### Initial setup steps
+
+A few initial setup commands must be run first. These commands only need be run once for
+each database instance you wish to run. The commands should be run from the _login node_.
+
+First, choose a location to hold the database and configuration files. Here, we use a
+folder named `postgres` in the `$COMMON` directory. Change the value of `POSTGRES_HOME`
+if you wish to use another location.  Run the following commands to create the needed directory 
+structure and create a random password for the database that is stored in `$POSTGRES_HOME/config/postgres-password`.
+
+{{< highlight bash >}}
+$ export POSTGRES_HOME=$COMMON/postgres
+$ mkdir -p $POSTGRES_HOME/{config,db/data,run}
+$ uuidgen > $POSTGRES_HOME/config/postgres-password
+$ chmod 600 $POSTGRES_HOME/config/postgres-password
+{{< /highlight >}}
+
+### Start the PostgreSQL SLURM job
+
+Use the following submit script to start the job for the database:
+
+{{% panel theme="info" header="postgres.submit" %}}
+{{< highlight bash >}}
+#!/bin/bash
+#SBATCH --time=168:00:00
+#SBATCH --mem=8gb
+#SBATCH --job-name=postgres_server
+#SBATCH --error=postgres_server.err
+#SBATCH --output=postgres_server.out
+#SBATCH --licenses=common
+#SBATCH --dependency=singleton
+#SBATCH --signal=B:SIGINT@60
+
+export POSTGRES_HOME=$COMMON/postgres
+export POSTGRES_PASSWORD_FILE=$POSTGRES_HOME/config/postgres-password
+export POSTGRES_USER=$USER
+export POSTGRES_DB=mydb
+export PGDATA=$POSTGRES_HOME/db/data
+export POSTGRES_HOST_AUTH_METHOD=md5
+export POSTGRES_INITDB_ARGS="\-\-data-checksums"
+export POSTGRES_PORT=$(shuf -i 2000-65000 -n 1)  
+echo "Postgres server running on $(hostname) on port $POSTGRES_PORT"
+echo "This job started at $(date +%Y-%m-%dT%T)"
+echo "This job will end at $(squeue \-\-noheader -j $SLURM_JOBID -o %e) (in $(squeue \-\-noheader -j $SLURM_JOBID -o %L))"
+module load singularity
+exec singularity run -B $POSTGRES_HOME/db:/var/lib/postgresql -B $POSTGRES_HOME/run:/var/run/postgresql docker://postgres:11 -c "port=$POSTGRES_PORT"
+{{< /highlight >}}
+{{% /panel %}}
+
+This script starts a PostgreSQL server instance with the following properties:
+
+- The superuser username is set to your HCC username and the password is the random one generated earlier.
+- The server is started on a random port to avoid collisions.
+- The database name is `mydb`. This can be changed to whatever name you would like (some applications may require a specific name).
+- Checksums on data pages are enabled to help detect corruption.
+- Password authentication is required for security.
+
+Additionally, the job is run with `--dependency=singleton` to ensure that only one instance (based on job name) is running
+at a time. Duplicate jobs submitted afterwards will queue until an earlier job exits. The `--signal=B:SIGINT@60` option 
+instructs SLURM to send a shutdown signal to the PostgreSQL server 60 seconds before the time limit of the job. This
+will help to avoid corruption by allowing the server to perform a graceful shutdown.
+
+Once the job starts, check the `postgres_server.out` file for information on which host and port the server is listening on. For example,
+
+{{< highlight bash >}}
+Postgres server running on c1725.crane.hcc.unl.edu on port 10332
+This job started at 2020-06-19T10:20:58
+This job will end at 2020-06-19T10:50:57 (in 29:59)
+{{< /highlight >}}
+
+Here, the server is running on host `c1725.crane.hcc.unl.edu` on port 10332. 
+The output also contains information on when the job will end. This can be useful when submitting
+the companion analysis job(s) that will use the database. It is recommended to adjust the requested walltime of
+the analysis job(s) to ensure they will end _before_ the database job does.
+
+### Accessing the PostgreSQL instance
+
+The server instance can be accessed using the hostname and port from the job output, as well as your HCC username
+and the random password set initially. The exact method will depend on your application.  Take care to treat the
+password in a secure manner.
+
+### Restarting the PostgreSQL instance
+
+To restart the server, simply resubmit the same SLURM job as above. The first time the job is run, PostgreSQL
+will create the database from scratch.  Subsequent runs will detect an existing database and will not
+overwrite it. Data entered into the database from previous runs will be available.
+
+{{% notice info %}}
+Each new instance of the server will run on a different host and port. You will need to update these
+values before submitting subsequent analysis jobs.
+{{% /notice %}}
+
+### Submitting jobs that require PostgreSQL
+
+The simplest way to manage jobs that need the database is to manually submit them after the PostgreSQL SLURM job
+has started. However, this is not terribly convenient. However, this is not terribly convenient. A better way is
+to use the dependency feature of SLURM. Submit the PostgreSQL job first and make a note of the job id. In the
+submit script(s) of the analysis jobs, add the line
+
+{{< highlight batch >}}
+#SBATCH --dependency=after:<job id>
+{{< /highlight >}}
+
+replacing `<job id>` with the numeric job id noted before. This will instruct SLURM to only begin running
+the analysis job(s) once the database job has begun.