diff --git a/content/applications/app_specific/running_postgres.md b/content/applications/app_specific/running_postgres.md new file mode 100644 index 0000000000000000000000000000000000000000..bd2747dee464c52c1cb1632141ae875f0ed88dfe --- /dev/null +++ b/content/applications/app_specific/running_postgres.md @@ -0,0 +1,125 @@ ++++ +title = "Running PostgreSQL" +description = "How to run a PostgreSQL server within a SLURM job" ++++ + +This page describes how to run a PostgreSQL server instance with a SLURM job +on HCC resources. Many software packages require the use of an SQL type database +as part of their workflows. This example shows how to start an PostgreSQL server +inside of a SLURM job on HCC resources. The database will be available as long as +the SLURM job containing it is running, and other jobs may then be submitted to +connect to and use it. The database files are stored on the clusters filesystem +(here `$COMMON` is used), so that even when the containing SLURM job ends the data +is persistent. That is, you can submit a subsequent identical PostgreSQL server job +and data that was previously imported in to the database will still be there. + +{{% notice warning %}} +One **one** instance of the database server job can run at a time. Submitting multiple +server jobs simultaneously will result in undefined behavior and database corruption. +{{% /notice %}} + +### Initial setup steps + +A few initial setup commands must be run first. These commands only need be run once for +each database instance you wish to run. The commands should be run from the _login node_. + +First, choose a location to hold the database and configuration files. Here, we use a +folder named `postgres` in the `$COMMON` directory. Change the value of `POSTGRES_HOME` +if you wish to use another location. Run the following commands to create the needed directory +structure and create a random password for the database that is stored in `$POSTGRES_HOME/config/postgres-password`. + +{{< highlight bash >}} +$ export POSTGRES_HOME=$COMMON/postgres +$ mkdir -p $POSTGRES_HOME/{config,db/data,run} +$ uuidgen > $POSTGRES_HOME/config/postgres-password +$ chmod 600 $POSTGRES_HOME/config/postgres-password +{{< /highlight >}} + +### Start the PostgreSQL SLURM job + +Use the following submit script to start the job for the database: + +{{% panel theme="info" header="postgres.submit" %}} +{{< highlight bash >}} +#!/bin/bash +#SBATCH --time=168:00:00 +#SBATCH --mem=8gb +#SBATCH --job-name=postgres_server +#SBATCH --error=postgres_server.err +#SBATCH --output=postgres_server.out +#SBATCH --licenses=common +#SBATCH --dependency=singleton +#SBATCH --signal=B:SIGINT@60 + +export POSTGRES_HOME=$COMMON/postgres +export POSTGRES_PASSWORD_FILE=$POSTGRES_HOME/config/postgres-password +export POSTGRES_USER=$USER +export POSTGRES_DB=mydb +export PGDATA=$POSTGRES_HOME/db/data +export POSTGRES_HOST_AUTH_METHOD=md5 +export POSTGRES_INITDB_ARGS="\-\-data-checksums" +export POSTGRES_PORT=$(shuf -i 2000-65000 -n 1) +echo "Postgres server running on $(hostname) on port $POSTGRES_PORT" +echo "This job started at $(date +%Y-%m-%dT%T)" +echo "This job will end at $(squeue \-\-noheader -j $SLURM_JOBID -o %e) (in $(squeue \-\-noheader -j $SLURM_JOBID -o %L))" +module load singularity +exec singularity run -B $POSTGRES_HOME/db:/var/lib/postgresql -B $POSTGRES_HOME/run:/var/run/postgresql docker://postgres:11 -c "port=$POSTGRES_PORT" +{{< /highlight >}} +{{% /panel %}} + +This script starts a PostgreSQL server instance with the following properties: + +- The superuser username is set to your HCC username and the password is the random one generated earlier. +- The server is started on a random port to avoid collisions. +- The database name is `mydb`. This can be changed to whatever name you would like (some applications may require a specific name). +- Checksums on data pages are enabled to help detect corruption. +- Password authentication is required for security. + +Additionally, the job is run with `--dependency=singleton` to ensure that only one instance (based on job name) is running +at a time. Duplicate jobs submitted afterwards will queue until an earlier job exits. The `--signal=B:SIGINT@60` option +instructs SLURM to send a shutdown signal to the PostgreSQL server 60 seconds before the time limit of the job. This +will help to avoid corruption by allowing the server to perform a graceful shutdown. + +Once the job starts, check the `postgres_server.out` file for information on which host and port the server is listening on. For example, + +{{< highlight bash >}} +Postgres server running on c1725.crane.hcc.unl.edu on port 10332 +This job started at 2020-06-19T10:20:58 +This job will end at 2020-06-19T10:50:57 (in 29:59) +{{< /highlight >}} + +Here, the server is running on host `c1725.crane.hcc.unl.edu` on port 10332. +The output also contains information on when the job will end. This can be useful when submitting +the companion analysis job(s) that will use the database. It is recommended to adjust the requested walltime of +the analysis job(s) to ensure they will end _before_ the database job does. + +### Accessing the PostgreSQL instance + +The server instance can be accessed using the hostname and port from the job output, as well as your HCC username +and the random password set initially. The exact method will depend on your application. Take care to treat the +password in a secure manner. + +### Restarting the PostgreSQL instance + +To restart the server, simply resubmit the same SLURM job as above. The first time the job is run, PostgreSQL +will create the database from scratch. Subsequent runs will detect an existing database and will not +overwrite it. Data entered into the database from previous runs will be available. + +{{% notice info %}} +Each new instance of the server will run on a different host and port. You will need to update these +values before submitting subsequent analysis jobs. +{{% /notice %}} + +### Submitting jobs that require PostgreSQL + +The simplest way to manage jobs that need the database is to manually submit them after the PostgreSQL SLURM job +has started. However, this is not terribly convenient. However, this is not terribly convenient. A better way is +to use the dependency feature of SLURM. Submit the PostgreSQL job first and make a note of the job id. In the +submit script(s) of the analysis jobs, add the line + +{{< highlight batch >}} +#SBATCH --dependency=after:<job id> +{{< /highlight >}} + +replacing `<job id>` with the numeric job id noted before. This will instruct SLURM to only begin running +the analysis job(s) once the database job has begun.