Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found
Select Git revision
  • FAQ
  • RDPv10
  • UNL_OneDrive
  • atticguidelines
  • data_share
  • globus-auto-backups
  • good-hcc-practice-rep-workflow
  • hchen2016-faq-home-is-full
  • ipynb-doc
  • master
  • rclone-fix
  • sislam2-master-patch-51693
  • sislam2-master-patch-86974
  • site_url
  • test
15 results

Target

Select target project
  • dweitzel2/hcc-docs
  • OMCCLUNG2/hcc-docs
  • salmandjing/hcc-docs
  • hcc/hcc-docs
4 results
Select Git revision
  • 26-add-screenshots-for-newer-rdp-v10-client
  • 28-overview-page-for-connecting-2
  • AddExamples
  • OMCCLUNG2-master-patch-74599
  • RDPv10
  • globus-auto-backups
  • gpu_update
  • master
  • mtanash2-master-patch-75717
  • mtanash2-master-patch-83333
  • mtanash2-master-patch-87890
  • mtanash2-master-patch-96320
  • patch-1
  • patch-2
  • patch-3
  • runTime
  • submitting-jobs-overview
  • tharvill1-master-patch-26973
18 results
Show changes
Showing
with 1870 additions and 71 deletions
---
title: FAQ
summary: "HCC Frequently Asked Questions"
weight: 10
---
## Table of Content
**Account Management**
- [I have an account, now what?](#i-have-an-account-now-what)
- [How can I change or rest my password?](#how-can-i-change-or-rest-my-password)
- [How do I (re)activate Duo?](#how-do-i-reactivate-duo)
- [I want to create HCC account, but when I try to request one, I am getting the error "Your account email must match the email on record for this group". What should I do?](#i-want-to-create-hcc-account-but-when-i-try-to-request-one-i-am-getting-the-error-your-account-email-must-match-the-email-on-record-for-this-group-what-should-i-do)
- [I want to change my primary group or add an additional group to my HCC account](#i-want-to-change-my-primary-group-or-add-an-additional-group-to-my-hcc-account)
- [My account has been locked and I would like to gain access to it.](#my-account-has-been-locked-and-i-would-like-to-gain-access-to-it)
**Data Storage**
- [I just deleted some files and didn't mean to! Can I get them back?](#i-just-deleted-some-files-and-didnt-mean-to-can-i-get-them-back)
- [How can I check which directories utilize the most storage on Swan?](#how-can-i-check-which-directories-utilize-the-most-storage-on-swan)
- [I want to compress large directory with many files. How can I do that?](#i-want-to-compress-large-directory-with-many-files-how-can-i-do-that)
- [I want to share data with others on Swan. How can I do that?](#i-want-to-share-data-with-others-on-swan-how-can-i-do-that)
**Job Submission**
- [How many nodes/memory/time should I request?](#how-many-nodesmemorytime-should-i-request)
- [I am trying to run a job but nothing happens?](#i-am-trying-to-run-a-job-but-nothing-happens)
- [I keep getting the error "slurmstepd: error: Exceeded step memory limit at some point." What does this mean and how do I fix it?](#i-keep-getting-the-error-slurmstepd-error-exceeded-step-memory-limit-at-some-point-what-does-this-mean-and-how-do-i-fix-it)
- [I keep getting the error "Some of your processes may have been killed by the cgroup out-of-memory handler." What does this mean and how do I fix it?](#i-keep-getting-the-error-some-of-your-processes-may-have-been-killed-by-the-cgroup-out-of-memory-handler-what-does-this-mean-and-how-do-i-fix-it)
- [I keep getting the error "Job cancelled due to time limit." What does this mean and how do I fix it?](#i-keep-getting-the-error-job-cancelled-due-to-time-limit-what-does-this-mean-and-how-do-i-fix-it)
- [My submitted job takes long time waiting in the queue or it is not running?](#my-submitted-job-takes-long-time-waiting-in-the-queue-or-it-is-not-running)
- [Why my job is showing (ReqNodeNotAvail, Reserved for maintenance) before a downtime?](#why-my-job-is-showing-reqnodenotavail-reserved-for-maintenance-before-a-downtime)
- [My job is submitted to the highmem partition and is pending with QOSMinMemory reason. What does this mean?](#my-job-is-submitted-to-the-highmem-partition-and-is-pending-with-qosminmemory-reason-what-does-this-mean)
**Open OnDemand**
- [Why my Open OnDemand JupyterLab or Interactive App Session is stuck and is not starting?](#why-my-open-ondemand-jupyterlab-or-interactive-app-session-is-stuck-and-is-not-starting)
- [My directories are not full, but my Open OnDemand JupyterLab Session is still not starting. What else should I try?](#my-directories-are-not-full-but-my-open-ondemand-jupyterlab-session-is-still-not-starting-what-else-should-i-try)
- [Why my Open OnDemand RStudio Session is crashing?](#why-my-open-ondemand-rstudio-session-is-crashing)
- [I need more resources than I can select with Open OnDemand Apps, can I do that?](#i-need-more-resources-than-i-can-select-with-open-ondemand-apps-can-i-do-that)
**Data Transfer**
- [Why I can not access files under shared Attic/Swan Globus Collection?](#why-i-can-not-access-files-under-shared-atticswan-globus-collection)
- [I used Globus to copy my data across HCC file systems. How can I check that all data was successfully transferred and the data checksums match?](#i-used-globus-to-copy-my-data-across-hcc-file-systems-how-can-i-check-that-all-data-was-successfully-transferred-and-the-data-checksums-match)
**Account Offboarding**
- [I am graduating soon, what will happen with my HCC account?](#i-am-graduating-soon-what-will-happen-with-my-hcc-account)
- [A member of my HCC group left and I need access to the data under their directories.](#a-member-of-my-hcc-group-left-and-i-need-access-to-the-data-under-their-directories)
**HCC Support and Training**
- [I want to talk to a human about my problem. Can I do that?](#i-want-to-talk-to-a-human-about-my-problem-can-i-do-that)
- [Can HCC provide training for my group?](#can-hcc-provide-training-for-my-group)
- [Can HCC provide help and resources for my workshop?](#can-hcc-provide-help-and-resources-for-my-workshop)
- [Where can I get training on using HCC resources?](#where-can-i-get-training-on-using-hcc-resources)
**Networking**
- [What IP's do I use to allow connections to/from HCC resources?](#what-ips-do-i-use-to-allow-connections-tofrom-hcc-resources)
---
## Account Management
#### I have an account, now what?
Congrats on getting an HCC account! Now you need to connect to a Holland
cluster. To do this, we use an SSH connection. SSH stands for Secure
Shell, and it allows you to securely connect to a remote computer and
operate it just like you would a personal machine.
Depending on your operating system, you may need to install software to
make this connection. Check out our documentation on [Connecting to HCC Clusters](/connecting/).
Additional details on next steps and important links for new account holders are available [here in our documentation](/FAQ/new_account/).
#### How can I change or rest my password?
Information on how to change or retrieve your password can be found on
the documentation page: [How to change your password](/accounts/how_to_change_your_password)
All passwords must be at least 8 characters in length and must contain
at least one capital letter and one numeric digit. Passwords also cannot
contain any dictionary words. If you need help picking a good password,
consider using a (secure!) password generator such as
[this one provided by Random.org](https://www.random.org/passwords)
To preserve the security of your account, we recommend changing the
default password you were given as soon as possible.
#### How do I (re)activate Duo?
!!! info "If you have not activated Duo before:**"
Please join our [Remote Open Office hours](https://hcc.unl.edu/OOH) or schedule another remote
session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and show your photo ID and we will be happy to activate it for you.
!!! info "If you have activated Duo previously but now have a different phone number:"
Join our [Remote Open Office hours](https://hcc.unl.edu/OOH) or schedule another remote
session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and show your photo ID and we will be happy to activate it for you.
!!! info "If you have activated Duo previously and have the same phone number:"
Email us at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
from the email address your account is registered under and we will send
you a new link that you can use to activate Duo.
#### I want to create HCC account, but when I try to request one, I am getting the error "Your account email must match the email on record for this group". What should I do?
This error message indicates that you have probably selected the *"I am the owner of this group and this account is for me."* checkbox when filling out the New User Request Form.
This checkbox should be selected only by the owner of the HCC group.
If you are not the owner of the HCC group, please do not select this checkbox and try re-submitting the form again.
#### I want to change my primary group or add an additional group to my HCC account
If you would like to change or add groups to your account, such as a class group or additional research group, please fill out our [Group Modification Request Form](https://hcc.unl.edu/group-addchange-request). Once the change is approved by the new group owner, HCC staff will make the adjustments.
#### My account has been locked and I would like to gain access to it.
HCC automatically locks inactive accounts after 1 year for security purposes.
If you would like to reactivate your HCC account in the original group, please email [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and HCC staff will start the process.
If you are wanting to activate it under a new group, please fill out our [Group Modification Request Form](https://hcc.unl.edu/group-addchange-request). Once the change is approved by the new group owner, HCC staff will make the adjustments.
---
## Data Storage
#### I just deleted some files and didn't mean to! Can I get them back?
That depends. Where were the files you deleted?
!!! info "**If the files were in your $HOME directory (/home/group/user/)**"
**It's possible.**
$HOME directories are backed up daily and we can restore your files as
they were at the time of our last backup. Please note that any changes
made to the files between when the backup was made and when you deleted
them will not be preserved. To have these files restored, please contact
HCC Support at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
as soon as possible.
!!! info "**If the files were in your $WORK directory (/work/group/user/) or $NRDSTOR directory ({{ hcc.nrdstor.path }})**"
**No.**
Unfortunately, the $WORK directories are created as a short term place
to hold job files. This storage was designed to be quickly and easily
accessed by our worker nodes and as such is not conducive to backups.
Any irreplaceable files should be backed up in a secondary location,
such as Attic, the cloud, or on your personal machine. For more
information on how to prevent file loss, check out [Preventing File
Loss](/handling_data/data_storage/preventing_file_loss/).
#### How can I check which directories utilize the most storage on Swan?
You can run `ncdu` from the Swan terminal with the location in question and (re)move directories and data if needed, e.g.,:
```bat
ncdu $HOME/my-folder
```
!!! note
If you have thousands or millions files in a location on Swan, please run `ncdu` only on a sub-directory you suspect may contain large numbers of files.
You may also use `ncdu` on locations in $WORK or $COMMON. Note that running `ncdu` puts additional load on the filesystem(s), so **please run it sparingly**.
HCC suggests running `ncdu` once and saving the output to a file; `ncdu` will read from this file instead of potentially scanning the filesystem multiple times.
To run `ncdu` in this manner, first scan the location using the `-o` option
```bat
ncdu -o ncdu_output.txt $HOME/my-folder
```
Then use the `-f` option to start `ncdu` graphically using this file, i.e.
```bat
ncdu -f ncdu_output.txt
```
Note that re-reading the filesystem to see changes in real time is not supported in this mode. After making changes (deleting/moving files), a new output file
will need to be created and read by repeating the steps above.
#### I want to compress large directory with many files. How can I do that?
In general, we recommend using `zip` as the archive format as `zip` files keep an index of the files.
Moreover, `zip` files can be quickly indexed by the various `zip` tools, and allow extraction of all files or a subset of files.
To compress the directory named `input_folder` into `output.zip`, you can use:
```bat
zip -r output.zip input_folder/
```
If you don't need to list and extract subsets of the archived data, we recommend using `tar` instead.
To compress the directory named `input_folder` into `output.tar.gz`, you can use:
```bat
tar zcf output.tar.gz input_folder/
```
Depending on the size of the directory and number of files you want to compress, you can perform the compressing via an [Interactive Job](/submitting_jobs/creating_an_interactive_job/) or [SLURM job](/submitting_jobs/).
#### I want to share data with others on Swan. How can I do that?
There are multiple methods of sharing data on Swan, including file permissions and Globus.
NRDStor by default has a shared directory for every group at `{{hcc.nrdstor.path}}/group_name/shared`. The $WORK filesystem can have a shared folder, but needs to be requested through email to [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and HCC staff will start the process.
More details on data sharing is available in the [Sharing Data on Swan documentation](/handling_data/swan_data_sharing/).
---
## Job Submission
#### How many nodes/memory/time should I request?
**Short answer:** We don’t know.
**Long answer:** The amount of resources required is highly dependent on
the application you are using, the input file sizes and the parameters
you select. Sometimes it can help to speak with someone else who has
used the software before to see if they can give you an idea of what has
worked for them.
Ultimately, it comes down to trial and error; try different
combinations and see what works and what doesn’t. Good practice is to
check the output and utilization of each job you run. This will help you
determine what parameters you will need in the future.
For more information on how to determine how many resources a completed
job used, check out the documentation on [Monitoring Jobs](/submitting_jobs/monitoring_jobs/).
#### I am trying to run a job but nothing happens?
Where are you trying to run the job from? You can check this by typing
the command \`pwd\` into the terminal.
**If you are running from inside your $HOME directory
(/home/group/user/)**:
Move your files to your $WORK directory (/work/group/user) and resubmit
your job. The $HOME folder is not meant for job output. You may be attempting
to write too much data from the job.
**If you are running from inside your $WORK directory:**
Contact us at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
with your login, the name of the cluster you are running on, and the
full path to your submit script and we will be happy to help solve the
issue.
#### I keep getting the error "slurmstepd: error: Exceeded step memory limit at some point." What does this mean and how do I fix it?
This error occurs when the job you are running uses more memory than was
requested in your submit script.
If you specified `--mem` or `--mem-per-cpu` in your submit script, try
increasing this value and resubmitting your job.
If you did not specify `--mem` or `--mem-per-cpu` in your submit script,
chances are the default amount allotted is not sufficient. Add the line
```bat
#SBATCH --mem=<memory_amount>
```
to your script with a reasonable amount of memory and try running it again. If you keep
getting this error, continue to increase the requested memory amount and
resubmit the job until it finishes successfully.
For additional details on how to monitor usage on jobs, check out the
documentation on [Monitoring Jobs](/submitting_jobs/monitoring_jobs/).
If you continue to run into issues, please contact us at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
for additional assistance.
#### I keep getting the error "Some of your processes may have been killed by the cgroup out-of-memory handler." What does this mean and how do I fix it?
This is another error that occurs when the job you are running uses more memory than was
requested in your submit script.
If you specified `--mem` or `--mem-per-cpu` in your submit script, try
increasing this value and resubmitting your job.
If you did not specify `--mem` or `--mem-per-cpu` in your submit script,
chances are the default amount allotted is not sufficient. Add the line
```bat
#SBATCH --mem=<memory_amount>
```
to your script with a reasonable amount of memory and try running it again. If you keep
getting this error, continue to increase the requested memory amount and
resubmit the job until it finishes successfully.
For additional details on how to monitor usage on jobs, check out the
documentation on [Monitoring Jobs](/submitting_jobs/monitoring_jobs/).
If you continue to run into issues, please contact us at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
for additional assistance.
#### I keep getting the error "Job cancelled due to time limit." What does this mean and how do I fix it?
This error occurs when the job you are running reached the time limit than was
requested in your submit script without finishing successfully.
If you specified `--time` in your submit script, try
increasing this value and resubmitting your job.
If you did not specify `--time` in your submit script,
chances are the default runtime of 1 hour is not sufficient. Add the line
```bat
#SBATCH --time=<runtime>
```
to your script with increased runtime value and try running it again. The maximum runtime on Swan
is 7 days (168 hours).
For additional details on how to monitor usage on jobs, check out the
documentation on [Monitoring Jobs](/submitting_jobs/monitoring_jobs/).
If you continue to run into issues, please contact us at
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
for additional assistance.
#### My submitted job takes long time waiting in the queue or it is not running?
If your submitted jobs are taking long time waiting in the queue, that usually means your account is over-utilizing and your fairshare score is low, this might be due submitting big number of jobs over the past period of time; and/or the amount of resources (memory, time) you requested for your job is big.
For additional details on how to monitor usage on jobs, check out the documentation on [Monitoring queued Jobs](/submitting_jobs/monitoring_jobs/).
#### Why my job is showing (ReqNodeNotAvail, Reserved for maintenance) before a downtime?
Jobs submitted before a downtime may pend and show _(ReqNodeNotAvail, Reserved for maintenance)_ for their status.
(Information on upcoming downtimes can be found at [status.hcc.unl.edu](https://status.hcc.unl.edu/).)
Any job which cannot finish before a downtime is scheduled to begin will pend and show this message. For example,
the downtime starts in 6 days but the script is requesting (via the `--time` option) 7 days of runtime.
If you are sure your job can finish in time, you can lower the requested time to be less than the interval before
the downtime begins (for example, 4 days if the downtime starts in 6 days). Use this with care however to ensure your
job isn't prematurely terminated. Alternatively, you can simply wait until the downtime is completed. Jobs will
automatically resume normally afterwards; no special action is required.
#### My job is submitted to the highmem partition and is pending with QOSMinMemory reason. What does this mean?
The majority of nodes in the `batch` partition on Swan have 256GBs of RAM, with a few nodes with up to 2TBs of RAM. To ensure that the jobs that require lots of memory will run on the nodes with more RAM memory, SLURM uses the `highmem` partition, which is part of the `batch` partition. **This is not an actual partition, so it can not be separately used.** SLURM internally submits the job to both `highmem` and `batch` partitions, and depending on the requested RAM memory, allocates the requested resources. During this process, when checking the job status, you may see:
```bat
$ squeue -u demo
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1000000 highmem,b job_name demo PD 0:00 1 (QOSMinMemory)
```
This message means that the job does not require high memory and it will be submitted to the `batch` partition when the requested resources are available. Once this internal process is completed, the `NODELIST(REASON)` message will be updated accordingly.
Please note that `highmem,b` is truncated from `highmem,batch`. The expanded output can be seen with:
```bat
$ squeue -u demo -o "%.18i %.20P %.8j %.8u %.2t %.10M %.6D %R"
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1000000 highmem,batch job_name demo PD 0:00 1 (QOSMinMemory)
```
!!! note
The number of nodes with high memory is limited, so please only request high amounts of memory if the job really needs it. Otherwise, you may encounter longer waiting times, lower submission priority and underutilized resources.
---
## Open OnDemand
#### Why my Open OnDemand JupyterLab or Interactive App Session is stuck and is not starting?
The most common reason for this is full `$HOME` directory. You can check the size of the directories in `$HOME` by running `ncdu` on the terminal from the `$HOME` directory.
Then, please remove any unnecessary data; move data to [$COMMON or $WORK](/handling_data/); or [back up important data elsewhere](/handling_data/data_storage/preventing_file_loss/).
If the majority of storage space in `$HOME` is utilized by `conda` environments, please [move the conda environments](/applications/user_software/using_anaconda_package_manager/#moving-and-recreating-existing-environment) or [remove unused conda packages and caches](/applications/user_software/using_anaconda_package_manager/#remove-unused-anaconda-packages-and-caches).
#### My directories are not full, but my Open OnDemand JupyterLab Session is still not starting. What else should I try?
If the inode and storage quotas of the `$HOME` and `$WORK` directories on Swan are not exceeded, and your Open OnDemand JupyterLab Session is still not starting, there are two additional things you can check:
- If a custom conda environment is not used, when installing packages with `pip` locally, these libraries are installed in `$HOME/.local`. When using other modules or applications that are based on Python/conda (e.g., Open OnDemand JupyterLab), these local installs can cause conflicts and errors. In this case, please rename the `$HOME/.local` directory (e.g., `mv $HOME/.local $HOME/.local_old`).
- Please make sure that you don't set variables such as `PYTHONPATH` in your `$HOME/.bashrc` file. If you have this variable set in the `$HOME/.bashrc` file, please comment out that line and run `source $HOME/.bashrc` to apply the changes.
Whether you have renamed the `$HOME/.local` directory and/or modified the file `$HOME/.bashrc`, please cancel and restart your JupyterLab Session.
#### Why my Open OnDemand RStudio Session is crashing?
There are two main reasons why this may be happening:
1) The requested RAM is not enough for the analyses you are performing. In this case, please terminate your running Session and start a new one requesting more RAM.
2) Some R packages installed as part of the OOD RStudio App may be incompatible with each other. In this case, please terminate your running Session and rename the directory where these packages are installed (e.g., `mv $HOME/R $HOME/R.bak`). To reduce the number of R packages you need to install, please use specific variants such as Bioconductor, Tidyverse or Geospatial when needed instead of installing Bioconductor packages using the OOD RStudio Basic variant for example.
#### I need more resources than I can select with Open OnDemand Apps, can I do that?
The Open OnDemand Apps are meant to be used for learning, development and light testing and have limited resources compared to the resources available for batch submissions. If the resources provided by OOD Apps are not enough, then they should migrate their workflow to batch script.
## Data Transfer
#### Why I can not access files under shared Attic/Swan Globus Collection?
In some occasions, errors such as _"Mapping collection to specified ID failed."_, may occur when accessing files from shared Attic/Swan Globus Collection.
In order to resolve this issue, the owner of the collection needs to login to Globus and activate the `hcc#attic` or `hcc#swan` endpoint respectively.
This should reactivate the correct permissions for the collection.
#### I used Globus to copy my data across HCC file systems. How can I check that all data was successfully transferred and the data checksums match?
Globus automatic file integrity verification using checksums is turned off for HCC-specific Globus collections/endpoints.
By performing automatic file integrity verification using checksums via Globus all the source and destination files are read again and compared, and this adds significant I/O load on the HCC file-systems.
All HCC-specific file-systems have already built-in integrity checksums, thus additional verification is not needed.
If the status of your Globus transfer is `SUCCEEDED`, then the checksums of the source and destination files should match.
If you would still like to compare the checksums of the transferred files regardless of the Globus transfer status, there are two separate ways how you can achieve this:
- Start another Globus transfer and select _both_ **"sync - only transfer new or changed files"** and **"checksum is different"** under _"Transfer & Timer Options"_ under the _File Manager_ Globus tab.
- Run `rsync` with the `--checksum` option using an [Interactive Session](https://hcc.unl.edu/docs/submitting_jobs/creating_an_interactive_job/) or [SLURM job](https://hcc.unl.edu/docs/submitting_jobs/).
---
## Account Offboarding
#### I am graduating soon, what will happen with my HCC account?
Access to HCC resources is separate from access to NU resources, so you do not lose access to HCC when you graduate.
- If the HCC account is part of a research group, the account will remain active until the owner of the group requests that the account needs to be deactivated or until the account hasn't been used for a minimum of an year, whichever comes first.
- If the account holder continues collaborating with the HCC group owner as an outside collaborator, a proof of collaboration may be required. For more information on the User regulations please see [here](https://hcc.unl.edu/hcc-policies#user-regulations).
- If the account is only part of a course group, then according to [our class policy](https://hcc.unl.edu/hcc-policies#class-groups), the account will be deactivated one week after the course end date.
#### A member of my HCC group left and I need access to the data under their directories.
User directories under `{{ hcc.swan.home.path }}`, `{{ hcc.swan.work.path }}`, and `{{ hcc.nrdstor.path }}` by default are only accessible by the individual user account for those directories.
If you need access to a user directory under HCC's filesystems, please email [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and HCC staff will start the process.
---
## HCC Support and Training
#### I want to talk to a human about my problem. Can I do that?
Of course! We have an open door policy and invite you to join our [Remote Open Office hours](https://hcc.unl.edu/OOH), schedule a remote
session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu), or you can drop one of us a line and we'll arrange a time to meet: [Contact Us](https://hcc.unl.edu/contact-us).
#### Can HCC provide training for my group?
HCC can provide an introductory training (up to 2 hours) for groups (more than 2 people) on request via Zoom and In-Person.
Before submitting a request for training, please ensure everyone who will be attending has an active HCC account **and** has activated DUO for their HCC account.
Training requests can be submitted to [hcc-support@unl.edu](mailto:hcc-support@unl.edu).
Please include:
- A list of available times
- How many will be attending
- Preference on Zoom or on-site training
- If on-site, what location is it at?
An HCC staff member will reach out to confirm the date and location.
HCC also provides virtual Open Office Hours every Tuesday and Thursday from 2-3 PM. More details are available on the [Office Hours webpage](https://hcc.unl.edu/OOH).
#### Can HCC provide help and resources for my workshop?
We are happy to help with your workshop! We are able to provide up to 40 demo accounts for participants who don't already have HCC accounts and can have a staff member on-site to provide assistance with issues related to HCC and answering HCC related questions.
Please submit your request atleast **1 month in advance**. Requests may not be able to be fulfilled based on staff availability. It is strongly recommended to involve HCC staff during the initial planning of the hands-on portion of the workshop in order to provide smooth and timely experience with HCC resources.
Before submitting a request for workshop support, please fully test your materials using Swan and provide us a complete list of any software packages and environments that are needed.
Workshop support requests can be submitted to [hcc-support@unl.edu](mailto:hcc-support@unl.edu).
Please include:
- The date(s) and time(s) HCC will be utilized during the workshop, including any neccessarry setup.
- How many will be attending. If possible, please provide how many already have HCC accounts.
- Location of the workshop
- A list of software packages or environments that everyone would need.
- If you are using Open OnDemand JupterLab or RStudio, we can create custom kernel/image for the purpose of the workshop.
- If you are using a conda environment, we can create a shared environment for participants to use without the need to have participants creating their own.
An HCC staff member will reach out to confirm the date and location.
#### Where can I get training on using HCC resources?
HCC provides free and low cost training events throughout the year. Most events are held in-person, but some will be hybrid or Zoom.
New events are posted on our [upcoming events page](https://hcc.unl.edu/upcoming-events) and announced through our [hcc-announce mailing list](https://hcc.unl.edu/subscribe-mailing-list).
Past events and their materials are also available on our [past events page](https://hcc.unl.edu/past-events).
---
## Networking
#### What IP's do I use to allow connections to/from HCC resources?
Under normal circumstances no special network permissions are needed to access HCC resources. Occasionally, it may be necessary to whitelist the public IP
addresses HCC utilizes. Most often this is needed to allow incoming connections for an external-to-HCC license server, but may also be required
if your local network blocks outgoing connections. To allow HCC IP's, add the following ranges to the whitelist:
```bat
129.93.175.0/26
129.93.227.64/26
129.93.241.16/28
```
If you are unsure on how to do this, contact your local IT support staff for assistance.
For additional questions or issues with this, please [Contact Us](https://hcc.unl.edu/contact-us).
\ No newline at end of file
---
title: I have an HCC account, now what?
summary: "Information after getting a new HCC account"
---
- [Important links and information](#important-links-and-information)
- [HCC Support](#hcc-support)
- [HCC Resource Status](#hcc-resource-status)
- [Data Storage](#data-storage)
- [Getting started running jobs on HCC resources.](#getting-started-running-jobs-on-hcc-resources)
- [Account 2FA and Password](#account-2fa-and-password)
- [I forgot my password, how can I retrieve it?](#i-forgot-my-password-how-can-i-retrieve-it)
- [If I get a new phone, how do I (re)activate Duo?](#if-i-get-a-new-phone-how-do-i-reactivate-duo)
---
#### I have requested an account, what are my next steps?
Once you request an account, HCC staff will wait for a confirmation from either your instructor or lab group advisor.
When HCC recieves this confirmation, your account will be created. A temporary password will be sent to your email.
However you will still not be able to sign in just yet. First you will need to setup our multifactor authentication.
- If you are a part of the NU System (UNL, UNO, UNK, UNMC), HCC will ask if you would like to use the phone number tied to your TrueYou account. Once you confirm this number, HCC staff will have a text message sent by DUO to activate your multifactor with HCC resources.
- If you are not a part of the NU system or wish to use a different phone number than what is in TrueYou, you will need to attend one of our remote office hours where an HCC staff will assist you.
- If you do not wish to use your smartphone to authenticate with HCC resources, you can use a YubiKey with HCC resources. More information on YubiKeys is available [here](/accounts/setting_up_and_using_duo/#yubikeys)
It is strongly recommended to [change your password](/accounts/how_to_change_your_password/) as soon as possible.
At this point, you can now connect to and begin using HCC resources.
#### How do I connect to HCC resources?
Congrats on getting an HCC account! Now you need to connect to an HCC resource.
To do this, we can use an SSH connection or we can use a web browser to access the online web portal, Open OnDemand.
Both methods allows you to securely connect to a remote HCC resource and
operate it just like you would a personal machine.
Check out our documentation on [Connecting to HCC Clusters using SSH](/connecting/) and on [using Open OnDemand](/open_ondemand/).
## Important links and information
#### HCC Support
- **Real time and email support**
We have an open door policy and invite you to join our [Remote Open Office hours](https://hcc.unl.edu/OOH), schedule a remote or in-person session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu), or you can drop one of us a line and we'll arrange a time to meet: [Contact Us](https://hcc.unl.edu/contact-us).
- **Class and group tutorial and introduction sessions**
If your class or group is interested in a introductory session, please [email us](mailto:hcc-support@unl.edu) and we will work to schedule a time.
- **Training events**
We also host training events throughout the year that are announced via email to all with HCC accounts and viewable on our [Upcoming Events](https://hcc.unl.edu/upcoming-events) page. Our materials are also available for prior events on our [Past Events](https://hcc.unl.edu/past-events) page.
- **HCC Courses**
In addition to live events, we also have courses available using Bridge available on our [HCC Courses](https://hcc.unl.edu/hcc-courses) page.
#### HCC Resource Status
From time to time, HCC resources may be unavailable due to scheduled maintenance or unforseen issues. Upcoming maintenance will be announced ahead of time on the [status page](https://status.hcc.unl.edu/) and via email. Any unforseen issues will be posted to the [status page](https://status.hcc.unl.edu/) once HCC staff have established the issue.
#### Data Storage
- **Where to store data:** HCC has 5 filesystems available for use. Specific details on the file systems are available on our [data storage documentation.](/handling_data/data_storage/)
- **Key information:** The main information to take note is that the Common, Work, and NRDStor filesystems do not have any backups. Any files deleted on these filesystems are **permanently lost**.
- **Purge policy:** The Work filesystem also has a purge policy in place to delete old and unused files to keep the file system performant. There is **no** purge policy on Attic, Home, Common, or NRDStor.
Attic and Home are the only file systems with backups available.
- **Preventing file loss:** It is _**critical**_ to backup any important data related to your class, creative, or research activities. HCC provides a [guide](/handling_data/data_storage/preventing_file_loss/) on how to backup important data and prevent file loss.
## Getting started running jobs on HCC resources.
Once you have your account setup and reviewed the important information above, you can begin conducting class, creative, or research activities on HCC resources.
For Swan, you will need to take a few steps.
1. [Transfer data to HCC Clusters](/handling_data/)
2. [Check software availability](/applications/)
3. [Submit jobs on HCC Clusters](/submitting_jobs/) or use [Open OnDemand](/open_ondemand/)
## Account 2FA and Password
#### I forgot my password, how can I retrieve it?
Information on how to change or retrieve your password can be found on
the documentation page: [How to change your
password](/accounts/how_to_change_your_password)
All passwords must be at least 8 characters in length and must contain
at least one capital letter and one numeric digit. Passwords also cannot
contain any dictionary words. If you need help picking a good password,
consider using a (secure!) password generator such as
[this one provided by Random.org](https://www.random.org/passwords)
To preserve the security of your account, we recommend changing the
default password you were given as soon as possible.
#### If I get a new phone, how do I (re)activate Duo?
**If you have not activated Duo before:**
If you are a part of the NU System, email us at [hcc-support@unl.edu](mailto:hcc-support@unl.edu)
from the email address your account is registered under and send the last 4 digits of your phone number and we will send you a link that you can use to activate Duo.
Please join our [Remote Open Office hours](https://hcc.unl.edu/OOH) or schedule another remote
session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and show your photo ID and we will be happy to activate it for you.
**If you have activated Duo previously but now have a different phone number:**
Join our [Remote Open Office hours](https://hcc.unl.edu/OOH) or schedule another remote
session at [hcc-support@unl.edu](mailto:hcc-support@unl.edu) and show your photo ID and we will be happy to activate it for you.
**If you have activated Duo previously and have the same phone number:**
Email us at [hcc-support@unl.edu](mailto:hcc-support@unl.edu)
from the email address your account is registered under and we will send
you a new link that you can use to activate Duo.
---
title: SSH host keys
summary: "SSH keys for HCC services"
---
- [Summary](#summary)
- [Swan](#swan)
- [Crane](#crane)
---
## Summary
SSH host keys securely identify SSH servers. When an SSH client connects to a server, the server presents a host key to the client. The client confirms the key has not changed from previous connections.
HCC tries to keep the SSH keys unchanged throughout the lifetime of a service, but security requirements evolve, and key changes may be necessary. If you get a key change warning, please confirm the fingerprint against the list below before updating your SSH client known hosts list. If the key fingerprint does not match, do not accept the new key and *do not enter* your credentials, and please contact HCC.
### Example warning message
```
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
```
[//]: # ( for i in /etc/ssh/ssh_host_*pub ; do ssh-keygen -lf $i ; done )
## Swan
### swan.unl.edu fingerprints
```
SHA256:qcyi6CEw1gUgumEghA+TcXFmu39MAO4Pyrt8rT6+ymk (ECDSA)
SHA256:SrpwIZFSaZ3Nt6Ne9PW/7SSHXo1sdT0QnputriPAmA0 (ED25519)
SHA256:GfkTzeP/gWn0NChgWwAqOpuSVWPNtXbjlVqy2pyRGlk (RSA)
```
### swan-xfer.unl.edu fingerprints
```
SHA256:TvONmFeLVTA3IyA1IkGqzcLnLSOYZ2lkOWUesQ33nE0 (ECDSA)
SHA256:hY4dkI8ngY//lwuOC3sUgZtRNnMl6zkWPX10ptlSgiY (ED25519)
SHA256:l8pMknftfqRVtFF+BQc2WXwbZ23QhnjbG2erzPLzrGc (RSA)
```
### known\_hosts
```
swan.unl.edu ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBGJgEGqgkU3g8tzgedOXnNmGvBmgU8wPHnoFW1MDREhfdsDwyOvq+Pu+O+vSf1B4f3Krl49VkDhk1/kzMOSa/2U=
swan.unl.edu ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFH3i+E4EKT20y+tXmnizsXN2c6Lg2SlaGjsbERegll6
swan.unl.edu ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDFjkdZ8i/zRJ1XVopK4n4YNVgNG7eKMvrUXh2gHFXPkWKoIfUAnTywZIRx/iWRTmgWMn9QPIiD9cyULFbeqOCgXg0NoBlrUxThoKVF7TlXd4XXW7cK8n3a8/H/c8PrRON+5bNE6La+b1zwnTA5KbPYkSJ4A0F622c+LH2EEta0RYzEuskfNY7l8gBWpvtX11Xj2nOstXW543xHFZETqW4C2Cz7dYxTxU1kduZPSUFYb9SAXOfr9cnUn9z1+HzfWv9s3E3pn8irwP7YVDWxEXJi4atiHnfcKhWI/sY1TDhbFlJL0mMKqeBpKlP0LR4B4ck8XaXpPqd4QpDbFtqY8Vmol8pxtjkFydPOozbshvs7MWMwfQsXqMfZRlVOJ9ecitognxn3nIffmpZpXFB+lfyYrZvCXr74MVF2QXeZzX0Js0Y7q0Xst+lacRnt/cF4grH6Z0k8LTV+CiuzUvxcjwePLHBFYmtD2bN/H418Y7IOEHsGQd5xaYMrxidwOB0DQjE=
swan-xfer.unl.edu ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCWWKy921NTqcvbDISuOmNnViBegfwVk3Ni25Z0e8GBmTPXPDCyoVGCwAyXtiqxVFjPwNt9/zD8VWi/hInE9O8o=
swan-xfer.unl.edu ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDKgahWZaLmYmGfIOigoAGaW2Wpj84UzrpmgnXvkxMtN
swan-xfer.unl.edu ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCxyaMW7B/ifN8AyNIfX89fpzg8zE9SYTxDwqRKkW9giUQKER7GNZsF+WHlEkdkHcmv51jvFHNTwk4ZB0g/lZNAGxqJFbUIHf3K0gYCdji5MVQXqvMT40m8Flup8gxJWG07D3rqj2qeDlGxkMJQ1gUUsN90Exh---2rVkaOHMwjST2Q2HqklGdgDt9m3cgfAqhofLGNnQBUJCSy1fcpl58kJflMQXvnWvmAsUbzxoelqbifQGUgl9pdUDm+MJRAxnTI+KlNrIvDqokHXvb5Sy7UqLxVE8sg2CWmrfCReEnKzAM4cQ22bekRh9MrZqU9gLgt5Ez0DT1caFCk5xo+7MTYOfo3wd6KG+5flY+7MsgTVPHlFO8G/vZVKk1hTuPkOyByrzlbQ/IEzMQYXun6v/CKgiMvh2/rkAKX9oRnXVQjCq9o4EzKE6bkP4dYDNfdKePXANrS2J7mOpgQgKTzH4rO3wxnDG8BfOBctnyCeZo8GuRvRJoiBYD8f3SdeRSabXWU=
```
## Crane
### crane.unl.edu fingerprints
```
SHA256:0C+JzhK+Lw0Sm3HBcqn84vLIVr79ewp548Q4EnMikRQ (ECDSA)
SHA256:s/ZLlfcinYzEkamJ15Htek+0tpKlwo5uq+HajcxowRE (ED25519)
SHA256:GDH3+iqSp3WJxtUE6tXNQcWRwpf0xjYgkQrYBDX3Ir0 (RSA)
```
### crane-xfer.unl.edu fingerprints
```
SHA256:k9w+Khxea9ZZCuayJc+YIFi+9LRV68QPHXl7OSEANis (ECDSA)
SHA256:V64bGApa8S1MbvfkFQGEUL9xx7RWOzWetorPpGyBTMw (ED25519)
SHA256:WfS6HcnjbOrD8z11C1RYgF4TNZsqUnM/ZBpoTO2J4Pg (RSA)
```
### known\_hosts
```
crane.unl.edu ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBKFSyNCtoEMiMCLyvh6FsoFhfmGtJEbmeDqfoqi3tU2WaX7Dtc6Ti0k87MxfgPmLSn+wqSXDUxNL2eMdzuy9LRg=
crane.unl.edu ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILqREOJ9NtVi9d7p0pBBOumIbUJxc1Io3847pJacp8Ct
crane.unl.edu ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAv0LS4DwF7vdSk0jz5/otXyGn93hieUXu9ZFQgAVZLHyOYveBYSTyW9GCicAGtBBePbt3PvGzYARLHa9CX3K/sqAeax1DpmHuM1x6SNvIrvYySymqRJZWcvRHOo8o2Zc/5NZMmh06AtCLKuG9SxdmjuoBcfgX6AtG6gfH9t/7k2Q04qwpvaRy8cRbJCndiW8UAJblqg722m52reydqg8iN5C1QD/773yEUPdfVNVqIMsnSvoiEbSijxocaopiKiU9f1/QdvOIDyh6/b5BXp1o2zQ1fd4I0OveZDKn3QAqpX44wfuJ1u9D/Gq87K5NhyE3TqPUIe/+ZqZNzrcZmii66w==
crane-xfer.unl.edu ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBAWBwHOTjsPaftbFP1hL/rEo9E29yrEy+l82TfN6NJCvaFb4kdppDo86utmadWMB08YbUghEdOJXoNpudb7AHlI=
crane-xfer.unl.edu ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBL/nZV+XQEV19FODr1F2Z6Kn0jDhM+xEaq9QRDWetEY
crane-xfer.unl.edu ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDBggvAv6Gj6q5RnmMegh3S9YqNN2PEIyjEodbcvKyncY7r0seKVP4ABthrS3fU8bz1wSVi4nSVhMhGT4eRaDViy9NJPoebyORmiAczWgIZKip79YM+UZSQgZI0OtgfDv3Ouv2/ti1ymn0qVEHLrm7pLGto7HJlLGRpj4Acwid5T6sY3QTGyaJ64hkblYWfAryVcNIS/Qfz7wyeZEG1WE4XKPy2Ax+dRiYBiSWoORWzdMAs5OASrgtM1CeScjqrBlx3BDBi9FLzqbBq+IO/5Ee31Tvg9zBZtTgwAGwxsWhCvA+V95FDvSSjrjjNFEZ1ThuWbyDlEOmfF/B/k6wdxSTP
```
---
title: Basic Kubernetes
summary: "Basic Kubernetes"
weight: 20
---
### Setup
This section assumes you've completed the [Quick Start](quick_start.md) section.
If you are in multiple namespaces, you need to be aware of which namespace you’re working in, and either set it with `kubectl config set-context nautilus --namespace=the_namespace` or specify in each `kubectl` command by adding `-n namespace`.
### Explore the system
To get the list of cluster nodes (although you may not have access to all of them), type:
```
kubectl get nodes
```
Right now you probably don't have anything running in the namespace, and these commands will return `No resources found in ... namespace.`. There are three categories we will examine: pods, deployments and services. Later these commands will be useful to see what's running:
List all the pods in your namespace
```
kubectl get pods
```
List all the deployments in your namespace
```
kubectl get deployments
```
List all the services in your namespace
```
kubectl get services
```
### Launch a simple pod
Let’s create a simple generic pod, and login into it.
You can copy-and-paste the lines below. Create the `pod1.yaml` file with the following content:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: mypod
image: ubuntu
resources:
limits:
memory: 100Mi
cpu: 100m
requests:
memory: 100Mi
cpu: 100m
command: ["sh", "-c", "echo 'Im a new pod' && sleep infinity"]
```
Reminder, indentation is important in YAML, just like in Python.
*If you don't want to create the file and are using Mac or Linux, you can create yaml's dynamically like this:*
```
kubectl create -f - << EOF
<contents you want to deploy>
EOF
```
Now let’s start the pod:
```
kubectl create -f pod1.yaml
```
See if you can find it:
```
kubectl get pods
```
Note: You may see the other pods too.
If it is not yet in Running state, you can check what is going on with
```
kubectl get events --sort-by=.metadata.creationTimestamp
```
Events and other useful information about the pod can be seen in `describe`:
```
kubectl describe pod test-pod
```
If the pod is in Running state, we can check its logs
```
kubectl logs test-pod
```
Let’s log into it
```
kubectl exec -it test-pod -- /bin/bash
```
You are now inside the (container in the) pod!
Does it feel any different than a regular, dedicated node?
Try to create some directories and some files with content.
(Hello world will do, but feel free to be creative)
We will want to check the status of the networking.
But ifconfig is not available in the image we are using; so let’s install it.
First, let's make sure our installation tools are updated.
```
apt update
```
Now, we can use apt to install the necessary network tools.
```
apt install net-tools
```
Now check the networking:
```
ifconfig -a
```
Get out of the Pod (with either Control-D or exit).
You should see the same IP displayed with kubectl
```
kubectl get pod -o wide test-pod
```
We can now destroy the pod
```
kubectl delete -f pod1.yaml
```
Check that it is actually gone:
```
kubectl get pods
```
Now, let’s create it again:
```
kubectl create -f pod1.yaml
```
Does it have the same IP?
```
kubectl get pod -o wide test-pod
```
Log back into the pod:
```
kubectl exec -it test-pod -- /bin/bash
```
What does the network look like now?
What is the status of the files your created?
Finally, let’s delete the pod explicitly:
```
kubectl delete pod test-pod
```
### Let’s make it a deployment
You saw that when a pod was terminated, it was gone.
While above we did it by ourselves, the result would have been the same if a node died or was restarted.
In order to gain a higher availability, the use of Deployments is recommended. So, that’s what we will do next.
You can copy-and-paste the lines below.
###### dep1.yaml:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-dep
labels:
k8s-app: test-dep
spec:
replicas: 1
selector:
matchLabels:
k8s-app: test-dep
template:
metadata:
labels:
k8s-app: test-dep
spec:
containers:
- name: mypod
image: ubuntu
resources:
limits:
memory: 500Mi
cpu: 500m
requests:
memory: 100Mi
cpu: 50m
command: ["sh", "-c", "sleep infinity"]
```
Now let’s start the deployment:
```
kubectl create -f dep1.yaml
```
See if you can find it:
```
kubectl get deployments
```
The Deployment is just a conceptual service, though.
See if you can find the associated pod:
```
kubectl get pods
```
Once you have found its name, let’s log into it
```
kubectl get pod -o wide test-dep-<hash>
kubectl exec -it test-dep-<hash> -- /bin/bash
```
You are now inside the (container in the) pod!
Create directories and files as before.
Try various commands as before.
Let’s now delete the pod!
```
kubectl delete pod test-dep-<hash>
```
Is it really gone?
```
kubectl get pods
```
What happened to the deployment?
```
kubectl get deployments
```
Get into the new pod
```
kubectl get pod -o wide test-dep-<hash>
kubectl exec -it test-dep-<hash> -- /bin/bash
```
Was anything preserved?
Let’s now delete the deployment:
```
kubectl delete -f dep1.yaml
```
Verify everything is gone:
```
kubectl get deployments
kubectl get pods
```
### More tutorials are available at [Nautilus Documentation - Tutorials](https://docs.pacificresearchplatform.org)
---
title: Batch Jobs
summary: "Batch Jobs"
weight: 40
---
### Running batch jobs
#### Basic example
Kubernetes has a support for running batch jobs. A Job is a daemon which watches your pod and makes sure it exited with exit status 0. If it did not for any reason, it will be restarted up to `backoffLimit` number of times.
Since jobs in Nautilus are not limited in runtime, you can only run jobs with meaningful `command` field. Running in manual mode (`sleep infinity` `command` and manual start of computation) is prohibited.
Let's run a simple job and get its result.
Create a job.yaml file and submit:
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
resources:
limits:
memory: 200Mi
cpu: 1
requests:
memory: 50Mi
cpu: 50m
restartPolicy: Never
backoffLimit: 4
```
Explore what's running:
```
kubectl get jobs
kubectl get pods
```
When the job is finished, your pod will stay in Completed state, and Job will have COMPLETIONS field 1 / 1. For long jobs, the pods can have Error, Evicted, and other states until they finish properly or backoffLimit is exhausted.
This example job did not use any storage and outputted the result to STDOUT, which can be seen as our pod logs:
```
kubectl logs pi-<hash>
```
The pod and job will remain for you to come and look at for `ttlSecondsAfterFinished=604800` seconds (1 week) by default, and you can adjust this value in your job definition if desired.
**Please make sure you did not leave any pods and jobs behind.** To delete the job, run
```
kubectl delete job pi
```
#### Running several bash commands
You can group several commands, and use pipes, like this:
```
command:
- sh
- -c
- "cd /home/user/my_folder && apt-get install -y wget && wget pull some_file && do something else"
```
#### Logs
All stdout and stderr outputs from the script will be preserved and accessible by running
```
kubectl logs pod_name
```
Output from initContainer can be seen with
```
kubectl logs pod_name -c init-clone-repo
```
To see logs in real time do:
```
kubectl logs -f pod_name
```
The pod will remain in Completed state until you delete it or timeout is passed.
#### Retries
The backoffLimit field specifies how many times your pod will run in case the exit status of your script is not 0
or if pod was terminated for a different reason (for example a node was rebooted). It's a good idea to have it more than 0.
#### Fair queueing
There is no fair queue implemented on Nautilus. If you submit 1000 jobs, you block **all** other users from submitting in the cluster.
To limit your submission to a fair portion of the cluster, refer to [this guide](https://kubernetes.io/docs/tasks/job/fine-parallel-processing-work-queue/). Make sure to use a deployment and persistent storage for Redis pod. Here's [our example](https://gitlab.nrp-nautilus.io/prp/job-queue/-/blob/master/redis.yaml)
#### CPU only jobs
Nautilus is primarily used for GPU jobs. While it's possible to run large CPU-only jobs, you have to take certain measures to prevent taking over all cluster resources.
You can run the jobs with lower priority and allow other jobs to preempt yours. This way you should not worry about the size of your jobs and you can use the maximum number of resources in the cluster. To do that, add the `opportunistic` priority class to your pods:
```yaml
spec:
priorityClassName: opportunistic
```
Another thing to do is to avoid the GPU nodes. This way you can be sure you're only using the CPU-only nodes and jobs are not preventing any GPU usage. To do this, add the node antiaffinity for GPU device to your pod:
```yaml
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: feature.node.kubernetes.io/pci-10de.present
operator: NotIn
values:
- "true"
```
You can use a combination of 2 methods or either one.
---
title: Deployments
summary: "Deployments"
weight: 50
---
## Running an idle deployment
In case you need to have an idle pod in the cluster, that might ocassionally do some computations, you have to run it as a [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). Deployments in Nautilus are limited to 2 weeks (unless the namespace is added to exceptions list and runs a permanent service). This ensures your pod will not run in the cluster forever when you don't need it and move on to other projects.
Please don't run such pods as Jobs, since those are not purged by the cleaning daemon and will stay in the cluster forever if you forget to remove those.
Such a deployment **can not request a GPU**. You can use the
```
command:
- sleep
- "100000000"
```
as the command if you just want a pure shell, and `busybox`, `centos`, `ubuntu` or any other general image you like.
Follow the [guide for creating deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) and add the minimal requests to it and limits that make sense, for example:
```
resources:
limits:
cpu: "1"
memory: 10Gi
requests:
cpu: "10m"
memory: 100Mi
```
Example of running an nginx deployment:
```
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
k8s-app: nginx
spec:
replicas: 1
selector:
matchLabels:
k8s-app: nginx
template:
metadata:
labels:
k8s-app: nginx
spec:
containers:
- image: nginx
name: nginx-pod
resources:
limits:
cpu: 1
memory: 4Gi
requests:
cpu: 100m
memory: 500Mi
```
## Quickly stopping and starting the pod
If you need a simple way to start and stop your pod without redeploying every time, you can scale down the deployment. This will leave the definition, but delete the pod.
To stop the pod, scale down:
```
kubectl scale deployment deployment-name --replicas=0
```
To start the pod, scale up:
```
kubectl scale deployment deployment-name --replicas=1
```
---
title: GPU Pods
summary: "GPU Pods"
weight: 20
---
The Nautilus Cluster provides over 200 GPU nodes. In this section you will request GPUs. Make sure you don't waste those and delete your pods when not using the GPUs.
Use this definition to create your own pod and deploy it to kubernetes \(refer to [Basic Kubernetes](basic_kubernetes.md)\):
```yaml
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod-example
spec:
containers:
- name: gpu-container
image: gitlab-registry.nrp-nautilus.io/prp/jupyter-stack/prp:latest
command: ["sleep", "infinity"]
resources:
limits:
nvidia.com/gpu: 1
```
This example requests 1 GPU device. You can have up to 2 for pods. If you request GPU devices in your pod,
kubernetes will auto schedule your pod to the appropriate node. There's no need to specify the location manually.
**You should always delete your pod** when your computation is done to let other users use the GPUs.
Consider using [Jobs]({{ nrp.docs_url }}/userdocs/running/jobs/) **with actual script instead of `sleep`** whenever possible to ensure your pod is not wasting GPU time.
If you have never used Kubernetes before, see the [tutorial]({{ nrp.docs_url }}/userdocs/start/getting-started/).
#### Requesting high-demand GPUs
Certain kinds of GPUs have much higher specs than the others, and to avoid wasting those for regular jobs, your pods will only be scheduled on those if you request the type explicitly.
Currently those include:
* NVIDIA-TITAN-RTX
* NVIDIA-RTX-A5000
* Quadro-RTX-6000
* Tesla-V100-SXM2-32GB
* NVIDIA-A40
* NVIDIA-RTX-A6000
* Quadro-RTX-8000
* NVIDIA-A100-SXM4-80GB*
*A100 running in [MIG mode](#mig-mode) is not considered high-demand one.
#### Requesting many GPUs
Since 1 and 2 GPU jobs are blocking nodes from getting 4 and 8 GPU jobs, there are some nodes reserved for those. Once you submit a job requesting 4 or 8 GPUs, a controller will automatically add toleration which will allow you to use the node reserved for more GPUs. You don't need to do anything manually for that.
#### Choosing GPU type
We have a variety of GPU flavors attached to Nautilus. You can get a list of GPU models from the actual cluster information (f.e. `kubectl get nodes -L nvidia.com/gpu.product`).
<div id="observablehq-chart-35acf314"></div>
<p>Credit: <a href="https://observablehq.com/d/7c0f46855b4212e0">GPU types by NRP Nautilus</a></p>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@observablehq/inspector@5/dist/inspector.css">
<script type="module">
import {Runtime, Inspector} from "https://cdn.jsdelivr.net/npm/@observablehq/runtime@5/dist/runtime.js";
import define from "https://api.observablehq.com/d/7c0f46855b4212e0.js?v=4";
new Runtime().module(define, name => {
if (name === "chart") return new Inspector(document.querySelector("#observablehq-chart-35acf314"));
});
</script>
If you need more graphical memory, use the official specs to choose the type. The table below is an example of the GPU types in the Nautuilus Cluster and their memory size:
GPU Type | Memory size (GB)
---|---
NVIDIA-GeForce-GTX-1070 | 8G
NVIDIA-GeForce-GTX-1080 | 8G
Quadro-M4000 | 8G
NVIDIA-A100-PCIE-40GB-MIG-2g.10gb | 10G
NVIDIA-GeForce-GTX-1080-Ti | 12G
NVIDIA-GeForce-RTX-2080-Ti | 12G
NVIDIA-TITAN-Xp | 12G
Tesla-T4 | 16G
NVIDIA-A10 | 24G
NVIDIA-GeForce-RTX-3090 | 24G
NVIDIA-GeForce-RTX-3090 | 24G
NVIDIA-TITAN-RTX | 24G
NVIDIA-RTX-A5000 | 24G
Quadro-RTX-6000 | 24G
Tesla-V100-SXM2-32GB | 32G
NVIDIA-A40 | 48G
NVIDIA-RTX-A6000 | 48G
Quadro-RTX-8000 | 48G
**NOTE**: [Not all nodes are available to all users]({{ nrp.docs_url }}/userdocs/running/special/). You can consult about your available resources in [Matrix]({{ nrp.docs_url }}/userdocs/start/support) and on [resources page]({{ nrp.resources.url }}).
Labs connecting their hardware to our cluster have preferential access to all our resources.
To use a **specific type of GPU**, add the affinity definition to your pod yaml
file. The example below specifies *1080Ti* GPU:
```yaml
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nvidia.com/gpu.product
operator: In
values:
- NVIDIA-GeForce-GTX-1080-Ti
```
**To make sure you did everything correctly** after you've submitted the job, look at the corresponding pod yaml (`kubectl get pod ... -o yaml`) and check that resulting nodeAffinity is as expected.
#### Selecting CUDA version
In general the higher CUDA versions support the lower and same driver versions. The nodes are labelled with the major and minor CUDA and driver versions. You can check those at the [resources page]({{ nrp.resources.url }}) or list with this command (it will also choose only GPU nodes):
```bash
kubectl get nodes -L nvidia.com/cuda.driver.major,nvidia.com/cuda.driver.minor,nvidia.com/cuda.runtime.major,nvidia.com/cuda.runtime.minor -l nvidia.com/gpu.product
```
If you're using the container image with higher CUDA version, you have to pick the nodes supporting it. Example:
```yaml
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nvidia.com/cuda.runtime.major
operator: In
values:
- "12"
- key: nvidia.com/cuda.runtime.minor
operator: In
values:
- "2"
```
Also you can choose the driver above something if you know which one you need (this will pick drivers **above** 535):
```yaml
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nvidia.com/cuda.driver.major
operator: Gt
values:
- "535"
```
#### MIG mode
A100 GPUs allow slicing those into several logical GPUs ( [MIG mode](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#a100-profiles) ). This mode is enabled in our cluster. Things can change, but currently we're thinking about slicing those in halves. The current MIG mode can be obtained from nodes via the `nvidia.com/gpu.product` label: `NVIDIA-A100-PCIE-40GB-MIG-2g.10gb` means 2 compute instances (out of 7 total) and 10GB memory per virtual GPU.
---
title: The National Research Platform
summary: "How to utilize the National Research Platform (NRP)."
weight: 8
---
### What is the National Research Platform (NRP)?
The [National Research Platform](https://nationalresearchplatform.org) is a partnership of more than 50 institutions, led by researchers at UC San Diego, University of Nebraska-Lincoln, and UC Berkeley and includes the National Science Foundation, Department of Energy, and multiple research universities in the US and around the world.
The major resource of NRP is a heterogenous globally distributed, open system that fetures a variety of CPUs, GPUs and storage, arranged into a Kubernetes cluster called [Nautilus](https://docs.pacificresearchplatform.org).
The map below shows the National Research Platform resources located across the world.
<iframe
src="https://dash.nrp-nautilus.io/map"
style="width:100%; height:600px; border:1px solid black"
></iframe>
This help document covers these topics:
- [Quick Start](quick_start)
- [Basic Kubernetes](basic_kubernetes)
- [GPU Pods](gpu_pods)
- [Batch Jobs](batch_jobs)
- [Deployments](deployments)
- [Storage](storage)
- [JupyterHub Services](jupyterhub)
The full documentation of the NRP Nautilus Cluster can be found at https://docs.pacificresearchplatform.org.
To get help regarding using the NRP Nautilus Cluster, please refer to the [Contact page](https://docs.pacificresearchplatform.org/userdocs/start/contact/)
---
title: JupyterHub Service
summary: "JupyterHub Service"
weight: 70
---
### [JupyterHub](https://jupyterhub-west.nrp-nautilus.io) on Nautilus
[JupyterHub](https://jupyterhub-west.nrp-nautilus.io) service is provided on the Nautilus Cluster, which is great
if you need to quickly run your workflow and do not want to learn any
kubernetes. Simply follow the link to [https://jupyterhub-west.nrp-nautilus.io](https://jupyterhub-west.nrp-nautilus.io), click the **Sign in With CILogon** button, and use your institutional credentials to login using CILogon. After authentication, choose the hardware specs to spawn your instance. An example of the specs selection is shown as below:
<img src="/images/nrp-jupyterhub-options.png">
Your persistent home folder initially will be limited to 5GB. If you need more, you can request it to be extended.
You can also request for [cephFS storage](https://docs.pacificresearchplatform.org/userdocs/storage/ceph/) that is mounted to a shared disk space. All these requests can be made by **contacting NRP admins through [Matrix](https://docs.pacificresearchplatform.org/userdocs/start/contact/)**.
Please use this to store all the data, code and results that you would need for long experiments.
**NOTE:** Your jupyter container will shut down 1hr after your browser disconnects from it. If you need your job to keep running, don't close the browser window.
You could either use a desktop with a persistent Internet connection or only use this for testing your code.
**NOTE:** Available images are described in the [scientific images section](https://docs.pacificresearchplatform.org/userdocs/running/sci-img/).
If you need to use an image that is not provided by NRP, proceed to [Step by Step Tensorflow with Jupyter](https://docs.pacificresearchplatform.org/userdocs/jupyter/jupyter-pod). If you prefer a customized JupyterHub, follow the guide to [Deploy JupyterHub](https://docs.pacificresearchplatform.org/userdocs/jupyter/jupyterhub/) to deploy your own JupyterHub instance on the Nautilus Cluster.
!!! tip "Deploying your own JupyterHub instance"
Instructions on how to create and customize your own instance of JupyterHub are available on the [Deploy JupyterHub]({{ nrp.docs_url }}/userdocs/jupyter/jupyterhub/) documentation page.
---
title: Quick Start
summary: "Quick Start"
weight: 10
---
The Nautilus Cluster is a globally distrubuted [Kubernetes](https://kubernetes.io) cluster.
The general guide of getting access to the Nautilus Cluster can be found [here](https://docs.pacificresearchplatform.org/userdocs/start/get-access/). The guidance in this page is tailored to NU users:
### Get access to the Nautilus cluster
1. Point your browser to the [Nautilus Portal](https://portal.nrp-nautilus.io)
2. On the portal page click on "Login" button at the top right corner
<img src="/images/nautilus-portal-login.png" height="50">
3. You will be redirected to the "CILogon" page
4. On this page, select "University of Nebraska-Lincoln" as the Identity Provider from the menu and Click "Log On" button to use your UNL credentials to login. For users from other NU campuses, select the institution of the NU system that you are affilicated with.
<img src="/images/cilogon-unl.png">
5. After a successful authentication you will login on the portal.
_On first login you become a **guest**. Any admin user can
validate your guest account, promote you to **user** and add your account to their **namespace**. You need to be assigned to at least one namespace (usually a group project but can be your new namespace)._
- To get access to a namespace, please contact its owner (usually email). Once you are granted the user role in the cluster and are added to the namespace, you will get access to all namespace resources.
- If you're starting a new project and would like to have your own namespace, either for yourself or for your group, you can request to be promoted to the admin by **contacting NRP admins through [Matrix](https://docs.pacificresearchplatform.org/userdocs/start/contact/)**.
This will give you permission to create any number of namespaces and invite other users to your namespace(s). Please note, **you'll be the one responsible for all activity happening in your namespaces**.
6. Once you are made either a user or admin of a namespace, you'll need to accept the **Acceptable Use Policy (AUP)** on the portal page \(as shown in the screenshot below\) in order to get access to the cluster.
<img src="/images/nrp-aup.png" height="50">
7. Please review [Policies](https://docs.pacificresearchplatform.org/userdocs/start/policies/) before starting any work on the Nautilus Cluster.
### Configure a client to use the Nautilus Cluster
Now you have been given access to the Nautilus Cluster. To interact with the cluster, you need to configure a client with the `kubectl` command-line tool. A client can be your desktop or laptop computer, a virtual machine, or a terminal environment.
1. [Install][1] the kubectl tool
2. Login to [NRP Nautilus portal][2]
<img src="/images/nautilus-portal-login.png" height="50">
3. Click the **Get Config** link on top right corner of the page to get your configuration file.
<img src="/images/nrp-get-config.png" height="50">
4. Save the file as **config** and put the file in your \<home\>/.kube folder.
This folder may not exist on your machine, to create it execute from a terminal:
```
mkdir ~/.kube
```
5. Test kubectl can connect to the cluster using a command line tool:
```
kubectl get pods -n your_namespace
```
It's possible there are no pods in your namespace yet. If you've got `No resources found.`, this indicates your namespace is empty and you can start running in it.
[1]: https://kubernetes.io/docs/tasks/tools/install-kubectl/
[2]: https://portal.nrp-nautilus.io
---
title: Storage
summary: "Storage"
weight: 60
---
### Using Storage
Different Kubernetes clusters will have different storage options available.
Let’s explore the most basic one: emptyDir. It will allocate local scratch volume, which will be gone once the pod is destroyed.
You can copy-and-paste the lines below.
###### strg1.yaml:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-storage
labels:
k8s-app: test-storage
spec:
replicas: 1
selector:
matchLabels:
k8s-app: test-storage
template:
metadata:
labels:
k8s-app: test-storage
spec:
containers:
- name: mypod
image: alpine
resources:
limits:
memory: 100Mi
cpu: 100m
requests:
memory: 100Mi
cpu: 100m
command: ["sh", "-c", "apk add dumb-init && dumb-init -- sleep 100000"]
volumeMounts:
- name: mydata
mountPath: /mnt/myscratch
volumes:
- name: mydata
emptyDir: {}
```
Now let’s start the deployment:
```
kubectl create -f strg1.yaml
```
Now log into the created pod, create
```
mkdir /mnt/myscratch/username
```
then store some files in it.
Also put some files in some other (unrelated) directories.
Now kill the container: `kill 1` wait for a new one to be created, then log back in.
What happened to the files?
You can now delete the deployment.
### Using outer persistent storage
In our cluster we have ceph storage connected, which allows using it for real data persistence.
To get storage, we need to create an abstraction called PersistentVolumeClaim. By doing that we "Claim" some storage space - "Persistent Volume". There will actually be PersistentVolume created, but it's a cluster-wide resource which you can not see.
Create the file:
###### pvc.yaml:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-vol
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
```
We're creating a 1GB volume and formatting it with XFS.
Look at its status with `kubectl get pvc test-vol`. The `STATUS` field should be equal to `Bound` - this indicates successful allocation.
Now we can attach it to our pod. Create one:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: mypod
image: centos:centos7
command: ["sh", "-c", "sleep infinity"]
resources:
limits:
memory: 100Mi
cpu: 100m
requests:
memory: 100Mi
cpu: 100m
volumeMounts:
- mountPath: /examplevol
name: examplevol
volumes:
- name: examplevol
persistentVolumeClaim:
claimName: test-vol
```
In volumes section we're attaching the requested persistent volume to the pod (by its name!), and in volumeMounts we're mounting the attached volume to the container in specified folder.
### Exploring storageClasses
Attaching persistent storage is usually done based on storage class. Different clusters will have different storageClasses, and you have to read the [documentation](https://docs.pacificresearchplatform.org/userdocs/storage/intro) on which one to use. Some are restricted and you need to contact admins to ask for permission to use those.
Note that the one we used is the default - it will be used if you define none.
### Cleaning up
After you've deleted all the pods and deployments, delete the volume claim:
```
kubectl delete pvc test-vol
```
Please make sure you did not leave any running pods, deployments, volumes.
title: The OSG Consortium
+++
title = "Characteristics of an OSG friendly job"
description = "Characteristics of an OSG friendly job"
+++
The OSG is a Distributed High Throughput Computing (DHTC) environment,
which means that users can access compute cores on over 100 different
computing sites across the nation with a single job submission. This
also means that your jobs must fit a set of criteria in order to be
---
title: Characteristics of an OSG friendly job
summary: "Characteristics of an OSG friendly job"
weight: 10
---
Any researcher performing Open Science in the US can become an [OSPool](https://osg-htc.org/services/open_science_pool.html) user. The OSPool provides its users with fair-share access (no allocation needed!) to processing and storage capacity contributed by university campuses, government-supported supercomputing institutions and research collaborations. Using state-of-the-art distributed computing technologies the OSPool is designed to support High Throughput workloads that consist of large ensembles of independent computations.
Your jobs must fit a set of criteria in order to be
eligible to run on OSG. The list below provides some rule of thumb
characteristics that can help us make a decision if using OSG for a
given job is a viable option.
| Characteristics of an OSG friendly job |
| -------------------------------------- |
!!! tip "**Characteristics of an OSG friendly job**"
| Variable | Suggested Values |
| -------------------------------------- | ------------------------------------------------------------------------------------- |
| Memory/Process | <= 2GB |
| Memory/Process | <= 40 GB |
| Type of job | serial (i.e. mostly single core) |
| Network traffic<br>(input or output files) | <= 2GB each side |
| Running Time | Ideal time is 1-10 hours - max is 72 |
| Running Time | Ideal time is 1-10 hours - max is 40 |
| Runtime Disk Usage | <= 10GB |
| Binary Type | Portable RHEL6/7 |
| Software | Non-licensed, pre-compiled binaries, containers |
| Total CPU Time (of job workflow) | Large, typically >= 1000 hours |
### OSG Job Runtime
......@@ -35,5 +34,4 @@ over from the beginning.  For this reason, it is good practice to build
automatic checkpointing into your job, or break a large job into
multiple small jobs if it is at all possible.
Next: [How to submit an OSG Job with HTCondor]({{< relref "how_to_submit_an_osg_job_with_htcondor" >}})
---
title: The OSG Consortium
summary: "How to utilize the OSG."
weight: 9
---
If you find that you are not getting access to the volume of computing
resources needed for your research through HCC, you might also consider
submitting your jobs to the OSG.
### What is the OSG?
The [OSG](https://osg-htc.org/) advances
science through open distributed computing. Established in 2005, the OSG Consortium operates a fabric of distributed High Throughput Computing (dHTC) services in support of the National Science & Engineering community. The research collaborations, campuses, national laboratories, and software providers that form the consortium are unified in their commitment to advance open science via these services. The Holland Computing Center is proud member of the OSG Consortium. If you are interested in hearing more about OSG or utilizing its resources for your research, please email hcc-support.unl.edu.
The map below shows the Open Science Grid sites located across the U.S.
<img src="/images/17044917.png">
- [Characteristics of an OSG friendly job](characteristics_of_an_osg_friendly_job)
title: Creating an Account
+++
title = "Changing Your Password"
description = "How to change your HCC password"
weight = "30"
+++
---
title: Changing Your Password
summary: "How to change your HCC password"
---
How to change your password
---------------------------
{{% notice info%}}
**Your account must be active with Duo authentication setup in order for
the following instructions to work.**
{{% /notice %}}
!!! info
**Your account must be active with Duo authentication setup in order for the following instructions to work.**
- [How to change your password](#how-to-change-your-password)
- [HCC password requirements](#hcc-password-requirements)
- [Changing a known HCC password](#changing-a-known-hcc-password)
- [Change your password via the command line](#change-your-password-via-the-command-line)
......@@ -20,7 +19,7 @@ the following instructions to work.**
- [Tutorial Video](#tutorial-video)
Every HCC user has a password that is same on all HCC machines
(Tusker, Crane, Anvil). This password needs to satisfy the HCC
(Swan, Anvil). This password needs to satisfy the HCC
password requirements.
### HCC password requirements
......@@ -47,17 +46,17 @@ to change it:
#### Change your password via the command line
To change a current or temporary password, the user needs to login to
any HCC cluster (Crane or Tusker) and use the ***passwd*** command:
any HCC cluster and use the ***passwd*** command:
**Change HCC password**
{{< highlight bash >}}
```bash
$ passwd
Changing password for user <username>.
Current Password:
New password:
New password:
{{< /highlight >}}
```
With this command, the user is first prompted for his/her old password.
If the "*Current Password*" is correct, then the user is asked twice for
......@@ -70,12 +69,12 @@ needs to fulfill the HCC password requirements.
with your HCC credentials.
2. Click **Update Account** in the top menu
{{< figure src="/images/35326617.png" height="150" >}}
<img src="/images/35326617.png" height="150">
3. Enter your new password in the **Password** and **Retype Password**
boxes and click **Modify** to save
{{< figure src="/images/35326618.png" height="150" >}}
<img src="/images/35326618.png" height="150">
### Resetting a forgotten HCC password
......@@ -83,7 +82,7 @@ To reset your password, navigate to the [myHCC User Portal](https://hcc.unl.edu/
Click the link to reset your forgotten password
{{< figure src="/images/35326619.png" height="400" >}}
<img src="/images/35326619.png" height="400">
Fill in the requested information (your HCC user name and email
associated with your account) and click **Reset Password**. A reset link
......@@ -92,4 +91,4 @@ onscreen prompts to set a new password.
### Tutorial Video
{{< youtube eaTW6FDhpsM >}}
{{ youtube('eaTW6FDhpsM') }}
---
title: Creating an Account
weight: 2
---
Anyone affiliated with the University of Nebraska (NU) system can request an account on
and use HCC shared resources for free.
How to create an HCC account:
1. **Identify or Setup a Group:** All HCC accounts must be associated
with an HCC group **owned by NU faculty**. Usually, user's HCC group is the research group owned by their advisor
but it may also be a class group owned by the course instructor. To establish a new
group, please complete a [new group request](https://hcc.unl.edu/new-group-request) if you are NU faculty.
2. **Request an Account:** All accounts must be associated with an HCC group.
Your group will usually be owned by your advisor, however, it could also be a
class group owned by your instructor. Once you know the group your account will
be associated with, please complete a [new user request](http://hcc.unl.edu/new-user-request/).
3. **Setup Two Factor Authentication:** Once your account has been approved, you will recieve an email
with login instructions. To finish activating your account, you will need to either have phone number registered with TrueYou or join our [Remote Open Office hours](https://hcc.unl.edu/OOH) or schedule another remote session and show your photo ID in order to [activate Two Factor Authentication](setting_up_and_using_duo.md).
4. **Reset your Temporary Password:** To maintain the security of your account, please
[change your password](how_to_change_your_password.md) as soon as
your account is active.
Once the above steps are complete, your account is now active and you are ready to
[connect to HCC resources](/connecting) and
[begin submitting jobs](/submitting_jobs). If you
have any questions or would like to setup a consultation meeting, please [contact us](/contact_us/).
+++
title = "Setting Up and Using Duo"
description = "Duo Setup Instructions"
weight = "8"
+++
---
title: Setting Up and Using Duo
summary: "Duo Setup Instructions"
---
##### Use of Duo two-factor authentication (https://www.duosecurity.com) is required for access to HCC resources.
!!! note
The information here only pertains to using Duo with Holland Computing Center accounts.
For help with your general University (i.e. TrueYou) account and Duo, contact
the [Huskertech Help Center](https://its.unl.edu/helpcenter/) via email at [support@nebraska.edu](mailto:support@nebraska.edu).
##### **Use of Duo two-factor authentication (https://www.duosecurity.com) is required for access to HCC resources.**
Users will connect via SSH and enter their username/passwords as usual. One additional
authentication step through Duo is then needed before the login is completed. This
......@@ -22,20 +27,33 @@ smartphone or purchase a YubiKey USB device.
### Smartphone
If you *are not* currently using Duo with your TrueYou account:
1. Install the free **Duo Mobile** application from the
[Google Play Store](https://play.google.com/store/apps/details?id=com.duosecurity.duomobile), [Apple App Store](https://itunes.apple.com/us/app/duo-mobile/id422663827), or [Microsoft Store](https://www.microsoft.com/en-us/store/apps/duo-mobile/9nblggh08m1g)
2. Visit one of the following locations. **Bring your smartphone and a valid photo ID** such as your university ID card or drivers license.
1. Visit either HCC location [118 Schorr Center, UNL](http://www1.unl.edu/tour/SHOR) |
[152 Peter Kiewit Institute, UNO](http://pki.nebraska.edu/new/pages/about-pki/maps-directions-and-parking) in-person anytime from 9am-5pm to enroll.
2. Visit Information Technology Services [115 Otto Olsen, UNK](http://www.unk.edu/campus-map/?q=m15)
in-person and ask for HCC identity verification.
2. ~~Visit one of the following locations. **Bring your smartphone and a valid photo ID** such as your university ID card or drivers license.~~
1. ~~Visit either HCC location [118 Schorr Center, UNL](http://www1.unl.edu/tour/SHOR) |
[152 Peter Kiewit Institute, UNO](http://pki.nebraska.edu/new/pages/about-pki/maps-directions-and-parking) in-person anytime from 9am-5pm to enroll.~~
2. ~~Visit Information Technology Services [115 Otto Olsen, UNK](http://www.unk.edu/campus-map/?q=m15)
in-person and ask for HCC identity verification.~~
**Due to current health and safety concerns, Duo activation is entirely remote.** Join one of [HCC's Remote Open Office hours](https://hcc.unl.edu/OOH)
sessions every Tues/Thurs from 2-3PM CST to activate Duo. Contact [hcc-support@unl.edu](mailto:hcc-support@unl.edu) for alternate
times if you are not able to attend.
Faculty/staff members with a verified NU telephone number can enroll by
phone. If you would like an HCC staff member to call your NU telephone
number to enroll, please email
{{< icon name="envelope" >}}[hcc-support@unl.edu] (mailto:hcc-support@unl.edu)
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
with a time you will be available.
If you *are* currently using Duo with your TrueYou account:
1. You can request to use the same phone for HCC's Duo as you are using for TrueYou.
Please contact [hcc-support@unl.edu](mailto:hcc-support@unl.edu) with the request
using the email address associated with your TrueYou account. In the email, include
the last 4 digits of the phone number for verification.
### YubiKeys
YubiKey devices are currently a one-time cost of around $25 from HCC, or can be
......@@ -50,23 +68,22 @@ U2F use.
Example login using Duo Push
----------------------------
This demonstrates an example login to Crane using the Duo Push method.
This demonstrates an example login to Swan using the Duo Push method.
Using another method (SMS, phone call, etc.) proceeds in the same way.
(Click on any image for a larger version.)
First, a user connects via SSH using their normal HCC username/password,
exactly as before.
{{< figure src="/images/5832713.png" width="600" >}}
{{% notice warning%}}**Account lockout**
<img src="/images/duo_login_pass.png" width="600">
!!!warning "**Account lockout**"
After 10 failed authentication attempts, the user's account is
disabled. If this is the case, then the user needs to send an email to
[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
including his/her username and the reason why multiple failed
authentication attempts occurred.
{{% /notice %}}
After entering the password, instead of completing the login, the user
will be presented with the Duo prompt. This gives the choice to use any
......@@ -75,21 +92,20 @@ this example, the choices are Duo Push notification, SMS message, or
phone call. Choosing option 1 for Duo Push, a request to verify the
login will be sent to the user's smartphone.
{{< figure src="/images/5832716.png" height="350" >}}
<img src="/images/duo_app_request.png" height="350">
Simply tap `Approve` to verify the login.
{{< figure src="/images/5832717.png" height="350" >}}
<img src="/images/duo_app_approved.png" height="350">
!!! warning
**If you receive a verification request you didn't initiate, deny the request and contact HCC immediately via email at[hcc-support@unl.edu](mailto:hcc-support@unl.edu)**
{{% notice warning%}}**If you receive a verification request you didn't initiate, deny the
request and contact HCC immediately via email at
[hcc-support@unl.edu] (mailto:hcc-support@unl.edu)**
{{% /notice %}}
In the terminal, the login will now complete and the user will logged in
as usual.
{{< figure src="/images/5832714.png" height="350" >}}
<img src="/images/duo_login_successful.png" height="350">
Duo Authentication Methods
......@@ -98,7 +114,7 @@ Duo Authentication Methods
### Duo Push
##### [[Watch the Duo Push Demo]](https://www.duosecurity.com/duo-push)
{{< figure src="/images/5832709.png" height="350" caption="Photo credit: https://duosecurity.com" >}}
<img src="/images/5832709.png" height="350" caption="Photo credit: https://duosecurity.com">
For smartphone or tablet users (iPhone, Android, Blackberry, Windows
Phone), the Duo Mobile app is available for free. A push notification
......@@ -107,7 +123,7 @@ one tap.
### Duo Mobile Passcodes
{{< figure src="/images/5832711.png" height="350" caption="Photo credit: https://duosecurity.com" >}}
<img src="/images/5832711.png" height="350" caption="Photo credit: https://duosecurity.com">
The Duo Mobile app can also be used to generate numeric passcodes, even
when internet and cell service is unavailable. Press the key icon to
......@@ -117,7 +133,7 @@ prompt to complete authentication.
### SMS Passcodes
{{< figure src="/images/5832712.png" height="350" >}}
<img src="/images/5832712.png" height="350">
For non-smartphone users, Duo can send passcodes via normal text
messages which are entered manually to complete login. Please note since
......@@ -134,9 +150,12 @@ entered manually to complete the login.
### YubiKey
##### [[Yubico]](http://www.yubico.com/)
{{< figure src="/images/5832710.jpg" height="200" caption="Photo credit: Yubico" >}}
<img src="/images/5832710.jpg" height="200" caption="Photo credit: Yubico">
YubiKeys are USB hardware tokens that generate passcodes when pressed.
With HCC clusters, there is no prompt to press on the YubiKey. When the DUO prompt
appears in the terminal, press the YubiKey and it will output a string to the terminal
to authenticate you.
They appear as a USB keyboard to the computer they are connected to, and
so require no driver software with almost all modern operating systems.
YubiKeys are available from the Husker Tech store at UNL. Users may also purchase them directly from
......
title: "Anvil: HCC's Cloud"
+++
title = "Adding SSH Key Pairs"
description = "How to add key pairs to your OpenStack account."
+++
---
title: Adding SSH Key Pairs
summary: "How to add key pairs to your OpenStack account."
---
If you have not already generated your key pairs and need help doing so,
please see the documentation that relates to your operating system:
- [Creating SSH key pairs on Mac]({{< relref "creating_ssh_key_pairs_on_mac" >}})
- [Creating SSH key pairs on Windows]({{< relref "creating_ssh_key_pairs_on_windows" >}})
- [Creating SSH key pairs on Mac](../creating_ssh_key_pairs_on_mac/)
- [Creating SSH key pairs on Windows](../creating_ssh_key_pairs_on_windows/)
!!! note
This guide assumes you are either accessing Anvil from on-campus, or are connected to the [Anvil VPN](../connecting_to_the_anvil_vpn/).
{{% notice info %}}
This guide assumes you are either accessing Anvil from on-campus, or are
connected to the [Anvil VPN]({{< relref "connecting_to_the_anvil_vpn" >}}).
{{% /notice %}}
Log into the Anvil web dashboard at **https://anvil.unl.edu** using
your HCC credentials. On the left-hand side navigation menu,
click *Access & Security*.
{{< figure src="/images/13599031.png" >}}
<img src="/images/13599031.png">
Choose the *Key Pairs* tab in the main window section.
{{< figure src="/images/13599033.png" >}}
<img src="/images/13599033.png">
Open your **public** key file, select the entire text, and copy it. On
the right-hand side, click the *Import Key Pair* button.
{{< figure src="/images/13599036.png" >}}
<img src="/images/13599036.png">
In the pop-up window, fill in the *Key Pair Name* field with a
convenient name. Paste the copied public key text in the larger *Public
Key* box.
{{< figure src="/images/13599039.png" width="650" >}}
<img src="/images/13599039.png" width="650">
Click the *Import Key Pair* button to close the pop-up and save the key.
You should then see an entry with the saved key (the fingerprint value
will be different than the example below).
{{< figure src="/images/13599043.png" >}}
<img src="/images/13599043.png">
The key pair can now be associated with any newly created instances.