_index.md 8.11 KB
Newer Older
Adam Caprez's avatar
Adam Caprez committed
1
2
3
+++
title = "FAQ"
description = "HCC Frequently Asked Questions"
Carrie A Brown's avatar
Carrie A Brown committed
4
weight = "95"
Adam Caprez's avatar
Adam Caprez committed
5
6
7
8
9
10
11
12
13
14
15
+++

- [I have an account, now what?](#i-have-an-account-now-what)
- [How do I change my password?](#how-do-i-change-my-password)
- [I forgot my password, how can I retrieve it?](#i-forgot-my-password-how-can-i-retrieve-it)
- [I just deleted some files and didn't mean to! Can I get them back?](#i-just-deleted-some-files-and-didn-t-mean-to-can-i-get-them-back)
- [How do I (re)activate Duo?](#how-do-i-re-activate-duo)
- [How many nodes/memory/time should I request?](#how-many-nodes-memory-time-should-i-request)
- [I am trying to run a job but nothing happens?](#i-am-trying-to-run-a-job-but-nothing-happens)
- [I keep getting the error "slurmstepd: error: Exceeded step memory limit at some point." What does this mean and how do I fix it?](#i-keep-getting-the-error-slurmstepd-error-exceeded-step-memory-limit-at-some-point-what-does-this-mean-and-how-do-i-fix-it)
- [I want to talk to a human about my problem. Can I do that?](#i-want-to-talk-to-a-human-about-my-problem-can-i-do-that)
Mohammed Tanash's avatar
Mohammed Tanash committed
16
- [My submitted job takes long time waiting in the queue or it is not running?](#my-submitted-job-taked-long-time-waiting-in-the-queue-or-it-is-not-running)
Adam Caprez's avatar
Adam Caprez committed
17
18
19
20
21
22
23
24
25
26
27

---

#### I have an account, now what?

Congrats on getting an HCC account! Now you need to connect to a Holland
cluster. To do this, we use an SSH connection. SSH stands for Secure
Shell, and it allows you to securely connect to a remote computer and
operate it just like you would a personal machine.

Depending on your operating system, you may need to install software to
Carrie A Brown's avatar
Carrie A Brown committed
28
make this connection. Check out our documentation on [Connecting to HCC Clusters]
29
({{< relref "../connecting/" >}}).
Adam Caprez's avatar
Adam Caprez committed
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

#### How do I change my password?

#### I forgot my password, how can I retrieve it?

Information on how to change or retrieve your password can be found on
the documentation page: [How to change your
password]({{< relref "/accounts/how_to_change_your_password" >}})


All passwords must be at least 8 characters in length and must contain
at least one capital letter and one numeric digit. Passwords also cannot
contain any dictionary words. If you need help picking a good password,
consider using a (secure!) password generator such as
[this one provided by Random.org](https://www.random.org/passwords)

To preserve the security of your account, we recommend changing the
default password you were given as soon as possible.

#### I just deleted some files and didn't mean to! Can I get them back?

That depends. Where were the files you deleted?

**If the files were in your $HOME directory (/home/group/user/):** It's
possible.

$HOME directories are backed up daily and we can restore your files as
they were at the time of our last backup. Please note that any changes
made to the files between when the backup was made and when you deleted
them will not be preserved. To have these files restored, please contact
HCC Support at
61
{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
Adam Caprez's avatar
Adam Caprez committed
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
as soon as possible.

**If the files were in your $WORK directory (/work/group/user/):** No.

Unfortunately, the $WORK directories are created as a short term place
to hold job files. This storage was designed to be quickly and easily
accessed by our worker nodes and as such is not conducive to backups.
Any irreplaceable files should be backed up in a secondary location,
such as Attic, the cloud, or on your personal machine. For more
information on how to prevent file loss, check out [Preventing File
Loss]({{< relref "preventing_file_loss" >}}).

#### How do I (re)activate Duo?

**If you have not activated Duo before:**

Please stop by
[our offices](http://hcc.unl.edu/location)
along with a photo ID and we will be happy to activate it for you. If
you are not local to Omaha or Lincoln, contact us at
82
{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
Adam Caprez's avatar
Adam Caprez committed
83
84
85
86
87
88
89
90
91
92
93
and we will help you activate Duo remotely.

**If you have activated Duo previously but now have a different phone
number:**

Stop by our offices along with a photo ID and we can help you reactivate
Duo and update your account with your new phone number.

**If you have activated Duo previously and have the same phone number:**

Email us at
94
{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
Adam Caprez's avatar
Adam Caprez committed
95
96
97
98
99
100
101
102
103
104
105
106
107
from the email address your account is registered under and we will send
you a new link that you can use to activate Duo.

#### How many nodes/memory/time should I request?

**Short answer:** We don’t know.

**Long answer:** The amount of resources required is highly dependent on
the application you are using, the input file sizes and the parameters
you select. Sometimes it can help to speak with someone else who has
used the software before to see if they can give you an idea of what has
worked for them.

Carrie A Brown's avatar
Carrie A Brown committed
108
Ultimately, it comes down to trial and error; try different
Adam Caprez's avatar
Adam Caprez committed
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
combinations and see what works and what doesn’t. Good practice is to
check the output and utilization of each job you run. This will help you
determine what parameters you will need in the future.

For more information on how to determine how many resources a completed
job used, check out the documentation on [Monitoring Jobs]({{< relref "monitoring_jobs" >}}).

#### I am trying to run a job but nothing happens?

Where are you trying to run the job from? You can check this by typing
the command \`pwd\` into the terminal.

**If you are running from inside your $HOME directory
(/home/group/user/)**:

Move your files to your $WORK directory (/work/group/user) and resubmit
your job.

The worker nodes on our clusters have read-only access to the files in
$HOME directories. This means that when a job is submitted from $HOME,
the scheduler cannot write the output and error files in the directory
and the job is killed. It appears the job does nothing because no output
is produced.

**If you are running from inside your $WORK directory:**

Contact us at
136
{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
Adam Caprez's avatar
Adam Caprez committed
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
with your login, the name of the cluster you are running on, and the
full path to your submit script and we will be happy to help solve the
issue.

##### I keep getting the error "slurmstepd: error: Exceeded step memory limit at some point." What does this mean and how do I fix it?

This error occurs when the job you are running uses more memory than was
requested in your submit script.

If you specified `--mem` or `--mem-per-cpu` in your submit script, try
increasing this value and resubmitting your job.

If you did not specify `--mem` or `--mem-per-cpu` in your submit script,
chances are the default amount allotted is not sufficient. Add the line

{{< highlight batch >}}
#SBATCH --mem=<memory_amount>
{{< /highlight >}}

to your script with a reasonable amount of memory and try running it again. If you keep
getting this error, continue to increase the requested memory amount and
resubmit the job until it finishes successfully.

For additional details on how to monitor usage on jobs, check out the
documentation on [Monitoring Jobs]({{< relref "monitoring_jobs" >}}).

If you continue to run into issues, please contact us at
164
{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu)
Adam Caprez's avatar
Adam Caprez committed
165
166
167
168
169
170
171
172
173
174
175
for additional assistance.

#### I want to talk to a human about my problem. Can I do that?

Of course! We have an open door policy and invite you to stop by
[either of our offices](http://hcc.unl.edu/location)
anytime Monday through Friday between 9 am and 5 pm. One of the HCC
staff would be happy to help you with whatever problem or question you
have.  Alternatively, you can drop one of us a line and we'll arrange a
time to meet:  [Contact Us](https://hcc.unl.edu/contact-us).

Mohammed Tanash's avatar
Mohammed Tanash committed
176
#### My submitted job takes long time waiting in the queue or it is not running?
Adam Caprez's avatar
Adam Caprez committed
177
If your submitted jobs are taking long time waiting in the queue, that usually means your account is over-utilizing and your fairshare score is low, this might be due submitting big number of jobs over the past period of time; and/or the amount of resources (memory, time) you requested for your job is big.
Mohammed Tanash's avatar
Mohammed Tanash committed
178
For additional details on how to monitor usage on jobs, check out the documentation on [Monitoring queued Jobs]({{< relref "monitoring_jobs" >}}).
Mohammed Tanash's avatar
Mohammed Tanash committed
179
180