using_anaconda_package_manager.md 10.5 KB
Newer Older
Carrie A Brown's avatar
Carrie A Brown committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+++
title = "Using Anaconda Package Manager"
description = "How to use the Anaconda Package Manager on HCC resources."
weight=10
+++

[Anaconda](https://www.anaconda.com/what-is-anaconda),
from [Anaconda, Inc](https://www.anaconda.com)
is a completely free enterprise-ready distribution for large-scale data
processing, predictive analytics, and scientific computing. It includes
over 195 of the most popular Python packages for science, math,
engineering, and data analysis. **It also offers the ability to easily
create custom _environments_ by mixing and matching different versions
of Python and/or R and other packages into isolated environments that
individual users are free to create.**  Anaconda includes the `conda`
package and environment manager to make managing these environments
straightforward.

- [Using Anaconda](#using-anaconda)
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
20
21
22
23
24
- [Searching for Packages](#searching-for-packages)
- [Creating custom Anaconda Environment](#creating-custom-anaconda-environment)
- [Adding and Removing Packages from an Existing Environment](#adding-and-removing-packages-from-an-existing-environment)
- [Creating custom GPU Anaconda Environment](#creating-custom-gpu-anaconda-environment)
- [Using an Anaconda Environment in a Jupyter Notebook on Crane](#using-an-anaconda-environment-in-a-jupyter-notebook-on-crane)
Carrie A Brown's avatar
Carrie A Brown committed
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72

### Using Anaconda

While the standard methods of installing packages via `pip`
and `easy_install` work with Anaconda, the preferred method is using
the `conda` command.  

{{% notice info %}}
Full documentation on using Conda is available
at http://conda.pydata.org/docs/

A [cheatsheet](/attachments/11635089.pdf) is also provided.
{{% /notice %}}

A few examples of the basic commands are provided here.  For a full
explanation of all of Anaconda/Conda's capabilities, see the
documentation linked above. 

Anaconda is provided through the `anaconda` module on HCC machines.  To
begin using it, load the Anaconda module.

{{% panel theme="info" header="Load the Anaconda module to start using Conda" %}}
{{< highlight bash >}}
module load anaconda
{{< /highlight >}}
{{% /panel %}}

To display general information about Conda/Anaconda, use the `info` subcommand.

{{% panel theme="info" header="Display general information about Conda/Anaconda" %}}
{{< highlight bash >}}
conda info
{{< /highlight >}}
{{% /panel %}}

Conda allows the easy creation of isolated, custom environments with
packages and versions of your choosing.  To show all currently available
environments, and which is active, use the `info `subcommand with the
`-e` option.

{{% panel theme="info" header="List available environments" %}}
{{< highlight bash >}}
conda info -e
{{< /highlight >}}
{{% /panel %}}

The active environment will be marked with an asterisk (\*) character.

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
73

Carrie A Brown's avatar
Carrie A Brown committed
74
75
76
77
78
79
80
81
82
The `list` command will show all packages installed
in the currently active environment.

{{% panel theme="info" header="List installed packages in current environment" %}}
{{< highlight bash >}}
conda list
{{< /highlight >}}
{{% /panel %}}

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
83
### Searching for Packages
Carrie A Brown's avatar
Carrie A Brown committed
84

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
85
To find packages, use the `search` subcommand.
Carrie A Brown's avatar
Carrie A Brown committed
86
87
88
89
90
91
92
93
94
95
96

{{% panel theme="info" header="Search for packages" %}}
{{< highlight bash >}}
conda search numpy
{{< /highlight >}}
{{% /panel %}}

If the package is available, this will also display available package
versions and compatible Python versions the package may be installed
under.

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
97
98
### Creating Custom Anaconda Environment

Carrie A Brown's avatar
Carrie A Brown committed
99
100
101
The `create` command is used to create a new environment.  It requires
at a minimum a name for the environment, and at least one package to
install.  For example, suppose we wish to create a new environment, and
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
102
need version 1.17 of NumPy.
Carrie A Brown's avatar
Carrie A Brown committed
103
104
105

{{% panel theme="info" header="Create a new environment by providing a name and package specification" %}}
{{< highlight bash >}}
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
106
conda create -n mynumpy numpy=1.17
Carrie A Brown's avatar
Carrie A Brown committed
107
108
109
110
{{< /highlight >}}
{{% /panel %}}

This will create a new environment called 'mynumpy' and installed NumPy
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
111
version 1.17, along with any required dependencies.  
Carrie A Brown's avatar
Carrie A Brown committed
112
113
114
115
116

To use the environment, we must first *activate* it.

{{% panel theme="info" header="Activate environment" %}}
{{< highlight bash >}}
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
117
conda activate mynumpy
Carrie A Brown's avatar
Carrie A Brown committed
118
119
120
{{< /highlight >}}
{{% /panel %}}

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
121
Our new environment is now active, and we can use it.  The shell prompt will change to indicate this as well.
Carrie A Brown's avatar
Carrie A Brown committed
122

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
123
### Adding and Removing Packages from an Existing Environment
Carrie A Brown's avatar
Carrie A Brown committed
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162

To install additional packages in an environment, use the `install`
subcommand.  Suppose we want to install iPython in our 'mynumpy'
environment.  While the environment is active, use `install `with no
additional arguments.  

{{% panel theme="info" header="Install a new package in the currently active environment" %}}
{{< highlight bash >}}
conda install ipython
{{< /highlight >}}
{{% /panel %}}

If you aren't currently in the environment you wish to install the
package in, add the `-n `option to specify the name.

{{% panel theme="info" header="Install new packages in a specified environment" %}}
{{< highlight bash >}}
conda install -n mynumpy ipython
{{< /highlight >}}
{{% /panel %}}

The `remove` subcommand to uninstall a package functions similarly.

{{% panel theme="info" header="Remove package from currently active environment" %}}
{{< highlight bash >}}
conda remove ipython
{{< /highlight >}}
{{% /panel %}}

{{% panel theme="info" header="Remove package from environment specified by name" %}}
{{< highlight bash >}}
conda remove -n mynumpy ipython
{{< /highlight >}}
{{% /panel %}}

To exit an environment, we *deactivate* it.

{{% panel theme="info" header="Exit current environment" %}}
{{< highlight bash >}}
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
163
conda deactivate
Carrie A Brown's avatar
Carrie A Brown committed
164
165
166
167
168
169
170
171
172
173
174
175
{{< /highlight >}}
{{% /panel %}}

Finally, to completely remove an environment, add the `--all `option
to `remove`.

{{% panel theme="info" header="Completely remove an environment" %}}
{{< highlight bash >}}
conda remove -n mynumpy --all
{{< /highlight >}}
{{% /panel %}}

Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
### Creating Custom GPU Anaconda Environment

We provide GPU versions of various frameworks such as `tensorflow`, `keras`, `theano`, via [modules](../../modules). 
However, sometimes you may need additional libraries or packages that are not available as part of these modules. 
In this case, you will need to create your own GPU Anaconda environment.

To do this, you need to first clone one of our GPU modules to a new Anaconda environment, and then install the desired packages in this new environment.

The reason for this is that the GPU modules we support are built using the specific CUDA drivers our GPU nodes have. 
If you just create custom GPU environment without cloning the module, your code will not utilize the GPUs correctly.


For example, if you want to use `tensorflow` with additional packages, first do:
{{% panel theme="info" header="Cloning GPU module to a new Anaconda environment" %}}
{{< highlight bash >}}
module load tensorflow-gpu/py36/1.14
module load anaconda
conda create -n tensorflow-gpu-1.14-custom --clone $CONDA_DEFAULT_ENV
module purge
{{< /highlight >}}
{{% /panel %}}

This will create a new `tensorflow-gpu-1.14-custom` environment in your home directory that is a copy of the `tensorflow-gpu` module. 
Then, you can install the additional packages you need in this environment.
{{% panel theme="info" header="Install new packages in the currently active environment" %}}
{{< highlight bash >}}
module load anaconda
conda activate tensorflow-gpu-1.14-custom
conda install <packages>
{{< /highlight >}}
{{% /panel %}}

Next, whenever you want to use this custom GPU Anaconda environment, you need to add these two lines in your submit script:
{{< highlight bash >}}
module load anaconda
conda activate tensorflow-gpu-1.14-custom
{{< /highlight >}}

{{% notice info %}}
If you have custom GPU Anaconda environment please only use the two lines from above and **DO NOT** load the module you have cloned earlier. 
Using `module load tensorflow-gpu/py36/1.14` and `conda activate tensorflow-gpu-1.14-custom` in the same script is **wrong** and may give you various errors and incorrect results.
{{% /notice %}}

### Using an Anaconda Environment in a Jupyter Notebook on Crane
Carrie A Brown's avatar
Carrie A Brown committed
220
221
222
223
224
225

It is not difficult to make an Anaconda environment available to a
Jupyter Notebook. To do so, follow the steps below, replacing
`myenv` with the name of the Python or R environment you wish to use:

1.  Stop any running Jupyter Notebooks and ensure you are logged out of
226
    the JupyterHub instance on the cluster you are using.
Carrie A Brown's avatar
Carrie A Brown committed
227
228
229
230
231
232
233
234
    1.  If you are not logged out, please click the Control Panel button
        located in the top right corner.
    2.  Click the "Stop My Server" Button to terminate the Jupyter
        server.
    3.  Click the logout button in the top right corner.  
          
2.  Using the command-line environment, load the target conda
    environment:
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
235
    {{< highlight bash >}}conda activate myenv{{< /highlight >}}
Carrie A Brown's avatar
Carrie A Brown committed
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265

3.  Install the Jupyter kernel and add the environment:

    1.  For a **Python** conda environment, install the IPykernel
        package, and then the kernel specification:

        {{< highlight bash >}}
        # Install ipykernel
        conda install ipykernel

        # Install the kernel specification
        python -m ipykernel install --user --name "$CONDA_DEFAULT_ENV" --display-name "Python ($CONDA_DEFAULT_ENV)"
        {{< /highlight >}}

    2.  For an **R** conda environment, install the jupyter\_client and
        IRkernel packages, and then the kernel specification:

        {{< highlight bash >}}
        # Install PNG support for R, the R kernel for Jupyter, and the Jupyter client
        conda install r-png
        conda install r-irkernel jupyter_client

        # Install jupyter_client 5.2.3 from anaconda channel for bug workaround
        conda install -c anaconda jupyter_client

        # Install the kernel specification
        R -e "IRkernel::installspec(name = '$CONDA_DEFAULT_ENV', displayname = 'R ($CONDA_DEFAULT_ENV)', user = TRUE)"
        {{< /highlight >}}

4.  Once you have the environment set up, deactivate it:
Natasha Pavlovikj's avatar
Natasha Pavlovikj committed
266
    {{< highlight bash >}}conda deactivate{{< /highlight >}}
Carrie A Brown's avatar
Carrie A Brown committed
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282

5.  To make your conda environments accessible from the worker nodes,
    enter the following commands:

    {{< highlight bash >}}
    mkdir -p $WORK/.jupyter
    mv ~/.local/share/jupyter/kernels $WORK/.jupyter
    ln -s $WORK/.jupyter/kernels ~/.local/share/jupyter/kernels
    {{< /highlight >}}

{{% notice note %}}
**Note**: Step 5 only needs to be done once. Any future created
environments will automatically be accessible from SLURM notebooks
once this is done.
{{% /notice %}}

283
6.  Login to JupyterHub
Carrie A Brown's avatar
Carrie A Brown committed
284
285
286
287
288
    and create a new notebook using the environment by selecting the
    correct entry in the `New` dropdown menu in the top right
    corner.  
    {{< figure src="/images/24151931.png" height="400" class="img-border">}}