-
Adam Caprez authoredAdam Caprez authored
- Using Anaconda
- Searching for Packages
- Creating Custom Anaconda Environments
- Using /common for environments
- Adding and Removing Packages from an Existing Environment
- Creating Custom GPU Anaconda Environment
- Using an Anaconda Environment in a Jupyter Notebook
- Install ipykernel
- Install the kernel specification
- Install PNG support for R, the R kernel for Jupyter, and the Jupyter client
- Install jupyter_client 5.2.3 from anaconda channel for bug workaround
- Install the kernel specification
title = "Using Anaconda Package Manager"
description = "How to use the Anaconda Package Manager on HCC resources."
weight=10
Anaconda,
from Anaconda, Inc
is a completely free enterprise-ready distribution for large-scale data
processing, predictive analytics, and scientific computing. It includes
over 195 of the most popular Python packages for science, math,
engineering, and data analysis. It also offers the ability to easily
create custom environments by mixing and matching different versions
of Python and/or R and other packages into isolated environments that
individual users are free to create. Anaconda includes the conda
package and environment manager to make managing these environments
straightforward.
- Using Anaconda
- Searching for Packages
- Creating custom Anaconda Environments
- Using /common for environments
- Adding and Removing Packages from an Existing Environment
- Creating custom GPU Anaconda Environment
- Using an Anaconda Environment in a Jupyter Notebook
Using Anaconda
While the standard methods of installing packages via pip
and easy_install
work with Anaconda, the preferred method is using
the conda
command.
{{% notice info %}} Full documentation on using Conda is available at http://conda.pydata.org/docs/
A cheatsheet is also provided. {{% /notice %}}
A few examples of the basic commands are provided here. For a full explanation of all of Anaconda/Conda's capabilities, see the documentation linked above.
Anaconda is provided through the anaconda
module on HCC machines. To
begin using it, load the Anaconda module.
{{% panel theme="info" header="Load the Anaconda module to start using Conda" %}} {{< highlight bash >}} module load anaconda {{< /highlight >}} {{% /panel %}}
To display general information about Conda/Anaconda, use the info
subcommand.
{{% panel theme="info" header="Display general information about Conda/Anaconda" %}} {{< highlight bash >}} conda info {{< /highlight >}} {{% /panel %}}
Conda allows the easy creation of isolated, custom environments with
packages and versions of your choosing. To show all currently available
environments, and which is active, use the info
subcommand with the
-e
option.
{{% panel theme="info" header="List available environments" %}} {{< highlight bash >}} conda info -e {{< /highlight >}} {{% /panel %}}
The active environment will be marked with an asterisk (*) character.
The list
command will show all packages installed
in the currently active environment.
{{% panel theme="info" header="List installed packages in current environment" %}} {{< highlight bash >}} conda list {{< /highlight >}} {{% /panel %}}
Searching for Packages
To find packages, use the search
subcommand.
{{% panel theme="info" header="Search for packages" %}} {{< highlight bash >}} conda search numpy {{< /highlight >}} {{% /panel %}}
If the package is available, this will also display available package versions and compatible Python versions the package may be installed under.
Creating Custom Anaconda Environments
The create
command is used to create a new environment. It requires
at a minimum a name for the environment, and at least one package to
install. For example, suppose we wish to create a new environment, and
need version 1.17 of NumPy.
{{% panel theme="info" header="Create a new environment by providing a name and package specification" %}} {{< highlight bash >}} conda create -n mynumpy numpy=1.17 {{< /highlight >}} {{% /panel %}}
This will create a new environment called 'mynumpy' and installed NumPy version 1.17, along with any required dependencies.
To use the environment, we must first activate it.
{{% panel theme="info" header="Activate environment" %}} {{< highlight bash >}} conda activate mynumpy {{< /highlight >}} {{% /panel %}}
Our new environment is now active, and we can use it. The shell prompt will change to indicate this as well.
Using /common for environments
By default, conda environments are installed in the user's home
directory at ~/.conda/envs
.
This is fine for smaller environments, but larger environments (especially ML/AI-based ones) can quickly
exhaust the space in the home
directory.
For larger environments, we recommend using the $COMMON
folder instead. To do so, use the -p
option
instead of -n
for conda create
. For example, creating the same environment as above but
placing it in the folder $COMMON/mynumpy
instead.
{{% panel theme="info" header="Create environment in /common" %}} {{< highlight bash >}} conda create -p $COMMON/mynumpy numpy=1.17 {{< /highlight >}} {{% /panel %}}
To activate the environment, you must use the full path.
{{% panel theme="info" header="Activate environment in /common" %}} {{< highlight bash >}} conda activate $COMMON/mynumpy {{< /highlight >}} {{% /panel %}}
Please note that you'll need to add the #SBATCH --licenses=common
directive to your submit scripts
as described here in order to use environments
in $COMMON
.
Adding and Removing Packages from an Existing Environment
To install additional packages in an environment, use the install
subcommand. Suppose we want to install iPython in our 'mynumpy'
environment. While the environment is active, use install
with no
additional arguments.
{{% panel theme="info" header="Install a new package in the currently active environment" %}} {{< highlight bash >}} conda install ipython {{< /highlight >}} {{% /panel %}}
If you aren't currently in the environment you wish to install the
package in, add the -n
option to specify the name.
{{% panel theme="info" header="Install new packages in a specified environment" %}} {{< highlight bash >}} conda install -n mynumpy ipython {{< /highlight >}} {{% /panel %}}
The remove
subcommand to uninstall a package functions similarly.
{{% panel theme="info" header="Remove package from currently active environment" %}} {{< highlight bash >}} conda remove ipython {{< /highlight >}} {{% /panel %}}
{{% panel theme="info" header="Remove package from environment specified by name" %}} {{< highlight bash >}} conda remove -n mynumpy ipython {{< /highlight >}} {{% /panel %}}
To exit an environment, we deactivate it.
{{% panel theme="info" header="Exit current environment" %}} {{< highlight bash >}} conda deactivate {{< /highlight >}} {{% /panel %}}
Finally, to completely remove an environment, add the --all
option
to remove
.
{{% panel theme="info" header="Completely remove an environment" %}} {{< highlight bash >}} conda remove -n mynumpy --all {{< /highlight >}} {{% /panel %}}
Creating Custom GPU Anaconda Environment
We provide GPU versions of various frameworks such as tensorflow
, keras
, theano
, via modules.
However, sometimes you may need additional libraries or packages that are not available as part of these modules.
In this case, you will need to create your own GPU Anaconda environment.
To do this, you need to first clone one of our GPU modules to a new Anaconda environment, and then install the desired packages in this new environment.
The reason for this is that the GPU modules we support are built using the specific CUDA drivers our GPU nodes have. If you just create custom GPU environment without cloning the module, your code will not utilize the GPUs correctly.
For example, if you want to use tensorflow
with additional packages, first do:
{{% panel theme="info" header="Cloning GPU module to a new Anaconda environment" %}}
{{< highlight bash >}}
module load tensorflow-gpu/py36/1.14
module load anaconda
conda create -n tensorflow-gpu-1.14-custom --clone $CONDA_DEFAULT_ENV
module purge
{{< /highlight >}}
{{% /panel %}}
This will create a new tensorflow-gpu-1.14-custom
environment in your home directory that is a copy of the tensorflow-gpu
module.
Then, you can install the additional packages you need in this environment.
{{% panel theme="info" header="Install new packages in the currently active environment" %}}
{{< highlight bash >}}
module load anaconda
conda activate tensorflow-gpu-1.14-custom
conda install
{{< /highlight >}}
{{% /panel %}}
Next, whenever you want to use this custom GPU Anaconda environment, you need to add these two lines in your submit script: {{< highlight bash >}} module load anaconda conda activate tensorflow-gpu-1.14-custom {{< /highlight >}}
{{% notice info %}}
If you have custom GPU Anaconda environment please only use the two lines from above and DO NOT load the module you have cloned earlier.
Using module load tensorflow-gpu/py36/1.14
and conda activate tensorflow-gpu-1.14-custom
in the same script is wrong and may give you various errors and incorrect results.
{{% /notice %}}
Using an Anaconda Environment in a Jupyter Notebook
It is not difficult to make an Anaconda environment available to a
Jupyter Notebook. To do so, follow the steps below, replacing
myenv
with the name of the Python or R environment you wish to use:
-
Stop any running Jupyter Notebooks and ensure you are logged out of the JupyterHub instance on the cluster you are using.
- If you are not logged out, please click the Control Panel button located in the top right corner.
- Click the "Stop My Server" Button to terminate the Jupyter server.
- Click the logout button in the top right corner.
-
Using the command-line environment of the login node, load the target conda environment: {{< highlight bash >}}conda activate myenv{{< /highlight >}}
-
Install the Jupyter kernel and add the environment:
-
For a Python conda environment, install the IPykernel package, and then the kernel specification:
{{< highlight bash >}}
conda install ipykernel
python -m ipykernel install --user --name "CONDA_DEFAULT_ENV" --display-name "Python (CONDA_DEFAULT_ENV)" {{< /highlight >}}
-
For an R conda environment, install the jupyter_client and IRkernel packages, and then the kernel specification:
{{< highlight bash >}}
conda install r-png conda install r-irkernel jupyter_client
conda install -c anaconda jupyter_client
R -e "IRkernel::installspec(name = 'CONDA_DEFAULT_ENV', displayname = 'R (CONDA_DEFAULT_ENV)', user = TRUE)" {{< /highlight >}}
-
-
Once you have the environment set up, deactivate it: {{< highlight bash >}}conda deactivate{{< /highlight >}}
-
Login to JupyterHub and create a new notebook using the environment by selecting the correct entry in the
New
dropdown menu in the top right corner.
{{< figure src="/images/24151931.png" height="400" class="img-border">}}