Skip to content
Snippets Groups Projects
Commit addb85eb authored by Garhan Attebury's avatar Garhan Attebury
Browse files

Some updates to the facilities docs. More to come.

parent 063902d0
No related branches found
No related tags found
No related merge requests found
......@@ -2,15 +2,17 @@
title: "Facilities of the Holland Computing Center"
---
This document details the equipment resident in the Holland Computing Center (HCC) as of December 2020.
This document details the equipment resident in the Holland Computing Center (HCC) as of June 2022.
HCC has two primary locations directly interconnected by a 100 Gbps primary link with a 10 Gbps backup. The 1800 sq. ft. HCC machine room at the Peter Kiewit Institute (PKI) in Omaha can provide up to 500 kVA in UPS and genset protected power, and 160 ton cooling. A 2200 sq. ft. second machine room in the Schorr Center at the University of Nebraska-Lincoln (UNL) can currently provide up to 100 ton cooling with up to 400 kVA of power. Dell S4248FB-ON edge switches and Z9264F-ON core switches provide high WAN bandwidth and Software Defined Networking (SDN) capability for both locations. The Schorr and PKI machine rooms both have 100 Gbps paths to the University of Nebraska, Internet2, and ESnet as well as backup 10 Gbps paths. HCC uses multiple data transfer nodes as well as a FIONA (flash IO network appliance) to facilitate end-to-end performance for data intensive workflows.
HCC has two primary locations directly interconnected by a 100 Gbps primary link with a 10 Gbps backup. The 1800 sq. ft. HCC machine room at the Peter Kiewit Institute (PKI) in Omaha can provide up to 500 kVA in UPS and genset protected power, and 160 ton cooling. A 2200 sq. ft. second machine room in the Schorr Center at the University of Nebraska-Lincoln (UNL) can currently provide up to 100 ton cooling with up to 400 kVA of power. Dell S4248FB-ON edge switches and Z9264F-ON core switches provide high WAN bandwidth and Software Defined Networking (SDN) capability for both locations. The Schorr and PKI machine rooms both have 100 Gbps paths to the University of Nebraska, Internet2, and ESnet as well as a 100 Gbps backup path. HCC uses multiple data transfer nodes as well as a FIONA (flash IO network appliance) to facilitate end-to-end performance for data intensive workflows.
HCC's resources at UNL include two distinct offerings: Rhino and Red. Rhino is a linux cluster dedicated to general campus usage with 7,040 compute cores interconnected by low-latency Mellanox QDR InfiniBand networking. 360 TB of BeeGFS storage is complemented by 50 TB of NFS storage and 1.5 TB of local scratch per node. Each compute node is a Dell R815 server with at least 192 GB RAM and 4 Opteron 6272 / 6376 (2.1 / 2.3 GHz) processors.
HCC's main resources at UNL include Red, a high throughput cluster for high energy physics, and hardware supporting the PATh, PRP, and OSG NSF projects. The largest machine on the Lincoln campus is Red, with 15,984 job slots interconnected by a mixture of 1, 10, 25, 40, and 100 Gbps Ethernet. Red serves up over 11 PB of storage using the CEPH filesystem. Red primarily serves as a major site for storage and analysis in the international high energy physics project known as CMS (Compact Muon Solenoid) and is integrated with the Open Science Grid (OSG).
The largest machine on the Lincoln campus is Red, with 15,984 job slots interconnected by a mixture of 1, 10, 25, 40, and 100 Gbps Ethernet. More importantly, Red serves up over 11 PB of storage using the Hadoop Distributed File System (HDFS). Red integrated primarily serves as a major site for storage and analysis in the international high energy physics project known as CMS (Compact Muon Solenoid) and is integrated with the Open Science Grid (OSG).
Other resources at UNL include hardware supporting the PATh, PRP, and OSG projects as well as the off-site replica of the Attic archival storage system.
HCC's resources at PKI (Peter Kiewit Institute) in Omaha include the Crane and Anvil clusters along with the Attic and Common storage services.
HCC's resources at PKI (Peter Kiewit Institute) in Omaha include the Swan, Crane and Anvil clusters along with the Attic and Common storage services.
Swan is the newest HPC resource and currently contains 8,848 modern CPU cores with high speed Mellanox HDR100 interconnects and 5.3PB of scratch lustre storage. Swan additionally contains 24x NVIDIA T4 GPUs and will be expanded over time as HCC's primary HPC system.
Crane debuted at 474 on the Top500 list with an HPL benchmark or 121.8 TeraFLOPS. Intel Xeon chips (8-core, 2.6 GHz) provide the processing with 4 GB RAM available per core and a total of 12,236 cores. The cluster shares 1.5 PetaBytes of Lustre storage and contains HCC's GPU resources. We have since expanded the existing cluster: 96 nodes with new Intel Xeon E5-2697 v4 chips and 100GB Intel Omni-Path interconnect were added to Crane. Moreover, Crane has 43 GPU nodes with 110 NVIDIA GPUs in total which enables the most state-of-art research, from drug discovery to deep learning.
......@@ -24,18 +26,7 @@ These resources are detailed further below.
# 1. HCC at UNL Resources
## 1.1 Rhino
* 107 4-socket Opteron 6172 / 6376 (16-core, 2.1 / 2.3 GHz) with 192 or 256 GB RAM
* 2x with 512 GB RAM, 2x with 1024 GB RAM
* Mellanox QDR InfiniBand
* 1 and 10 GbE networking
* 5x Dell N3048 switches
* 50TB shared storage (NFS) -> /home
* 360TB BeeGFS storage over Infiniband -> /work
* 1.5TB local scratch
## 1.2 Red
## 1.1 Red
* USCMS Tier-2 resource, available opportunistically via the Open Science Grid
* 18 2-socket Xeon Gold 6248R (3.00GHz) (96 slots per node)
......@@ -52,7 +43,7 @@ These resources are detailed further below.
* 1 2-socket Xeon E5-1660 v3 (3.0GHz) (16 slots per node)
* 40 2-socket Opteron 6128 (2.0GHz) (32 slots per node)
* 40 4-socket Opteron 6272 (2.1GHz) (64 slots per node)
* 11 PB HDFS storage
* 11 PB CEPH storage
* Mix of 1, 10, 25, 40, and 100 GbE networking
* 2x Dell Z9264F-ON switches
* 1x Dell S5248F-ON switch
......@@ -62,14 +53,27 @@ These resources are detailed further below.
* 2x Dell S4810 switches
* 5x Dell N3048 switches
## 1.3 Silo (backup mirror for Attic)
## 1.2 Silo (backup mirror for Attic)
* 1 Mercury RM216 2U Rackmount Server 2 Xeon E5-2630 (12-core, 2.6GHz)
* 10 Mercury RM445J 4U Rackmount JBOD with 45x 4TB NL SAS Hard Disks
# 2. HCC at PKI Resources
## 2.1 Crane
## 2.1 Swan
* 144 PowerEdge R650 2-socket Xeon Gold 6348 (28-core, 2.6GHz) with 256GB RAM
* 12 PowerEdge R650 2-soxcket Xeon Gold 6348 (28-core, 2.6GHz) with 256GB RAM and 2x T4 GPUs
* 2 PowerEdge R650 2-socket Xeon Gold 6348 (28-core, 2.6GHz) with 2TB RAM
* Mellanox HDR100 InfiniBand
* 25Gb networking with 4x Dell N5248F-ON switches
* Management network with 6x Dell N3248TE-ON switches
* 10TB NVMe backed /home filesystem
* 5.3PB Lustre /work filesystem
* 3.5TB local flash scratch per node
## 2.2 Crane
* 452 Relion 2840e systems from Penguin
* 452x with 64 GB RAM
......@@ -120,12 +124,12 @@ These resources are detailed further below.
* 2 Nvidia V100S GPUs
## 2.2 Attic
## 2.3 Attic
* 1 Mercury RM216 2U Rackmount Server 2-socket Xeon E5-2630 (6-core, 2.6GHz)
* 10 Mercury RM445J 4U Rackmount JBOD with 45x 4TB NL SAS Hard Disks
## 2.3 Anvil
## 2.4 Anvil
* 76 PowerEdge R630 systems
* 76x with 256 GB RAM
......@@ -143,7 +147,7 @@ These resources are detailed further below.
* 10 GbE networking
* 6x Dell S4048-ON switches
## 2.4 Shared Common Storage
## 2.5 Shared Common Storage
* Storage service providing 1.9PB usable capacity
* 6 SuperMicro 1028U-TNRTP+ systems
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment