Skip to content
Snippets Groups Projects
facilities.md 6.86 KiB
Newer Older
---
title: "Facilities of the Holland Computing Center"
---

This document details the equipment resident in the Holland Computing Center (HCC) as of November 2018.
HCC has two primary locations directly interconnected by a pair of 10 Gbps fiber optic links (20 Gbps total). The 1800 sq. ft. HCC machine room at the Peter Kiewit Institute (PKI) in Omaha can provide up to 500 kVA in UPS and genset protected power, and 160 ton cooling. A 2200 sq. ft. second machine room in the Schorr Center at the University of Nebraska-Lincoln (UNL) can currently provide up to 100 ton cooling with up to 400 kVA of power. One Brocade MLXe router and two Dell Z9264F-ON core switches in each location provide both high WAN bandwidth and Software Defined Networking (SDN) capability. The Schorr machine room connects to campus and Internet2/ESnet at 100 Gbps while the PKI machine room connects at 10 Gbps. HCC uses multiple data transfer nodes as well as a FIONA (flash IO network appliance) to facilitate end-to-end performance for data intensive workflows.
HCC's resources at UNL include two distinct offerings: Rhino and Red. Rhino is a linux cluster dedicated to general campus usage with 7,040 compute cores interconnected by low-latency Mellanox QDR InfiniBand networking. 360 TB of BeeGFS storage is complemented by 50 TB of NFS storage and 1.5 TB of local scratch per node.  Each compute node is a Dell R815 server with at least 192 GB RAM and 4 Opteron 6272 / 6376 (2.1 / 2.3 GHz) processors.

The largest machine on the Lincoln campus is Red, with 9,536 job slots interconnected by a mixture of 1, 10, and 40 Gbps ethernet. More importantly, Red serves up over 6.6 PB of storage using the Hadoop Distributed File System (HDFS). Red is integrated with the Open Science Grid (OSG), and serves as a major site for storage and analysis in the international high energy physics project known as CMS (Compact Muon Solenoid).

HCC's resources at PKI (Peter Kiewit Institute) in Omaha include Crane, Anvil, Attic, and Common storage.
Crane debuted at 474 on the Top500 list with an HPL benchmark or 121.8 TeraFLOPS. Intel Xeon chips (8-core, 2.6 GHz) provide the processing with 4 GB RAM available per core and a total of 12,236 cores. The cluster shares 1.5 PetaBytes of Lustre storage and contains HCC's GPU resources. We have since expanded the existing cluster: 96 nodes with new Intel Xeon E5-2697 v4 chips and 100GB Intel Omni-Path interconnect were added to Crane. Moreover, Crane has 21 GPU nodes with 57 NVIDIA GPUs in total which enables the most state-of-art research, from drug discovery to deep learning.
Anvil is an OpenStack cloud environment consisting of 1,520 cores and 400TB of CEPH storage all connected by 10 Gbps networking. The Anvil cloud exists to address needs of NU researchers that cannot be served by traditional scheduler-based HPC environments such as GUI applications, Windows based software, test environments, and persistent services. In addition, a project to expand Ceph storage by 1.1 PB is in progress.

Attic and Silo form a near line archive with 1.0 PB of usable storage. Attic is located at PKI in Omaha, while Silo acts as an online backup located in Lincoln. Both Attic and Silo are connected with 10 Gbps network connections.

In addition to the cluster specific Lustre storage, a shared common storage space exists between all HCC resources with 1.9PB capacity.

These resources are detailed further below.

# 1. HCC at UNL Resources

## 1.1 Rhino
* 107 4-socket Opteron 6172 / 6376 (16-core, 2.1 / 2.3 GHz) with 192 or 256 GB RAM
    * 2x with 512 GB RAM, 2x with 1024 GB RAM
* Mellanox QDR InfiniBand
* 1 and 10 GbE networking
    * 5x Dell N3048 switches
* 50TB shared storage (NFS) -> /home
* 360TB BeeGFS storage over Infiniband -> /work
* 1.5TB local scratch

* USCMS Tier-2 resource, available opportunistically via the Open Science Grid
* 60 2-socket Xeon E5530 (2.4GHz) (16 slots per node)
* 16 2-socket Xeon E5520 (2.27 GHz) (16 slots per node)
* 36 2-socket Xeon X5650 (2.67GHz) (24 slots per node)
* 16 2-socket Xeon E5-2640 v3 (2.6GHz) (32 slots per node)
* 40 2-socket Xeon E5-2650 v3 (2.3GHz) (40 slots per node)
* 24 4-socket Opteron 6272 (2.1 GHz) (64 slots per node)
* 28 2-socket Xeon E5-2650 v2 (2.6GHz) (32 slots per node)
* 48 2-socket Xeon E5-2660 (2.2GHz) (32 slots per node)
* 24 2-socket Xeon E5-2660 v4 (2.0GHz) (56 slots per node)
* 2 2-socket Xeon E5-1660 v3 (3.0GHz) (16 slots per node)
* 10.8 PB HDFS storage
* Mix of 1, 10, and 40 GbE networking
    * 1x Dell S6000-ON switch
    * 2x Dell S4048-ON switch
    * 5x Dell S3048-ON switches
    * 2x Dell S4810 switches
    * 2x Dell N3048 switches

## 1.3 Silo (backup mirror for Attic)

* 1 Mercury RM216 2U Rackmount Server 2 Xeon E5-2630 (12-core, 2.6GHz)
* 10 Mercury RM445J 4U Rackmount JBOD with 45x 4TB NL SAS Hard Disks

# 2. HCC at PKI Resources

## 2.1 Crane

* 452 Relion 2840e systems from Penguin
    * 452x with 64 GB RAM
    * 2-socket Intel Xeon E5-2670 (8-core, 2.6GHz)
    * Intel QDR InfiniBand
* 96 nodes from multiple vendor
    * 59x with 256 GB RAM
    * 37x with 512 GB RAM
    * 2-socket Intel Xeon E5-2697 v4 (18-core, 2.3GHz)
    * Intel Omni-Path
* 1 and 10 GbE networking
    * 4x 10 GbE switch
    * 14x 1 GbE switches
* 1500 TB Lustre storage over InfiniBand
* 3 Supermicro SYS-6016GT systems
    * 48 GB RAM
    * 2-socket Intel Xeon E5620 (4-core, 2.4GHz)
    * 2 Nvidia M2070 GPUs
* 3 Supermicro SYS-1027GR-TSF systems
    * 128 GB RAM
    * 2-socket Intel Xeon E5-2630 (6-core, 2.3GHz)
    * 3 Nvidia K20M GPUs
* 1 Supermicro SYS-5017GR-TF systems
    * 32 GB RAM
    * 1-socket Intel Xeon E5-2650 v2 (8-core, 2.6GHz)
    * 2 Nvidia K40C GPUs
* 5 Supermicro SYS-2027GR-TRF systems
    * 64 GB RAM
    * 2-socket Intel Xeon E5-2650 v2 (8-core, 2.6GHz)
    * 4 Nvidia K40M GPUs
* 2 Supermicro SYS-5018GR-T systems
    * 64 GB RAM
    * 2-socket Intel Xeon E5-2620 v4 (8-core, 2.1GHz)
    * 2 Nvidia P100 GPUs


* 1 Mercury RM216 2U Rackmount Server 2-socket Xeon E5-2630 (6-core, 2.6GHz)
* 10 Mercury RM445J 4U Rackmount JBOD with 45x 4TB NL SAS Hard Disks


* 76 PowerEdge R630 systems
    * 76x with 256 GB RAM
    * 2-socket Intel Xeon E5-2650 v3 (10-core, 2.3GHz)
    * Dual 10Gb Ethernet
* 12 PowerEdge R730xd systems
    * 12x with 128 GB RAM
    * 2-socket Intel Xeon E5-2630L v3 (8-core, 1.8GHz)
    * 12x 4TB NL SAS Hard Disks and 2x200 GB SSD
    * Dual 10 Gb Ethernet
* 2 PowerEdge R320 systems
    * 2x with 48 GB RAM
    * 1-socket Intel E5-2403 v3 (4-core, 1.8GHz)
    * Quad 10Gb Ethernet
* 10 GbE networking
    * 6x Dell S4048-ON switches

## 2.4 Shared Common Storage

* Storage service providing 1.9PB usable capacity
* 6 SuperMicro 1028U-TNRTP+ systems
    * 2-socket Intel Xeon E5-2637 v4 (4-core, 3.5GHz)
    * 256 GB RAM
    * 120x 4TB SAS Hard Disks
* 2 SuperMicro 1028U-TNRTP+ systems
    * 2-socket Intel Xeon E5-2637 v4 (4-core, 3.5GHz)
    * 128 GB RAM
    * 6x 200 GB SSD
* Intel Omni-Path
* 10 GbE networking