Documentation of CIIRC Cluster.
Welcome to the CIIRC cluster documentation pages.
Danger
The cluster has limited number of nodes. When allocating large jobs, keep in mind that you are not alone here. Never allocate long large jobs that block other users.
DO NOT run IDEs remote connections against node-head. It causes heavy overload of the nfs filesystem.
All heavy computing have to be done via slurm on GPU/Compute nodes.
Breaking this rules first result in your session being terminated without warning, then you will be denied access to the cluster.
These pages contains all the documentation formerly hosted on CIIRC Dokuwiki and presented in information mails. This documentation is made with hope that it will be clearer and easier to navigate.
We all benefit from well written and up to date documentation. Therefore, contributions to improving the docs are more than welcome.
Drop a line to it@ciirc.cvut.cz and become a member of the documentation team. We are grateful for your support.
Cluster access
Note
Everyone with CIIRC account can access master node of the cluster via ssh at cluster.ciirc.cvut.cz. To gain permissions to Slurm job planning service, please send a request ticket through CIIRC HelpDesk.
On master node you can look around, explore available sw modules, try your code etc... If you want perform heavy computing via Slurm (see below) you have to apply for additional permission. Please ask via helpdesk or mail to it@ciirc.cvut.cz
It is not recommended to run heavy computations on the login / head / master node. Please use interactive jobs for that.
SSH access without a password
Copy your public key to the head node to connect to the cluster without typing your password everytime.
ssh-copy-id -i ~/.ssh/id_rsa.pub <your_cluster_user>@cluster.ciirc.cvut.cz
Once this is done, you can simply just ssh <your_cluster_user>@cluster.ciirc.cvut.cz
,
without password.
Note
To have a personal certificate generated by CIIRC, follow instructions in the wiki page Personal certificate.
SSH username autofill
If your local username is the same as the CIIRC / cluster user,
you can just ssh cluster.ciirc.cvut.cz
.
If your local username differs from the cluster username,
add the following entry into ~/.ssh/config
to allow connecting by calling ssh cluster.ciirc.cvut.cz
(without specifying the username explicitly):
Host cluster.ciirc.cvut.cz
User <your_cluster_user>
Source: Stack Overflow
Basic info
CIIRC computational cluster serves CIIRC members for their intensive batch computations. CIIRC cluster also offers acceleration on graphics cards. CIIRC cluster keeps developing.
CIIRC cluster is XSEDE Compatible Basic Cluster based on OpenHPC project. Runs on CentOS Linux release 7.5 uses Slurm as a workload manager and job scheduler. Cluster management and orchestration is done by Warewulf toolkit.
User software is managed by EasyBuild, a software build and installation framework and use Lmod environment module system.
Container platform in CIIRC cluster is Singularity.
Current Cluster comprises 12 nodes, in gpu and compute partitions. (For details about nodes please see cluster system configuration).
PARTITION NODES NODELIST CPUS MEMORY GRES
gpu 1 node-01 56 257552 gpu:1080Ti:1
gpu 4 node-[14-17] 72 192088 gpu:1080Ti:4
gpu 2 node-[11-12] 56 257552 gpu:K40:2
compute* 5 node-[02-05,09] 56 257552 (null)
Nodes are interconnected by 10 Gbit Ethernet, 100 Gbit EDR IB is available. All nodes contain local ssd scratch disk the whole cluster is connected to the 600 TB Isilon NAS storage.
Present Slurm configuration is fairly standard:
- SelectType=select/cons_res
- SelectTypeParameters=CR_CPU_Memory
- SchedulerType=sched/backfill
- PriorityType=priority/multifactor
Future extension and plans:
- at least 3 nodes will be added soon
- BeeGFS distributed scratch space
- Five DGX-1 nodes this year
- improve cluster throughput by better QOS settings