wiki:GCCluster

Context Navigation

Version 11 (modified by Pieter Neerincx, 13 years ago) (diff)
--

GCC cluster

The GCC has its own 480 core cluster. The main workhorses are 10 servers each with

48 cores
256 GB RAM
1 GBit management NIC
10 GBit NIC for a dedicated fast IO connection to a
2 PB shared GPFS for storage

For users

Login to the User Interface server

To submit jobs, check the status, test scripts, etc. you need to login on the user interface server a.k.a. cluster.gcc.rug.nl using SSH. Please note that cluster.gcc.rug.nl is only available from within certain RUG/UMCG subnets. From outside you need a double hop; Firstly login to the proxy:

$> ssh [your_account]@proxy.gcc.rug.nl

followed by:

$> ssh [your_account]@cluster.gcc.rug.nl

If you are inside certain subnets of the RUG/UMCG network, you can skip the proxy and login to cluster.gcc.rug.nl directly.

Available queues

In order to quickly test jobs you are allowed to run the directly on cluster.gcc.rug.nl outside the scheduler. Please think twice though before you hit enter: if you crash cluster.gcc.rug.nl others can no longer submit or monitor their jobs, which is pretty annoying. On the other hand it's not a disaster as the scheduler and execution daemons run on physically different servers and hence are not affected by a crash of cluster.gcc.rug.nl.

To test how your jobs perform on an execution node and get an idea of the typical resource requirements for your analysis you should submit a few jobs to the test queues first. The test queues run on a dedicated execution node, so in case your jobs make that server run out of disk space, out of memory or do other nasty things accidentally, it will not affect the production queues and ditto nodes.

Once you've tested your job scripts and are sure they will behave nice & perform well, you can submit jobs to the production queue named gcc. In case you happen to be part of the gaf group and need to process high priority sequence data for the Genome Analysis Facility you can also use the gaf queue.

Queue	Job type	Limits
test-short	debugging	10 minutes max. walltime per job; limited to a single test node / 48 cores
test-long	debugging	max 4 jobs running simultaneously per user; limited to half the test node / 24 cores
gcc	production - default prio	none
gaf	production - high prio	only available to users from the gaf group

Useful commands

Please refer to the Torque manuals for a complete overview. Some examples:

Submitting jobs:

Simple submit of job script to the default queue, which routes your job to the gcc production queue:

$> qsub myScript.sh

Submitting a job with a jobname different from the filename of the submitted script (default) and with a dependency on a previously submitted job. This job will not start before the dependency has completed successfully:

$> qsub -N [nameOfYourJob] -W depend=afterok:[ID of a previously submitted job] myScript.sh

Instead of providing arguments to qsub on the commandline, you can also add them using the #PBS syntax as a special type of comments to your (bash) job script like this:

#!/bin/bash
#PBS -N jobName
#PBS -q test-short
#PBS -l nodes=1:ppn=2
#PBS -l walltime=00:06:00
#PBS -l mem=10mb
#PBS -e /some/path/to/your/testScript1.err
#PBS -o /some/path/to/your/testScript1.out

[Your actual work...]

Checking for the status of your jobs:

Default output for all users:

$> qstat

Long jobs names:

$> wqstat

Limit output to your own jobs

$> wqstat -u [your account]

Get "full" a.k.a detailed output for a specific job (you probably don't want that for all jobs....):

$> qstat -f [jobID]

Get other detailed status info for a specific job:

$> checkjob [jobID]

List jobs based on priority as in who is next in the queue:

$> diagnose -p

List available nodes:

$> pbsnodes

For admins

Servers

Function	DNS	IP	Daemons	Comments
User interface node	cluster.gcc.rug.nl	195.169.22.156	- (clients only)	Login node to submit and inspect jobs. Relatively powerful machine. Users can run code outside the scheduler for debugging purposes.
scheduler VM	scheduler01	195.169.22.214	pbs_server maui	Dedicated scheduler No user logins if this one is currently the production scheduler
scheduler VM	scheduler02	195.169.22.190	pbs_server maui	Dedicated scheduler No user logins if this one is currently the production scheduler
Execution node	targetgcc01	192.168.211.191	pbs_mom	Dedicated test node: only the test-short and test-long queues run on this node. Crashing the test node shall not affect production!.
Execution node	targetgcc02	192.168.211.192	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc03	192.168.211.193	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc04	192.168.211.194	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc05	192.168.211.195	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc06	192.168.211.196	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc07	192.168.211.197	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc08	192.168.211.198	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc09	192.168.211.199	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.
Execution node	targetgcc10	192.168.211.200	pbs_mom	Redundant production node: only the default gcc and priority gaf queues run on this node.

PBS software / flavour

The current setup uses the resource manager Torque 2.5.12 combined with the scheduler Maui 3.3.1.

Maui

Runs only on the schedulers with config files in $MAUI_HOME:

/usr/local/maui/

Torque

Torque clients are available on all servers.
Torque's pbs_server daemon runs only on the schedulers.
Torque's pbs_mom daemon runs only on the execution nodes where the real work is done.
Torque config files are installed in $TORQUE_HOME:

/var/spool/torque/

Dual scheduler setup for seamless cluster upgrades

We use two schedulers: scheduler01 and scheduler02. These alternate as production and test scheduler. The production scheduler is hooked up to cluster.gcc.rug.nl and does not allow direct user logins. Hence you cannot submit jobs from the production scheduler, but only from cluster.gcc.rug.nl. The other is the test scheduler, which does not have a dedicated user interface machine and does allow direct user logins. You will need to login to the test scheduler in order to submit jobs. When it is time to upgrade software or tweak the Torque/Maui configs:

We drain a few nodes: running jobs are allowed to finish, but no new ones will start.
On the production scheduler as root:
```
$> qmgr -c 'set node targetgcc[0-9][0-9] state = offline'
```
Once idle move the drained nodes from the production to the test scheduler.
Change the name of the scheduler in both these files on each node to be moved:
```
$TORQUE_HOME/server_name
$TORQUE_HOME/mom_priv/config
```
On each execution node where the config changed run as root:
```
$> service pbs_mom restart
```
On the test scheduler as root:
```
$> qmgr -c 'set node targetgcc[0-9][0-9] state = online'
```
Check the change in available execution nodes using:
```
$> pbsnodes
```
Test the new setup
Disable direct logins to the test scheduler
Enable direct logins to the production scheduler
Disable job submission from cluster.gcc.rug.nl on the production scheduler
Take cluster.gcc.rug.nl offline
Make cluster.gcc.rug.nl the user interface and submit host for the test scheduler
Take cluster.gcc.rug.nl back online: the test scheduler is now the new production scheduler and vice versa
Drain additional nodes and move them to the new production scheduler

Installation details

/etc/hosts files

Extremely important: make sure hosts are named consistently in the /etc/hosts files on all hosts that are part of the cluster. More explicitly:

there shall be only one line per IP address.
in case of multiple names/aliases for the same IP address these shall all be listed on all hosts and in exactly the same order on that single line.

Inconsistent naming of hosts will result in miscommunication between Torque or Maui daemons. A typical symptom of inconsistent host names is when qsub fails to register job dependencies.

Our current config files:

$TORQUE_HOME/mom_priv/config
$MAUI_HOME/maui.cfg
Other Torque settings can be loaded from a file using qmgr.
To export/inspect the settings use:
```
$> qmgr -c 'p s'
```

Init scripts

Both the Torque and Maui source downloads contain a contrib folder with /etc/init.d/ scripts to start/stop the daemons:

/etc/init.d/pbs_server SuSE flavor | Redhat/CentOS/Fedora flavor
/etc/init.d/maui SuSE flavor | Redhat/CentOS/Fedora flavor
/etc/init.d/pbs_mom SuSE flavor

We use versions patched for:

The location where the daemons are installed.
The run levels at which the daemons should be started or stopped.
Dependencies: GPFS is explicitly defined as service required for starting/stopping the Torque and Maui daemons.
Make sure to check whether your scheduler runs on a SuSE or Redhat/CentOS/Fedora VM

To install:

On scheduler[01|02] as root:

$> cp *.pbs_server /etc/init.d/pbs_server; chkconfig --add pbs_server; service pbs_server status
$> cp *.maui       /etc/init.d/maui;       chkconfig --add maui;       service maui status

On targetgcc[01-10] as root:

$> cp *.pbs_mom    /etc/init.d/pbs_mom;    chkconfig --add pbs_mom;    service pbs_mom status

wqstat

We have patched torque-2.5.12/src/cmds/qstat.c and recompiled the clients to create wqstat, which reports long job names up to 40 characters as compared to the default 16. As normal user:

$> cd torque-2.5.12
$> ./configure --with-default-server=scheduler01 --disable-server --disable-mom --prefix=/some/other/location/cluster_clients/
$> make
$> make install

As root:

$> cp /some/other/location/cluster_clients/bin/qstat   /usr/local/bin/wqstat

Attachments (9)

gcc_pbs_mom.config.txt (805 bytes) - added by Pieter Neerincx 13 years ago. pbs_mom config
suse.maui (1.2 KB) - added by Pieter Neerincx 13 years ago. /etc/init.d/ script for maui
suse.pbs_mom (1.8 KB) - added by Pieter Neerincx 13 years ago. /etc/init.d/ script for pbs_mom
suse.pbs_server (1.7 KB) - added by Pieter Neerincx 13 years ago. /etc/init.d/ script for pbs_server
qstat.c (55.1 KB) - added by Pieter Neerincx 13 years ago. Patched qstat.c for long job names
gcc_maui.cfg.txt (659 bytes) - added by Pieter Neerincx 13 years ago. maui.cfg
redhat.maui (565 bytes) - added by Pieter Neerincx 13 years ago. /etc/init.d/ script for maui (Redhat/CentOS/Fedora flavor)
redhat.pbs_server (2.3 KB) - added by Pieter Neerincx 13 years ago. /etc/init.d/ script for pbs_server (Redhat/CentOS/Fedora flavor)
gcc_torque.txt (2.5 KB) - added by Pieter Neerincx 13 years ago. torque setup commands to be imported with qmgr

Download all attachments as: .zip

Download in other formats:

Plain Text