HTCondor

Fundamentals

HTCondor is a high-throughput job scheduler for high performance computing environments. It receives tasks from users and allocates shared resources from a cluster of worker nodes to execute them.

As SRE for Galaxy, I have run an HPC cluster managed with HTCondor where thousands of jobs are submitted every day. I did also migrate the cluster from HTCondor 8 to HTCondor 23 with minimal issues and no downtime.