Skip to content
Snippets Groups Projects
title = "Condor Jobs on HCC"
description = "How to run jobs using Condor on HCC machines"
weight = "54"

This quick start demonstrates how to run multiple copies of Fortran/C program using Condor on HCC supercomputers. The sample codes and submit scripts can be downloaded from condor_dir.zip.

Login to a HCC Cluster

Log in to a HCC cluster through PuTTY (For Windows Users) or Terminal (For Mac/Linux Users) and make a subdirectory called condor_dir under the $WORK directory. In the subdirectory condor_dir, create job subdirectories that host the input data files. Here we create two job subdirectories, job_0 and job_1, and put a data file (data.dat) in each subdirectory. The data file in job_0 has a column of data listing the integers from 1 to 5. The data file in job_1 has a integer list from 6 to 10. 

{{< highlight bash >}} $ cd $WORK $ mkdir condor_dir $ cd condor_dir $ mkdir job_0 $ mkdir job_1 {{< /highlight >}}

In the subdirectory condor_dir, save all the relevant codes. Here we include two demo programs, demo_f_condor.f90 and demo_c_condor.c, that compute the sum of the data stored in each job subdirectory (job_0 and job_1). The parallelization scheme here is as the following. First, the master computer node send out many copies of the executable from the condor_dir subdirectory and a copy of the data file in each job subdirectories. The number of executable copies is specified in the submit script (queue), and it usually matches with the number of job subdirectories. Next, the workload is distributed among a pool of worker computer nodes. At any given time, the number of available worker nodes may vary. Each worker node executes the jobs independent of other worker nodes. The output files are separately stored in the job subdirectory. No additional coding are needed to make the serial code turned "parallel". Parallelization here is achieved through the submit script. 

{{%expand "demo_condor.f90" %}} {{< highlight fortran >}} Program demo_f_condor implicit none integer, parameter :: N = 5 real8 w integer i common/sol/ x real8 x real8, dimension(N) :: y_local real8, dimension(N) :: input_data

open(10, file='data.dat')

do i = 1,N
    read(10,*) input_data(i)
enddo

do i = 1,N
    w = input_data(i)*1d0
    call proc(w)
    y_local(i) = x      
    write(6,*) 'i,x = ', i, y_local(i)
enddo
write(6,*) 'sum(y) =',sum(y_local)

Stop End Program Subroutine proc(w) real8, intent(in) :: w common/sol/ x real8 x

x = w

Return End Subroutine {{< /highlight >}} {{% /expand %}}

{{%expand "demo_c_condor.c" %}} {{< highlight c >}} //demo_c_condor #include <stdio.h>

double proc(double w){ double x;
x = w;
return x; }

int main(int argc, char* argv[]){ int N=5; double w; int i; double x; double y_local[N]; double sum; double input_data[N]; FILE *fp; fp = fopen("data.dat","r"); for (i = 1; i<= N; i++){ fscanf(fp, "%lf", &input_data[i-1]); }

for (i = 1; i <= N; i++){        
    w = input_data[i-1]*1e0;
    x = proc(w);
    y_local[i-1] = x;
    printf("i,x= %d %lf\n", i, y_local[i-1]) ;
}

sum = 0e0;
for (i = 1; i<= N; i++){
    sum = sum + y_local[i-1];   
}

printf("sum(y)= %lf\n", sum);    

return 0; } {{< /highlight >}} {{% /expand %}}


Compiling the Code

The compiled executable needs to match the "standard" environment of the worker node. The easies way is to directly use the compilers installed on the HCC supercomputer without loading extra modules. The standard compiler of the HCC supercomputer is GNU Compier Collection. The version can be looked up by the command lines gcc -v or gfortran -v.

{{< highlight bash >}} $ gfortran demo_f_condor.f90 -o demo_f_condor.x $ gcc demo_c_condor.c -o demo_c_condor.x {{< /highlight >}}

Creating a Submit Script

Create a submit script to request 2 jobs (queue). The name of the job subdirectories is specified in the line initialdir. The $(process) macro assigns integer numbers to the job subdirectory name job_. The numbers run form 0 to queue-1. The name of the input data file is specified in the line transfer_input_files.

{{% panel header="submit_f.condor"%}} {{< highlight bash >}} universe = grid grid_resource = pbs batch_queue = guest should_transfer_files = yes when_to_transfer_output = on_exit executable = demo_f_condor.x output = Fortran_(process).out error = Fortran_(process).err initialdir = job_$(process) transfer_input_files = data.dat queue 2 {{< /highlight >}} {{% /panel %}}

{{% panel header="submit_c.condor"%}} {{< highlight bash >}} universe = grid grid_resource = pbs batch_queue = guest should_transfer_files = yes when_to_transfer_output = on_exit executable = demo_c_condor.x output = C_(process).out error = C_(process).err initialdir = job_$(process) transfer_input_files = data.dat queue 2 {{< /highlight >}} {{% /panel %}}

Submit the Job

The job can be submitted through the command condor_submit. The job status can be monitored by entering condor_q followed by the username. 

{{< highlight bash >}} $ condor_submit submit_f.condor $ condor_submit submit_c.condor $ condor_q {{< /highlight >}}

Replace <username> with your HCC username.

Sample Output

In the job subdirectory job_0, the sum from 1 to 5 is computed and printed to the .out file. In the job subdirectory job_1, the sum from 6 to 10 is computed and printed to the .out file. 

{{%expand "Fortran_0.out" %}} {{< highlight batchfile>}}  i,x = 1 1.0000000000000000
i,x = 2 2.0000000000000000
i,x = 3 3.0000000000000000
i,x = 4 4.0000000000000000
i,x = 5 5.0000000000000000
sum(y) = 15.000000000000000
{{< /highlight >}} {{% /expand %}}

{{%expand "Fortran_1.out" %}} {{< highlight batchfile>}}  i,x = 1 6.0000000000000000
i,x = 2 7.0000000000000000
i,x = 3 8.0000000000000000
i,x = 4 9.0000000000000000
i,x = 5 10.000000000000000
sum(y) = 40.000000000000000
{{< /highlight >}} {{% /expand %}}