ce-all - CE Information Provider
(c) by Florian Schintke <schintke@zib.de>,
Thomas Röblitz <roeblitz@zib.de>,
Jörg Meltzer <meltzer@zib.de>
Konrad-Zuse-Zentrum für Informationstechnik Berlin, 2001, 2002
Author(s):
Thomas Röblitz <roeblitz@zib.de>
Florian Schintke <schintke@zib.de>
Jörg Meltzer <meltzer@zib.de>
PBS: qstat
LSF: lsid, bqueues, lshosts, bhosts, bjobs
Condor: condor_q, condor_status, condor_version, condor_config_val
Globus: globus-hostname
'-cluster <cluster>' specifies the name of the server. The server name supplied is only honored for the PBS batch system. If the PBS server named in <cluster> doesn't exist an attempt is made to use the default PBS server for this client.
Cluster batch system bin path
'-cluster-batch-system-bin-path <cluster-batch-system-bin-path>' specifies the path where the commands of the cluster batch system can be found.
Globus hostname script
'-globus-hostname-script <globus-hostname-script>' specifies the complete path to the a script that prints out the hostname of the gatekeeper
(e.g. 'globus-hostname').
Globus config file
'-globus-config-file <globus-config-file>' specifies the path + filename of the Globus configuration files (e.g. globus-jobmanager.conf).
Grid mapfile
'-auth-users-from-grid-mapfile' reads authorized users from the grid-mapfile rather than from the static configuration file.
Host info bin path
'-host-info-bin' specifies the location of the hosts info script.
Local resource management system
'-lrms <lrmsstring> specifies the my Ressource Management System (default: PBS).
Maximum cputime
'-maxcputime <hh:mm:ss>' defines the maximum cpu time for a job submitted to the ce (Condor only).
Maximum wallclocktime
'-maxwalltime<hh:mm:ss>' defines the maximum wall clock time allowed for jobs submitted to the ce (Condor only).
Queue
'-queue <queue1 queue2...>' specifies non rms submission queues or a set of condor central managers (CONDOR_HOST).
Static config file
'-static <static>' specifies the name of the file that contains static information.
B <VO file> '-vo <VO>' specifies the name of the file which contains the VO TAG Info.
Resource management system execution queue
'-rms-execution-queue <queue> specifies the rms execution queue (stats are only shown if set in -queue switch) (PBS & LSF only).
Resource management system submission queues
'-rms-submission-queues <queue1> <queue2> ...' specifies rms submission queues (PBS & LSF only).
CESE Bindings
'-cesebind <configfile> | queueregex se | queueregex se directory' specifies the cesebind configuration
Ttl
'-ttl <ttl>' specifies the value for entryTtl.
Dynamic subcluster
'-sshworkernode <node>' specifies the node from which to take subcluster information. Feature is disabled.
General:
LSF allows various ways to configure a queue, we will try to find out the CE's values directly using values by bqueues command, usually we have a node by node configuration, so we view a queue as a set of nodes and accumulate missing bqueues values with the nodes values instead (bhost, lshost command).
MaxRunningJobs:
Unless a queue specifies the MaxRunningJobs we take the value from the nodes MaxRunningJobs. In case the admin employed a user/group based policy and absolutely no information for this attribute can be found we use the value of the total cpus.
TotalCPUs:
TotalCPUs differ from queue to queue, the value an accumulation of each host taken from bqueues -l field 'HOSTS: {hosts}'. Each hosts Cpus are determined first from lshosts command or second bhosts MaxRunningJobs.
See README.Condor
(when script is configured for all record types)
set defaults & initialize variables
parse cmdline
obtain static attributes values
create ce records
create cluster record
create subcluster records
create filesystem records
print ce records
print cluster record
print subcluster records
print remote filessystem records
call host info script
see subprocedures: setGlobalDefaults determineLrms, setLrmsDefaults, processCommandLineParameters
Here we set some global defaults.
try to get lrms information from the program name and check for '-lrms' commandLine option
aborts: unknown lrms is found in commandline scriptname are commandline option are mutually exclusive
PBS & LSF:
set wp4 rms variables
CONDOR:
set defaults for config files and policy variables
parse command line parameters
aborts: Cluster batch systems bin path could not be found, wp4 rms managed queues are misconfigured.
The procedures
getStaticData, readStaticInformation, getCeid
obtain all relevant static data, which is integrated to CE Records and cluster records
PBS:
post: the Cluster name is defined to be the gatekeeper host. ClusterArg, if specified, is used as the hostname of the PBS server to query. The server name as returned by the server will be used to set Server for subsequent PBS commands. The LrmsVersion is either pbs_version info taken from the qstat -B -f shell command, or "-" if info could not be found the ServerParam is specified as @ and the name of the Server $ServerParam = "@$Server" [Max|Default][CPU|Wall]TimeServer is the value in seconds of resources_[max|default].[cput|walltime] from qstat -B -f $Server" command or if the value is missing "-"
@AllQueues array contains queuenames from qstat -Q -f $ServerParam
aborts: no queues could be found on the cluster, no node information found
LSF:
post: The Cluster name is defined to be the gatekeeper host. the LrmsVersion value is taken from lsid shell command
node and the running jobs information are gathered here in the hashs %Jobs and %Nodes since calling bjobs and bhost/lshost shell calls are expensive in run-time nodes and jobs have references to each other (see Readme.Implementationdetails) which enables us to view the queue as a set of nodes
@AllQueues contains all queues found by bqueues
aborts: no queues could be found on the cluster, no node information found
CONDOR:
post: The Cluster name is defined to be the gatekeeper host. the LrmsVersion value is taken from condor_version shell command @AllQueues contains the collector names (CEIdNames) of all condor hosts specified with condor_config_val -pool CONDOR_HOST -name CONDOR_HOST COLLECTOR_NAME %CondorHosts hash contains a map collector name (CE name) -> CONDOR_HOST
aborts: no pools could be found
Here we perform all configfile reading.
We return a provisional CEID which is the same for each queue in the following course of the program. The string is gatekeeper_host:gatekeeper_port/jobmanager-lrms.
If we have to determine the gatekeeper host, gatekeeper port and have a globus config file specified in commandline we read out the globus variables otherwise we try to read the values from static config files first CE using the current lrms.
aborts: GlobusGatekeeperHost or GlobusGatekeeperPort is undefined.
Read a list of user records, which will be added to the GlueCEAccessControlBaseRules of all found CE records.
bla bla bla bla bla
We can redefine or add new CE/subcluster/cluster/filesystem information by adding a static config files information to dynamically created CE records.
Read out static config file by fetching either Computing Elements, cluster, subCluster or remoteFileSystems. At first, check the line for an object identifier. Second, scan the dn string for sub keys and store their value Third, check found keys values and figure out what kind of record the dn points to. Fourth, now that we are context aware, read the attribute/value pairs and store the attributename in the key of the hash of the identified hash and the location specified by the full dn string and their attribute value to the value of the same hash.
aborts: Static config file could not be opened.
This procedure enables us to include subcluster and filesystem information from a periodically updated remote host config file (by admin).
DN-STRING construction for subcluster and filesystem record: Read out remote host config file specified by first parameter. There are two modes, if we allow only one subcluster (WP1 people can operate with only one) then we create a subcluster record and set the subcluster unique id rdn to the clusters unique id. Otherwise, if we allow more than one subcluster then we process an additional hostname parameter from the file, which represents the subcluster unique id. We create the filesystem dn by adding the filesystem name rdn to the subclusters dn.
Alternatively a dynamic mechanism is provided, given a dn string of a subcluster node, the procedure will cause ssh to hop to the node specified by the GlueSubClusterUniqueID. Information will be read from stdout of the subcluster info script. NOTE: The effective subcluster id is dependent of the use of WP1 mode, If in WP1 mode the workernode must be set wit `-sshworkernode` option. ASSIGNING Attributes: The attributes GlueSubClusterUniqueID unique id and GlueSubClusterName are set after we found out the subclusters subcluster unique id rdn. Then we fetch cpu mem and filesystem info (see README) and set attributes.
We add dynamical information from either cluster batch system commands, or from static information we gathered before.
see subprocedures:
staticCEDataToQueue,
glueCEToQueue,
glueCEInfoToQueue,
glueCEPolicyToQueue,
glueCEStateToQueue
Resolve the static config file CE records dn string regular expression and integrate the content to matching actual CE records.
Append ce se bindings specified by commandline tuples/triples which are stored in Array CESEBinds.
For each CE add all job records to a CE when the jobs queue and the CEName are equal.
For each CE add all node records to a CE that have a QUEUE flag for the CEName.
For each CE add the GlueCE attributes to the CE.
For all CEs add the hostingclusters cluster id to the queues GlueForeignKeys.
For each CE add the GlueCEInfo attributes to the CE.
LSF:
Count CPUs of each of the queues nodes.
For each CE add the GlueCEPolicy attributes to the CE.
PBS:
Process 'qstat -Q -f queue@server' for policy information.
Get GlueCEPolicyPriority from line Priority = aPriority Get GlueCEPolicyMaxTotalJobs from line max_queuable = aMaxQueuable Get GlueCEPolicyMaxRunningJobs from line max_running = aMaxRunning Get GlueCEMaxCPUTime and GlueCEMaxWallClockTime from line resources_max.cput = aCPUTime resources_max.walltime = aWallClockTime (for full description and fallbacks see README)
LSF:
Process 'bqueues ClusterParam -l' for policy information.
Get GlueCEMaxCPUTime and GlueCEMaxWallClockTime from line following CPULIMIT RUNLIMIT
and GlueCEPolicyPriority and GlueCEPolicyMaxRunningJobs from line following PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV
We also set GlueCEPolicyMaxTotalJobs to GlueCEPolicyMaxRunningJobs. (for full description and fallbacks see README)
CONDOR:
Process condor_status and condor_config_val for policy information.
Get GlueCEPolicyMax[CPU|Wall]Time from commandline option or default value. Get GlueCEPolicyMaxRunningJobs by counting machines in condor_status command. Get GlueCEPolicyMaxTotalJobs as product of CondorMaxTotalJobsFactor and GlueCEPolicyMaxRunningJobs. (if defined CondorMaxTotalJobsFactor is dynamic `condor_config_val MAX_JOBS_RUNNING`). Get GlueCEPolicyPriority from $PriorityUndefined variable.
For each CE add the GlueCEState attributes to the CE.
see subprocedures: checkImmediateJobStart, checkIllegalValues
PBS:
Process qstat -Q -f queue@server command
Get GlueCEStateFreeCPUs is the sum of all nodes free cpus. Get GlueCEStateRunningJobs from line state_count= ...Running:num_running Exiting:num_exiting; Get GlueCEStateTotalJobs from line total_jobs=num_total_jobs GlueCEStateWaitingJobs is difference between total and running jobs. Get GlueCEStateWorstResponseTime & GlueCEStateEstimatedResponseTime see README Get GlueCEStateStatus processing lines enabled = is_enabled started = is_started
LSF:
Process bqueues -l -m cluster
Get GlueCEStateFreeCPUs is the sum of all nodes free cpus. Get GlueCEStateRunningJobs, GlueCEStateStatus, GlueCEStateTotalJobs from line following PRIO NICE STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SSUSP USUSP RSV GlueCEStateWaitingJobs is difference between total and running jobs. Get GlueCEStateWorstResponseTime & GlueCEStateEstimatedResponseTime see README Get GlueCEStateStatus processing lines enabled = is_enabled started = is_started
see: setLSFCPUPower, setLsfRemainingResponse
CONDOR:
Process condor_q -global -pool $Server
Get GlueCEStateFreeCPUs from command `condor_status -avail -long` counting Cpus. Get GlueCEStateRunningJobs, GlueCEStateTotalJobs, GlueCEStateWaitingJobs from command `condor_q -global` looking for regexp numJobs jobs; numIdle idle, numRunning running. GlueCEStateStatus is defined as "Production" GlueCEStateWorstResponseTime GlueCEStateEstimatedResponseTime is calculated in procedure computeResponseTimes with a static approximation.
see: computeResponseTimes
checkImmediateJobStart(dn)
All Lrms check if they can immediately run a job. If we have free cpus and the queue policy allows more jobs to run then we can consider immediate job start (ie. we set the response times to 0). Before allowing immediate job start we also require that the fraction of waiting jobs in the system is less than or equal to IdleJobThreshold.
see: glueCEStateToQueue
checkIllegalValues(dn)
All Lrms check if number of running jobs is greater than max_running jobs and total jobs greater than max_total jobs. The max value is increased to match the actual value of the CE specified by param dn.
We compute for every queue a value for the CPUPOWER attribute. We traverse the queues nodes and accumulate the cpu power of the nodes. Each queue can use 1/number_hosted_queues part of a cpu. If all nodes are used exclusivly by a queue then CPUPOWER = TotalCpus.
setLsfRemainingResponse(dn)
We compute the remaining wallClockTime for each queue
computeResponseTimes(dn)
This general formula is used if no jobinfo is used to compute estimated- worstResponseTime. We add the GlueCEStateWorstResponseTime and GlueCEStateEstimatedResponseTime attribute to the CE specified by param dn.
This procedure is used by rms and determines the suggested values for execution queue attributes.
Add WP4 rms state information (submission/execution/normal) of a queue using the RMSSTATE flag. Shared attributes of rms submission queues are recalculated, since they depend on execution queues values.
calculateSubmissionQueue(dn)
Set running jobs of the CE specified by parameter dn to execution queues running jobs Level maximums for running and total jobs Decrease wall clock time of a submission queue to execution queue value.
Activate all queues set in '-queue' for printing using PRINTABLEQUEUE flag.
We allow multiple sources of record definition, the admin may create multiple cluster / subCluster / remoteFileystem records each with different regular expressions. Therefore we have to unify these records.
see: unifyClusterData, addCEs, addSubclusters, unifySubClusterData, unifyFileSystemData
This procedure resolves the regular expressions in %Clusters hash and copies entries to %UnifiedCluster.
Add hosted CE's to the cluster.
For all CE, get subcluster information if a subcluster node specified in the static configfile matches one of the CE's nodes. For each CE determine at least one SubCluster.
This procedure resolves the regular expressions in %SubCluster hash and copies entries to %UnifiedSubCluster.
This procedure resolves the regular expressions in %FileSystems hash and copies entries to %UnifiedFileSystem.
Each record type has got its own print procedure.
see: printCeInformation, printClusters, printSubclusters, printFileSystems, printHostInfo,
This procedure traverses the hash of all CE and calls the printCE procedure if lrms, rms state information and the mds-vo-name of the CE apply to the current environment. If we use WP4 rms queues the execution queue should only be printed if set in '-queue'.
This procedure prints the CE information. We print all data for the ce specified by parameter $ce.
1. print the dn 2. print the objectclasses - print cetop, - ce - schemaversion - all other classes using the tags $AllCE{$ce}{OBJECTCLASS_class} 3. print the attributes - schemaversion - remaining attributes in sorted order 4. print timestamps
If the information provider is configured to print cluster records we will append the records matching to the CE's GlueCEHostingClusters.
Note prints only the first cluster if $Wp1SubClusterMode is set.
1. print the dn 2. print the objectclasses - print clustertop, - cluster - schemaversion - all other classes using the tags $UnifiedCluster{$cl}{OBJECTCLASS_class} 3. print the attributes - schemaversion - remaining attributes in sorted order 4. print timestamps
If the information provider is configured to print cluster records we will append the subCluster records matching to the CE's GlueCEHostingClusters from hash %UnifiedSubCluster.
Note prints only the first cluster if $Wp1SubClusterMode is set.
1. print the dn 2. print the objectclasses - print clustertop, - subcluster - schemaversion - all other classes using the tags $UnifiedSubCluster{$scl}{OBJECTCLASS_class} 3. print the attributes - schemaversion - remaining attributes in sorted order 4. print timestamps
Print all fileSystems from hash %UnifiedFileSystem.
1. print the dn 2. print the objectclasses - clustertop - hostremotefilesystem - schemaversion 3. print the attributes - schemaversion - remaining attributes in sorted order - all other classes using the tags $UnifiedFileSystem{$fileSys}{OBJECTCLASS_class} 4. print timestamps
Print host information if HostInfoBinArg is specified in commandline.
Print CE SE binding record.