Table of Contents
Abstract
This document contains the information needed to configure thea GLite Data Transfer Agent. For additional information how to install that module, please refer to the gLite Data Transfer Agents Module Install Guide.
This package provide the agents that perform the actions concerning the Data Transfers. We distinguished two different kinds of agent: the Channel Agent and the VO Agent.
The Channel Agent is responsible to manage all the file transfers through a channel, i.e. the entity that represent the phisical, monodirectional link between two sites: this agent will fetch the files transfers requests from a Queue and submit them to the configured TransferService.
The other type of agent, the VO Agent, is in charge to perform all the action that are related to a VO specific Virtual Organization, which involves applying VO policies, and Catalog interaction. In case a VO requires Cataog interaction, it has to configure a pluigion to talk to a Catalog Service, in order to retrieve the source and destination SURLs from the Logical File Names (LFNs and GUIDs) and source and destination sites; Once a transfer is completed, the new replicas will also be registered to the appropriate catalog.
One Channel Agent is needed for each channels available on the site, and one VO Agent is needed for each VO that want to perfom data transfers. All of these agents share the same Queue, but the FTA frameworks guarantees that they interact each other in the proper way: a VO Agent is allowed to see just the jobs (and related information) that belong to itself, in the same way a Channel Agent is not able to process requests belonging to other channels. You can imagine that each agent acts on a view of the entire Queue:
/---------------\ | Queue | \---------------/ |-------- | VO_1 || Vo_1 | | Agent =====> View |-------| |-------- Ch_1 || Channel_1 | | View <===== Agent |-------- || VO_2 || Vo_2 |-------| Agent =====> View |-------| |-------- Ch_2 || Channel_2 | | View <===== Agent |-------- || VO_3 || Vo_2 |-------| Agent =====> View | | |-------- | \---------------/
The way of the Channel and the VO Agent work is the same: they periodically run some action in order to perform the step required to transfer data.
The Channel Agent actions are:
Submit new File transfer request to the TransferService
Check the state of all the active File transfer requests and update the Queue with the retrieved information
Revoke active file transfers marked as canceling on the Queue
The VO Agent actions are:
Allocate a transfer job to a channel based on the source and destination of the related files. In case no SURLs are provided, it performs the name resolution starting from the Logical name and contacting a Catalog service
Check if the source files are ready to be transferred, i.e. already staged in the SE cache
Perform the finalization of the transfer, registering, if required, the new replicas into a Catalog service
Reschedule failed transfers that are in waiting state
Revoke pending (i.e. not yet processed by the Channel Agent) file transfers marked as canceling on the Queue.
The GLite Data Transfer Agents module provides a default action for all of these types, but it would also allow to extend the behavior of an agent by configuring different actions.
The gLite Data Transfer Agents use the gLite Data Config Service modules for retrieving initialization and configuration parameters. For a detailed explanation how that module works please have a look at the gLite Data Config Service User's Guide.
The GLite Transfer Agent requires the configuration files named glite-transfer-channel-agent-<CHANNEL_NAME>.properties.xml and glite-transfer-channel-agent-<CHANNEL_NAME>.log-properties in $GLITE_LOCATION_USER/etc, $GLITE_LOCATION/etc or $GLITE_LOCATION_VAR/etc, where <CHANNEL_NAME> is the name of the Channel which the agent is responsible for. The structure of theconfiguration file depends of the type of Queue and Transfer Service the Channel Agent has to contact:
<service name="transfer-channel-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-ts-****"> <!-- Transfer Service Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-channel-actions"> <!-- Channel Actions Configuration --> </component> <component name="transfer-channel-agent"> <!-- Channel Agent Configuration --> </component> </components> </service>
where the sections markled with **** depends on the connector that has to be used.
The GLite Transfer Agent requires the configuration files named glite-transfer-vo-agent-<VO_NAME>.properties.xml and glite-transfer-vo-agent-<VO_NAME>.log-properties in $GLITE_LOCATION_USER/etc, $GLITE_LOCATION/etc or $GLITE_LOCATION_VAR/etc, where <VO_NAME> is the name of the Virtual Organization which the Agent belongs to. The structure of the configuration file depends also on the chosen VO Agent deployment type (with or withour catalog interaction) and on the type of the Queue the Agent has to contact. Please note that a VO can install only one VO Agent.
<service name="transfer-vo-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object Library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-vo-actions"> <!-- VO Actions Configuration --> </component> <component name="transfer-vo-agent"> <!-- VO Agent Configuration --> </component> </components> </service>
<service name="transfer-vo-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <!-- Optional <component name="transfer-agent-namegen-****"> <!-- NameGeneration Plugin --> </component>--> <component name="transfer-agent-catalog-****"> <!-- Catalog Service client Plugin --> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object Library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-vo-actions"> <!-- VO Actions Configuration --> </component> <component name="transfer-vo-agent"> <!-- VO Agent Configuration --> </component> </components> </service>
where the sections markled with **** depends on the connector/plug-in that has to be used.
Currently, the following Connectors are implemented:
Transfer Service:
It can be configured to use third party gsiftp transfers (type is "urlcopy") or to issue SRM copy requests (type is "srmcopy")
Data Access Object:
Oracle -- transfer-agent-dao-oracle
The requirement is that the configuration should contain one connector per type (transfer-agent-ts-* and transfer-agent-dao-*).
For what concern the Catalog Client, currently we provide:
GLite Fireman -- transfer-agent-catalog-fireman ( provided by the module org.glite.data.transfer-agent-fireman)
PythonCatalog -- transfer-agent-catalog-python
Please note that a Catalog Client is required only if the VO Agent wants to accept transfer job requests with logicla names. Please also note that the PythonCatalog plugin doesn't provide the functionality to interact with a CatalogService, but it is intendedto act as a "bridge" in order to develop VO specific CatalogService plugin as python scripts
Finally, in case of Catalog interaction is required, it will be possible to specify the strategy to generate the name of the destination file starting from the logical name. This plugin is optional, in the sense that if no component is provided, the FTA would use a default implementation that concatenate the SRM endpoint, the Storage Element path and the logical name. Additional plugins are:
*Default* (used if no plugin is provided)
PythonNameGen -- transfer-agent-namegen-python
The following section describes the configuration parameter for the GLite Transfer Agents. The configuration for the Connectors/Plug-ins is reported in a dedicated section. Since most of the modules are common for the Channel and VO Agents, deployment model specifics are discussed on the description note. For a structure of the configuration file for a given agent, please check the previous section.
Represents the module that should be use to retieve service information from ServiceDiscovery
libglite_data_agents_common_sd.so
The type used to register SRM Services on ServiceDiscovery
Type: string; Default value: 'SRM'
The type used to register GRIDFTP Servers on ServiceDiscovery
Type: string; Default value: 'gsiftp'
The type used to register MyProxy Catalog Services on ServiceDiscovery
Type: string; Default value: 'MyProxy'
The type used to register FileTransfer Services on ServiceDiscovery
Type: string; Default value: 'org.glite.FileTransfer'
The type used to register FileTransferStat Services on ServiceDiscovery
Type: string; Default value: 'org.glite.FileTransferStats'
The type used to register ChannelManagement Services on ServiceDiscovery
Type: string; Default value: 'org.glite.ChannelManagement'
The type used to register Channel Agents on ServiceDiscovery
Type: string; Default value: 'org.glite.ChannelAgent'
The type used to register BDII Services on ServiceDiscovery
Type: string; Default value: 'BDII'
The type used to register VOMS Services on ServiceDiscovery
Type: string; Default value: 'org.glite.voms'
The type used to register SEIndex Services on ServiceDiscovery
Type: string; Default value: 'org.glite.SEIndex'
The type used to register Fireman Services on ServiceDiscovery
Type: string; Default value: 'org.glite.Fireman'
The name of the SRM Service Property that provides the path to be used. This is computerd starting from the Glue SAPath property
Type: string; Default value: 'SEMountPoint'
The name of the ChannelAgent Service Property that provides the channel source site.
Type: string; Default value: 'source'
The name of the ChannelAgent Service Property that provides the channel destination site.
Type: string; Default value: 'destination'
The name of the MyProxy Server Property that provides the Fts Mode.
Type: string; Default value: 'FtsMode'
The value of the MyProxy Server FtsMode Property for retrivers.
Type: string; Default value: 'retriever'
The value of the MyProxy Server FtsMode Property for renewers.
Type: string; Default value: 'renewer'
Enable caching of ServiceDiscovery information
Type: boolean; Default value: 'true'
The validity, in seconds, of the entries created in the cache. When this time elapsed, the entry is refreshed automatically by the cache next time the user asks for it. Entry related to queries that returned no result are not refreshed. If this parameter is not provided, the internal default will apply (1 hour)
Type: integer; Default value: ''
The reduced validity time for the stale entries in the cache: When the validity of an entry expires, the cache try to refresh it using ServiceDiscovery. In case of failure, the entry is considered stale and kept into the cache with a shorter validity. If this parameter is not provided, the internal default will apply (15 minutes)
Type: integer; Default value: ''
The number of seconds that should elaps before the entry in the cache are considered obsolete and can then be purged. Since entries are periodically refreshed based on the usage, the purge operation affects only stale entries. If this parameter is not provided, the internal default will apply (5 hours)
Type: integer; Default value: ''
The number of seconds that should elaps before considering obsolete the entries related to queries that didn't return any result. These "negative" results are not automatically refreshed, therefore should be cleaned more often. If this parameter is not provided, the internal default will apply (5 minutes)
Type: integer; Default value: ''
Represents the mmodule that provide a client to the MyProxy Server for retrieving proxy certificates to be used for the File Transfers
libglite_data_agents_common_cred_myproxy.so
The host name of the MyProxy Server. If that parameter is not set or is empty, the Myproxy Server is looked up using the Service Discovery and then, if not found, the myproxy default will apply (MYPROXY_SERVER environment variable)
Type: string; Default value: ''
The port of the MyProxy Server. If that parameter is not set or is 0, the myproxy default will applies
Type: integer; Default value: '0'
The lifetime in seconds of the proxy certificates that will be created
Type: integer; Default value: '86400'
The location where the certificates retrieved from the MyProxy Service will be stored. That location must already exist
Type: string; Default value: '/tmp'
The minimum validity time (in seconds) an already existent certificate should have before submitting a new job. In case the certificate couldn't satisfy that requirement, a new certificate will be retrieved from the MyProxy Service
Type: integer; Default value: '3600'
<No configuration parameters>
Represents the library that provides the logic for the File and Job State transitions
libglite_data_transfer_agent_fsm.so
When this parameter is set to true, in case of consecutive failures, a transfer will be moved to the "Hold" state, waiting for manual intervention, otherwise it will go in "Failed"
Type: boolean; Default value: 'true'
<No configuration parameters>
This module provides a scheduler class that is able to periodically execute FTA Actions
libglite_data_transfer_agent_scheduler.so
<No initialization parameters>
The number of consecutive failures before an Action is considered disabled for <DisableTime> seconds. If that value is set to zero, actions will never be disabled and the parameter <DisableTime> is ignored
Type: integer; Default value: '0'
The number of seconds an action stays disabled
Type: integer; Default value: '300'
Specify the time interval, in seconds, to periodically check if the DAO context is valid. If 0 is specified,the context is checked on every iteration
Type: integer; Default value: '60'
Specify the timeout, in seconds, that a scheduler has to wait for the stopping its thread gracefully. If the timeout elapsed a signal in sent in order to try to abort the running action
Type: integer; Default value: '100'
Catalog Client: Client for retrieving Replica Information from a Catalog Service. In case the no catalog interaction is foreseen, this component can be removed from the configuration file. This module will be loaded at run-time. Currently, the supported plugins are: FiremanCatalog and the python catalog bridge
For the Fireman plugin, please have a look at the org.glite.data.transfer-agent-fireman module.
The python catalog bridge details are:
This module represents the module that should be used to call a Python script in order to resolve logical file names and register new replicas
libglite_data_transfer_agent_catalog_python.so
The name of the python module that contains the function that would be used in order to check replication permissions, resolve logical file names and register new replicas
Type: string; Default value: ''
This parameter represent an initialization string that will be passed to the python module that contains the catalog plugin, specified by the "CatalogModule" parameter. The format of that string is module dependent
Type: string; Default value: ''
<No configuration parameters>
File Transfer Service Connector: the configuration file for the GLite Transfer Agent should specify one connector for contacting the Transfer Service. This module will be loaded at run-time.
Represents the connector to use the URLCopy as Transfer Service
libglite_data_transfer_agent_ts_urlcopy.so
The type of transfer that should be performed. This value can be either "urlcopy" or "srmcopy": In the first case, the TransferService would perform the SRM interaction and start a GridFTP transfer, in the latter the TransferService would delegate the transfer to the SRM by an SRMCopy call
Type: string; Default value: 'urlcopy'
The maximum number of transfer request that the Transfer Service can process simultaneously
Type: integer; Default value: '50'
The maximum number of parallel streams that could be used during the transfer. A transfer is performed with the number of streams specified in the channel table, if lesser than "Streams", otherwise this threshold would be used. In any case the value can be overwritten by a job using the job's parameter "-p". Currently, this value has no effect if the "TransferType" parameter is set to "srmcopy"
Type: integer; Default value: '10'
The TCP Windows Size that should be used during the transfer.If set to 0, the default is used This parameter can be overwritten by a job using the job's parameter "-tcpbs". Currently, this value has no effect if the "TransferType" parameter is set to "srmcopy"
Type: integer; Default value: '0'
Deprecated. Use "TcpBufferSize" instead
Type: integer; Default value: '0'
The Block Size that should be used during the transfer.If set to 0, the default is used This parameter can be overwritten by a job using the job's parameter "-bs". Currently, this value has no effect if the "TransferType" parameter is set to "srmcopy"
Type: integer; Default value: '0'
The timeout value (in seconds) that should be used to detect that a transfer is hanging.0 means no timeout.In case the specified value is -1, the glite-url-copy default will apply. Please note that in case of SrmCopy bulk requests the global transfer timeout is computed by multiplying this value for the size of the request.If this property is not set, it will assume 600 in case of UrlCopy transfer type and 0 in case of SrmCopy transfer type
Type: integer; Default value: '600'
The timeout value (in seconds) that should be used for the SOAP calls to the SRM. This timeout is used both for send and receiving.0 means no timeout. In case the specified value is -1, the glite-url-copy default will apply.
Type: integer; Default value: '-1'
The timeout value (in seconds) that should be used for an SRM Put request. 0 means no timeout.In case the specified value is -1, the glite-url-copy default will apply. This value is ignored if the "TransferType" parameter is different from "urlcopy"
Type: integer; Default value: '60'
The timeout value (in seconds) that should be used for an SRM Get request. 0 means no timeout. In case the specified value is -1, the glite-url-copy default will apply. This value if ignored is the "TransferType" parameter is different from "urlcopy"
Type: integer; Default value: '60'
The timeout value (in seconds) that should be used for trying to set the SRM status of the Destination File to Done.0 means no timeout.In case the specified value is -1, the glite-url-copy default will apply. This value is ignored if the "TransferType" parameter is different from "urlcopy"
Type: integer; Default value: '60'
The timeout value (in seconds) that should be used for trying to set the SRM status of the Source File to Done.0 means no timeout.In case the specified value is -1, the glite-url-copy default will apply. This value is ignored if the "TransferType" parameter is different from "urlcopy"
Type: integer; Default value: '60'
The timeout value (in seconds) between two Transfer Markers that should be used to detect that a transfer is hanging.0 means no timeout.In case the specified value is -1, the glite-url-copy default will apply. This value is ignored if the "TransferType" parameter is different from "urlcopy"
Type: integer; Default value: '120'
This parameter is now deprecated. In order to set the timeout for an srmcopy transfer, please use the "TransferTimeout" parameter
Type: integer; Default value: '-1'
The time interval (in seconds) that should be used in order to abort an srm copy call: when the SRM can't be contacted for the time specified by that parameter, the transfer will be aborted. 0 means no timeout. This value is ignored if the "TransferType" parameter is different from "srmcopy"
Type: integer; Default value: '600'
The maximum size for a bulk SrmCopy request. This value is ignored if the "TransferType" parameter is different from "srmcopy"
Type: integer; Default value: '100'
The direction of the srm copy transfer. Can assume the following values: "pull" (the transfer is driven by the destination SRM), "push" (the transfer is driven by the source SRM), "default" (defualt value of glite-urlcopy, usually pull). This value is ignored if the "TransferType" parameter is different from "srmcopy"
Type: string; Default value: 'default'
The Log Level for the Glite URL Copy Transfer. Allowed values are: DEBUG, INFO, WARN, and ERROR
Type: string; Default value: 'DEBUG'
Data Access Object Connector: the configuration file for the GLite Transfer Agent should specify one connector for contacting the Queue. This module will be loaded at run-time.
Represents the base module to Access Data (DAO) using Oracle as Data Source
libglite_data_agents_common_dao_oracle.so
The Oracle ConnectString identifying the DB
Type: string; Default value: 'localhost'
the name of the user that should be used to connect to the DB
Type: string; Default value: ''
the password of the user that should be used to connect to the DB
Type: string; Default value: ''
The Size of the statement Cache.0 means that the caching is disabled. Note: since some memory leaks has been observed, it better for the moment to keep the cache disabled
Type: integer; Default value: '0'
<No configuration parameters>
Represents the module to Access Data (DAO) using Oracle as Data Source
libglite_data_transfer_agent_dao_oracle.so
Enable the VO View DAO Factory
Type: string; Default value: 'false'
Enable the Channel View DAO Factory
Type: string; Default value: 'true'
Enable the Credentials View DAO Factory
Type: string; Default value: 'true'
<No configuration parameters>
Represents the module to Access Data (DAO) using Oracle as Data Source
libglite_data_transfer_agent_dao_oracle.so
Enable the VO View DAO Factory
Type: string; Default value: 'true'
Enable the Channel View DAO Factory
Type: string; Default value: 'false'
Enable the Credentials View DAO Factory
Type: string; Default value: 'false'
<No configuration parameters>
Channel Agent: this section will report the Channel Agent-specific modules
Represents the module that provides the Channel Agent Actions. These actions are: <Fetch> : Submit job through the channel; <CheckState> : Check Active Jobs' State; <CancelActive>: Cancel Active Transfers;
libglite_data_transfer_agent_channel_actions.so
The strategy to be used to change the format of the SURLs. Allowed values are: compact: SURLs would be converted to the format srm://<host>/<sfn> compact-with-port: SURLs would be converted to the format srm://<host>:<port>/<sfn> fully-qualified: SURLs would be converted to the format srm://<host>:<port>/<path>?SFN=<sfn> where <path> is usually retrieved from the SRM endpoint and it's usually <srm/managerv1> disabled: no convertion is performed. The SURLs are processed as submitted by the users
Type: string; Default value: 'compact'
The default SRM version to be used when looking for an SRM endpoint
Type: string; Default value: '1.1'
The policy that shold be applied when choosing the SRM version to use. Allowed values are: default: if the user provided a fully-qualified SURL, use the endpoint extracted from it, otherwise use the value specified in "SrmVersion" with-space-token: use 2.2 if a space token is provided, otherwise behaves as the default policy force: force to use the version specified in "SrmVersion"
Type: string; Default value: 'default'
Enable Client Credential Delagation. If this value is set to true, transfers would be executed using client delegated credentials, retrieved from MyProxy, otherwise the agent would use its Service credentials.
Type: boolean; Default value: 'true'
The maximum number of Ready/Active files to cancel at each iteration of the Cancel action.
Type: integer; Default value: '500'
This module represent the agent that should run the actions concerning a Channel. Actions are provided by different libraries that should be loaded into the the agent process using GLite Data Config Service. The Channel Agent requires to run three different kind of action: <Fetch> : Transfer files through the channel using the configured TransferService; <Check> : Check the status of pending TransferService requests; <Cancel>: Cancel the active and pending transfer TransferService request; This configuration assign a default value for all of these types, however it would be possible to provide different action types and use them, just provide the type of the action on the related <action_type>_Type initialization parameter. By default, all the actions (except the Cancel one) are scheduled to be executed periodically with an interval provided by the "DefaultInterval" configuration parameter. In order to modify the sceduling interval based on the action type, just provide a <action_type>_Interval configuration parameter with the value you want to assign. In addition to the actions described above, the Channel Agent also execute other two internal actions: <Heartbeat> : Periodically refresh the status of agent in teh Database <CleanSDCache>: Clean the ServiceDiscovery Cache For what concern the scheduling interval of these actions, the values are specified by the parameters <Heartbeat_Interval> and <CleanSDCache_Interval>
libglite_data_transfer_channel_agent.so
The name of the Channel which the agent is responsible for
Type: string; Default value: ''
The contact information of the Administrator responsible for that agent
Type: string; Default value: ''
This parameter specify how to interpret the VO Share values associated to a channel. It should be one the following values: * normalized: the share is the value of the channel voshare property for the given VO, normalized to the sum of all the share for all the VOs in the same channel. This option could be used when channel administrators wantto guarantee slots for certain VOs, in order to implement some sort of QoS, accepting to eventually penalize the total throughput (transfer slots would be reserved to a VO even if that VO has no job to process) * absolute: the share is the value on the channel voshare property expressed as a percentage. No normalization is performed, that means that the sum of all the shares on the same channel can exceed 100%. This option could be used when channel administrators want to balance the share between the VOs, without allowing that a single VO fully allocate a channel but minimizing the risk to allocate slots to VOs that don't have any job to process. This option implies some tuning on the VO share values based on the experience, but it would allow to have a compromise between throughput and QoS * normalized-on-active: the share is the value of the channel voshare property for the given VO, normalized to the sum of all the share for all the VOs in the same channel that has at least one job that can be processed by the Channel Agent (job state should be Active, Pending or Canceling). This option is the default one and could be used when the channel administrators want to optimize the throughput of the channel (the channel can be fully allocated even by one VO), but with a lower QoS
Type: string; Default value: 'normalized-on-active'
The name of the action that provides the logic to fetch transfers
Type: string; Default value: 'glite:Fetch'
The name of the action that provides the logic to check the state of running transfers
Type: string; Default value: 'glite:CheckState'
The name of the action that provides the logic to cancel active and pending transfers
Type: string; Default value: 'glite:CancelActive'
The default interval, in secons, to be used for scheduling the Channel actions
Type: integer; Default value: '3'
The interval, in seconds, to be used for scheduling the Heartbeat action. The purpose of this action is to periodically update the lastActive timestamp in the t_agent table, in order to demonstrate that agent is up and running. If this value is 0, the Heartbeat action will be disabled
Type: integer; Default value: '60'
The interval, in seconds, to be used for purging obsolete entries from the ServiceDiscovery cache in order to evaluate changes in the information system. If the SD Cache is disabled, this action doesn't do anything
Type: integer; Default value: '300'
The interval, in secons, to be used for scheduling the Fetch action. If this parameter is not set or is empty, the Fetch action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Fetch action will be disabled (useful only for debugging purposes)
Type: integer; Default value: ''
The interval, in secons, to be used for scheduling the Check State action. If this parameter is not set or is empty, the Check State action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Check action will be disabled (useful only for debugging purposes)
Type: integer; Default value: ''
The interval, in secons, to be used for scheduling the Cancel action. If this parameter is not set or is empty, the Cancel action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Cancel action will be disabled (useful only for debugging purposes)
Type: integer; Default value: '60'
VO Agent: this section will report the VO Agent-specific modules
Represents the module that provides the VO Agent Actions. These actions are: <Allocate> : allocate a job into a channel based on the source and destination ; If needed, interact with an external Catalog Service in order to perform the name resolution <CheckReadiness>: check if the files are ready to be transferred (i.e. staged on the disk cache) <BasicRetry> : reschedule transfers that are in waiting state; <Finalize> : Finalize the transfer request and, if needed, register the new replicas into an externl catalog service
libglite_data_transfer_agent_vo_actions.so
The strategy to be used to change the format of the SURLs. Allowed values are: compact: SURLs would be converted to the format srm://<host>/<sfn> compact-with-port: SURLs would be converted to the format srm://<host>:<port>/<sfn> fully-qualified: SURLs would be converted to the format srm://<host>:<port>/<path>?SFN=<sfn> where <path> is usually retrieved from the SRM endpoint and it's usually <srm/managerv1> disabled: no convertion is performed. The SURLs are processed as submitted by the users
Type: string; Default value: 'compact'
The default SRM version to be used when looking for an SRM endpoint
Type: string; Default value: '1.1'
The policy that shold be applied when choosing the SRM version to use. Allowed values are: default: if the user provided a fully-qualified SURL, use the endpoint extracted from it, otherwise use the value specified in "SrmVersion" with-space-token: use 2.2 if a space token is provided, otherwise behaves as the default policy force: force to use the version specified in "SrmVersion"
Type: string; Default value: 'default'
Enable Client Credential Delagation. If this value is set to true, vo agent actions would be executed using client delegated credentials, retrieved from MyProxy, otherwise the agent would use its Service credentials. Please note that this option is introduced for testing purposes and therefore is not supported in production
Type: boolean; Default value: 'true'
The maximum number of failures before moving the file to Hold (or Failed, depending on "Transfer Agent FSM" EnableHold advanced parameter)
Type: integer; Default value: '3'
The delay in second before a Waiting file is resubmitted
Type: integer; Default value: '600'
The delay in second before a Catalog Waiting file is retried
Type: integer; Default value: '600'
Enable Unknown Source Site. If this value is set to true, during the allocation phase, in case the source SE is not listed on the Information System, the job would be assigned to the fake "UNKNOWN" site; otherwise the job would fail. This option is useful in order to transfer files from an SE that is on a different Grid
Type: boolean; Default value: 'false'
Enable Unknown Destination Site. If this value is set to true, during the allocation phase, in case the destination SE is not listed on the Information System, the job would be assigned to the fake "UNKNOWN" site; otherwise the job would fail. This option is useful in order to transfer files to an SE that is on a different Grid
Type: boolean; Default value: 'false'
This module represents the agent that should run the FTS actions that are strictly related to Virtual Organization policies. Actions are provided by different libraries that should be loaded into the the agent process using GLite Data Config Service. The VO Agent requires to run the follwing kinds of actions: <Allocate> : Allocate transfer job to a channel; in case Logical names are provided, it will interact with an external catalog service in order to perform the name resolution; <Retry> : Schedule the resubmission of failed transfers; <Cancel> : Cancel a pending file (not yet queue in the channel queue); <CheckReadiness> : Check if a file is already "online", i.e. available in the disk cahce for transferring <Finalize> : Finalize the request; in case logical names are provided, this action will register the new replicas into an external catalog service; This module provides a default value for all of these types, however it would be possible to provide different action types and use them, by specifying the type of the action on the related <action_type>_Type initialization parameter. By default, all the actions (except the Retry and Cancel ones) are scheduled to be executed periodically with an interval provided by the "DefaultInterval" configuration parameter. The interval for the Retry Action is specified by the <Retry_Interval> parameter, the one for the Cancel Action by <Cancel_Interval>. In order to modify the scheduling interval based on the action type, you can provide a <action_type>_Interval configuration parameter with the value you want to assign. In addition to the actions described above, the VO Agent also execute other two internal actions: <Heartbeat> : Periodically refresh the status of agent in teh Database <CleanSDCache>: Clean the ServiceDiscovery Cache For what concern the scheduling interval of these actions, the values are specified by the parameters <Heartbeat_Interval> and <CleanSDCache_Interval>
libglite_data_transfer_vo_agent.so
The name of the VO which the agent belong to
Type: string; Default value: ''
The contact information of the Administrator responsible for that agent
Type: string; Default value: ''
If that parameter is set to true, the transfers will be performed using the related Channel Agent service credentails, otherwise they will use the client proxy certificate downloaded from MyProxy
Type: boolean; Default value: 'false'
The name of the action type that provides the logic to allocate a transfer job to a channel. In case the source and destination SURLs are not provided, this action will also contact a Catalog Service (if the appropriate plugin is configured) configured in order to perform the name resolution.
Type: string; Default value: 'glite:Allocate'
The name of the action type that provides the logic to retry failed transfers
Type: string; Default value: 'glite:BasicRetry'
The name of the action type that provides the logic to cancel pending (i.e. not yet processed by a channel) file transfers
Type: string; Default value: 'glite:CancelPending'
The name of the action type that provides the logic to finalize the transfer, resolve, registering the new replicas into a Catalog Service when the file transfers are completed.
Type: string; Default value: 'glite:Finalize'
The name of the action type that provides the logic to check if the Source files are ready to be transferred, i.e. already available ("online") in the source Storage Element cache .
Type: string; Default value: 'glite:CheckReadiness'
The default interval, in seconds, for scheduling the VO actions
Type: integer; Default value: '3'
The interval, in seconds, to be used for scheduling the Heartbeat action. The purpose of this action is to periodically update the lastActive timestamp in the t_agent table, in order to demonstrate that agent is up and running. If this value is 0, the Heartbeat action will be disabled
Type: integer; Default value: '60'
The interval, in seconds, to be used for purging obsolete entries from the ServiceDiscovery cache in order to evaluate changes in the information system. If the SD Cache is disabled, this action doesn't do anything
Type: integer; Default value: '300'
The interval, in seconds, to be used for scheduling the Allocate action. If this parameter is not set or is empty, the Allocate action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Allocate action will be disabled (useful only for debugging purposes)
Type: integer; Default value: ''
The interval, in seconds, to be used for scheduling the Retry action. If this parameter is not set or is empty, the Retry action will be scheduled every 60 seconds. If this value is 0, the Retry action will be disabled (useful only for debugging purposes)
Type: integer; Default value: '60'
The interval, in seconds, to be used for scheduling the Cancel action. If this parameter is not set or is empty, the Cancel action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Cancel action will be disabled (useful only for debugging purposes)
Type: integer; Default value: '60'
The interval, in seconds, to be used for scheduling the Finalize action. If this parameter is not set or is empty, the Finalize action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Finalization action will be disabled (useful only for debugging purposes).
Type: integer; Default value: ''
The interval, in seconds, to be used for scheduling the CheckReadiness action. If this parameter is not set or is empty, the CheckReadiness action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the CheckReadiness action will be disabled (useful only for debugging purposes).
Type: integer; Default value: ''
This section contains some example of the GLite Data Transfer Channel and VO Agent configuration files. Those values should be changed according to the environment where the GLite Data Transfer Channel Agent is installed. Fireman, UrlCopy and Oracle connectors are used.
<?xml version="1.0" encoding="UTF8"?> <service> <components> <component name="agents-sd"> <lib>libglite_data_agents_common_sd.so</lib> </component> <component name="agents-cred-myproxy"> <lib>libglite_data_agents_common_cred_myproxy.so</lib> <init> <param name="Server"> <value>lxb1414.cern.ch</value> </param> </init> </component> <component name="transfer-agent-fsm"> <lib>libglite_data_transfer_agent_fsm.so</lib> </component> <component name="agents-dao-oracle"> <lib>libglite_data_agents_common_dao_oracle.so</lib> <init> <param name="ConnectString"> <value>@DBNAME</value> </param> <param name="User"> <value>@DBUSER</value> </param> <param name="Password"> <value>@DBPASSWORD</value> </param> </init> </component> <component name="transfer-agent-dao-oracle"> <lib>libglite_data_transfer_agent_dao_oracle.so</lib> <init> <param name="VOView"> <value>false</value> </param> <param name="ChannelView"> <value>true</value> </param> </init> </component> <component name="transfer-agent-ts-urlcopy"> <lib>libglite_data_transfer_agent_ts_urlcopy.so</lib> <dependencies> <lib>libglite_data_transfer_url_copy.so</lib> </dependencies> <init> <param name="MaxTransfers"> <value>3</value> </param> </init> <config> <param name="LogLevel"> <value>DEBUG</value> </param> </config> </component> <component name="transfer-agent-scheduler"> <lib>libglite_data_transfer_agent_scheduler.so</lib> </component> <component name="transfer-agent-channel-actions"> <lib>libglite_data_transfer_agent_channel_actions.so</lib> </component> <component name="transfer-channel-agent"> <lib>libglite_data_transfer_channel_agent.so</lib> <init> <param name="Name"> <value>channel_1</value> </param> <param name="DefaultInterval"> <value>3</value> </param> </init> </component> </components> </service>
<?xml version="1.0" encoding="UTF8"?> <service> <components> <component name="agents-sd"> <lib>libglite_data_agents_common_sd.so</lib> </component> <component name="agents-cred_myproxy"> <lib>libglite_data_agents_common_cred_myproxy.so</lib> <init> <param name="Server"> <value>lxb1414.cern.ch</value> </param> </init> </component> <component name="transfer-agent-fsm"> <lib>libglite_data_transfer_agent_fsm.so</lib> </component> <component name="agents-dao-oracle"> <lib>libglite_data_agents_common_dao_oracle.so</lib> <init> <param name="ConnectString"> <value>@DBNAME</value> </param> <param name="User"> <value>@DBUSER</value> </param> <param name="Password"> <value>@DBPASSWORD</value> </param> </init> </component> <component name="transfer-agent-dao-oracle"> <lib>libglite_data_transfer_agent_dao_oracle.so</lib> <init> <param name="VOView"> <value>true</value> </param> <param name="ChannelView"> <value>false</value> </param> </init> </component> <component name="transfer-agent-scheduler"> <lib>libglite_data_transfer_agent_scheduler.so</lib> </component> <component name="transfer-agent-vo-actions"> <lib>libglite_data_transfer_agent_vo_actions.so</lib> </component> <component name="transfer-vo-agent"> <lib>libglite_data_transfer_vo_agent.so</lib> <init> <param name="Name"> <value>EGEE</value> </param> <param name="DefaultInterval"> <value>3</value> </param> </init> </component> </components> </service>
<?xml version="1.0" encoding="UTF8"?> <service> <components> <component name="agents-sd"> <lib>libglite_data_agents_common_sd.so</lib> </component> <component name="agents-cred-myproxy"> <lib>libglite_data_agents_common_cred_myproxy.so</lib> <init> <param name="Server"> <value>lxb1414.cern.ch</value> </param> </init> </component> <component name="transfer-agent-catalog-fireman"> <lib>libglite_data_transfer_agent_catalog_fireman.so</lib> </component> <component name="transfer-agent-fsm"> <lib>libglite_data_transfer_agent_fsm.so</lib> </component> <component name="agents-dao-oracle"> <lib>libglite_data_agents_common_dao_oracle.so</lib> <init> <param name="ConnectString"> <value>@DBNAME</value> </param> <param name="User"> <value>@DBUSER</value> </param> <param name="Password"> <value>@DBPASSWORD</value> </param> </init> </component> <component name="transfer-agent-dao-oracle"> <lib>libglite_data_transfer_agent_dao_oracle.so</lib> <init> <param name="VOView"> <value>true</value> </param> <param name="ChannelView"> <value>false</value> </param> </init> </component> <component name="transfer-agent-scheduler"> <lib>libglite_data_transfer_agent_scheduler.so</lib> </component> <component name="transfer-agent-vo-actions"> <lib>libglite_data_transfer_agent_vo_actions.so</lib> </component> <component name="transfer-vo-agent"> <lib>libglite_data_transfer_vo_agent.so</lib> <init> <param name="Name"> <value>EGEE</value> </param> <param name="DefaultInterval"> <value>3</value> </param> </init> </component> </components> </service>
An example of a logging configuration file is:
log4j.rootCategory=DEBUG, file log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%m%n log4j.appender.file=org.apache.log4j.FileAppender log4j.appender.file.fileName=/var/log/glite/glite-transfer-vo-agent-EGEE.log log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%d %-6p %c - %m%n
The agent configuration template file can then be generated by concatenating the differenet transfer-agent-*.config-template.xml files contained into the config folder. This configuration template file can then be used to generate the configuration file as explained in the gLite Data Transfer Agents Install Guide document.
All the VO Agent Actions can be overridden specifying the appropriate action type in the vo-agent configuration. In order to do that, the library providing the action should be loaded inside the process (using the Glite Service Configurator Library) and the related action types shoudl be registered in ActionFactory. However, this approach is quite complex and requires that the actions reimplement most of the logic already provided by the default ones. Due to that reason, from version 2.2.0 FileTransferAgent starts adopting the Strategy Pattern, decoupling the logic of the action (that a VO may want to customize) from the common tasks (get instances from the DB, update changes, etc...). This new approach, combined with the possibility to call python scripts within the VO Agent process, would allow then to specify these strategy in a easier way. At the moment, the only action startegy that can be overridden is the Retry, but new ones will be added in future versions.
In order to use this feature, the structure of the VO Agent shold be modified as illustrated below:
<service name="transfer-vo-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <component name="agents-python"> <!-- Python interpreter module --> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object Library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-vo-actions"> <!-- VO Actions Configuration --> </component> <component name="transfer-agent-vo-actions-python"> <!-- VO Python Actions Configuration --> </component> <component name="transfer-vo-agent"> <!-- VO Agent Configuration --> </component> </components> </service>
The configuration properties of these new modules are:
The purpose of this module is to load and initialize the Python interpreter in order to run python script inside an Agent process. The module also initialize some wrapper modules that expose the main agents-common functionalities
libglite_data_agents_common_python.so
The location where the python interpret look for modules and packages. The value can be a list of directory path separated by ':' (in a unix-like sistem). If this property is set, the module would add the value to the PYTHONPATH environment variable
Type: string; Default value: ''
<No configuration parameters>
Represents the module that overrides some VO Actions allowing to specify the strategy as a python script These actions are, at the moment: <Retry>: Retry failed transfers or move them to hold;
libglite_data_transfer_agent_vo_actions_python.so
The name of the python module that contains the function that would be evaluated in order to choose if a file has to be retried, failed or let it in Waiting state. If you want to configure the VO agent in order to use the retry action that call this module, you have to set the property "transfer-vo-agent.Retry_Type" to "glite:PythonRetry"
Type: string; Default value: ''
This parameter represent an initialization string that will be passed to the python module that contains the retry logic, specified by the "RetryModule" parameter. The format of that string is module dependent
Type: string; Default value: ''
The name of the python module that contains the function that would be evaluated in order to choose if a file has to be retried, failed or let it in a waiting state after a failure during the catalog interaction. In case the VO doesn't configure a Catalog plugin, this parameter can be left empty
Type: string; Default value: ''
This parameter represent an initialization string that will be passed to the python module that contains the Catalog retry logic, specified by the "CatalogRetryModule" parameter. The format of that string is module dependent
Type: string; Default value: ''
<No configuration parameters>
Please note that in order to be able to call the strategy defined in the python script you have to set the expert property "transfer-vo-agent.Retry_Type" to "glite:PythonRetry"
In case of the VO Agent configured to handle transfers starting from the logical name is exactly the same, you need just to to configure a Catalog Connector.
The current release of transfer-agents provides two python retry strategies, located in GLITE_LOCATION/lib/python/glite/fts/strategies:
basic_retry.py: the python implementation of the default BasicRetry Strategy. No configuration parameters are exposed this script
smarter_retry.py a more advanced retry logic that check for the last transfer failures: fail immediately if source files doesn't exist; wait more if a timeout on get is received;delete destination file if a "file exists" error follows a "transfer error". This script accepts the following configuration parameters, that you could pass through the "transfer-agent-vo-actions-python.RetryParams":
MaxFailures : The maximum number of failures allowed for a file (Default:100)
MaxFileExistsFailures : The maximum number of consecutive "File Exists" failures (Default: 3)
HoldEnabled : If this value is set to true, when a file fails,it will be moved to HOLD. If this value is set to false, the file status will be FAILED(Default: true)
OverwriteFailedFiles : When the strategy detect that a "file exists" failure is due by a incomplete cleanup of a previous failed transfer, delete the destination file before performing the new attempt (Default: true)
OverwriteExistingFiles : In case of "file exists" failure, always delete the destination before retrying the transfer (Default: false)
DefaultRetryDelay : The default interval (in seconds) before retyring a transfer (Default:600)
RetryDelayForTimeoutOnGet : The interval (in seconds) before retyring a transfer that failed with a "Timeout on Get" error (Default:1800)
RetryDelayForDestFileExists : The interval (in seconds) before retyring a transfer that failed at the destination (Default:600)
RetryDelayForDestFileExists : The interval (in seconds) before retyring a transfer that failed at the destination (Default:600)
SrmServiceType : The service type that identify an SRM into the information systrem (Default: SRM)
In case you're interested in providing other strategies, you have to comply with the following rules:
In order provide a Python script that can be used as Retry Strategy for the gLite File Transfer Agent, you need to provide the following functions:
RetryVersion(): This method should return a string containing the version of the Retry Strategy interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".
InitRetry(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:
params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script
Retry(job,file,transfers): This method provides the retry strategy that should be applied for the failed files. The input parameters are:
job: the Transfer Job request object
file: the Transfer File object that represent the file to evaluate
transfers: a list containing all the transfer attempts already performed for the given file, ordered from the more recent one
This method should return a glite.fts.RetryResult value:
Wait : Not enough time is elapsed from the last try
Retry: File should be retried
Hold : File should be moved in Hold State
Fail : File should be considered Failed
In order provide a Python script that can be used as Catalog Retry Strategy for the gLite File Transfer Agent, you need to provide the following functions:
CatalogRetryVersion(): This method should return a string containing the version of the Catalog Retry Strategy interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".
InitCatalogRetry(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:
params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script
CatalogRetry(job,files): This method provides the Catalog retry strategy that should be applied for the files failed durting Catalog interation. The input parameters are:
job: the Transfer Job request object
files: the list of Transfer File objects that represent the file to evaluate
This method should return a glite.fts.CatalogRetryResult value:
Wait : Not enough time is elapsed from the last try
Retry: File should be retried
Fail : File should be considered Failed
Please note that tyhe usage of the Catalog Retry Strategy is not compulsory. In case the Python Strategy bridge is used and no module is specified for overriding Catalog Retry Strategy, all the files that encountered an error during the Catalog interaction would be failed
The following section illustrate an example of configuration that use a python script as retry strategy
<?xml version="1.0" encoding="UTF8"?> <service> <components> <component name="agents-sd"> <lib>libglite_data_agents_common_sd.so</lib> </component> <component name="agents-cred-myproxy"> <lib>libglite_data_agents_common_cred_myproxy.so</lib> </component> <component name="transfer-agent-fsm"> <lib>libglite_data_transfer_agent_fsm.so</lib> </component> <component name="agents-dao-oracle"> <lib>libglite_data_agents_common_dao_oracle.so</lib> <init> <param name="ConnectString"> <value>@DBNAME</value> </param> <param name="User"> <value>@DBUSER</value> </param> <param name="Password"> <value>@DBPASSWORD</value> </param> </init> </component> <component name="transfer-agent-dao-oracle"> <lib>libglite_data_transfer_agent_dao_oracle.so</lib> <init> <param name="VOView"> <value>true</value> </param> <param name="ChannelView"> <value>false</value> </param> </init> </component> <component name="transfer-agent-scheduler"> <lib>libglite_data_transfer_agent_scheduler.so</lib> </component> <component name="transfer-agent-python"> <lib>libglite_data_transfer_agent_python.so</lib> <init> <param name="PythonPath"> <value>/opt/glite/lib/python2.2/site-packages: /opt/glite/lib/python/glite/fts/strategies/</value> </param> </init> </component> <component name="transfer-agent-vo-actions"> <lib>libglite_data_transfer_agent_vo_actions.so</lib> </component> <component name="transfer-agent-vo-actions-python"> <lib>libglite_data_transfer_agent_vo_actions_python.so</lib> <init> <param name="RetryModule"> <value>smarter_retry</value> </param> <param name="RetryParams"> <value> MaxFailures = 10 ; HoldEnabled = true ; OverwriteFailedFiles = true ; OverwriteExistingFiles = false ; DefaultRetryDelay = 60 ; RetryDelayForTimeoutOnGet = 300 ; RetryDelayForDestFileExists = 60 ; </value> </param> </init> </component> <component name="transfer-vo-agent"> <lib>libglite_data_transfer_vo_agent.so</lib> <init> <param name="Name"> <value>EGEE</value> </param> <param name="Retry_Type"> <value>glite:PythonRetry</value> </param> </init> </component> </components> </service>
A VO can provide its specific CatalogService Plugin by providing an implementation of the CatalogService interface defined in /glite/data/transfer/agents/catalog/CatalogService.h . In order to do that, the library providing that class should be loaded inside the process (using the Glite Service Configurator Library) as well as a Factory to create that instance (it uses the AbstractFactory Pattern: see /glite/data/transfer/agents/catalog/CatalogServiceFactory.h). However, this approach is slightly complex and has some deployment implications (e.g. possible missmatch between the versions of FTA and the provided library). Due to that reason, from version 2.3.0 FileTransferAgent introduced a Python CatalogService Connector that will implement the CatalogService interface by delegating to a python script the execution of the related methods. This approach, combined with the possibility to call python scripts within the VO Agent process, would allow then teh VOs to specify theirs own connector in a easier way.
In order to use this feature, the structure of the VO Agent Configuration shold be modified as illustrated below:
<service name="transfer-vo-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <component name="agents-python"> <!-- Python interpreter module --> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="transfer-agent-catalog-python"> <!-- PythonCatalog Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object Library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-vo-actions"> <!-- VO Actions Configuration --> </component> <component name="transfer-agent-vo-actions-python"> <!-- VO Python Actions Configuration --> </component> <component name="transfer-vo-agent"> <!-- VO Agent Configuration --> </component> </components> </service>
The configuration properties of these new modules are:
The purpose of this module is to load and initialize the Python interpreter in order to run python script inside an Agent process. The module also initialize some wrapper modules that expose the main agents-common functionalities
libglite_data_agents_common_python.so
The location where the python interpret look for modules and packages. The value can be a list of directory path separated by ':' (in a unix-like sistem). If this property is set, the module would add the value to the PYTHONPATH environment variable
Type: string; Default value: ''
<No configuration parameters>
This module represents the module that should be used to call a Python script in order to resolve logical file names and register new replicas
libglite_data_transfer_agent_catalog_python.so
The name of the python module that contains the function that would be used in order to check replication permissions, resolve logical file names and register new replicas
Type: string; Default value: ''
This parameter represent an initialization string that will be passed to the python module that contains the catalog plugin, specified by the "CatalogModule" parameter. The format of that string is module dependent
Type: string; Default value: ''
<No configuration parameters>
Please note that you can also configure the Python Retry Strategy described in the paragraph above
In case you're interested in providing such connector, you have to comply with the following rules:
In order provide a Python script that can be used as CatalogService Connector for the gLite File Transfer Agent, you need to provide the following functions:
CatalogPluginVersion(): This method should return a string containing the version of the PythonCatalog interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".
InitCatalogPlugin(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:
params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script
GlobalCatalogType(): This method should return a string containing the type of the Global CatalogService the plugin is using. This value is then use to look up the CatalogService endpoint in the information system. In case the VO uses local Catalogs, this method can be skipped
LocalCatalogType(): This method should return a string containing the type of the Local CatalogService the plugin is using. This value is then use to look up the CatalogService endpoint in the information system. In case the VO uses a global Catalog, this method can be skipped
GetEndpoint(site,vo_name): This method should return a StringPair that contain the endpoint and the type of the CatalogService service that should be used when the Agent should access/register files in the given site. If this method is not provided, the VO Agent will then look up the CatalogService endpoint in the information system using the types returned by the LocalCatalogType and GlobalCatalogType calls. The input parameters are:
site: the name of the site where the file is/has to be registered
vo_name: the name of the VO which the Agent is responsible for
CheckPermissions(endpoint, type, names): This method should check that the user is allowed to replicate the files indicated by the logical names listed in the "names" parameter. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files can't be authorized, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs. The input parameters are:
endpoint: the Catalog service endpoint
type: the Catalog service type
names: the list of logical names
ListSurls(endpoint, type, names, source): This method should returns all the replicas registered in the CatalogSevrice identified by the "endpoint" and "type" input parameters, for all the logical names listed in "names". The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The optional parameter source can contain the requested source site/SE in order to provide an hint to return only the SURLs involved. The result of the method should be an instance of the glite.fts.catalog.ListSurlsResult class. In case of success, the ListSurlsResult.Status field should be set to ResultStatus.Success and the ListSurlsResult.Surls field should contain the list of the replicas' SURLs (the type of this field is glite.fts.catalog.StringArray2, an array of string arrays). In case of errors ListSurlsResult.Status should be set to ResultStatus.Failed (or ResultStatus.SomeFailures) and ListSurlsResult.Reason should contain the reason of the failure. In case the status is set to ResultStatus.SomeFailures, you can provide the list of failed files and the related reason in the ListSurlsResult.Failures field, an array that should holds the (LogicalName, ErrorReason) pairs. Please note that in case you want to provide these details, you have to fill the ListSurlsResult.Surls with the list of Surls for the file that succeeded and an empty array for the failed ones (the ones listed in ListSurlsResult.Failures)
endpoint: the Catalog service endpoint
type: the Catalog service type
names: the list of logical names
source: [Optional] the source SE that can be used to filter the request
CheckSurls(endpoint, type, names): This method should check that the files identified by the "logical name-surl" pairs listed in the "names" parameter don't already exist. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files already exist, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs.
endpoint: the Catalog service endpoint
type: the Catalog service type
names: the list of "logical name-surl" pairs
GetStats(endpoint, type, names): This method should return the stat info of the files identified by the the "logical name-surl" pairs listed in the "names" parameter. The purpose of this method is to synchronize the source and destination catalogs and therefore, in case the endpoints of these catalog are the same, this method is not called. A Plugin implementation should derive it's own implementation of the FileStat class, adding the stat info and the related metadata that should be synchronized between the catalogs; the FTA would then take the returned object and pass it as opaque pointer to the the RegisterSurls method. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files failed, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs.
endpoint: the Catalog service endpoint
type: the Catalog service type
names: the list of "logical name-surl" pairs
RegisterSurls(endpoint, type, replicas): This method should register the new replicas in the CatalogSevrice identified by the "endpoint" and "type" input parameters. The replics are contained in "names", an array of ReplicaStat object (properties are Logical, Surl, FileStat). The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. If the registration succeed, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed (or ResultStatus.SomeFailures) and CatalogResult.Reason should contain the reason of the failure. In case the result is set to ResultStatus.SomeFailures, you can provide the list of failed files and the related reason by filling the CatalogResult.Failures field, an array that should holds the (LogicalName, ErrorReason) pairs.
endpoint: the Catalog service endpoint
type: the Catalog service type
replicas: the list of ReplicaStat objects
In addition to the above extension, you may be interested in providing a plugin for defining the function that generates the destination SURL. As in the other cases, this is possible either providing an implementation of the interface defined in glite/transfer/agent/namegen/NameGeneration.h, or via a python script. In case you'll choose this option, the configuration properties are:
This module represents the module that should be used to call a Python script in order to generate names of new replicas
libglite_data_transfer_agent_namegen_python.so
The name of the python module that contains the function that would be used in order to generate the name of new replicas starting from the logical name. If this parameter is not provided or is empty, the default implementation would be used
Type: string; Default value: ''
This parameter represent an initialization string that will be passed to the python module that contains the NameGeneration plugin, specified by the "NameGenModule" parameter. The format of that string is module dependent
Type: string; Default value: ''
<No configuration parameters>
In that case, the structure of an VO Agent Configuration File that uses python catalog and namegeneration plugins would be:
<service name="transfer-vo-agent"> <components> <component name="agents-sd"> <!-- ServiceDiscovery client Configuration--> </component> <component name="agents-cred-myproxy"> <!-- MyProxy client Configuration--> </component> <component name="agents-python"> <!-- Python interpreter module --> </component> <component name="transfer-agent-fsm"> <!-- FSM Configuration--> </component> <component name="transfer-agent-namegen-python"> <!-- Python NameGeneration Configuration--> </component> <component name="transfer-agent-catalog-python"> <!-- PythonCatalog Configuration--> </component> <component name="agents-dao-****"> <!-- Data Access Object Library --> </component> <component name="transfer-agent-dao-****"> <!-- Queue (Data Access Object) Connector --> </component> <component name="transfer-agent-scheduler"> <!-- Scheduler Configuration--> </component> <component name="transfer-agent-vo-actions"> <!-- VO Actions Configuration --> </component> <component name="transfer-agent-vo-actions-python"> <!-- VO Python Actions Configuration --> </component> <component name="transfer-vo-agent"> <!-- VO Agent Configuration --> </component> </components> </service>
Please note that the previously described component "agents-python" is also needed.
In case you're interested in providing such function, you have to comply with the following rules:
In order provide a Python script that can be used as to generate the name of the for the destination SURL when a logical name is provided, you need to provide the following functions:
NameGenVersion(): This method should return a string containing the version of the NameGeneration interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".
InitNameGen(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:
params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script
Generate(logical,endpoint,se_path,storage_class): This method should return the surl of the destination file strating form the input parameters are:
logical: the logicla name of the file, as specified by the used in the Transfer Job Request
endpoint: the endpoint of the destination SRM
se_path: the path in the Storage Element, retrieved from the Information System (SAPath property)
storage_class [Optional]: the storage class for the file. In case this parameter is set into the Transfer Job request, and the se_path would already take into consideration this parameter (if the SA configured accordingly)
This method should return string containg the generated SURL
The following section illustrate an example of configuration that use a python script as catalog connector
<?xml version="1.0" encoding="UTF8"?> <service> <components> <component name="agents-sd"> <lib>libglite_data_agents_common_sd.so</lib> </component> <component name="agents-cred-myproxy"> <lib>libglite_data_agents_common_cred_myproxy.so</lib> </component> <component name="transfer-agent-fsm"> <lib>libglite_data_transfer_agent_fsm.so</lib> </component> <component name="transfer-agent-namegen-python"> <lib>libglite_data_transfer_agent_namegen_python.so</lib> <init> <param name="NameGenModule"> <value>@VO_NameGen_Script</value> </param> <param name="NameGenParams"> <value>@SOME_VALUES</value> </param> </init> </component> <component name="transfer-agent-catalog-python"> <lib>libglite_data_transfer_agent_catalog_python.so</lib> <init> <param name="CatalogModule"> <value>@VO_Catalog_PLUGIN</value> </param> <param name="CatalogParams"> <value>@SOME_VALUES</value> </param> </init> </component> <component name="agents-dao-oracle"> <lib>libglite_data_agents_common_dao_oracle.so</lib> <init> <param name="ConnectString"> <value>@DBNAME</value> </param> <param name="User"> <value>@DBUSER</value> </param> <param name="Password"> <value>@DBPASSWORD</value> </param> </init> </component> <component name="transfer-agent-dao-oracle"> <lib>libglite_data_transfer_agent_dao_oracle.so</lib> <init> <param name="VOView"> <value>true</value> </param> <param name="ChannelView"> <value>false</value> </param> </init> </component> <component name="transfer-agent-scheduler"> <lib>libglite_data_transfer_agent_scheduler.so</lib> </component> <component name="transfer-agent-python"> <lib>libglite_data_transfer_agent_python.so</lib> <init> <param name="PythonPath"> <value>/opt/glite/lib/python2.2/site-packages: /opt/glite/lib/python/glite/fts/strategies/</value> </param> </init> </component> <component name="transfer-agent-vo-actions"> <lib>libglite_data_transfer_agent_vo_actions.so</lib> </component> <component name="transfer-agent-vo-actions-python"> <lib>libglite_data_transfer_agent_vo_actions_python.so</lib> <init> <param name="RetryModule"> <value>smarter_retry</value> </param> <param name="RetryParams"> <value> MaxFailures = 10 ; HoldEnabled = true ; OverwriteFailedFiles = true ; OverwriteExistingFiles = false ; DefaultRetryDelay = 60 ; RetryDelayForTimeoutOnGet = 300 ; RetryDelayForDestFileExists = 60 ; </value> </param> </init> </component> <component name="transfer-vo-agent"> <lib>libglite_data_transfer_vo_agent.so</lib> <init> <param name="Name"> <value>EGEE</value> </param> <param name="Retry_Type"> <value>glite:PythonRetry</value> </param> </init> </component> </components> </service>