!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
The connection between the dCache System and the Pnfs Filesystem is done through the PnfsManager. The PnfsManager reads most of its configuration data out of the pnfs filesystem itself. Therefore pnfs needs some preparation to work together with the PnfsManager.Currently the PnfsManager is able to detect and to scan multiple pnfs servers. One of these servers will be declared as default pnfs server . Only the default server will be used by the PnfsManager to answer pnfs related queries containing pnfsIds without serverId extensions. pnfs serverId extentions are not yet supported by the current dCache (1.4.x) system. In case, there are more than one pnfs filesystem mounted on the PnfsManager host, the cell behaves as follows :
In case, the PnfsManager is not running on the pnfs server host itself, the host, the PnfsManager is running on, must be made root trusted on the pnfs server host.
- If all mounted pnfs filesystems turn out to be different mountpoints of the same server, the PnfsManager will chose the mountpoint with the largest file tree. The corresponding server will be the default pnfs server.
- If the PnfsManager detects different pnfs servers, the default pnfs server will be choosen at random, which is certainly no what you want. So, use the defaultPnfsServer option in the dCacheSetup file to chose an appropriate server (see below).
export CVS_RSH=ssh1
export CVSROOT=:ext:cvs@watphrakeo.desy.de:/home/cvs/cvs-root
cvs checkout Sandbox/Java/diskCacheV111
cd Sandbox/Java/diskCacheV111
mkdir classes
cp /<cellsBaseDir>/cells.jar classes
The most important parameters steering the behaviour of nearly all dCache components are set up in the config/dCacheSetup configuration file. The distribution doesn't contain this file to prevent overwriting it with a new 'checkout'. Instead, we provide a template in config/dCacheSetup.temp. This file has to be renamed to dCacheSetup and modified accordingly (see below).
cp config/dCacheSetup.temp config/dCacheSetupPreparing dCacheSetup
Most of the variables in dCacheSetup are already assigned to the correct value. The variable ${ourHomeDir} can be used to address certain files or directories relative to the dCache base.Preparing the ssh access
Only the following variables possibly need to be tuned.
- The kerberized FTP door from Fermilab requires extra configuration.
- java has to be assigned to the java virtual machine. It is not sufficient to have java defined in the PATH variable.
- java_options needs to be set up as well if a certain java option is required. For the original sun JVM (Solaris,Linux) at least -server should be specified. The IRIX JVM 1.3.1 allows to chose the -hotspot JVM. In addition it might be necessary to set the stack sizes to some large value in case it is intended to run large amounts of pools within the same virtual machine.
- billingDb is forwarded to the BillingCell. Currently the bilingModule regards this argument as directory where it can create the billing files. This variable should be set to some reasonable value instead of /tmp.
- serviceLocatorHost,serviceLocatorPort define the host and port number of the serviceLocator.
- xxxPort The names of the different port variables should be selfdescribing. The template defines these port numbers relative to a portBase variable. This can be overwritten without doing any harm to the setup scheme.
The telnetPort should only uncommented for debug purposes. It allows unauthenticated access to the system. In addition, if uncommented, each JVM (Domain) will try to start the telnet deamon on the specifed port which will certainly fail if more than one dCache Domain is started on the same host.- pnfs overwrites the selfdetection mechnism of the PnfsManager.
- defaultPnfsServer specifes the pnfs server name in case multiple pnfs servers are mounted on the PnfsDomain Host. The value of the defaultPnfsServer parameter must match the content of the /pnfs/fs/admin/etc/config/serverName file.
- trash overwrites the autodetection machnism of the Cleaner module. This variable should just point to an empty line. It must not be commented out. If the pnfsDomain is not started on the actual pnfs server, the Cleaner will printout an error message and will stop.
- ftpBase The current (Desy) ftp implementation is rather old and still needs a pointer to the pnfs filesytem to reply on 'dir' requests. Future releases will have an extended communication scheme with the PnfsManager in addition to a directory pool and won't touch pnfs directly.
Hosts, which are going to run the doorDomain or the adminDoorDomain need to have a valid host and server key. It is very likely, that the host key is already available in /etc/ssh/ssh_host_key. To make it accessible by the ssh door, this file has to be linked into the config directory.ln -s /etc/ssh/ssh_host_key ./config/host_keyor if /etc/ssh/ssh_host_key is not present :ssh-keygen -b 1024 -f ./host_key -N ""Whenever a host key already exist, one MUST NOT create a new one for the dCache access, otherwise the ssh clients will get confused. They can only associate a single key with a host.According to the ssh specification a server key has to be generated regularly. Our ssh deamon currently violates this protocol. We simply create a single server key and use it until a new one is created. So, if one wants to be very precise with this idea he simply needs to setup a cron job which frequently creates a new server_key in the ./config directory. The door will, after a short delay, use the most recent one.
cd config
ssh-keygen -b 768 -f ./server_key -N ""NOTE I
The 1.4.0 ssh deamon only supports the idea, des and blowfish encryption ciphers. The 3des is still suffering from a software bug which makes is essentially unusable. Since in ssh-II 'des' is no longer being used due to security considerations and 'idea' due to licencing considerations, the blowfish algorithm is the only left right now. Consequently the ssh client has to be lauched with the -c blowfish option.
NOTE II
The current doorDomain contains an ssh door which runs the ssh I protocol using the keys described above, but which doesn't authenticate the user. Everybody is immediately logged in without password or key check. We are in process of replacing this door by the adminDoorDomain which does real password checks using local password files or the pam mechnism.
Running initPackage
The jobs/initPackage.sh script mainly sets up some necessary links. In addition it checks the existence of the dCacheSetup file, the host and server keys, the correct JVM and cells version. It's not very likely that the dCache will come up if ./initPackage.sh reports an error.
cd jobs
./initPackage.sh
The purpose of the LocationManager is to store and distribute information about services and the corresponding port numbers. In addition it steers network related actions like default routes and listen ports.Therefore, the Location Manager is the first Domain which has to be started. It is not assumed to be stopped or restarted while other components are alive.
The config directory contains a lm.config.temp template file which needs to be renamed to lm.config. The content should be sufficient for all central dCache services. New pool domain names have to be added manually (see below). Only defaultPoolsDomain is already predefined to be able to run the example setup without modifications.
cd ../configThe LocationManagerDomain has to be started on the host which is specified in the config/dCacheSetup file.
cp lm.config.temp lm.config
jobs/lm start -telnet=22123The telnet port may be chosen arbitrarily.WARNING
The login into the lm is not yet part of the security scheme and therefore not really protected.
Configuration
To following procedure is necessary to add a new domain. The domain is assumed to contain pools only. Its name might be e.g.: fallbackPoolsDomain , cdf-dac3poolDomain or anything else . When you later try to start the pool these names must match. The word Domain is automatically post-fixed to running domains, but must be explicitly entered here. Login into the LocationManager :
telnet <locationManagerHost> 22123When you are done be sure to stop the lm and restart it without the telnet option.
#
# specify user and password.
#
tlm-username-112 > exit
Shell Exit (code=0;msg=(0) )
Back to .. mode
.. > set dest lm
lm > define fallbackPoolsDomain
lm > connect fallbackPoolsDomain dCacheDomain
lm > defaultroute fallbackPoolsDomain dCacheDomain
lm > setup write
lm > exit
Back to .. mode
.. > exit
jobs/lm stop
jobs/lm start
The following central services are packed together into Domains bas follows.Although the startup sequence of the Domains is not important, the fastes way to do it, is to follow this order :
- dCacheDomain PoolManager, Prestager
- httpdDomain httpd, (data) collector, billing
- doorDomain ftpDoor, dCapDoor, (unauthenticated) SshDoor
- pnfsDomain PnfsManager, Cleaner
- adminDoorDomain pamModule, AccessControlModule, (authenticated) sshDoor
#All these services will find to eachother (sooner or later). One may login and check the services :
# i) the following commands might be issued on different hosts.
# ii) at this point we assume that the LocationManager is already running.
#
cd jobs
./pnfs start
./httpd start
./dCache start
./door start
#./adminDoor startssh -l trude -c blowfish -p ${sshPort from dCacheSetup}
slm-username-100 > ps -f
Cell List
------------------
skm A 0 2 SshKeyManager Ssh Key Manager
DCap A 0 2 LoginManager p=24125;c=diskCacheV111.doors.DCapDoor
System A 0 1 SystemCell doorDomain:IOrec=25772;IOexc=0;MEM=1125696
RoutingMgr A 0 1 RoutingManager RoutingMgr
slm-trude-110 A 0 2 StreamLoginCell trude@watphrakeo.desy.de/131.169.1.222
FTP A 0 2 LoginManager p=24126;c=diskCacheV111.cells.FTPDoor2
c-100 A 0 2 LocationMgrTunnel <>/Connected -> dCacheDomain
slm A 0 2 LoginManager p=24124;c=dmg.cells.services.StreamLoginCell
lm A 0 2 LocationManager ClientReady
slm-username-100 > route
Dest Cell Dest Domain Gateway Type
* dCacheDomain c-100 Domain
* * *@dCacheDomain Default
slm-username-100 > exit
Shell Exit (code=0;msg=(0) )
Back to .. mode
.. > set dest System@dCacheDomain
System@dCacheDomain > ps -f
l-100-Unknown-104 A 0 2 LocationMgrTunnel <io>/Accepted -> doorDomain
Prestager A 0 2 DummyStager Req=0;Err=0;
PoolManager A 0 2 PoolManagerV3 PoolManager
System A 0 1 SystemCell dCacheDomain:IOrec=25787;IOexc=0;MEM=791032
RoutingMgr A 0 1 RoutingManager RoutingMgr
c-102 A 0 2 LocationMgrTunnel <io>/Connected -> httpdDomain
c-101 A 0 2 LocationMgrTunnel <io>/Connected -> pnfsDomain
lm A 0 2 LocationManager ClientReady
l-100 A 0 2 LoginManager p=44250;c=dmg.cells.network.LocationMgrTunnel
System@dCacheDomain > route
Dest Cell Dest Domain Gateway Type
r-pal01-2 * *@genericPoolsDomain Wellknown
r-pal01-1 * *@genericPoolsDomain Wellknown
PnfsManager * *@pnfsDomain Wellknown
billing * *@httpdDomain Wellknown
w-pal01-0 * *@genericPoolsDomain Wellknown
* doorDomain l-100-Unknown-104 Domain
* pnfsDomain c-101 Domain
* httpdDomain c-102 Domain
The PoolManager is responsible for distributing data amoung the different pools. It uses the HSM storageClasses and the IP number of the data client for its decision where to cache the data file. The config directory contains a template (PoolManager.conf.temp) for the PoolManager configuration. It needs to be renamed to PoolManager.conf to become actived. This file should only be regarded as initial configuration. Changes to the setup have to go through the commandline interface of the PoolManager itself. This The 1.4.0 PoolManager hosts a rather sofisticated decision unit. We are preparing additional documentation for configuration and maintainance.
A pool domain may contain one to 'n' pools. One pool serves exactly one contiguous diskarea. Let's say the poolbase is /<poolBase>/. To prepare a pool :cd /<poolBase>/Adjust the specified variables according your needs :
mkdir control data
cp /<dCacheBase>/config/setup.temp ./setup
vi ./setupAll pools which are handled by a single pool domain have to be specified in the <poolDomainName>.poollist file in the config directory. The format of this file is
- set max diskspace maximum number of bytes, the system is allowed to use. Abriv. may be used e.g. 'g' for gigabytes and 'm' for megabytes.
- rh set max active maximum number of restore processes allowed for this pool.
- st set max active maximum number of store processes allowed for this pool.
- mover set max active maximum number of dCache-client movers allowed for this pool.
- hsm set <hsmName> -command=... defines the external script or binary to store and restore datasets into/from the hsm (<hsmName>).
- hsm set <hsmName> -pnfs=<pnfsMountpoint> specifies the pnfs mountpoint. It can be used by the store/restore script to move the file from/to the HSM store.
- hsm set <hsmName> -<key>=<value> specifies any value(s) the store/restore script needs.
<poolName>-1 /<poolBase>-1The poolNames must match the names of the pools described in the config/PoolManager.conf file.
<poolName>-2 /<poolBase>-2
<poolName>-3 /<poolBase>-3
poolBase is the name of the directory you created above.
The <poolDomainName>Domain must match the Domain name specified in the LocationManager description file (lm.config).
The pool is started by
jobs/pool -pool=<poolDomainName> startRemark : Configuration templates in the config directory all fit together. Only the pathnames in config/defaultPools.poollist have be be adjusted according to the real <poolBase> directories. When adding additional pools, remember to add there routes to the lm.config file in order to allow them to communicate with the system.
Building the SRM interface to dCache
Within the grid community there is a work being done on a common protocol for accessing mass storage system. This is commonly referred to as the Storage Resource Manager specification. By combining the Apache Tomcat server and dCache it is possible to run both in one process space and thus share many resources. A proof of concept implmentation of this is available in the dCache archives. This is not a production ready system.
To activate the SRM:
- Install the apache tomcat server to the right place (document?)
- Create an environments variable "build_srm=yes"
- Follow the rest of the standard dCache build
- When editing the dCacheSetup file, add the following
java_options="-Dcatalina.base=/var/tomcat4 -Dcatalina.home=/var/tomcat4"When these steps are completed then execution of the dCache automatically starts an instance of the Tomcat server in a seperate thread.
Note: The classpath for the tomcat server and SRM servlet classes must also be specified in the dCacheSetup. This isn't covered here as it is installation dependent. Please make certain that you have all classes needed in your classpath, this is a common mistake.
Deploying the Fermilab Kerberized FTP Door
The kerberized door (FtpDoorFnal.class) requires some extra parameters and setup in order to run and authenticate users. First the dCacheSetup file must be modified to pass additional parameters to the java runtime.
java="/usr/java/j2sdk1.4.0/bin/java -Djava.security.krb5.realm=FNAL.GOV \The kerberos door requires at least JVM 1.4 in order to operate. This is due to the inclusion of GSS API support in the new runtime. The parameters passed in must be configured as appropriate for the specific install.
-Djava.security.krb5.kdc=krb-fnal-1.fnal.gov \
-Djavax.security.auth.useSubjectCredsOnly=false \
-Djava.security.auth.login.config=/home/wellner/dev/Sandbox/Java/diskCacheV111/config/jgss.conf"
The jgss.conf file looks as follows:
com.sun.security.jgss.accept {The principal must be assigned by the kerberos administrator and installed using standard kerberos mechanisms. The keytab file looks as follows:
com.sun.security.auth.module.Krb5LoginModule required
debug=true
principal="ftp${/}orchard.fnal.gov@FNAL.GOV"
doNotPrompt=true
useKeyTab=true
keyTab="${/}etc${/}krb5.keytab"
storeKey=true;
};
wellner: 8977 1530 / /pnfs/fnal.gov/usr/wellnerWhere the components are as follows:
wellner@FNAL.GOV
bakken: 5406 1530 / /pnfs/fnal.gov/usr/cdfen
bakken@FNAL.GOV
kennedy: 1056 3200 / /pnfs/fnal.gov/usr/cdfen
kennedy@FNAL.GOV
localname: localId localGroup virtualPath physicalPath
principalName
In the config directory there is a file named door.batch. This contains in it an example of how to configure a door for operation. Uncomment this and change the paths to ones suitable for your installation.