Page Tools


    Hello, this is the user documentation for the ADAGE GFarm plugin.

    The plugin featured by the tarball distribution of ADAGE is designed for GFarm v1. If you want to deploy the all new GFarm v2, please do a svn checkout of the ADAGE project. All Gfarm deployments are achieved as a non-privileged user (private mode).

    Describing a GFarm application

    A GFarm application is made of 3 or 4 roles: the metadata server (gfarm), the metadata cache server (agent, not used since GFarm v2), the gfarm file system node (gfsd) and the client. Please refer to the GFarm documentation to understand how these roles are acting and, possibly, what you are doing.

    A GFarm application description example is available in the adage/tests/gfarm directory (source directory) and it starts with:

    <?xml version="1.0"?>
    <!DOCTYPE GFARM_application>
    <GFARM_application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="../../src/plugin/gfarm//GFARM_application.xsd">

    Describing the Metadata Server (gfarm)

    The metadata server is a centralized and unique server that must appear one and only one time in the application description. You can declare the metadata server using the following code:

    <server
     name="metadata_server"
     root="$ROOT/bin"
     prefix="."
     port_master="10602"
     port_gfmd="10601"
     port_gfsd="10600"
    />
    • (required) name the name of the Metadata Server. It has to be unique.
    • (optional) root the path to find the server binary. If not set, $PATH will be used.
    • (optional) prefix IMPORTANT: argument to use with config-gfarm. Should be set to “.”, so that the default local adage directory will be used. If not set, and according to gfarm, the default directory will be “/”.
    • (optional) port_master, port_gfmd, port_gfsd (for v1 only) arguments to use with config-gfarm. If not set, default arguments will be used by gfarm. If you are deploying GFarm in user-mode you probably need to specify ports greater than 1024.
    • (optional) config the path where to copy the gfarm.conf file to a shared (NFS) place.
    • (optional) args additional arguments to use with this role. You may use this field to provide some specific parameters to gfarm (see 'config-gfarm -t' for complete list of options).

    The following roles (agent, gfsd, client) are declared as profiles you can multiply thanks to the cardinality argument. Let's open the list of processes:

    <procs>

    Describing a Metadata Cache Server (agent, for v1 only)

    GFarm agents first need to retrieve the gfarm.conf file generated by the Metadata Server. To achieve that, they will either copy it using the secure copy (scp), or using the config path if set in the above metadata server description.

     <proc
      name="metadata_cache_server"
      role="agent"
      prefix="."
      port_master="10603"
      cardinality="1"
     />
    • (required) name the name of the Metadata Cache Server. It has to be unique.
    • (required) role it has to be set to agent.
    • (required) cardinality if you want to create multiple instances of this profile.
    • (optional) prefix IMPORTANT: argument to use with config-agent. Should be set to “.”, so that the default local adage directory will be used. If not set, and according to gfarm, the default directory will be “/”.
    • (optional) root the path to find the cache server binary. If not set, $PATH will be used.
    • (optional) port_master argument to use with config-agent. If not set, default argument will be used by gfarm.
    • (optional) args additional arguments to use with this role. You may use this field to provide some specific parameters to gfarm (see 'config-agent -t' for complete list of options).

    Describing a GFarm File System node (gfsd)

    Every File System node is attached to a rendez-vous, either the Metadata Server or a given Metadata Cache Server (only for v1). To express such a link, File System nodes first copy the GFarm configuration file generated by their corresponding rendez-vous, using a distant copy method (e.g. scp).

    <proc
      name="fs_node"
      role="gfsd"
      prefix="."
      cardinality="1"
      agent="metadata_cache_server"
     />
    • (required) name the name of the File System node. It has to be unique.
    • (required) role it has to be set to gfsd.
    • (required) agent the name of the rendez-vous to connect with. This is called agent for historical reason.
    • (optional) prefix IMPORTANT: argument to use with config-gfarm. Should be set to “.”, so that the default local adage directory will be used. If not set, and according to gfarm, the default directory will be “/”.
    • (required) cardinality if you want to create multiple instances of this profile.
    • (optional) root the path to find the gfsd binary. If not set, $PATH will be used.
    • (optional, for v2 only) port_master argument to use with config-agent. If not set, default argument will be used by gfarm.
    • (optional) args additional arguments to use with this role. You may use this field to provide some specific parameters to gfarm (see 'config-gfsd -t' for complete list of options).

    Describing a Client

    As for File System nodes, every client is attached to a rendez-vous, either the Metadata Server or a given Metadata Cache Server (only for v1). It has also to retrieve the right GFarm configuration file from its rendez-vous. This configuration file is stored in the local ADAGE directory of the node ($destdir/$user/adage-$adagepid/gf_client_0/0/gf_client_0_p/0/etc) and used thanks to the GFARM_CONFIG_FILE environment variable. $destdir may change depending on the node, usually it is set to the local /tmp directory. A copy of the configuration file is also available in the $destdir/$user/adage-$adagepid/share folder to be used by other applications (see the GFarm+JuxMem tutorial for further information).

    <proc
      name="gf_client"
      role="client"
      binary="gfmkdir"
      args="adagetest"
      cardinality="1"
      agent="metadata_cache_server"
     />
    • (required) name the name of the Client node. It has to be unique. If you plan to use the JuxMem basic tutorial with the GFarm feature, this is the name you need to enter when asked.
    • (required) role it has to be set to client.
    • (required) agent the name of the agent to connect with. This is called agent for historical reason.
    • (required) cardinality if you want to create multiple instances of this profile.
    • (optional) binary the binary to use. If not set, nothing is started. You can specify a binary that sucessfuly does nothing (/bin/true) if you want to deploy JuxMem thereafter.
    • (optional) args additional arguments to use with this role.

    Your GFarm application description is now finished, you can end the file:

    </procs>
    </GFARM_application>

    Getting started

    Deploying GFarm

    GFarm v2 users, please switch the GFarm version attribute to “2” in the tests/gfarm/ctrl-params-spec.xml file. PostgreSQL users, please switch the database type to “gfarm-pgsql” in the tests/gfarm/ctrl-params-spec.xml file. If you are NOT using the OAR2(Grid) scheduler, this is the right time to create your resource file, see the tests/nodes.res file. Then you can type:

    <xterm> ./src/adage -a tests/gfarm/appl-gfarmv1.xml -c tests/gfarm/ctrl-params-ssh.xml -r tests/nodes.res -x </xterm>

    If you are using the OAR2(Grid) scheduler you can directly indicate your reservation number. Grid5000 users, please indicate the tests/gfarm/ctrl-params-oarsh.xml file to force the use of OAR2 tools.

    <xterm> ./src/adage -a tests/gfarm/appl-gfarmv1.xml -c tests/gfarm/ctrl-params-oarsh.xml -j 203580 -x </xterm>

    The deployment ends with a report indicating hosts and PIDs:

    <xterm> DBG[scripts_generator.cc:73] generate(): hosts_pid[paraquad64.rennes.grid5000.fr] = 14783 14777 DBG[scripts_generator.cc:73] generate(): hosts_pid[paraquad39.rennes.grid5000.fr] = 14346 DBG[scripts_generator.cc:73] generate(): hosts_pid[paraquad24.rennes.grid5000.fr] = 12743 DBG[scripts_generator.cc:73] generate(): hosts_pid[paraquad30.rennes.grid5000.fr] = 12609 </xterm>

    One node (paraquad64.rennes.grid5000.fr) has 2 PIDs, corresponding to the Metadata Server and the LDAP server. We can check everything works fine using the get_status.sh script. We can also connect to the node hosting the client, set the GFARM_CONFIG_FILE environment variable to the full path of the client's configuration file ($destdir/$user/adage-$adagepid/share/gfarm.conf) and check that basic gfarm commands such as gfhost or gfls work.

    You can also use the clean-up.sh script to stop all the processes.

    What this plugin really does

    Starting from the Gfarm Application description, this plugin behaves as follow:

    • On each node, a local directory is created, using the ADAGE naming convention ( $destdir/$user/adage-$adagepid/$pg_id/$i/$proc_id/$j ). If you have not set the prefix option in the Gfarm description, this directory will be used to copy the appropriate GFarm configuration file and will be also used as the —prefix argument for the config-{gfarm,agent,gfsd} binary.
    • [only for GFarm v1] Metadata Cache Servers (agents) wait for the initialization of the Metadata Server (gfarm) before copying the GFarm configuration file.
    • File System nodes (gfsd) and clients wait for their rendez-vous (metadata server for GFarm v2 or agent for GFarm v1) before copying the GFarm configuration file.
    • [only for GFarm v1] The GFarm user home directory is created by the plugin using the gfmkdir gfarm:~ command.

    Powered by Heliovista - Création site internet