European Grid Infrastructure

Scale up on EGI

Content:

The European Grid Infrastructure is a grid infrastructure gathering computing resources from all over the world. It is a very powerful computing environment, but transpires as technically challenging to use. OpenMOLE makes it very simple to benefit from the grid.

Setting up an EGI authentication 🔗

You first need to import your EGI certificate in OpenMOLE as described in the GUI guide.

Authentication in console mode 🔗

!!! This is not recommended !!!

In the console, execute the following:

EGIAuthentication() = P12Certificate(password, "/path/to/your/certificate.p12")

You only need to execute this operation once. OpenMOLE will store this information in your preferences folder.

Submitting jobs to EGI 🔗

Mandatory parameter 🔗

In order to use EGI you must be registered in a Virtual Organisation (VO). The VO is the only compulsory parameter when creating an EGI environment within OpenMOLE. In the following example the VO biomed is specified, but you can use any VO:

val env = EGIEnvironment("biomed")

Optional parameters 🔗

Other optional parameters are available when defining an EGIEnvironment:

cpuTime: the maximum duration for the job in terms of CPU consumption, for instance 1 hour,
openMOLEMemory: the memory of attributed to the OpenMOLE runtime on the execution node, if you run external tasks you can reduce the memory for the OpenMOLE runtime to 256MB in order to have more memory for you program on the execution node, for instance openMOLEMemory = 256 megabytes,
debug: generate debugging information about the execution node (hostname, date, memory, max number of file descriptors, user proxy, ...). Defaults to debug = false.

Here is a use example using these parameters:

val env =
  EGIEnvironment(
    "biomed",
    cpuTime = 4 hours,
    openMOLEMemory = 200 megabytes
  )

Advanced parameters 🔗

The EGIEnvironment also accepts a set of more advanced options:

service: a DIRAC REST API,
group: the name of the DIRAC group,
bdii: the BDII to use for listing resources accessible from this VO. The BDII in your preference file is used, when this field is left unspecified.
voms: the VOMS server used for the authentication,
fqan: additional flags for authentication,
setup: setup to use on the DIRAC server. It is set to \"Dirac-Production\" by default.

Grouping 🔗

You should also note that the use of a batch environment is generally not suited for short tasks, i.e. less than 1 hour for a grid. In case your tasks are short, you can group several executions with the keyword by in your workflow. For instance, the workflow below groups the execution of model by 100 in each job submitted to the environment:

// Define the variables that are transmitted between the tasks
val i = Val[Double]
val res = Val[Double]

// Define the model, here it is a simple task executing "res = i * 2", but it can be your model
val model =
  ScalaTask("val res = i * 2") set (
    inputs += i,
    outputs += (i, res)
  )

// Define a local environment
val env = LocalEnvironment(10)

// Make the model run on the the local environment
DirectSampling(
  evaluation = model on env by 100 hook display,
  sampling = i in (0.0 to 1000.0 by 1.0)
)