Meanwhile, you have to package your code using CARE, as explained in the Native Packaging section. The following contents expose how to handle your packaged model within OpenMOLE.
args<-commandArgs(trailingOnly = TRUE)
data<-read.csv("data.csv",header=T,sep=",")
result<-as.numeric(args[1])*data
write.csv(result,"result.csv", row.names=FALSE)
With an example data.csv:
h1,h2,h3
7,8,9
9,7,3
1,1,1
This reads a file called data.csv, multiply its content by a number provided on the command line and writes the result to an output file called results.csv.
To call this script from the command line you should type: R -f script.R --slave --args 4
considering you have R installed on your system.Once the script is up and running, remember that the first step to run it from OpenMOLE is to package it. This is done using CARE on your system.care -r ~ -o R.tgz.bin R -f script.R --slave --args 4
Notice how the command line is identical to the original one. The call to the R script remains unchanged, as CARE and its options are inserted at the beginning of the command line.The result of the previous command line is a file named R.tgz.bin. It is an archive containing a portable version of your execution. It can be extracted and executed on any other Linux platform.The method described here packages everything including R itself! Therefore there is no need to install R on the target execution machine. All that is needed is for the remote execution host to run Linux, which is the case for the vast majority of (decent) high performance computing environments.Packaging an application is done once and for all by running the original application against CARE. CARE's re-execution mechanisms allows you to change the original command line when re-running your application. This way you can update the parameters passed on the command line and the re-execution will be impacted accordingly. As long as all the configuration files, libraries, ... were used during the original execution, there is no need to package the application multiple times with different input parameters.
You can now upload this archive to your OpenMOLE workspace along with a data.csv file to a subfolder named data. Let's now explore a complete combination of all the data files with OpenMOLE. The input data files are located in data and the result files are written to a folder called results. A second input parameter is a numeric value i ranging from 1 to 10. The corresponding OpenMOLE script looks like this:
// Declare the variable
val i = Val[Double]
val input = Val[File]
val inputName = Val[String]
val output = Val[File]
// R task
// "workDirectory" is automatically set to the location of your .oms script in your OpenMOLE workspace
val rTask = CARETask(workDirectory / "data/R.tgz.bin", "R --slave -f script.R --args ${i}") set (
(inputs, outputs) += (i, inputName),
inputFiles += (input, "data.csv"),
outputFiles += ("result.csv", output)
)
val exploration =
ExplorationTask(
(i in (1.0 to 10.0 by 1.0)) x
(input in (workDirectory / "data").files withName inputName)
)
val copy = CopyFileHook(output, workDirectory / "result" / "${inputName}-${i}.csv")
exploration -< (rTask hook copy hook ToStringHook())
The CARETask
performs two actions: it first unarchives the CARE container by running R.tgz.bin
. Then the actual execution takes place as a second command. Note that for each execution of the CARETask
, any command starting with / is relative to the root of the CARE archive, and any other command is executed in the current directory. The current directory defaults to the original packaging directory.
Several notions from OpenMOLE are reused in this example. If you're not too familiar with
Hooks or
Samplings, check the relevant sections of the
documentation.
CARETask
is not different from the SystemExecTask to the extent of the archive given as a first parameter.