Documentation > Advanced Concepts
The various methods available in OpenMOLE make an extensive use of genetic algorithms (GA).
For instance, it is the case for the model calibration method (which is an optimization problem), or the search for output diversity with the PSE method (which boils down to a GA with a novelty incentive).
GAs can be smartly distributed on grid environments using an island scheme, and are able to deal with stochastic models.
The different calibration methods in OpenMOLE use GAs to explore the parameter space of a simulation model, looking for parameter sets that will produce outputs reaching one or several given objectives. Objectives functions, also called fitness functions, compute quantities from the model outputs that have to be minimized or maximized. They are a quantification of the optimal model output you are looking for.
A common optimization problem is data fitting. In this particular case, the objective function could compute the distance between simulation results and data, a classical example is the Squared Error function.
If you want your model to reproduce several characteristics (sometimes called stylised facts), you need several objectives, each of them quantifying the similarity between your model outputs and the characteristics you want to reproduce.
To calibrate your model, you need to define:
Notice that the objectives are given as a sequence of model outputs variables to minimize. So if you want to reach specific target values, like Pi and 42, you can use the
NB: in this case the results in the saved file will be the difference between the outputs of the model and your objectives.
Obviously, maximization problems are performed by taking the opposite of variables as objectives. You may use a
Contents
GAs can be smartly distributed on grid environments using an island scheme, and are able to deal with stochastic models.
About Calibration and GA ðŸ”—
OpenMOLE provides advanced methods to help you calibrate your model. These methods automatically generate workflows to explore the parameter space of your model towards the best parameter set, according to a previously defined criterion or objective. This is commonly addressed in the literature as a calibration, or optimization, problem.The different calibration methods in OpenMOLE use GAs to explore the parameter space of a simulation model, looking for parameter sets that will produce outputs reaching one or several given objectives. Objectives functions, also called fitness functions, compute quantities from the model outputs that have to be minimized or maximized. They are a quantification of the optimal model output you are looking for.
A common optimization problem is data fitting. In this particular case, the objective function could compute the distance between simulation results and data, a classical example is the Squared Error function.
If you want your model to reproduce several characteristics (sometimes called stylised facts), you need several objectives, each of them quantifying the similarity between your model outputs and the characteristics you want to reproduce.
To calibrate your model, you need to define:
- the genome of your model, i.e. the parameters to be calibrated. They are the dimensions of the parameter space that will be explored by the GA. The GA will try different genomes, and keep the best one discovered yet.
- the objectives you want to reach, expressed as variables to be minimized.
- a termination criterion, to stop the method eventually.
Dummy Model Optimization Example ðŸ”—
This workflow optimizes a dummy model using the generational NSGA II multi-objective algorithm. You can replace the instances of model by your own model, and adapt the variation range of the input variables. If you are not familiar with parameter tuning using GA, you should first consult the tutorial explaining how to calibrate a NetLogo model with a GA. //model inputs
val x = Val[Double]
val y = Val[Double]
val s = Val[String]
//model outputs
val o1 = Val[Double]
val o2 = Val[Double]
val model =
ScalaTask("""
val o1 = x
val o2 =
s match {
case "apple" => y
case "banana" => -y
case "strawberry" => -2 * y
}
""") set (
inputs += (x, y, s),
outputs += (o1, o2)
)
// Construction of the workflow orchestrating the genetic algorithm
// genome is the inputs prototype and their variation ranges
// objectives are the objectives to minimize
// termination is the termination criterion, here it runs for 100 generations. A time limit could be set as an
// alternative by replacing 100 by 1 hour (hour is a duration type understood by OpenMOLE).
// the parallelism specifies how many evaluation are concurrently submitted to the execution environment
val evolution =
NSGA2Evolution(
genome = Seq(
x in (0.0, 1.0),
y in (0.0, 1.0),
s in List("apple", "banana", "strawberry")),
objectives = Seq(o1, o2),
evaluation = model,
parallelism = 10,
termination = 100
)
// Construction of the complete workflow with the execution environment, and the hook.
// A hook is attached to save the population of solutions to workDirectory /evolution on the local machine running OpenMOLE
// Here the generated workflow will run using 4 threads of the local machine.
evolution hook (workDirectory / "evolution") on LocalEnvironment(4)
Notice that the objectives are given as a sequence of model outputs variables to minimize. So if you want to reach specific target values, like Pi and 42, you can use the
DeltaTask
and plug it after your model:
//model inputs
val x = Val[Double]
val y = Val[Double]
val s = Val[String]
//model outputs
val o1 = Val[Double]
val o2 = Val[Double]
val model =
ScalaTask("""
val o1 = x
val o2 =
s match {
case "apple" => y
case "banana" => -y
case "strawberry" => -2 * y
}
""") set (
inputs += (x, y, s),
outputs += (o1, o2)
)
NSGA2Evolution(
genome = Seq(
x in (0.0, 1.0),
y in (0.0, 1.0),
s in List("apple", "banana", "strawberry")),
objectives = Seq(o1, o2),
evaluation = model -- DeltaTask(o1 -> math.Pi, o2 -> 42),
parallelism = 10,
termination = 100
) hook (workDirectory / "evolution")
NB: in this case the results in the saved file will be the difference between the outputs of the model and your objectives.
Obviously, maximization problems are performed by taking the opposite of variables as objectives. You may use a
ScalaTask
in place of the DeltaTask
to perform such a small computation in the workflow.
//model inputs
val x = Val[Double]
val y = Val[Double]
val s = Val[String]
//model outputs
val o1 = Val[Double]
val o2 = Val[Double]
val model =
ScalaTask("""
val o1 = x
val o2 =
s match {
case "apple" => y
case "banana" => -y
case "strawberry" => -2 * y
}
""") set (
inputs += (x, y, s),
outputs += (o1, o2)
)
val maximize = ScalaTask("o1 = -o1") set ((inputs, outputs) += (o1, o2))
NSGA2Evolution(
genome = Seq(
x in (0.0, 1.0),
y in (0.0, 1.0),
s in List("apple", "banana", "strawberry")),
objectives = Seq(o1, o2),
evaluation = model -- maximize,
parallelism = 10,
termination = 100) hook (workDirectory / "evolution")
Outputs of NSGA2Evolution ðŸ”—
As an output, the method will produce in output a population file for each generation in the directory provided to theSavePopulationHook
, named by the generation number as populationN.csv
.
Each csv file contains a column with the generation number, the values of parameters, the median value of the objectives at each point, and in the variable evolution$samples
, the number of runs of the model used for the evaluation (in the case of stochastic models).