Method scores 🔗
The PSE method is designed to cover the output space, hence the highest score possible in output exploration. PSe is all about covering output space, hence the low scores in optimization and Input Space exploration. As the methods discovers patterns in the output space, inputs values that lead to these patterns are available, that give little insights about the model sensitivity. Contrarily to calibration-based methods, PSE is sensitive to the dimensionality of the output space, as it maintains an archive of the output space locations covered ever since, which is rapidly costly for more than 3-4 dimensions.
PSE handles stochasticity in the sense that the selected pattern are estimated by the median of several model execution output values.
The PSE method searches for diverse output values. As with all evolutionary algorithms, PSE generates new individuals through a combination of genetic inheritance from parent individuals and mutation. PSE (inspired by novelty search) selects for the parents whose output values are rare compared to the rest of the population and to the previous generations. In order to evaluate the rarity of a the output values, PSE discretises the output space, dividing it into cells. Each time a simulation is run and its output is known, a counter is incremented in the corresponding cell. PSE preferentially selects the parents whose associated cells have low counters. By selecting parents with rare output values, we try and increase the chances to produce new individuals with previously unobserved behaviours.
Typed signature 🔗PSE method can be typed likewise:
such that : PSE(M) = M(X)
With M, the model, X , the Input space, Y, the output space, 𝓟(X) the power set of X ( i.e. every subset of X , including X and ∅ )
In other words : this function takes a model M ( whose signature is (X→Y) ) , an element y of Y (y is the list of criterion value to reach) and find a list of elements of X (noted x) such that, M(x) are Pareto dominant compared to every image of other elements of X by M , regarding criterion y)
PSE takes the following parameters:
genome: the model parameters, varying within their minimum and maximum bounds,
objectives: the observables measured for each simulation and within which we search for diversity, with a discretization step ,
stochastic: the seed generator, which generates suitable seeds for the method. Mandatory if your model contains randomness. The generated seed for the model task is transmitted through the variable give as an argument of Replication (here myseed).
You will also need an evolutionary scheme and can use SteadyStateEvolution as described in Calibration)To use PSE as the exploration method in openmole, use the PSE constructor like so:
//seed declaration for random number generation val myseed =Val[Int] val exploration = PSEEvolution( evaluation = modelTask, parallelism = 10, termination = 100, genome = Seq( param1 in (0.0, 1.0), param2 in (-10.0, 10.0)), objectives = Seq( output1 in (0.0 to 40.0 by 5.0), output2 in (0.0 to 4000.0 by 50.0)), stochastic = Stochastic(seed = myseed) )
param2are inputs of the task that runs the model, and
output2are outputs of that same task. The number of inputs and outputs are illimited.
Note that the method is subject to the curse of dimensionality on the output space, meaning that the number of output patterns can grow as a power of the number of output variables. With more than just a few output variables, the search space may become so big that the search will take too long to complete and the search results will take more memory than one can handle on a modern computer. Restricting the number of output variables to 2 or 3 also facilitates the interpretation of the results, making them easy to visualise.
The PSE method is described in the following scientific paper : Guillaume Chérel, Clémentine Cottineau and Romain Reuillon, « Beyond Corroboration: Strengthening Model Validation by Looking for Unexpected Patterns» published in PLOS ONE 10(9), 2015.
[online version] [bibteX]