ABC-SMC¶
ABC-SMC algorithms for Bayesian model selection.
-
class
abcsmc.
ABCLoader
(data_store: abcsmc.loader.SQLDataStore)¶ Bases:
object
Load ABC results from database and analyse.
- Parameters
data_store (DataStore) – The datastore provides the database’s tables as pandas dataframes. Can be a SQLDataStore or pandas.HDFStore.
-
average_mass_at_tround_truth
()¶ Averaged posterior probabilities, grouped by
group_parameters
.
-
property
confusion_matrices_table
¶ Confusion matrices.
-
confusion_matrix_dict
()¶ Confusion matrices as dict, with keys indicating the sweep parameters.
-
property
group_parameters
¶ Paramters for grouping ABC sweeps.
-
property
max_nr_populations
¶ Maximum number of populations.
-
maximum_a_posteriori
()¶ MAP estimates, grouped by
group_parameters
.
-
property
maxs
¶ Maxima of the results, grouped by the
group_parameters
.
-
means
()¶ Means of the results, grouped by the
group_parameters
.
-
property
model_names
¶ Unique names of the models found in the database.
-
particles_of_population
(abc_smc_id: int, model_name: str, t: int)¶ Return the particles of a given population. Useful if the posterior parameters are of interest.
- Parameters
abc_smc_id (int) – ID of the ABCSMC run.
model_name (str) – Name of the model.
t (int) – Population number.
- Returns
particles – The particles of the chosen population.
- Return type
DataFrame
-
results
()¶ Final results of the ABC runs.
-
terminated_abc_smc_ids
()¶ IDs of already terminated ABCSMC runs.
-
class
abcsmc.
ABCSMC
(models: List[Callable[[util.parameters.Parameter], dict]], model_prior_distribution: util.random_variables.RV, model_perturbation_kernel: util.random_variables.ModelPerturbationKernel, parameter_given_model_prior_distribution: List[util.random_variables.Distribution], adaptive_parameter_perturbation_kernels: List[Callable[[int, dict], util.random_variables.Kernel]], distance_function: abcsmc.distance_functions.DistanceFunction, eps: abcsmc.epsilon.Epsilon, nr_particles: int, mapper=<class 'map'>, debug: bool = False, max_nr_allowed_sample_attempts_per_particle: int = 500, min_nr_particles_per_population: int = 1)¶ Bases:
object
Approximate Bayesian Computation - Sequential Monte Carlo (ABCSMC).
This is an implementation of an ABCSMC algorithm similar to 1
- Parameters
models (List[Callable[[Parameter], dict]]) –
Calling
models[m](par)
returns the calculated summary statistics of modelm
with the corresponding parameterspar
.Each callable represents thus one single model.
model_prior_distribution (RV) – A random variable giving the prior weights of the model classes. If the prior is uniform over the model classes this is something like
RV("randint", 0, len(models))
.model_perturbation_kernel (ModelPerturbationKernel) – Kernel which governs with which probability to switch the model for a given sample.
parameter_given_model_prior_distribution (List[Distribution]) – A list of prior distributions for the models’ parameters. Each list entry is the prior distribution for the corresponding model.
adaptive_parameter_perturbation_kernels (List[Callable[[int, dict], Kernel]]) –
A list of functions mapping
(t, stat) -> Kernel
, wheret
is the population nrstat
a dictionary of summary statistics.E.g.
stat['std']['parameter_1']
is the standard deviation ofparameter_1
.Warning
If a model has only one particle left the standard deviation is zero.
This callable is called at the beginning of a new population with the statistics dictionary from the last population to determine the new parameter perturbation kernel for the next population.
distance_function (DistanceFunction) – Measures the distance of the tentatively sampled particle to the measured data.
eps (Epsilon) – Returns the current acceptance epsilon. This epsilon changes from population to population. The eps instance provides the strategy according to which to change it.
mapper (map like) – A callable which behaves like the built-in map function. I.e. mapper(f, args) takes a callable
f
and applies it to the arguments in the listargs
. This mapper is used for particle sampling. It can be a distributed mapper such as theparallel.sge.SGE
class.debug (bool) – Whether to output additional debug information.
max_nr_allowed_sample_attempts_per_particle (int) – The maximum number of sample attempts allowed for each particle. If this number is reached, the sampling for a particle is stopped. Hence, a population may return with less particles than started. This is an approximation to the ABCSMC algorithm which ensures, that the algorithm terminates.
min_nr_particles_per_population (int) – Minimum number of samples which have to be accepted for a population. If this number is not reached, the algorithm stops. This option, together with the
max_nr_allowed_sample_attempts_per_particle
ensures that the algorithm terminates. This parameter determines to which extent the ABCSMC algorithm is approximated.
- 1
Toni, Tina, and Michael P. H. Stumpf. “Simulation-Based Model Selection for Dynamical Systems in Systems and Population Biology.” Bioinformatics 26, no. 1 (2010): 104–10. doi:10.1093/bioinformatics/btp619.
-
do_not_stop_when_only_single_model_alive
()¶ Calling this method causes the ABCSMC to still continue if only a single model is still alive. This is useful if the interest lies in estimating the model parameter as compared to performing model selection.
The default behavior is to stop when only a single model is alive.
-
run
(nr_samples_per_particle: List[int], minimum_epsilon: float) → abcsmc.storage.History¶ Run the ABCSMC model selection. This method can be called many times. It makes another step continuing where it has stopped before.
It is stopped when the maximum number of populations is reached or the
minimum_epsilon
value is reached.- Parameters
nr_samples_per_particle (List[int]) –
The length of the list determines the maximal number of populations.
The entries of the list the number of iterated simulations in the notation from 2 these are the \(B_t\). Usually, the entries are all ones:
nr_samples_per_particle = [1] * nr_populations
.minimum_epsilon (float) – Stop if epsilon is smaller than minimum epsilon specified here.
- 2
Toni, Tina, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael P. H. Stumpf. “Approximate Bayesian Computation Scheme for Parameter Inference and Model Selection in Dynamical Systems.” Journal of The Royal Society Interface 6, no. 31 (2009): 187–202. doi:10.1098/rsif.2008.0172.
-
sample_from_prior
() → List[dict]¶ Only sample from prior and return results without changing the history. This can be used to get initial samples for the distance function or the epsilon to calibrate them.
Warning
The sample is cached.
-
set_data
(observed_summary_statistics: dict, ground_truth_model_nr_or_name: Union[int, str], ground_truth_parameter: dict, abc_options: dict, model_names: Iterable[str])¶ Set the data to be fitted.
- Parameters
observed_summary_statistics (dict) –
This is the really important parameter here. It is of the form
{'statistic_1' : val_1, 'statistic_2': val_2, ... }
.The dictionary provided here represents the measured data. Particles during ABCSMC sampling are compared with the summary statistics provided here.
ground_truth_model_nr_or_name (Union[int, str]) – This is only meta data stored to the database, but not actually used for the ABCSMC algorithm. To evaluate the ABCSMC procedure against synthetic samples, this parameter can be used to indicate the ground truth model number or name. This helps with further analysis. If actually measured data is used, it is recommended to set this parameter to
-1
.ground_truth_parameter (dict) – Similar to
ground_truth_model_nr_or_name
, this is only for recording purposes, but not used in the ABCSMC algorithm. This stores the parameters of the ground truth model if it was synthetically obtained.abc_options (dict) – Has to contain the key “db_path” which has to be a valid SQLAlchemy database identifier. Can contain an arbitrary number of additional keys, only for recording purposes. Store arbitrary meta information in this dictionary.
model_names (List[str]) – Only for recording purposes. Record names of the models.
-
class
abcsmc.
ConstantEpsilon
(constant_epsilon_value: float)¶ Bases:
abcsmc.epsilon.Epsilon
Keep epsilon constant over all populations.
- Parameters
constant_epsilon_value (float) – The epsilon value for all populations.
-
__call__
(t, history)¶ - Parameters
t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.
- Returns
eps – The new epsilon for population
t
.- Return type
float
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
class
abcsmc.
DistanceFunction
¶ Bases:
abc.ABC
Abstract case class for distance functions.
Any other distance function should inherit from this class.
-
abstract
__call__
(x: dict, x_0: dict) → float¶ Abstract method. This method has to be overwritten by all concrete implementations.
Evaluate the distance of the tentatively sampled particles relative to the measured data.
- Parameters
x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.
- Returns
distance – Distance of the tentatively sampled particles to the measured data.
- Return type
float
-
get_config
() → dict¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
initialize
(sample_from_prior: List[dict])¶ This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.
-
to_json
() → str¶ Return JSON encoded configuration of the distance function.
- Returns
json_str – JSON encoded string describing the distance function. The default implementation is to try to convert the dictionary returned by
get_config
.- Return type
str
-
abstract
-
class
abcsmc.
DistanceFunctionWithMeasureList
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.DistanceFunction
Base class for distance functions with measure list.
- Parameters
measures_to_use (Union[str, List[str]]) –
If set to “all”, all measures are used. This is the default.
If a list is provided, the measures in the list are used.
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
initialize
(sample_from_prior)¶ This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.
-
measures_to_use
¶ The measures (summary statistics) to use for distance calculation.
-
sanitize_sample_from_prior
(sample)¶ Remove samples in which any of the measures is NaN. Added by Alessandro Motta <alessandro.motta@brain.mpg.de>
-
class
abcsmc.
Distribution
(*args, **kwargs)¶ Bases:
util.parameters.ParameterStructure
Distribution of parameters for a model.
A distribution is a collection of RVs and/or distributions. It is a dictionary-like object of random variables or distributions.
This should be used as prior and also as Kernel density.
-
copy
() → util.random_variables.Distribution¶ Copy the distribution.
- Returns
copied_distribution – A copy of the distribution.
- Return type
-
static
from_dictionary_of_dictionaries
(dict_of_dicts: dict) → util.random_variables.Distribution¶ Create distribution from dictionary of dictionaries.
- Parameters
dict_of_dicts (dict) – The keys of the dict indicate the parameters’ names. The values are itself dictionaries representing scipy.stats distributions. I.e. they have the key “name” and at least one of the keys “args” or “kwargs”.
- Returns
distribution – Created distribution.
- Return type
-
get_parameter_names
() → list¶ Sorted list of parameter names.
- Returns
sorted_names – Sorted list of parameter names.
- Return type
list
-
pdf
(x: Union[util.parameters.Parameter, dict])¶ Get combination of probability density function (for continuous variables) and probability mass function (for discrete variables) at point x.
- Parameters
x (Union[Parameter, dict]) – Evaluate at the given Parameter
x
.
-
rvs
() → util.parameters.Parameter¶ Sample from joint distribution.
- Returns
parameter – A parameter which was sampled.
- Return type
-
update_random_variables
(**random_variables)¶ Update random variables within the distribution.
- Parameters
**random_variables – keywords are the parameters’ names, the values are random variables.
-
-
class
abcsmc.
EmptyMultivariateMultiTypeNormalDistribution
¶ Bases:
object
Empty multivariate distribution.
Returns always empty parameters upon sampling.
-
pdf
(x)¶ Return always 1.
-
rvs
()¶ Return empty Parameter.
-
-
class
abcsmc.
Epsilon
¶ Bases:
abc.ABC
Abstract epsilon base class.
This class encapsulates a strategy for setting a new epsilon for each new population.
-
abstract
__call__
(t: int, history: abcsmc.storage.History)¶ - Parameters
t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.
- Returns
eps – The new epsilon for population
t
.- Return type
float
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
initialize
(sample_from_prior: List[dict], distance_to_ground_truth_function: Callable[[dict], float])¶ This method is called by the ABCSMC framework before the first usage of the epsilon and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
distance_to_ground_truth_function (Callable[[dict], float]) – One of the distance functions pre-evaluated at its second argument (the one representing the measured data). E.g. similar to
lambda x: distance_function(x, x_measured)
.
-
to_json
()¶ Return JSON encoded configuration of the distance function.
- Returns
json_str – JSON encoded string describing the distance function. The default implementation is to try to convert the dictionary returned by
get_config
.- Return type
str
-
abstract
-
class
abcsmc.
History
(db_path: str, nr_models: int, model_names: List[str], min_nr_particles_per_population: int, debug=False)¶ Bases:
object
History for ABCSMC.
This class records the evolution of the populations and stores the ABCSMC results.
- Parameters
db_path (str) – SQLAlchemy database identifier.
nr_models (int) – Number of models.
model_names (List[str]) – List of model names.
min_nr_particles_per_population (int) – Minimum number of particles per population.
debug (bool) – Whether to print additional debug output.
Warning
Most likely this class is never manually instantiated. An instance of this class is returned by the
ABCSMC.run
method. It can then be used for querying. However, most likely even that won’t be used since querying is usually done on the stored database using the abc_loader.-
append_population
(t: int, current_epsilon: float, particle_population: list)¶ Append population to database.
- Parameters
t (int) – Population number.
current_epsilon (float) – Current epsilon value.
particle_population (list) – List of sampled particles.
- Returns
enough_particles – Whether enough particles were found in the population.
- Return type
bool
-
done
()¶ Close database sessions and store end time of population.
-
get_complete_population_median
(t: int) → float¶ Median of a population’s distances to the measured sample.
- Parameters
t (int) – Population number.
- Returns
median – The median of the distances.
- Return type
float
-
static
get_cov
(particles: list) → Union[util.random_variables.NonEmptyMultivariateMultiTypeNormalDistribution, util.random_variables.EmptyMultivariateMultiTypeNormalDistribution]¶ Covariance from particles.
- Parameters
particles (list) – List of particles.
- Returns
cov – The covariance representing distribution.
- Return type
Union[NonEmptyMultivariateMultiTypeNormalDistribution, EmptyMultivariateMultiTypeNormalDistribution]
-
get_distribution
(t: int, m: int, parameter: str) → Tuple[numpy.ndarray]¶ Returns parameter values and weights.
- Parameters
t (int) – Population number.
m (int) – Model number.
parameter (str) –
- Returns
(points, weights) – The points and their weights.
- Return type
Tuple[np.ndarray]
-
get_model_probabilities
(t=- 1) → numpy.ndarray¶ Model probabilities.
- Parameters
t (int) – Population. Defaults to -1, i.e. the last population.
- Returns
probabilities – Model probabilities.
- Return type
np.ndarray
-
get_parameter_std
(t: int, m: int) → dict¶ Standard deviation of the parameters in a given population.
- Parameters
t (int) – Population number.
m (int) – Model number.
- Returns
std – Dictionary with keys the parameter names and values their standard deviations.
- Return type
dict
-
get_results
()¶ G the full last record.
-
get_results_distribution
(m: int, parameter: str) → Tuple[numpy.ndarray]¶ Returns parameter values and weights of the last population.
- Parameters
m (int) – Model number.
parameter (str) – Parameter name.
- Returns
results – results = (points, weights) with the points and the weights of the last population.
- Return type
Tuple[np.ndarray]
-
get_statistics
(t: int) → dict¶ Statistics from particle populations.
- Parameters
t (int) – Population number.
- Returns
stat – List of population statistics at the time t. Each list entry corresponds to a model.
[{"std": ..., "nr_particles": ..., "cov": ...}, {"std": ..., "nr_particles": ..., "cov": ...}, ... ]
.- Return type
list
-
nr_of_models_alive
(t=- 1) → int¶ Number of models still alive.
- Parameters
t (int) – Population number.
- Returns
nr_alive – Number of models still alive.
- Return type
int
-
nr_simulations
¶ Only counts the simulations which appear in particles. If a simulation terminated prematurely it is not counted.
-
sample_from_models
(t: int) → int¶ Sample from the distribution over models.
- Parameters
t (int) – Population number.
- Returns
model_choice – This is m*in the notation from 3 .
- Return type
int
- 3
Toni, Tina, and Michael P. H. Stumpf. “Simulation-Based Model Selection for Dynamical Systems in Systems and Population Biology.” Bioinformatics 26, no. 1 (2010): 104–10. doi:10.1093/bioinformatics/btp619.
-
sample_from_population
(t: int, m: int) → Optional[abcsmc.storage.Parameter]¶ Sample from population.
- Parameters
t (int) – Population number.
m (int) – Model number.
- Returns
sample – Returns None if population t,m is empty, otherwise a sample parameter from it.
- Return type
Union[Parameter, None]
-
store_initial_data
(ground_truth_model_nr_or_name: Union[int, str], options, observed_summary_statistics: dict, ground_truth_parameter: dict, distance_function_json_str: str, eps_function_json_str: str)¶ Store the initial configuration data.
- Parameters
ground_truth_model_nr_or_name (Union[int, str]) – number or name of the ground truth model.
observed_summary_statistics (dict) – the measured summary statistics.
ground_truth_parameter (dict) – the ground truth parameters.
distance_function_json_str (str) – the distance function represented as json string.
eps_function_json_str (str) – the epsilon represented as json string.
-
property
t
¶ Current population.
-
property
total_nr_simulations
¶ Total number of simulations/samples.
-
class
abcsmc.
Kernel
(*distribution, **random_variables)¶ Bases:
object
A Kernel of the form K(x,y) = K(x-y).
Can be initialized from a distribution or using individual variables. E.g. do
Kernel(distribution)
orKernel(par_name_1=rv1, par_name2=rv2)
.If X is a given RV with pdf f, then K(x,y) = f(x-y).
-
add_random_variables
(**random_variables)¶ Add random variables to kernel.
- Parameters
random_variables (keyword arguments) – Keys are the names, values the random variables.
-
pdf
(x, y)¶ Return density \(K(x,y)\), i.e., the probability of transitioning from y to x.
-
rvs
(theta)¶ Return sample from \(K( \cdot, theta)\).
-
-
class
abcsmc.
ListEpsilon
(values: List[float])¶ Bases:
abcsmc.epsilon.Epsilon
Return epsilon values from a predefined list.
- Parameters
values (List[float]) – List of epsilon values.
values[k]
is the value for population k.
-
__call__
(t, history)¶ - Parameters
t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.
- Returns
eps – The new epsilon for population
t
.- Return type
float
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
class
abcsmc.
LowerBoundDecorator
(component: util.random_variables.RV, lower_bound: float)¶ Bases:
util.random_variables.RVDecorator
Impose a strict lower bound on a random variable. Condition RV X to “X > lower bound”. In particular P(X = lower_bound) = 0.
Note
Sampling is done via rejection. Up to 10000 samples are taken from the decorated RV. The first sample within the permitted range is then taken. Otherwise None is returned.
- Parameters
component (RV) – The decorated random variable.
lower_bound (float) – The lower bound.
-
cdf
(x)¶ Cumulative distribution function.
- Parameters
x (float) – Cumulative distribution function at x.
- Returns
density – Cumulative distribution function at x.
- Return type
float
-
copy
()¶ Copy the random variable.
- Returns
copied_rv – A copy of the random variable.
- Return type
-
decorator_repr
()¶ Represent the decorator itself.
Template method.
The
__repr__
method useddecorator_repr
and the__repr__
of the decorated RV to build a combined representation.- Returns
decorator_repr – A string representing the decorator only.
- Return type
str
-
pdf
(x)¶ Probability density function.
- Parameters
x (float) – Probability density at x.
- Returns
density – Probability density at x.
- Return type
float
-
pmf
(x)¶ Probability mass function.
- Parameters
x (int) – Probability mass at
x
.- Returns
mass – The mass at
x
.- Return type
float
-
rvs
()¶ Sample from the RV.
- Returns
sample – A sample from the random variable.
- Return type
float
-
class
abcsmc.
MedianEpsilon
(initial_epsilon: Union[str, int] = 'from_sample', median_multiplier: float = 1)¶ Bases:
abcsmc.epsilon.Epsilon
Calculate epsilon as median from the last population.
- Parameters
initial_epsilon (Union[str, int]) –
If ‘from_sample’, then the initial median is calculated from samples as its median.
If a number is given, this number is used.
median_multiplier (float) – Multiplies the median by that number. Also applies it to the initial median if it is calculated from samples. However, it does not apply to the initial median if it is given as a number.
-
__call__
(t, history)¶ - Parameters
t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.
- Returns
eps – The new epsilon for population
t
.- Return type
float
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
initialize
(sample_from_prior, distance_to_ground_truth_function)¶ This method is called by the ABCSMC framework before the first usage of the epsilon and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
distance_to_ground_truth_function (Callable[[dict], float]) – One of the distance functions pre-evaluated at its second argument (the one representing the measured data). E.g. similar to
lambda x: distance_function(x, x_measured)
.
-
class
abcsmc.
MinMaxDistanceFunction
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.RangeEstimatorDistanceFunction
Calculate upper and lower margins as max and min of the parameters.
-
static
lower
(parameter_list)¶ Calculate the lower margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
lower_margin – The lower margin of the range calculated from these parameters.
- Return type
float
-
static
upper
(parameter_list)¶ Calculate the upper margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
upper_margin – The upper margin of the range calculated from these parameters.
- Return type
float
-
static
-
class
abcsmc.
ModelPerturbationKernel
(nr_of_models: int, probability_to_stay: Optional[float] = None)¶ Bases:
object
Model perturbation kernel.
- Parameters
nr_of_models (int) – Number of models.
probability_to_stay (Union[float, None]) – If
None
, probability to stay is set to 1/nr_of_models. Otherwise, the supplied value is used.
-
pmf
(n: int, m: int) → float¶ - Parameters
n (int) – Model target number.
m (int) – Model source number.
- Returns
probability – Probability with which to jump from
m
ton
.- Return type
float
-
rvs
(m: int) → int¶ Sample a Kernel jump from model
m
to another model.- Parameters
m (int) – Model source number.
- Returns
target – Target model number.
- Return type
int
-
abcsmc.
MultivariateMultiTypeNormalDistribution
(covariance_matrix, parameter_names, parameter_types, zero_covariance_substitutes=0.0001) → Union[util.random_variables.NonEmptyMultivariateMultiTypeNormalDistribution, util.random_variables.EmptyMultivariateMultiTypeNormalDistribution]¶ Factory function for multivariate and multitype normal distribution.
This distribution is essentially a multivariate normal, but takes into account if a type is an integer and returns it always as rounded integer. This is useful if some of the model parameters are discrete.
- Parameters
covariance_matrix (np.ndarray) – 2D array. The covariance matrix.
parameter_names (List[str]) – List of parameter names.
parameter_types (list) – A list containing
int
and/orfloat
to indicate whether a parameter is of typefloat
orint
.zero_covariance_substitutes (float) – Substitutes zero variance entries on the diagonal of the diagonal representation of the covariance matrix.
- Returns
multivariate_distribution – Returns NonEmptyMultivariateMultiTypeNormalDistribution of len(parameter_names) > 0 otherwise a EmptyMultivariateMultiTypeNormalDistribution is returned.
- Return type
Union[NonEmptyMultivariateMultiTypeNormalDistribution, EmptyMultivariateMultiTypeNormalDistribution]
-
class
abcsmc.
NonEmptyMultivariateMultiTypeNormalDistribution
(covariance_matrix: numpy.ndarray, parameter_names: List[str], parameter_types: list, zero_covariance_substitutes=0.0001)¶ Bases:
object
Multivariate and multitype normal distribution.
This distribution is essentially a multivariate normal, but takes into account if a type is an integer and returns it always as rounded integer. This is useful if some of the model parameters are discrete.
- Parameters
covariance_matrix (np.ndarray) – 2D array. The covariance matrix.
parameter_names (List[str]) – List of parameter names.
parameter_types (list) – A list containing
int
and/orfloat
to indicate whether a parameter is of typefloat
orint
.zero_covariance_substitutes (float) – Substitutes zero variance entries on the diagonal of the covariance matrix.
-
pdf
(x: dict) → float¶ Probability density function at x.
- Parameters
x (dict) – Where to predict.
- Returns
density – The probability density.
- Return type
float
-
rvs
() → util.parameters.Parameter¶ Sample from distribution.
-
class
abcsmc.
PCADistanceFunction
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.DistanceFunctionWithMeasureList
Calculate distance in whitened coordinates.
A whitening transformation \(W\) is calculated from an initial sample. The distance is measured as Euclidean distance in the transformed space. I.e
\[d(x,y) = \| Wx - Wy \|.\]-
__call__
(x, y)¶ Abstract method. This method has to be overwritten by all concrete implementations.
Evaluate the distance of the tentatively sampled particles relative to the measured data.
- Parameters
x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.
- Returns
distance – Distance of the tentatively sampled particles to the measured data.
- Return type
float
-
initialize
(sample_from_prior)¶ This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.
-
-
class
abcsmc.
Parameter
(*args, **kwargs)¶ Bases:
util.parameters.ParameterStructure
A single model parameter.
Parameters are a dictionary with the additional functionality to add and subtract parameters.
I.e.
par_1 + par_2
adds key wise.-
copy
() → util.parameters.Parameter¶ Copy the parameter.
-
-
class
abcsmc.
PercentileDistanceFunction
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.RangeEstimatorDistanceFunction
Calculate normalization 20% and 80% from percentiles as lower and upper margins.
-
PERCENTILE
= 20¶ The percentiles
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
static
lower
(measures)¶ Calculate the lower margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
lower_margin – The lower margin of the range calculated from these parameters.
- Return type
float
-
static
upper
(measures)¶ Calculate the upper margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
upper_margin – The upper margin of the range calculated from these parameters.
- Return type
float
-
-
class
abcsmc.
RV
(name: str, *args, **kwargs)¶ Bases:
util.random_variables.RVBase
Concrete random variable.
- Parameters
name (str) – Name of the distribution as in
scipy.stats
.args – Arguments as in
scipy.stats
matching the distribution with name “name”.
- kwargs:
Keyword arguments as in
scipy.stats
matching the distribution with name “name”.
-
cdf
(x)¶ Cumulative distribution function.
- Parameters
x (float) – Cumulative distribution function at x.
- Returns
density – Cumulative distribution function at x.
- Return type
float
-
copy
()¶ Copy the random variable.
- Returns
copied_rv – A copy of the random variable.
- Return type
-
distribution
¶ the scipy.stats. … distribution object
-
static
from_dictionary
(dictionary: dict) → util.random_variables.RV¶ Construct random variable from dictionary.
- Parameters
dictionary (dict) –
A dictionary with the keys
”name” (mandatory)
”args” (optional)
”kwargs” (optional)
as in scipy.stats.
Note
Either the “args” or the “kwargs” key has to be present.
-
pdf
(x)¶ Probability density function.
- Parameters
x (float) – Probability density at x.
- Returns
density – Probability density at x.
- Return type
float
-
pmf
(x)¶ Probability mass function.
- Parameters
x (int) – Probability mass at
x
.- Returns
mass – The mass at
x
.- Return type
float
-
rvs
()¶ Sample from the RV.
- Returns
sample – A sample from the random variable.
- Return type
float
-
class
abcsmc.
RVBase
¶ Bases:
abc.ABC
Random variable abstract base class.
Note
The reason we introduced another random variable class is that
scipy.stats
distributions are not pickleable. This class is a thin wrapper aroundscipy.stats
distributions to make them pickleable. It is important to be able to pickle them to execute the ACBSMC algorithm in a distributed cluster environment.-
abstract
cdf
(x: float) → float¶ Cumulative distribution function.
- Parameters
x (float) – Cumulative distribution function at x.
- Returns
density – Cumulative distribution function at x.
- Return type
float
-
abstract
copy
() → util.random_variables.RVBase¶ Copy the random variable.
- Returns
copied_rv – A copy of the random variable.
- Return type
-
abstract
pdf
(x: float) → float¶ Probability density function.
- Parameters
x (float) – Probability density at x.
- Returns
density – Probability density at x.
- Return type
float
-
abstract
pmf
(x) → float¶ Probability mass function.
- Parameters
x (int) – Probability mass at
x
.- Returns
mass – The mass at
x
.- Return type
float
-
abstract
rvs
() → float¶ Sample from the RV.
- Returns
sample – A sample from the random variable.
- Return type
float
-
abstract
-
class
abcsmc.
RVDecorator
(component: util.random_variables.RVBase)¶ Bases:
util.random_variables.RVBase
Random variable decorater base class.
Implement a decorator pattern.
Further decorators should derive from this class.
It stores the decorated random variable in
self.component
.Overwrite the method
decorator_repr
to represent the decorator type. The decorated variable will then be automatically included in the call to__repr__
.- Parameters
component (RVBase) – The random variable to be decorated.
-
cdf
(x)¶ Cumulative distribution function.
- Parameters
x (float) – Cumulative distribution function at x.
- Returns
density – Cumulative distribution function at x.
- Return type
float
-
component
¶ The decorated random variable
-
copy
()¶ Copy the random variable.
- Returns
copied_rv – A copy of the random variable.
- Return type
-
decorator_repr
() → str¶ Represent the decorator itself.
Template method.
The
__repr__
method useddecorator_repr
and the__repr__
of the decorated RV to build a combined representation.- Returns
decorator_repr – A string representing the decorator only.
- Return type
str
-
pdf
(x)¶ Probability density function.
- Parameters
x (float) – Probability density at x.
- Returns
density – Probability density at x.
- Return type
float
-
pmf
(x)¶ Probability mass function.
- Parameters
x (int) – Probability mass at
x
.- Returns
mass – The mass at
x
.- Return type
float
-
rvs
()¶ Sample from the RV.
- Returns
sample – A sample from the random variable.
- Return type
float
-
class
abcsmc.
RangeEstimatorDistanceFunction
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.DistanceFunctionWithMeasureList
Abstract base class for distance functions whose estimate is based on a range.
It defines the two template methods
lower
andupper
.Hence
\[d(x, y) = \sum_{i \in \text{measures}} \left | \frac{x_i - y_i}{u_i - l_i} \right |,\]where \(l_i\) and \(u_i\) are the lower and upper margins for measure \(i\).
-
__call__
(x, y)¶ Abstract method. This method has to be overwritten by all concrete implementations.
Evaluate the distance of the tentatively sampled particles relative to the measured data.
- Parameters
x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.
- Returns
distance – Distance of the tentatively sampled particles to the measured data.
- Return type
float
-
get_config
()¶ Return configuration of the distance function.
- Returns
config – Dictionary describing the distance function.
- Return type
dict
-
initialize
(sample_from_prior)¶ This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.
Per default, no calibration is made.
- Parameters
sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.
-
static
lower
(parameter_list: List[float])¶ Calculate the lower margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
lower_margin – The lower margin of the range calculated from these parameters.
- Return type
float
-
static
upper
(parameter_list: List[float])¶ Calculate the upper margin form a list of parameter values.
- Parameters
parameter_list (List[float]) – List of values of a parameter.
- Returns
upper_margin – The upper margin of the range calculated from these parameters.
- Return type
float
-
-
class
abcsmc.
SQLDataStore
(db: str)¶ Bases:
object
SQLData store for the ABCLoader class.
- Parameters
db (str) – SQLAlchemy connection string. E.g.: sqlite:////home/user/my_database.db.
-
class
abcsmc.
ZScoreDistanceFunction
(measures_to_use='all')¶ Bases:
abcsmc.distance_functions.DistanceFunctionWithMeasureList
Calculate distance as sum of ZScores over the selected measures. The measured data is the reference for the ZScore.
Hence
\[d(x, y) = \sum_{i \in \text{measures}} \left| \frac{x_i-y_i}{y_i} \right|.\]-
__call__
(x, y)¶ Abstract method. This method has to be overwritten by all concrete implementations.
Evaluate the distance of the tentatively sampled particles relative to the measured data.
- Parameters
x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.
- Returns
distance – Distance of the tentatively sampled particles to the measured data.
- Return type
float
-