ABC-SMC¶

ABC-SMC algorithms for Bayesian model selection.

class abcsmc.ABCLoader(data_store: abcsmc.loader.SQLDataStore)¶

Bases: object

Load ABC results from database and analyse.

Parameters: data_store (DataStore) – The datastore provides the database’s tables as pandas dataframes. Can be a SQLDataStore or pandas.HDFStore.

average_mass_at_tround_truth()¶: Averaged posterior probabilities, grouped by group_parameters.

property confusion_matrices_table¶: Confusion matrices.

confusion_matrix_dict()¶: Confusion matrices as dict, with keys indicating the sweep parameters.

property group_parameters¶: Paramters for grouping ABC sweeps.

property max_nr_populations¶: Maximum number of populations.

maximum_a_posteriori()¶: MAP estimates, grouped by group_parameters.

property maxs¶: Maxima of the results, grouped by the group_parameters.

means()¶: Means of the results, grouped by the group_parameters.

property model_names¶: Unique names of the models found in the database.

particles_of_population(abc_smc_id: int, model_name: str, t: int)¶

Return the particles of a given population. Useful if the posterior parameters are of interest.

Parameters

abc_smc_id (int) – ID of the ABCSMC run.
model_name (str) – Name of the model.
t (int) – Population number.

Returns

particles – The particles of the chosen population.

Return type

DataFrame

results()¶: Final results of the ABC runs.

terminated_abc_smc_ids()¶: IDs of already terminated ABCSMC runs.

class abcsmc.ABCSMC(models: List[Callable[[util.parameters.Parameter], dict]], model_prior_distribution: util.random_variables.RV, model_perturbation_kernel: util.random_variables.ModelPerturbationKernel, parameter_given_model_prior_distribution: List[util.random_variables.Distribution], adaptive_parameter_perturbation_kernels: List[Callable[[int, dict], util.random_variables.Kernel]], distance_function: abcsmc.distance_functions.DistanceFunction, eps: abcsmc.epsilon.Epsilon, nr_particles: int, mapper=<class 'map'>, debug: bool = False, max_nr_allowed_sample_attempts_per_particle: int = 500, min_nr_particles_per_population: int = 1)¶

Bases: object

Approximate Bayesian Computation - Sequential Monte Carlo (ABCSMC).

This is an implementation of an ABCSMC algorithm similar to 1

Parameters

models (List[Callable[[Parameter], dict]]) –
Calling models[m](par) returns the calculated summary statistics of model m with the corresponding parameters par.

Each callable represents thus one single model.
model_prior_distribution (RV) – A random variable giving the prior weights of the model classes. If the prior is uniform over the model classes this is something like RV("randint", 0, len(models)).
model_perturbation_kernel (ModelPerturbationKernel) – Kernel which governs with which probability to switch the model for a given sample.
parameter_given_model_prior_distribution (List[Distribution]) – A list of prior distributions for the models’ parameters. Each list entry is the prior distribution for the corresponding model.
adaptive_parameter_perturbation_kernels (List[Callable[[int, dict], Kernel]]) –
A list of functions mapping (t, stat) -> Kernel, where
- t is the population nr
- stat a dictionary of summary statistics.
  E.g. stat['std']['parameter_1'] is the standard deviation of parameter_1.
  
  Warning
  
  If a model has only one particle left the standard deviation is zero.
This callable is called at the beginning of a new population with the statistics dictionary from the last population to determine the new parameter perturbation kernel for the next population.
distance_function (DistanceFunction) – Measures the distance of the tentatively sampled particle to the measured data.
eps (Epsilon) – Returns the current acceptance epsilon. This epsilon changes from population to population. The eps instance provides the strategy according to which to change it.
mapper (map like) – A callable which behaves like the built-in map function. I.e. mapper(f, args) takes a callable f and applies it to the arguments in the list args. This mapper is used for particle sampling. It can be a distributed mapper such as the parallel.sge.SGE class.
debug (bool) – Whether to output additional debug information.
max_nr_allowed_sample_attempts_per_particle (int) – The maximum number of sample attempts allowed for each particle. If this number is reached, the sampling for a particle is stopped. Hence, a population may return with less particles than started. This is an approximation to the ABCSMC algorithm which ensures, that the algorithm terminates.
min_nr_particles_per_population (int) – Minimum number of samples which have to be accepted for a population. If this number is not reached, the algorithm stops. This option, together with the max_nr_allowed_sample_attempts_per_particle ensures that the algorithm terminates. This parameter determines to which extent the ABCSMC algorithm is approximated.

1: Toni, Tina, and Michael P. H. Stumpf. “Simulation-Based Model Selection for Dynamical Systems in Systems and Population Biology.” Bioinformatics 26, no. 1 (2010): 104–10. doi:10.1093/bioinformatics/btp619.

do_not_stop_when_only_single_model_alive()¶

Calling this method causes the ABCSMC to still continue if only a single model is still alive. This is useful if the interest lies in estimating the model parameter as compared to performing model selection.

The default behavior is to stop when only a single model is alive.

run(nr_samples_per_particle: List[int], minimum_epsilon: float) → abcsmc.storage.History¶

Run the ABCSMC model selection. This method can be called many times. It makes another step continuing where it has stopped before.

It is stopped when the maximum number of populations is reached or the minimum_epsilon value is reached.

Parameters

nr_samples_per_particle (List[int]) –
The length of the list determines the maximal number of populations.

The entries of the list the number of iterated simulations in the notation from 2 these are the \(B_t\). Usually, the entries are all ones: nr_samples_per_particle = [1] * nr_populations.
minimum_epsilon (float) – Stop if epsilon is smaller than minimum epsilon specified here.

2: Toni, Tina, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael P. H. Stumpf. “Approximate Bayesian Computation Scheme for Parameter Inference and Model Selection in Dynamical Systems.” Journal of The Royal Society Interface 6, no. 31 (2009): 187–202. doi:10.1098/rsif.2008.0172.

sample_from_prior() → List[dict]¶: Only sample from prior and return results without changing the history. This can be used to get initial samples for the distance function or the epsilon to calibrate them.

Warning

The sample is cached.

set_data(observed_summary_statistics: dict, ground_truth_model_nr_or_name: Union[int, str], ground_truth_parameter: dict, abc_options: dict, model_names: Iterable[str])¶

Set the data to be fitted.

Parameters

observed_summary_statistics (dict) –
This is the really important parameter here. It is of the form {'statistic_1' : val_1, 'statistic_2': val_2, ... }.

The dictionary provided here represents the measured data. Particles during ABCSMC sampling are compared with the summary statistics provided here.
ground_truth_model_nr_or_name (Union[int, str]) – This is only meta data stored to the database, but not actually used for the ABCSMC algorithm. To evaluate the ABCSMC procedure against synthetic samples, this parameter can be used to indicate the ground truth model number or name. This helps with further analysis. If actually measured data is used, it is recommended to set this parameter to -1.
ground_truth_parameter (dict) – Similar to ground_truth_model_nr_or_name, this is only for recording purposes, but not used in the ABCSMC algorithm. This stores the parameters of the ground truth model if it was synthetically obtained.
abc_options (dict) – Has to contain the key “db_path” which has to be a valid SQLAlchemy database identifier. Can contain an arbitrary number of additional keys, only for recording purposes. Store arbitrary meta information in this dictionary.
model_names (List[str]) – Only for recording purposes. Record names of the models.

class abcsmc.ConstantEpsilon(constant_epsilon_value: float)¶

Bases: abcsmc.epsilon.Epsilon

Keep epsilon constant over all populations.

Parameters: constant_epsilon_value (float) – The epsilon value for all populations.

__call__(t, history)¶

Parameters

t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.

Returns

eps – The new epsilon for population t.

Return type

float

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

class abcsmc.DistanceFunction¶

Bases: abc.ABC

Abstract case class for distance functions.

Any other distance function should inherit from this class.

abstract __call__(x: dict, x_0: dict) → float¶

Abstract method. This method has to be overwritten by all concrete implementations.

Evaluate the distance of the tentatively sampled particles relative to the measured data.

Parameters

x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.

Returns

distance – Distance of the tentatively sampled particles to the measured data.

Return type

float

get_config() → dict¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

initialize(sample_from_prior: List[dict])¶

This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters: sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.

to_json() → str¶

Return JSON encoded configuration of the distance function.

Returns: json_str – JSON encoded string describing the distance function. The default implementation is to try to convert the dictionary returned by get_config.
Return type: str

class abcsmc.DistanceFunctionWithMeasureList(measures_to_use='all')¶

Bases: abcsmc.distance_functions.DistanceFunction

Base class for distance functions with measure list.

Parameters

measures_to_use (Union[str, List[str]]) –

If set to “all”, all measures are used. This is the default.
If a list is provided, the measures in the list are used.

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

initialize(sample_from_prior)¶

This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters: sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.

measures_to_use¶: The measures (summary statistics) to use for distance calculation.

sanitize_sample_from_prior(sample)¶: Remove samples in which any of the measures is NaN. Added by Alessandro Motta <alessandro.motta@brain.mpg.de>

class abcsmc.Distribution(*args, **kwargs)¶

Bases: util.parameters.ParameterStructure

Distribution of parameters for a model.

A distribution is a collection of RVs and/or distributions. It is a dictionary-like object of random variables or distributions.

This should be used as prior and also as Kernel density.

copy() → util.random_variables.Distribution¶

Copy the distribution.

Returns: copied_distribution – A copy of the distribution.
Return type: Distribution

static from_dictionary_of_dictionaries(dict_of_dicts: dict) → util.random_variables.Distribution¶

Create distribution from dictionary of dictionaries.

Parameters: dict_of_dicts (dict) – The keys of the dict indicate the parameters’ names. The values are itself dictionaries representing scipy.stats distributions. I.e. they have the key “name” and at least one of the keys “args” or “kwargs”.
Returns: distribution – Created distribution.
Return type: Distribution

get_parameter_names() → list¶

Sorted list of parameter names.

Returns: sorted_names – Sorted list of parameter names.
Return type: list

pdf(x: Union[util.parameters.Parameter, dict])¶

Get combination of probability density function (for continuous variables) and probability mass function (for discrete variables) at point x.

Parameters: x (Union[Parameter, dict]) – Evaluate at the given Parameter x.

rvs() → util.parameters.Parameter¶

Sample from joint distribution.

Returns: parameter – A parameter which was sampled.
Return type: Parameter

update_random_variables(**random_variables)¶

Update random variables within the distribution.

Parameters: **random_variables – keywords are the parameters’ names, the values are random variables.

class abcsmc.EmptyMultivariateMultiTypeNormalDistribution¶

Bases: object

Empty multivariate distribution.

Returns always empty parameters upon sampling.

pdf(x)¶: Return always 1.

rvs()¶: Return empty Parameter.

class abcsmc.Epsilon¶

Bases: abc.ABC

Abstract epsilon base class.

This class encapsulates a strategy for setting a new epsilon for each new population.

abstract __call__(t: int, history: abcsmc.storage.History)¶

Parameters

t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.

Returns

eps – The new epsilon for population t.

Return type

float

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

initialize(sample_from_prior: List[dict], distance_to_ground_truth_function: Callable[[dict], float])¶

This method is called by the ABCSMC framework before the first usage of the epsilon and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters

sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
distance_to_ground_truth_function (Callable[[dict], float]) – One of the distance functions pre-evaluated at its second argument (the one representing the measured data). E.g. similar to lambda x: distance_function(x, x_measured).

to_json()¶

Return JSON encoded configuration of the distance function.

Returns: json_str – JSON encoded string describing the distance function. The default implementation is to try to convert the dictionary returned by get_config.
Return type: str

class abcsmc.History(db_path: str, nr_models: int, model_names: List[str], min_nr_particles_per_population: int, debug=False)¶

Bases: object

History for ABCSMC.

This class records the evolution of the populations and stores the ABCSMC results.

Parameters

db_path (str) – SQLAlchemy database identifier.
nr_models (int) – Number of models.
model_names (List[str]) – List of model names.
min_nr_particles_per_population (int) – Minimum number of particles per population.
debug (bool) – Whether to print additional debug output.

Warning

Most likely this class is never manually instantiated. An instance of this class is returned by the ABCSMC.run method. It can then be used for querying. However, most likely even that won’t be used since querying is usually done on the stored database using the abc_loader.

append_population(t: int, current_epsilon: float, particle_population: list)¶

Append population to database.

Parameters

t (int) – Population number.
current_epsilon (float) – Current epsilon value.
particle_population (list) – List of sampled particles.

Returns

enough_particles – Whether enough particles were found in the population.

Return type

bool

done()¶: Close database sessions and store end time of population.

get_complete_population_median(t: int) → float¶

Median of a population’s distances to the measured sample.

Parameters: t (int) – Population number.
Returns: median – The median of the distances.
Return type: float

static get_cov(particles: list) → Union[util.random_variables.NonEmptyMultivariateMultiTypeNormalDistribution, util.random_variables.EmptyMultivariateMultiTypeNormalDistribution]¶

Covariance from particles.

Parameters: particles (list) – List of particles.
Returns: cov – The covariance representing distribution.
Return type: Union[NonEmptyMultivariateMultiTypeNormalDistribution, EmptyMultivariateMultiTypeNormalDistribution]

get_distribution(t: int, m: int, parameter: str) → Tuple[numpy.ndarray]¶

Returns parameter values and weights.

Parameters

t (int) – Population number.
m (int) – Model number.
parameter (str) –

Returns

(points, weights) – The points and their weights.

Return type

Tuple[np.ndarray]

get_model_probabilities(t=- 1) → numpy.ndarray¶

Model probabilities.

Parameters: t (int) – Population. Defaults to -1, i.e. the last population.
Returns: probabilities – Model probabilities.
Return type: np.ndarray

get_parameter_std(t: int, m: int) → dict¶

Standard deviation of the parameters in a given population.

Parameters

t (int) – Population number.
m (int) – Model number.

Returns

std – Dictionary with keys the parameter names and values their standard deviations.

Return type

dict

get_results()¶: G the full last record.

get_results_distribution(m: int, parameter: str) → Tuple[numpy.ndarray]¶

Returns parameter values and weights of the last population.

Parameters

m (int) – Model number.
parameter (str) – Parameter name.

Returns

results – results = (points, weights) with the points and the weights of the last population.

Return type

Tuple[np.ndarray]

get_statistics(t: int) → dict¶

Statistics from particle populations.

Parameters: t (int) – Population number.
Returns: stat – List of population statistics at the time t. Each list entry corresponds to a model. [{"std": ..., "nr_particles": ..., "cov": ...}, {"std": ..., "nr_particles": ..., "cov": ...}, ... ].
Return type: list

nr_of_models_alive(t=- 1) → int¶

Number of models still alive.

Parameters: t (int) – Population number.
Returns: nr_alive – Number of models still alive.
Return type: int

nr_simulations¶: Only counts the simulations which appear in particles. If a simulation terminated prematurely it is not counted.

sample_from_models(t: int) → int¶

Sample from the distribution over models.

Parameters: t (int) – Population number.
Returns: model_choice – This is m*in the notation from 3 .
Return type: int

3: Toni, Tina, and Michael P. H. Stumpf. “Simulation-Based Model Selection for Dynamical Systems in Systems and Population Biology.” Bioinformatics 26, no. 1 (2010): 104–10. doi:10.1093/bioinformatics/btp619.

sample_from_population(t: int, m: int) → Optional[abcsmc.storage.Parameter]¶

Sample from population.

Parameters

t (int) – Population number.
m (int) – Model number.

Returns

sample – Returns None if population t,m is empty, otherwise a sample parameter from it.

Return type

Union[Parameter, None]

store_initial_data(ground_truth_model_nr_or_name: Union[int, str], options, observed_summary_statistics: dict, ground_truth_parameter: dict, distance_function_json_str: str, eps_function_json_str: str)¶

Store the initial configuration data.

Parameters

ground_truth_model_nr_or_name (Union[int, str]) – number or name of the ground truth model.
observed_summary_statistics (dict) – the measured summary statistics.
ground_truth_parameter (dict) – the ground truth parameters.
distance_function_json_str (str) – the distance function represented as json string.
eps_function_json_str (str) – the epsilon represented as json string.

property t¶: Current population.

property total_nr_simulations¶: Total number of simulations/samples.

class abcsmc.Kernel(*distribution, **random_variables)¶

Bases: object

A Kernel of the form K(x,y) = K(x-y).

Can be initialized from a distribution or using individual variables. E.g. do Kernel(distribution) or Kernel(par_name_1=rv1, par_name2=rv2).

If X is a given RV with pdf f, then K(x,y) = f(x-y).

add_random_variables(**random_variables)¶

Add random variables to kernel.

Parameters: random_variables (keyword arguments) – Keys are the names, values the random variables.

pdf(x, y)¶: Return density \(K(x,y)\), i.e., the probability of transitioning from y to x.

rvs(theta)¶: Return sample from \(K( \cdot, theta)\).

class abcsmc.ListEpsilon(values: List[float])¶

Bases: abcsmc.epsilon.Epsilon

Return epsilon values from a predefined list.

Parameters: values (List[float]) – List of epsilon values. values[k] is the value for population k.

__call__(t, history)¶

Parameters

t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.

Returns

eps – The new epsilon for population t.

Return type

float

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

class abcsmc.LowerBoundDecorator(component: util.random_variables.RV, lower_bound: float)¶

Bases: util.random_variables.RVDecorator

Impose a strict lower bound on a random variable. Condition RV X to “X > lower bound”. In particular P(X = lower_bound) = 0.

Note

Sampling is done via rejection. Up to 10000 samples are taken from the decorated RV. The first sample within the permitted range is then taken. Otherwise None is returned.

Parameters

component (RV) – The decorated random variable.
lower_bound (float) – The lower bound.

cdf(x)¶

Cumulative distribution function.

Parameters: x (float) – Cumulative distribution function at x.
Returns: density – Cumulative distribution function at x.
Return type: float

copy()¶

Copy the random variable.

Returns: copied_rv – A copy of the random variable.
Return type: RVBase

decorator_repr()¶

Represent the decorator itself.

Template method.

The __repr__ method used decorator_repr and the __repr__ of the decorated RV to build a combined representation.

Returns: decorator_repr – A string representing the decorator only.
Return type: str

pdf(x)¶

Probability density function.

Parameters: x (float) – Probability density at x.
Returns: density – Probability density at x.
Return type: float

pmf(x)¶

Probability mass function.

Parameters: x (int) – Probability mass at x.
Returns: mass – The mass at x.
Return type: float

rvs()¶

Sample from the RV.

Returns: sample – A sample from the random variable.
Return type: float

class abcsmc.MedianEpsilon(initial_epsilon: Union[str, int] = 'from_sample', median_multiplier: float = 1)¶

Bases: abcsmc.epsilon.Epsilon

Calculate epsilon as median from the last population.

Parameters

initial_epsilon (Union[str, int]) –
- If ‘from_sample’, then the initial median is calculated from samples as its median.
- If a number is given, this number is used.
median_multiplier (float) – Multiplies the median by that number. Also applies it to the initial median if it is calculated from samples. However, it does not apply to the initial median if it is given as a number.

__call__(t, history)¶

Parameters

t (int) – The population number.
history (History) – ABC history object. Can be used to query summary statistics to set the epsilon.

Returns

eps – The new epsilon for population t.

Return type

float

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

initialize(sample_from_prior, distance_to_ground_truth_function)¶

This method is called by the ABCSMC framework before the first usage of the epsilon and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters

sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
distance_to_ground_truth_function (Callable[[dict], float]) – One of the distance functions pre-evaluated at its second argument (the one representing the measured data). E.g. similar to lambda x: distance_function(x, x_measured).

class abcsmc.MinMaxDistanceFunction(measures_to_use='all')¶

Bases: abcsmc.distance_functions.RangeEstimatorDistanceFunction

Calculate upper and lower margins as max and min of the parameters.

static lower(parameter_list)¶

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: lower_margin – The lower margin of the range calculated from these parameters.
Return type: float

static upper(parameter_list)¶

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: upper_margin – The upper margin of the range calculated from these parameters.
Return type: float

class abcsmc.ModelPerturbationKernel(nr_of_models: int, probability_to_stay: Optional[float] = None)¶

Bases: object

Model perturbation kernel.

Parameters

nr_of_models (int) – Number of models.
probability_to_stay (Union[float, None]) – If None, probability to stay is set to 1/nr_of_models. Otherwise, the supplied value is used.

pmf(n: int, m: int) → float¶

Parameters

n (int) – Model target number.
m (int) – Model source number.

Returns

probability – Probability with which to jump from m to n.

Return type

float

rvs(m: int) → int¶

Sample a Kernel jump from model m to another model.

Parameters: m (int) – Model source number.
Returns: target – Target model number.
Return type: int

abcsmc.MultivariateMultiTypeNormalDistribution(covariance_matrix, parameter_names, parameter_types, zero_covariance_substitutes=0.0001) → Union[util.random_variables.NonEmptyMultivariateMultiTypeNormalDistribution, util.random_variables.EmptyMultivariateMultiTypeNormalDistribution]¶

Factory function for multivariate and multitype normal distribution.

This distribution is essentially a multivariate normal, but takes into account if a type is an integer and returns it always as rounded integer. This is useful if some of the model parameters are discrete.

Parameters

covariance_matrix (np.ndarray) – 2D array. The covariance matrix.
parameter_names (List[str]) – List of parameter names.
parameter_types (list) – A list containing int and/or float to indicate whether a parameter is of type float or int.
zero_covariance_substitutes (float) – Substitutes zero variance entries on the diagonal of the diagonal representation of the covariance matrix.

Returns

multivariate_distribution – Returns NonEmptyMultivariateMultiTypeNormalDistribution of len(parameter_names) > 0 otherwise a EmptyMultivariateMultiTypeNormalDistribution is returned.

Return type

Union[NonEmptyMultivariateMultiTypeNormalDistribution, EmptyMultivariateMultiTypeNormalDistribution]

class abcsmc.NonEmptyMultivariateMultiTypeNormalDistribution(covariance_matrix: numpy.ndarray, parameter_names: List[str], parameter_types: list, zero_covariance_substitutes=0.0001)¶

Bases: object

Multivariate and multitype normal distribution.

This distribution is essentially a multivariate normal, but takes into account if a type is an integer and returns it always as rounded integer. This is useful if some of the model parameters are discrete.

Parameters

covariance_matrix (np.ndarray) – 2D array. The covariance matrix.
parameter_names (List[str]) – List of parameter names.
parameter_types (list) – A list containing int and/or float to indicate whether a parameter is of type float or int.
zero_covariance_substitutes (float) – Substitutes zero variance entries on the diagonal of the covariance matrix.

pdf(x: dict) → float¶

Probability density function at x.

Parameters: x (dict) – Where to predict.
Returns: density – The probability density.
Return type: float

rvs() → util.parameters.Parameter¶: Sample from distribution.

class abcsmc.PCADistanceFunction(measures_to_use='all')¶

Bases: abcsmc.distance_functions.DistanceFunctionWithMeasureList

Calculate distance in whitened coordinates.

A whitening transformation \(W\) is calculated from an initial sample. The distance is measured as Euclidean distance in the transformed space. I.e

\[d(x,y) = \| Wx - Wy \|.\]

__call__(x, y)¶

Abstract method. This method has to be overwritten by all concrete implementations.

Evaluate the distance of the tentatively sampled particles relative to the measured data.

Parameters

x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.

Returns

distance – Distance of the tentatively sampled particles to the measured data.

Return type

float

initialize(sample_from_prior)¶

This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters: sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.

class abcsmc.Parameter(*args, **kwargs)¶

Bases: util.parameters.ParameterStructure

A single model parameter.

Parameters are a dictionary with the additional functionality to add and subtract parameters.

I.e. par_1 + par_2 adds key wise.

copy() → util.parameters.Parameter¶: Copy the parameter.

class abcsmc.PercentileDistanceFunction(measures_to_use='all')¶

Bases: abcsmc.distance_functions.RangeEstimatorDistanceFunction

Calculate normalization 20% and 80% from percentiles as lower and upper margins.

PERCENTILE = 20¶: The percentiles

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

static lower(measures)¶

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: lower_margin – The lower margin of the range calculated from these parameters.
Return type: float

static upper(measures)¶

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: upper_margin – The upper margin of the range calculated from these parameters.
Return type: float

class abcsmc.RV(name: str, *args, **kwargs)¶

Bases: util.random_variables.RVBase

Concrete random variable.

Parameters

name (str) – Name of the distribution as in scipy.stats.
args – Arguments as in scipy.stats matching the distribution with name “name”.

kwargs:: Keyword arguments as in scipy.stats matching the distribution with name “name”.

cdf(x)¶

Cumulative distribution function.

Parameters: x (float) – Cumulative distribution function at x.
Returns: density – Cumulative distribution function at x.
Return type: float

copy()¶

Copy the random variable.

Returns: copied_rv – A copy of the random variable.
Return type: RVBase

distribution¶: the scipy.stats. … distribution object

static from_dictionary(dictionary: dict) → util.random_variables.RV¶

Construct random variable from dictionary.

Parameters

dictionary (dict) –

A dictionary with the keys

”name” (mandatory)

”args” (optional)

”kwargs” (optional)

as in scipy.stats.

Note

Either the “args” or the “kwargs” key has to be present.

pdf(x)¶

Probability density function.

Parameters: x (float) – Probability density at x.
Returns: density – Probability density at x.
Return type: float

pmf(x)¶

Probability mass function.

Parameters: x (int) – Probability mass at x.
Returns: mass – The mass at x.
Return type: float

rvs()¶

Sample from the RV.

Returns: sample – A sample from the random variable.
Return type: float

class abcsmc.RVBase¶

Bases: abc.ABC

Random variable abstract base class.

Note

The reason we introduced another random variable class is that scipy.stats distributions are not pickleable. This class is a thin wrapper around scipy.stats distributions to make them pickleable. It is important to be able to pickle them to execute the ACBSMC algorithm in a distributed cluster environment.

abstract cdf(x: float) → float¶

Cumulative distribution function.

Parameters: x (float) – Cumulative distribution function at x.
Returns: density – Cumulative distribution function at x.
Return type: float

abstract copy() → util.random_variables.RVBase¶

Copy the random variable.

Returns: copied_rv – A copy of the random variable.
Return type: RVBase

abstract pdf(x: float) → float¶

Probability density function.

Parameters: x (float) – Probability density at x.
Returns: density – Probability density at x.
Return type: float

abstract pmf(x) → float¶

Probability mass function.

Parameters: x (int) – Probability mass at x.
Returns: mass – The mass at x.
Return type: float

abstract rvs() → float¶

Sample from the RV.

Returns: sample – A sample from the random variable.
Return type: float

class abcsmc.RVDecorator(component: util.random_variables.RVBase)¶

Bases: util.random_variables.RVBase

Random variable decorater base class.

Implement a decorator pattern.

Further decorators should derive from this class.

It stores the decorated random variable in self.component.

Overwrite the method decorator_repr to represent the decorator type. The decorated variable will then be automatically included in the call to __repr__.

Parameters: component (RVBase) – The random variable to be decorated.

cdf(x)¶

Cumulative distribution function.

Parameters: x (float) – Cumulative distribution function at x.
Returns: density – Cumulative distribution function at x.
Return type: float

component¶: The decorated random variable

copy()¶

Copy the random variable.

Returns: copied_rv – A copy of the random variable.
Return type: RVBase

decorator_repr() → str¶

Represent the decorator itself.

Template method.

The __repr__ method used decorator_repr and the __repr__ of the decorated RV to build a combined representation.

Returns: decorator_repr – A string representing the decorator only.
Return type: str

pdf(x)¶

Probability density function.

Parameters: x (float) – Probability density at x.
Returns: density – Probability density at x.
Return type: float

pmf(x)¶

Probability mass function.

Parameters: x (int) – Probability mass at x.
Returns: mass – The mass at x.
Return type: float

rvs()¶

Sample from the RV.

Returns: sample – A sample from the random variable.
Return type: float

class abcsmc.RangeEstimatorDistanceFunction(measures_to_use='all')¶

Bases: abcsmc.distance_functions.DistanceFunctionWithMeasureList

Abstract base class for distance functions whose estimate is based on a range.

It defines the two template methods lower and upper.

Hence

\[d(x, y) = \sum_{i \in \text{measures}} \left | \frac{x_i - y_i}{u_i - l_i} \right |,\]

where \(l_i\) and \(u_i\) are the lower and upper margins for measure \(i\).

__call__(x, y)¶

Abstract method. This method has to be overwritten by all concrete implementations.

Evaluate the distance of the tentatively sampled particles relative to the measured data.

Parameters

x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.

Returns

distance – Distance of the tentatively sampled particles to the measured data.

Return type

float

get_config()¶

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function.
Return type: dict

initialize(sample_from_prior)¶

This method is called by the ABCSMC framework before the first usage of the distance function and can be used to calibrate it to the statistics of the samples.

Per default, no calibration is made.

Parameters: sample_from_prior (List[dict]) – List of dictionaries containig the summary statistics.

static lower(parameter_list: List[float])¶

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: lower_margin – The lower margin of the range calculated from these parameters.
Return type: float

static upper(parameter_list: List[float])¶

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter.
Returns: upper_margin – The upper margin of the range calculated from these parameters.
Return type: float

class abcsmc.SQLDataStore(db: str)¶

Bases: object

SQLData store for the ABCLoader class.

Parameters: db (str) – SQLAlchemy connection string. E.g.: sqlite:////home/user/my_database.db.

class abcsmc.ZScoreDistanceFunction(measures_to_use='all')¶

Bases: abcsmc.distance_functions.DistanceFunctionWithMeasureList

Calculate distance as sum of ZScores over the selected measures. The measured data is the reference for the ZScore.

Hence

\[d(x, y) = \sum_{i \in \text{measures}} \left| \frac{x_i-y_i}{y_i} \right|.\]

__call__(x, y)¶

Abstract method. This method has to be overwritten by all concrete implementations.

Evaluate the distance of the tentatively sampled particles relative to the measured data.

Parameters

x (dict) – Summary statistics of the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the measured data.

Returns

distance – Distance of the tentatively sampled particles to the measured data.

Return type

float

ABC-SMC¶

discriminatEM

Navigation

Related Topics