API reference

pysco

pysco.utils

pysco.utils.find_files(dir, extension)[source]

Find all files in a directory with a given extension.

Parameters:

dir (str) – Directory to search for files.
extension (str) – File extension to search for.

Returns:

files – List of files found in the directory with the given extension.

Return type:

list

pysco.utils.generate_symbols(N=26)[source]

Generate symbols from ‘a’ to ‘z’, then ‘aa’…’zz’, etc. up to number N (number of symbols).

Parameters:: N (int) – The number of symbols to generate.
Returns:: A list of generated symbols.
Return type:: all_symbols (list)

pysco.utils.get_free_gpus(n_gpus=1)[source]

Get the IDs of free GPUs (with load less than 1% and memory less than 1%).

Parameters:: n_gpus (int) – Number of free GPUs to return.
Returns:: free_gpus – List of IDs of free GPUs.
Return type:: list

pysco.utils.remove_directory(path)[source]

Removes a directory and all its contents.

Parameters:: path (str) – The path to the directory to remove.

pysco.utils.remove_files(directory, files)[source]

Removes a list of files from a directory.

Parameters:

directory (str) – The directory containing the files to remove.
files (list) – List of files to remove.

pysco.utils.reorder_dict(dict_in, ordered_keys)[source]

Reorder the keys of a dictionary.

Parameters:

dict_in (dict) – The dictionary to reorder.
ordered_keys (list) – The keys of the dictionary in the desired order.

Returns:

The reordered dictionary.

Return type:

dict_out (dict)

pysco.utils.timeit(function)[source]: Time the execution of a decorated function.

pysco.plots

pysco.plots.plot

pysco.plots.plot.CMAPS = {'div': <matplotlib.colors.LinearSegmentedColormap object>, 'div_r': <matplotlib.colors.LinearSegmentedColormap object>, 'seq': <matplotlib.colors.LinearSegmentedColormap object>, 'seq_r': <matplotlib.colors.LinearSegmentedColormap object>}: Continuous colormaps derived from the sequential and diverging palettes.

pysco.plots.plot.PALETTES = {'cat': ['#1B6BBF', '#E8832A', '#7B3FA0', '#2A9E8F', '#C94040', '#F0B429', '#5AAADE', '#A0522D'], 'colors10': ['#3f90da', '#ffa90e', '#bd1f01', '#94a4a2', '#832db6', '#a96b59', '#e76300', '#b9ac70', '#717581', '#92dadd'], 'colors6': ['#5790fc', '#f89c20', '#e42536', '#964a8b', '#9c9ca1', '#7a21dd'], 'colors8': ['#1845fb', '#ff5e02', '#c91f16', '#c849a9', '#adad7d', '#86c8dd', '#578dff', '#656364'], 'div': ['#033270', '#0A85B8', '#A8D8EA', '#A8EDE6', '#06CFC2', '#04E8C8'], 'fancy': ['#1c161c', '#324b58', '#088395', '#8ac5ad', '#a1f5a8'], 'seq': ['#033270', '#0B5896', '#0A85B8', '#08ACBE', '#06CFC2', '#04E8C8']}: All built-in colour palettes, keyed by short name.

pysco.plots.plot.chainplot(dfs, names=None, columns=None, truths=None, colors=None, savefig=None, ls=None, padding=(4.0, 2.5), n_ticks=4, chain_kwargs=None, chainconfig_kwargs=None, legend_kwargs=None, plotconfig_kwargs=None)[source]

Plot MCMC chains using ChainConsumer with adaptive styling.

Parameters:

dfs (list, dict, or DataFrame) – Samples to plot. Accepts a single DataFrame, a list of DataFrames, or a {name: DataFrame} dictionary.
names (list of str, optional) – Display name for each chain. When dfs is a dict the keys are used.
columns (None, int, or list, optional) – Which columns to plot. None → all, int → first n, list → explicit names, list[list] → per-chain selections.
truths (dict, optional) – True parameter values passed to chainconsumer.Truth.
colors (str or list, optional) – Colour(s) for the chains. A palette name ('cat', 'colors6', 'colors10', …), a single colour string, or an explicit list. Default is 'cat' (8-colour CVD-safe).
savefig (str or path-like, optional) – If given, save the figure to this path.
ls (str, optional) – Linestyle applied to every chain. None cycles ['-', '--', '-.', ':'].
padding (float or tuple of float, optional) – Extra padding (in points) between tick labels and axis labels, passed directly to ax.xaxis.labelpad / ax.yaxis.labelpad. A scalar applies to both axes; a 2-tuple sets (xpad, ypad). Default is (4.0, 2.5).
n_ticks (int, optional) – Maximum number of major ticks per subplot axis. A smaller value (e.g. 3) gives more breathing room between tick labels; a larger value (e.g. 5) adds more detail. Default is 4.
chain_kwargs (dict, optional) – Extra keyword arguments forwarded to chainconsumer.Chain.
chainconfig_kwargs (dict, optional) – Overrides for chainconsumer.ChainConfig.
legend_kwargs (dict, optional) – Overrides for the matplotlib legend.
plotconfig_kwargs (dict, optional) – Overrides for chainconsumer.PlotConfig.

Returns:

C (chainconsumer.ChainConsumer) – The configured ChainConsumer object.
fig (matplotlib.figure.Figure) – The resulting figure.

pysco.plots.plot.check_latex()[source]: Enable LaTeX rendering in matplotlib if a LaTeX installation is found on the system PATH, otherwise disable it.

pysco.plots.plot.corner(*args, **kwargs)[source]

Produce a corner plot via corner.corner() with adaptive styling.

Accepts all keyword arguments of corner.corner() plus the pysco-specific keys documented in custom_corner().

pysco.plots.plot.custom_color_cycle(colors='colors6', linestyles=['-'], skip=0, lsfirst=False)[source]

Configure the matplotlib color (and optionally linestyle) cycle.

Parameters:

colors (str or list, optional) – Either the name of a built-in color list ('colors6', 'colors8', 'colors10', 'fancy' — all sourced from arXiv:2107.02270 except 'fancy') or an explicit list of matplotlib color strings. Default is 'colors6'.
linestyles (list or int, optional) – Linestyles to include in the cycle. Pass an int to take the first n styles from ['-', '--', '-.', ':']. Default is ['-'].
skip (int, optional) – Number of colors to skip from the beginning of the list. Useful when the first color conflicts with other plot elements. Default is 0.
lsfirst (bool, optional) – If True, the linestyle axis cycles faster than color (i.e. cycler(color) * cycler(linestyle)). If False (default), color cycles faster.

pysco.plots.plot.custom_corner(function)[source]

Decorator that wraps a corner-plot function with adaptive styling.

Temporarily applies gently scaled rcParams (based on the number of parameters) inside a try/finally block so the caller’s rcParams are always restored, even if the wrapped function raises.

Pysco-specific keyword arguments consumed by the wrapper (not forwarded to the underlying corner function):

savefig – path-like; save the figure to this file.
n_ref – int; reference parameter count for scaling (default 5).
custom_rc – bool; apply corner-specific rcParam overrides (default True).
histtype – str; histogram type for 1-D marginals (default 'step').
histalpha – float; face-colour alpha for 1-D histograms (default 0.1).

Parameters:: function (callable) – The function that produces the corner plot (receives *args, **kwargs).
Returns:: Wrapped function.
Return type:: callable

pysco.plots.plot.default_plotting(style='light', backcolor=None, frontcolor=None)[source]

Set the default plotting parameters for matplotlib.

Parameters: - style (str): The style of the plot. Can be ‘light’, ‘dark’. Default is ‘light’. - backcolor (str): The background color of the plot. Default is ‘white’. - frontcolor (str): The foreground color of the plot. Default is ‘black’.

pysco.plots.plot.get_cmap(name='seq')[source]

Return one of the pysco colormaps by short name.

Parameters:: name (str, optional) – One of 'seq', 'seq_r', 'div', 'div_r'. Default is 'seq'.
Return type:: matplotlib.colors.LinearSegmentedColormap

pysco.plots.plot.get_colors_from_cmap(N, cmap='viridis', reverse=False)[source]

Sample N evenly-spaced colors from a named matplotlib colormap.

Colors are drawn from the upper 70 % of the colormap range ([0.3, 1.0]) to avoid overly light shades.

Parameters:

N (int) – Number of colors to return.
cmap (str, optional) – Name of the matplotlib colormap to sample. Default is 'viridis'.
reverse (bool, optional) – If True, return the colors in reversed order. Default is False.

Returns:

Array of shape (N, 4) containing RGBA colors.

Return type:

numpy.ndarray

pysco.plots.plot.get_colorslist(colors='colors6')[source]

Return a list of hex color strings for one of the built-in palettes.

Parameters:

colors (str, optional) –

Key identifying the palette. One of:

'colors6' — 6-color palette (arXiv:2107.02270)
'colors8' — 8-color palette (arXiv:2107.02270)
'colors10' — 10-color palette (arXiv:2107.02270)
'fancy' — a small personal palette
'cat' — CVD-safe categorical (8 colours)
'seq' — sequential anchor colours (lavender → navy)
'div' — diverging anchor colours (purple ↔ blue)

Default is 'colors6'.

Returns:

Hex color strings for the requested palette.

Return type:

list of str

pysco.plots.plot.reset_rc()[source]: Reset all matplotlib rcParams to their default values.

pysco.plots.plot.set_color_cycle_from_cmap(cmap=None)[source]

Set the matplotlib color cycle from a colormap object.

Parameters:: cmap (matplotlib.colors.Colormap or None, optional) – A colormap whose .colors attribute is used as the new cycle. If None, the default matplotlib color cycle is restored.

pysco.plots.plot.set_colors(backcolor='white', frontcolor='black')[source]

Apply background and foreground colors to all relevant matplotlib rcParams.

This updates axes, tick, legend, grid, and text colors in one call so that the entire figure adopts a consistent color scheme.

Parameters:

backcolor (str, optional) – Background color applied to the figure, axes, and legend. Accepts any matplotlib color string (name, hex, 'none', …). Default is 'white'.
frontcolor (str, optional) – Foreground color applied to text, labels, edges, ticks, grid lines, and legend entries. Default is 'black'.

pysco.plots.plot.set_ticker()[source]

Create a ScalarFormatter that respects the current LaTeX rendering setting.

Note: the formatter is returned but not applied to any axis; assign it explicitly with ax.xaxis.set_major_formatter(set_ticker()).

Returns:: Formatter instance configured to match text.usetex.
Return type:: ticker.ScalarFormatter

pysco.plots.plot.to_pandas(samples, labels)[source]

Converts the samples to a pandas DataFrame.

Parameters: - samples (array): The samples to convert. - labels (list): The labels for the samples.

Returns: - df (DataFrame): The samples as a pandas DataFrame.

pysco.plots.plot.which_corner()[source]: Print the file path of the corner package currently in use.

pysco.plots.journals

pysco.plots.journals.get_style(style, journal='prd', cols='onecol', aspect=1.618033988749895)[source]

Return a style list for use with plt.style.context().

Parameters:

style (str) – Style name, e.g. ‘paper’.
journal (str) – Journal name, must be a key in journal_sizes.
cols (str) – Column key, e.g. ‘onecol’ or ‘twocol’.
aspect (float) – Height = width / aspect. Defaults to the golden ratio.

Returns:

Two-element list: [style_file_or_name, {“figure.figsize”: (width, height)}].

Return type:

list[str | dict]

pysco.eryn

class pysco.eryn.AutoCorrelationStopping(*args: Any, **kwargs: Any)[source]

Bases: Stopping

get_N_from_ess(nw, tau)[source]

Get the number of samples from the effective sample size.

Parameters:

nw (int) – Number of walkers.
tau (float) – Auto-correlation time.

Returns:

Target number of samples.

Return type:

int

class pysco.eryn.DiagnosticPlotter(sampler, path, truths, labels, plot_kwargs={}, plot_all_temps=False, transform_all=None, true_logl=None, discard=0.3, suffix='', converter=None)[source]

Bases: object

A class for generating diagnostic plots based on the state of the sampler.

Attributes: - sampler: The sampler object to use for the diagnostic plots. - path: The path to save the diagnostic plots. - truths: A dictionary containing the true values for each branch. - labels: A dictionary containing the labels for each parameter in each branch. - transform_all: A transform_container object containing the transformation functions for each branch. - true_logl: The true log likelihood value. - discard: The fraction of iterations to discard before plotting. - suffix: A suffix to append to the filenames of the diagnostic plots. - converter: A callable object to convert the samples to a different basis.

Methods: - setup(sampler): Set up the diagnostic plotter with the given sampler. - __call__(**kwargs): Perform various plotting operations based on the current state of the sampler. - plot_corners(samples, logl, trace=True, **kwargs): Plot corner plots and trace plots for the given samples. - plot_acceptance(): Plot the acceptance fraction for each update step in the MCMC sampling process. - plot_leaves_hist(): Plot the histogram of the number of leaves for each temperature.

plot_acceptance()[source]

Plot the acceptance fraction for each update step in the MCMC sampling process. If the has_rj attribute is True, it also plots the RJ acceptance fraction.

Returns: None

plot_act_evolution(N=10, all_T=False, **kwargs)[source]

Plot the auto-correlation time evolution.

Parameters: - N (int): Number of points to plot. - all_T (bool): Whether to plot for all temperatures or just one. - **kwargs: Additional keyword arguments to pass to get_integrated_act function.

Returns: - None

Raises: - None

plot_corners(samples, logl, trace=True, covs=True, **kwargs)[source]

Plot corner plots and trace plots for the given samples.

Parameters:

samples (dict) – A dictionary containing the samples for each branch.
logl (array-like) – The log likelihood values.
trace (bool, optional) – Whether to plot trace plots. Defaults to True.
covs (bool, optional) – Whether to plot the diagonal elements of the covariance matrix of the samples. Defaults to True.
**kwargs – Additional keyword arguments to be passed to the corner plot function.

Returns: None

plot_leaves_hist()[source]

Plot the histogram of the number of leaves for each temperature.

This method plots a histogram of the number of leaves for each temperature in the rj_branches dictionary. It uses the sampler object to get the number of leaves for each temperature. The histogram is plotted using the plt.hist function from the matplotlib.pyplot module. The plot includes temperature-specific colors and a legend for the colors.

Returns: None

plot_logl_betas(betas, logl)[source]

Plots the evolution of log-likelihood values for each temperature.

Parameters:

betas (numpy.ndarray) – Array of inverse temperatures.
logl (numpy.ndarray) – Array of log-likelihood values.

Returns:

None

plot_logl_evolution(logl)[source]

Plots the evolution of log-likelihood values.

Parameters:: logl (numpy.ndarray) – Array of log-likelihood values.
Returns:: None

setup(sampler)[source]

Set up the diagnostic plotter with the given sampler.

Parameters: - sampler: The sampler object to use for the diagnostic plots.

Returns: None

class pysco.eryn.GelmanRubinStopping(*args: Any, **kwargs: Any)[source]: Bases: Stopping

class pysco.eryn.ImportanceSampler(current_backend, target_likelihood, likelihood_kwargs={}, savename='importance_sampler.pkl')[source]

Bases: object

compute_target_probabilities(samples, groups)[source]

Compute the weights for the importance sampling.

Parameters:

samples (list) – current samples.
groups (list) – groups of indices for the active leaves.
current_logL (array) – Array of log likelihood values.

Returns:

Array of weights.

Return type:

array

compute_weights(current_logL, target_logL)[source]

Compute the weights for the importance sampling.

Parameters:

current_logL (array) – Array of log likelihood values for the current samples.
target_logL (array) – Array of log likelihood values for the target samples.

Returns:

Array of weights.

Return type:

array

property current_backend

property current_information

evaluate()[source]

Evaluate the importance sampling weights .

Returns:: Array of weights. target_logL (array): Array of the target log likelihood evaluated at the current samples.
Return type:: weights (array)

fetch_current_information(ess=10000.0, discard=None, thin=None)[source]

Fetch the relevant information contained in the current backend.

Parameters:

ess (float) – Effective sample size. Default is 1e4. It is used if either discard or thin are None.
discard (int) – Number of samples to discard. Default is None.
thin (int) – Thinning factor. Default is None.

Returns:

Dictionary containing the current samples, indices, number of leaves, log posterior and log likelihood.

Return type:

dict

resample(weights, target_logL)[source]

Resample the current samples using the importance sampling weights as probabilities.

Parameters:

weights (array) – Array of weights.
target_logL (array) – Array of the target log likelihood evaluated at the current samples.

Returns:

Dictionary containing the resampled samples. new_inds (dict): Dictionary containing the resampled indices.

Return type:

new_samples (dict)

run(ess=10000.0, backend='./importance_resampled_backend.h5')[source]

Run the importance sampler.

Parameters:

ess (float) – Effective sample size. Default is 1e4.
savename (str) – Name of the file to save the target backend.

Returns:

Dictionary containing the resampled samples.

Return type:

dict

save()[source]: Dump the ImportanceSampler object to a pickle file.

save_target_backend(backend='./importance_resampled_backend.h5')[source]

Save the target backend to a h5 file.

Parameters:: savename (str) – Name of the file to save the target backend.

property target_backend

update_target_backend(new_samples, new_inds, new_logp, new_logL)[source]

Update the target backend with the new samples.

Parameters:

new_samples (dict) – Dictionary containing the resampled samples.
new_inds (dict) – Dictionary containing the resampled indices.
new_logprior (array) – Array of log prior values for the resampled samples.
new_logL (array) – Array of log likelihood values for the target samples.

class pysco.eryn.PPplotter(truths, true_logL=None)[source]

Bases: object

compute_all_ratios(all_logLs, all_samples=None, all_weights=None)[source]

Compute the ratio for each log-likelihood in logLs. :param all_logLs: list of np.ndarray

List of log-likelihood values for each sample set.

Parameters:

all_samples – list of np.ndarray or pd.DataFrame, optional List of samples from the posterior distribution for each log-likelihood.
all_weights – list of np.ndarray, optional List of weights for each sample set, if None, equal weights are assumed.

Returns:

None

compute_ratio(logL, samples=None, weights=None)[source]

Compute the ratio between the number of samples whose likelihood is greater than the true log-likelihood and the total number of samples. :param logL: np.ndarray

Log-likelihood values at each samples.

Parameters:

samples – np.ndarray, optional The samples from the posterior distribution.
weights – np.ndarray, optional Weights for each sample, if None, equal weights are assumed.

Returns:

float: The ratio of samples with log-likelihood greater than the true log-likelihood.

Return type:

ratio

distance(samples, weights=None)[source]

Calculate the distance from the truths for each sample. args: samples: np.ndarray

The samples from the posterior distribution.

weights: np.ndarray, optional: Weights for each sample, if None, equal weights are assumed.

returns: distances: np.ndarray

The distances of each sample from the truths.

plot(significance=0.95, fig=None, ax=None, N_draws=0, **kwargs)[source]

Plot the posterior predictive plot. :param significance: float, optional

The significance level for the posterior predictive plot. Default is 0.95.

Parameters:

fig – matplotlib.figure.Figure, optional The figure to plot on. If None, a new figure is created.
ax – matplotlib.axes.Axes, optional The axes to plot on. If None, a new axes is created.
N_draws – int, optional The number of random realization of empirical distributions to draw. Default is 0, which means no random draws.
**kwargs – dict Additional keyword arguments to pass to the plot

Returns:

matplotlib.axes.Axes: The axes with the posterior predictive plot.

Return type:

ax

property ratios

property truths

class pysco.eryn.SamplesLoader(path, transform_fn=None)[source]

Bases: object

property backend

get_leaves(ess=10000.0)[source]

Get the number of leaves for each temperature.

Returns:: Dictionary containing the number of leaves for each temperature.
Return type:: dict

load(ess=10000.0, discard=None, thin=None, squeeze=False, leaves_to_ndim=False)[source]

Load the samples from the backend.

Parameters:

ess (float) – Effective sample size. Default is 1e4.
discard (int) – Number of samples to discard. Default is None.
thin (int) – Thinning factor. Default is None.
squeeze (bool) – Whether to ‘squeeze’ the samples. Default is False.
leaves_to_ndim (bool) – Whether to reshape the samples to have the shape (nsteps, nleaves*ndim). Default is False.

Returns:

samples_out (array): 2D array containing the samples of the single branch.

logL (array): 1D array containing the log likelihood. logP (array): 1D array containing the log posterior

else:

samples_out (dict): Dictionary containing the samples for each branch,: the log likelihood and the log posterior.

Return type:

if squeeze == True and there is only one branch

make_dataframe(labels=None, samples=None, ess=10000.0, return_dict=False)[source]

Make a pandas DataFrame from the samples.

Parameters:

labels (dict) – Dictionary containing the labels for each parameter in each branch.
samples (dict) – Dictionary containing the samples for each branch. Default is None.
ess (float) – Effective sample size. Default is 1e4.
return_dict (bool) – Whether to return a dictionary of DataFrames or a single DataFrame. Default is False.

Returns:

Dictionary containing the pandas DataFrame for each branch.

Return type:

dict

property transform_fn

pysco.eryn.adjust_covariance(samp, discard=0.8, svd=False, lr=None, skip_idxs=[])[source]

Adjusts the covariance matrix for each branch in the given sample.

Parameters: samp (Sampler): The sampler object. discard (float, optional): The fraction of iterations to discard. Defaults to 0.8. svd (bool, optional): Whether to perform singular value decomposition on the covariance matrix. Defaults to False. skip_idxs (list, optional): List of indices to skip. Defaults to an empty list. lr (float, optional): The learning rate for the covariance matrix adjustment. Defaults to None.

If None, the covariance matrix is computed from the samples. Else, the covariance matrix is adjusted by a factor of lr.

Returns: None

pysco.eryn.adjust_gamma0(samp, skip_idxs=[])[source]

Adjusts the gamma_0 parameter for the given sample.

Parameters: samp (Sampler): The sampler object. skip_idxs (list, optional): List of indices to skip. Defaults to an empty list.

Returns: None

pysco.eryn.arrange_inds(inds)[source]

Rearrange the indices in an array of dictionary containing the indices at each step.

Parameters:: inds (dict) – dictionary containing the indices at every.
Returns:: The rearranged indices.
Return type:: inds_out (object)

pysco.eryn.compute_discard_thin(backend, ess=10000.0, discard_multiplier=5, thin_multiplier=0.5)[source]

Compute the number of samples to discard and thin.

Parameters:

backend (object) – The backend object.
ess (float) – Effective sample size. Default is 1e4.
discard_multiplier (float) – Multiplier for the maximum auto-correlation time to compute the number of samples to discard. Default is 5.
thin_multiplier (float) – Multiplier for the minimum auto-correlation time to compute the thinning factor. Default is 0.5.

Returns:

Number of samples to discard. If ess is None, it is set to discard_multiplier times the maximum auto-correlation time (based on emcee). thin (int): Thinning factor. This is computed as thin_multiplier times the minimum auto-correlation time (based on emcee).

Return type:

discard (int)

pysco.eryn.general_act(sampler, discard=None, all_T=False, return_max=True, act_kwargs={})[source]

Compute the generalised auto-correlation time for the given sampler.

Parameters: - sampler (object): The sampler object. - discard (int): The number of samples to discard. Default is None. - all_T (bool): Whether to compute the auto-correlation time for all temperatures. Default is False. - return_max (bool): Whether to return the maximum auto-correlation time. Default is True. - act_kwargs (dict): Additional keyword arguments to pass to the auto-correlation time function.

Returns: if return_max: - tau_all (float): The maximum auto-correlation time across all the parameters.

else: - tau_all (array-like): The auto-correlation time for all the parameters.

pysco.eryn.get_clean_chain(coords, ndim, temp=0)[source]: Simple utility function to extract the squeezed chains for all the parameters

pysco.eryn.get_groups_from_all_inds(inds)[source]

pysco.eryn.get_groups_from_inds_array(inds)[source]

pysco.eryn.get_integrated_act_wrap(samples, nleaves=None, average=True, fast=False)[source]

Compute the integrated auto-correlation time for RJ moves.

Parameters:

samples (dict) – The samples to compute the auto-correlation time for.
nleaves (dict) – The number of active leaves. This is only used for RJ branches. Default is None.
average (bool) – Whether to average the auto-correlation time across all dimensions. Default is True.
fast (bool) – Whether to use the fast method for computing the auto-correlation time. Default is False.

Returns:

The integrated auto-correlation time for the samples.

Return type:

tau (array)

pysco.eryn.get_numpy(x)[source]