Running MCMC
Choosing a Sampler
The choice of sampler significantly affects both the quality of results and computational efficiency. Here’s guidance on when to use each:
Emcee (Ensemble MCMC) - Recommended for Most Cases
result = fitter.fit(params, sampler="emcee", nsteps=10000, nburn=2000, ...)
Use emcee when:
You need fast exploration of parameter space
The posterior is expected to be unimodal (single peak)
You’re doing quick preliminary fits or production runs
VegasAfterglow uses an optimized emcee configuration:
Custom proposal moves: DEMove (70%) + DESnookerMove (30%) for better mixing than default stretch move
Thread-based parallelism: Likelihood evaluations are parallelized via
ThreadPoolExecutorwith the GIL released during C++ computationAutomatic nwalkers: Optimized based on parameter count and CPU cores
Note
Parallelization: For emcee, set npool to the number of CPU cores. Each thread creates a Model and calls flux_density() with the GIL released during C++ computation, achieving near-native parallelism.
Dynesty (Nested Sampling) - For Model Comparison
result = fitter.fit(params, sampler="dynesty", nlive=1000, dlogz=0.5, ...)
Use dynesty when:
You need Bayesian evidence for model comparison
The posterior may be multimodal (multiple peaks)
You want to compare different physics configurations
VegasAfterglow uses optimized dynesty settings:
rslice sampling: More efficient than random walk for high dimensions
Thread pool: Uses Python threading (not multiprocessing) for better performance with C++ backend
Optimal queue size: Automatically calculated based on nlive and npool
Note
For dynesty, npool controls the number of parallel threads. If not specified, it defaults to the number of CPU cores on your machine.
Recommended Workflow
Quick exploration: Use emcee with moderate steps (5000-10000) to find approximate parameter region
Model comparison: Use dynesty when comparing different physics (forward shock only vs. with reverse shock, etc.)
Production run: Use emcee with longer chains (20000-50000 steps) for final parameter constraints
Fitter.fit() Interface
The Fitter.fit() method provides a unified interface for parameter estimation using different sampling algorithms via bilby.
Method Signature
# noqa: syntax
def fit(
self,
param_defs: Sequence[ParamDef], # Parameter definitions
sampler: str = "emcee", # Sampler algorithm
resolution: Tuple = None, # Override constructor resolution
npool: int = None, # Number of parallel threads (default: all cores)
top_k: int = 10, # Number of best fits to return
outdir: str = "bilby_output", # Output directory
label: str = "afterglow", # Run label
clean: bool = True, # Clean up intermediate files
resume: bool = False, # Resume previous run
log_likelihood_fn: Callable = None, # Custom log-likelihood
priors: dict = None, # Custom bilby priors (all samplers)
**sampler_kwargs # Sampler-specific parameters
) -> FitResult
Common Parameters
param_defs: List ofParamDefobjects defining parameters to fitRequired parameter for all samplers
See “Parameter Definition” section for details
resolution: Optional tuple of (phi_res, theta_res, t_res)Overrides the resolution set in the
Fitterconstructor for this runIf
None, uses the constructor value (default:(0.1, 0.25, 10))Example:
(0.1, 1, 15)for higher accuracy
sampler: Sampling algorithm to use"emcee": Affine-invariant MCMC ensemble sampler (recommended for speed)"dynesty": Dynamic nested sampling (for Bayesian evidence)"nestle": Alternative nested sampling implementation"cpnest": Nested sampling with parallel tempering"pymultinest": MultiNest nested sampling algorithm"ultranest": Nested sampling with slice samplingOther samplers supported by bilby - see bilby documentation for details
npool: Number of parallel threadsSet to number of CPU cores for best performance
Default: number of CPU cores
Example:
npool=8on an 8-core machine
top_k: Number of best-fit parameter sets to extractUseful for exploring multiple local maxima
Default: 10
Results sorted by log probability (highest first)
outdir: Directory for output filesDefault:
"bilby_output"Contains checkpoint files, result files, and plots
label: Run identifierDefault:
"afterglow"Used in output filenames:
{label}_result.json
clean: Whether to remove intermediate filesDefault:
TrueSet
Falseto keep checkpoint files for inspection
resume: Resume from previous runDefault:
FalseRequires matching
outdirandlabelfrom previous run
Emcee-Specific Parameters (sampler="emcee")
nsteps: Number of MCMC steps per walkerDefault: 5000
More steps = better convergence but longer runtime
Typical range: 5000-50000
nburn: Number of burn-in steps to discardDefault: 1000
Should be long enough to reach equilibrium
Rule of thumb: 10-20% of
nsteps
thin: Thinning factor (save every nth sample)Default: 1 (save all samples)
Use to reduce autocorrelation in chain
Example:
thin=10saves every 10th step
nwalkers: Number of ensemble walkersDefault:
2 * n_params(automatically set)More walkers = better exploration but more computation
Minimum:
2 * n_params
Dynesty-Specific Parameters (sampler="dynesty")
nlive: Number of live pointsDefault: 500 (recommended: 1000 for production runs)
More points = better evidence estimate but slower
Typical range: 500-2000
dlogz: Evidence tolerance (stopping criterion)Default: 0.1 (recommended: 0.5 for faster convergence)
Smaller values = more thorough sampling
Typical range: 0.01-0.5
sample: Sampling methodDefault:
"rwalk"(random walk)Other options:
"slice","rslice","unif""rwalk"works well for most problems
walks: Number of random walk stepsDefault: 100
Only relevant for
sample="rwalk"
maxmcmc: Maximum MCMC steps for slice samplingDefault: 5000
Only relevant for slice samplers
Other Samplers
VegasAfterglow supports all samplers available in bilby, including:
"nestle": Nested sampling (similar to dynesty)"cpnest": Nested sampling with parallel tempering"pymultinest": MultiNest algorithm (requires separate installation)"ultranest": Advanced nested sampling with slice sampling"ptemcee": Parallel-tempered MCMC"kombine": Clustered KDE proposal MCMC"pypolychord": PolyChord nested sampling
For sampler-specific parameters, refer to the bilby sampler documentation.
Pass sampler-specific parameters via **sampler_kwargs in the fit() method.
Example with alternative sampler:
# Using nestle sampler
result = fitter.fit(
params,
resolution=(0.1, 0.25, 10),
sampler="nestle",
nlive=1000, # nestle-specific parameter
method='multi', # nestle-specific: 'classic', 'single', or 'multi'
npool=8,
top_k=10,
)
Return Value: FitResult
The method returns a FitResult object with the following attributes:
samples: Posterior samples arrayShape:
[n_samples, 1, n_params]In transformed space (log10 for LOG-scale parameters)
labels: Parameter names as used in samplingLOG-scale parameters have
log10_prefixExample:
["log10_E_iso", "log10_Gamma0", "theta_c", "p", ...]
latex_labels: LaTeX-formatted labels for plottingExample:
["$\\log_{10}(E_{\\rm iso})$", "$\\log_{10}(\\Gamma_0)$", "$\\theta_c$", ...]Use these for corner plots and visualizations
top_k_params: Best-fit parameter valuesShape:
[top_k, n_params]Sorted by log probability (highest first)
In transformed space (log10 for LOG-scale parameters)
top_k_log_probs: Log probabilities for top-k fitsShape:
[top_k]Can convert to chi-squared via:
chi2 = -2 * log_prob
log_probs: Log probabilities for all samplesShape:
[n_samples, 1]
bilby_result: Full bilby Result objectContains additional diagnostics and metadata
Use for advanced analysis and plotting
Access via:
result.bilby_result.plot_corner()
Basic MCMC Execution
# Create fitter object and add data (see above for data loading examples)
fitter = Fitter(z=1.58, lumi_dist=3.364e28, jet="tophat", medium="ism")
# ... add data via fitter.add_flux_density(), fitter.add_spectrum(), etc.
# Option 1 (Recommended): Nested sampling with dynesty (computes Bayesian evidence, robust for multimodal posteriors)
result = fitter.fit(
params,
resolution=(0.1, 0.25, 10), # Grid resolution (phi, theta, t)
sampler="dynesty", # Nested sampling algorithm
nlive=1000, # Number of live points
walks=100, # Number of random walks per live point
dlogz=0.5, # Stopping criterion (evidence tolerance)
npool=8, # Number of parallel threads
top_k=10, # Number of best-fit parameters to return
)
# Option 2: MCMC with emcee (faster, good for unimodal posteriors, hard to converge for multimodal posteriors)
result = fitter.fit(
params,
resolution=(0.1, 0.25, 10), # Grid resolution (phi, theta, t)
sampler="emcee", # MCMC sampler
nsteps=50000, # Number of steps per walker
nburn=10000, # Burn-in steps to discard
thin=1, # Save every nth sample
npool=8, # Number of parallel threads
top_k=10, # Number of best-fit parameters to return
)
- Important Notes:
Parameters with
Scale.logare sampled aslog10_<name>(e.g.,log10_E_iso)The sampler works in log10 space for LOG-scale parameters, then transforms back to physical values
Use
npoolto parallelize likelihood evaluations across multiple CPU coresresult.latex_labelsprovides properly formatted labels for corner plotsFor emcee, total samples =
nwalkers * (nsteps - nburn) / thinFor dynesty, sample count depends on convergence (not fixed)
Analyzing Results
Parameter Constraints
# Print best-fit parameters
print("Top-k parameters:")
header = f"{'Rank':>4s} {'chi^2':>10s} " + " ".join(f"{name:>10s}" for name in result.labels)
print(header)
print("-" * len(header))
for i in range(result.top_k_params.shape[0]):
chi2 = -2 * result.top_k_log_probs[i]
vals = " ".join(f"{val:10.4f}" for val in result.top_k_params[i])
print(f"{i+1:4d} {chi2:10.2f} {vals}")
Model Predictions
# Generate model predictions with best-fit parameters
t_model = np.logspace(2, 8, 200)
nu_model = np.array([1e9, 5e14, 2e17]) # Radio, optical, X-ray
# Light curves at specific frequencies
lc_model = fitter.flux_density_grid(result.top_k_params[0], t_model, nu_model)
# Spectra at specific times
nu_spec = np.logspace(8, 20, 100)
times_spec = [1000, 10000]
spec_model = fitter.flux_density_grid(result.top_k_params[0], times_spec, nu_spec)
# Frequency-integrated flux (broadband light curves)
# Useful for comparing with instruments like Swift/BAT, Fermi/LAT
from VegasAfterglow.units import band
flux_integrated = fitter.flux(result.top_k_params[0], t_model,
band=band("BAT"))
Visualization
import matplotlib.pyplot as plt
import corner
flat_chain = result.samples.reshape(-1, result.samples.shape[-1])
# Corner plot for parameter correlations
fig = corner.corner(
flat_chain,
labels=result.latex_labels,
quantiles=[0.16, 0.5, 0.84],
show_titles=True,
title_kwargs={"fontsize": 12}
)
plt.savefig("corner_plot.png", dpi=300, bbox_inches='tight')
# Light curve comparison
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
colors = ['blue', 'orange', 'red']
for i, (nu, color) in enumerate(zip(nu_model, colors)):
ax = axes[i]
# Plot data (if available)
# ax.errorbar(t_data, flux_data, flux_err, fmt='o', color=color)
# Plot model
ax.loglog(t_model, lc_model[i], '-', color=color, linewidth=2)
ax.set_xlabel('Time [s]')
ax.set_ylabel('Flux Density [erg/cm^2/s/Hz]')
ax.set_title(f'nu = {nu:.1e} Hz')
plt.tight_layout()
plt.savefig("lightcurve_fit.png", dpi=300, bbox_inches='tight')