You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

PyMC代码迁移至PyMC3求助:贝叶斯统计模型转换问题

Migrating PyMC's @pm.deterministic to PyMC3 for Bayesian Change-Point Model

Problem

I'm learning Bayesian statistics via Probabilistic Programming and Bayesian Methods for Hackers, but the book's examples use the deprecated PyMC library. I'm migrating to PyMC3 and got stuck on replacing the @pm.deterministic decorator. Here's the original PyMC code:

import numpy as np
import pymc as pm
count_data = np.loadtxt("txtdata.csv")
n_count_data = len(count_data)
alpha = 1.0 / count_data.mean()
lambda_1 = pm.Exponential("lambda_1", alpha)
lambda_2 = pm.Exponential("lambda_2", alpha)
tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data)
@pm.deterministic
def lambda_(tau=tau, lambda_1=lambda_1, lambda_2=lambda_2):
    out = np.zeros(n_count_data)
    out[:tau] = lambda_1 # lambda before tau is lambda1
    out[tau:] = lambda_2 # lambda after (and including) tau is lambda2
    return out
observation = pm.Poisson("obs", lambda_, value=count_data, observed=True)
model = pm.Model([observation, lambda_1, lambda_2, tau])
mcmc = pm.MCMC(model)
mcmc.sample(40000, 10000, 1)
lambda_1_samples = mcmc.trace('lambda_1')[:]
lambda_2_samples = mcmc.trace('lambda_2')[:]
tau_samples = mcmc.trace('tau')[:]

I've started rewriting for PyMC3 but hit a wall with the deterministic function:

import numpy as np
import pymc3 as pm
from matplotlib import pyplot as plt
count_data = np.loadtxt("txtdata.csv")
n_count_data = len(count_data)
alpha = 1.0 / count_data.mean()
basic_model = pm.Model()
with basic_model:
    lambda_1 = pm.Exponential("lambda_1", alpha)
    lambda_2 = pm.Exponential("lambda_2", alpha)
    tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data)
    @pm.deterministics
    def lambda_(tau=tau, lambda_1=lambda_1, lambda_2=lambda_2):
        out = np.zeros(n_count_data)
        out[:tau] = lambda_1 # lambda before tau is lambda1
        out[tau:] = lambda_2 # lambda after (and including) tau is lambda2
        return out
    observation = pm.Poisson("obs", lambda_, value=count_data, observed=True)

How can I fix this PyMC3 code to run correctly?


Solution

PyMC3 removed the @pm.deterministic decorator entirely, and uses tensor operations (via Theano/TensorFlow) instead of raw NumPy array manipulation for model variables. Here's how to adapt your code:

Key Changes:

  1. Replace the deterministic function with tensor-based conditional logic (using pm.math.where or pm.math.switch)
  2. Remove the invalid @pm.deterministics decorator (you also had a typo: extra "s")
  3. Update the observation variable to use PyMC3's observed parameter syntax (no value= needed)
  4. Use PyMC3's modern sampling API (pm.sample() instead of pm.MCMC)

Full Working PyMC3 Code:

import numpy as np
import pymc3 as pm
import arviz as az  # For trace analysis/visualization (recommended for PyMC3)

# Load data
count_data = np.loadtxt("txtdata.csv")
n_count_data = len(count_data)
alpha = 1.0 / count_data.mean()

# Define the model
basic_model = pm.Model()
with basic_model:
    # Priors for the two lambda values
    lambda_1 = pm.Exponential("lambda_1", alpha)
    lambda_2 = pm.Exponential("lambda_2", alpha)
    
    # Prior for the change point tau
    tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data)
    
    # Define the piecewise lambda using tensor operations (replaces @pm.deterministic)
    # Create an index array for each time point
    idx = pm.math.arange(n_count_data)
    # Use where to select lambda_1 before tau, lambda_2 after
    lambda_ = pm.math.where(idx < tau, lambda_1, lambda_2)
    
    # Observed data (PyMC3 uses observed directly, no value= parameter)
    observation = pm.Poisson("obs", lambda_, observed=count_data)
    
    # Run MCMC sampling: tune = burn-in iterations, cores = parallel processing
    trace = pm.sample(40000, tune=10000, cores=2, return_inferencedata=True)

# Extract samples from the trace (using ArviZ's InferenceData object)
lambda_1_samples = trace.posterior["lambda_1"].values.flatten()
lambda_2_samples = trace.posterior["lambda_2"].values.flatten()
tau_samples = trace.posterior["tau"].values.flatten()

# Optional: Visualize the trace with ArviZ
az.plot_trace(trace)

Explanation:

  • Tensor Conditional Logic: PyMC3 variables are tensors, not NumPy arrays, so we can't slice and assign values directly. pm.math.where() acts like a vectorized if-else, selecting lambda_1 for indices before tau and lambda_2 for indices after.
  • Sampling API: pm.sample() handles MCMC setup automatically, including burn-in (via tune). The return_inferencedata=True flag returns an ArviZ object, which is the standard way to analyze and visualize PyMC3 results.
  • Deterministic Variable Optional: If you want to save lambda_ to the trace (for later analysis), wrap it in pm.Deterministic:
    lambda_ = pm.Deterministic("lambda_", pm.math.where(idx < tau, lambda_1, lambda_2))
    

内容的提问来源于stack exchange,提问作者Alpha Delta

火山引擎 最新活动