在R中使用GEE拟合负二项分布模型的最新进展问询
Hey there! It’s great to revisit this question 5 years later—there have been some solid updates in R for fitting Generalized Estimating Equations (GEE) with negative binomial distribution, especially since you already have your data aggregated and a working Poisson GEE model with geeglm. Here’s what you need to know:
1. Improved Support in geepack (the package powering geeglm)
In recent years, the geepack package has expanded its family support to include negative binomial directly in geeglm. You no longer need workarounds—just specify the neg.binomial family, and you can even estimate the dispersion parameter (theta) alongside your regression coefficients. Here’s a quick example tailored to your behavioral count data (Diract):
library(geepack) # Fit negative binomial GEE nb_gee <- geeglm(Diract ~ Group + Dir + Rec, # Add your other covariates here data = your_aggregated_data, id = cluster_id, # Replace with your cluster identifier (e.g., subject/group) family = neg.binomial(), # Let the model estimate theta, or set a starting value like theta=1 corstr = "exchangeable") # Pick the correlation structure matching your design # Check results summary(nb_gee)
Pro tip: If convergence is tricky, start with a fixed theta value (you can estimate this first using a standard negative binomial GLM with glm.nb() from the MASS package).
2. Alternative Packages for Better Convergence & Flexibility
If you run into snags with geepack, two packages have matured nicely since your original question:
geeM: Built to address limitations of older GEE implementations, this package handles negative binomial models smoothly and often converges better for overdispersed count data. Example:
library(geeM) nb_geem <- geem(Diract ~ Group + Dir + Rec, data = your_aggregated_data, id = cluster_id, family = negative.binomial(theta = estimated_theta), # Use theta from glm.nb() if needed corstr = "exchangeable") summary(nb_geem)
glmmTMB: While it’s known for mixed-effects models, you can use it to fit marginal models (similar to GEE) by focusing on fixed effects and using cluster-robust standard errors. It’s great for negative binomial data, including zero-inflated variants if you have excess zeros inDiract:
library(glmmTMB) library(lmtest) library(sandwich) # Fit marginal negative binomial model nb_marginal <- glmmTMB(Diract ~ Group + Dir + Rec, data = your_aggregated_data, family = nbinom2, # nbinom1 uses linear dispersion; nbinom2 uses quadratic cluster = cluster_id) # Get cluster-robust standard errors for GEE-like inference coeftest(nb_marginal, vcov = vcovCL, cluster = ~cluster_id)
3. Critical Checks for Your Data
Before diving in, make sure to:
- Verify overdispersion: Use the
dispersiontest()function from theAERpackage on your existing Poisson GEE model. Significant overdispersion confirms negative binomial is the right call. - Choose the right correlation structure: Test options like exchangeable, autoregressive, or unstructured using the QIC (Quasi-likelihood Information Criterion) available in
geepackandgeeM—lower QIC means a better fit. - Tune for convergence: If the model won’t converge, simplify your covariate list first, or use a fixed
thetastarting value from a simpler negative binomial model.
Since you already have a working Poisson model, transitioning to negative binomial should be straightforward with these updated tools. Let me know if you hit specific roadblocks!
内容的提问来源于stack exchange,提问作者BethanyKaye




