如何在R语言中根据自定义概率密度函数随机生成数据?
Great question! Generating random data from a custom probability density function (PDF) in R is totally feasible, and there are several reliable approaches depending on how complex your PDF is. Let’s walk through the most common methods, with practical code examples.
1. Inverse Transform Sampling (Best if you can find the inverse CDF)
The inverse transform method is the simplest approach—if you can derive the cumulative distribution function (CDF) of your PDF, then find its inverse. Here’s how it works:
- Generate uniform random numbers
u ~ Uniform(0,1) - Plug each
uinto the inverse of your CDF to get samples from your target PDF.
Example: Custom PDF f(x) = 2x for 0 ≤ x ≤ 1
For this PDF, the CDF is F(x) = x², and its inverse is F⁻¹(u) = sqrt(u).
# Define the inverse CDF of our custom PDF inv_cdf <- function(u) { sqrt(u) } # Generate 1000 samples n <- 1000 u_samples <- runif(n) # Uniform(0,1) samples custom_samples <- inv_cdf(u_samples) # Verify the results: histogram vs. true PDF hist(custom_samples, prob = TRUE, main = "Samples from f(x) = 2x (0 ≤ x ≤ 1)", xlab = "x", breaks = 20) curve(2*x, from = 0, to = 1, add = TRUE, col = "red", lwd = 2)
2. Accept-Reject Sampling (For when inverse CDF is hard to compute)
If deriving the inverse CDF isn’t feasible (e.g., your PDF has a complicated form), accept-reject sampling is a go-to. The idea is to use a "proposal distribution" that’s easy to sample from, then accept/reject candidates based on how well they match your target PDF.
Example: Custom Laplace-like PDF f(x) = 0.5*exp(-|x|)
Let’s use a normal distribution as our proposal distribution (since it’s easy to sample from with rnorm()):
# Define our target custom PDF target_pdf <- function(x) { 0.5 * exp(-abs(x)) } # Define a proposal distribution (here, Normal(0, 2)) proposal_pdf <- function(x) { dnorm(x, mean = 0, sd = 2) } # Find k: a constant where k*proposal_pdf(x) ≥ target_pdf(x) for all x # For this example, k ≈ 1.13 (you can calculate this via optimization) k <- 1.13 # Function to generate samples via accept-reject accept_reject <- function(n_samples) { samples <- numeric(n_samples) accepted <- 0 while (accepted < n_samples) { # Generate a candidate from the proposal distribution candidate <- rnorm(1, mean = 0, sd = 2) # Calculate acceptance probability accept_prob <- target_pdf(candidate) / (k * proposal_pdf(candidate)) # Accept the candidate if a uniform random number is < accept_prob if (runif(1) < accept_prob) { accepted <- accepted + 1 samples[accepted] <- candidate } } return(samples) } # Generate 1000 samples ar_samples <- accept_reject(1000) # Verify hist(ar_samples, prob = TRUE, main = "Samples via Accept-Reject Sampling", xlab = "x", breaks = 30) curve(target_pdf(x), from = -5, to = 5, add = TRUE, col = "blue", lwd = 2)
3. Using R Packages for Simplified Workflows
If you don’t want to implement the math yourself, several R packages let you define custom distributions and generate samples with minimal code. The distr package is a great option:
# Install first if you haven't # install.packages("distr") library(distr) # Define your custom PDF (same 2x example as before) custom_pdf <- function(x) { ifelse(x >= 0 & x <= 1, 2*x, 0) } # Create an absolute continuous distribution object my_dist <- AbscontDistribution(d = custom_pdf) # Generate 1000 random samples package_samples <- r(my_dist)(1000) # Verify hist(package_samples, prob = TRUE, main = "Samples via distr Package", xlab = "x", breaks = 20) curve(custom_pdf(x), from = 0, to = 1, add = TRUE, col = "green", lwd = 2)
For Specific Known Distributions
If your "custom" PDF is just a parameter variation of a standard distribution (e.g., a Gamma distribution with non-default shape/rate), you can skip the custom code and use R’s built-in r* functions:
rnorm()for normal,rgamma()for Gamma,rbeta()for Beta, etc. Just tweak the parameters to match your PDF.
内容的提问来源于stack exchange,提问作者steinchen




