You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

关于R语言hist()函数中breaks参数与直方图bins数量关系的问询

Hey there! Let's unpack R's hist() function and that confusing behavior with the breaks parameter you've observed using the mtcars dataset.

What is R's hist() function?

It's a built-in tool for creating histograms—visualizations that show the distribution of numeric data by grouping values into contiguous intervals (called bins) and counting how many observations fall into each bin.

The breaks parameter & bin count confusion

Here's the key point that trips up a lot of folks: when you pass a single integer to breaks, you're not telling R to make exactly that many bins. Instead, you're giving it a suggested number of intervals, and R will adjust this to create "nice" human-readable break points (like round numbers) that might result in a different number of bins.

Let's walk through your mtcars$mpg examples to see this in action (first run data(mtcars) to load the dataset):

  • hist(mtcars$mpg, breaks=3): You get exactly 3 bins here. This is the exception, not the rule—R found that splitting the mpg range (10.4 to 33.9) into 3 evenly spaced, round-interval bins made sense, so it stuck with your suggestion.
  • hist(mtcars$mpg, breaks=4): You end up with 5 bins. If you check the actual break points R uses (run hist_info <- hist(mtcars$mpg, breaks=4); hist_info$breaks), you'll see something like 10 15 20 25 30 35. R prioritized clean, round break points over strictly following your 4-bin request.
  • hist(mtcars$mpg, breaks=5): The bin count stays at 5. That's because the "nice" break points R calculates for a suggested 5 bins are identical to the ones it used for breaks=4—so no change in bin number.

How to get exactly the number of bins you want

If you need strict control over bin count, pass a vector of exact break points instead of a single integer. For example, to get exactly 4 bins for mtcars$mpg:

data(mtcars)
# Calculate exact break points: 5 breaks = 4 bins
mpg_breaks <- seq(range(mtcars$mpg)[1], range(mtcars$mpg)[2], length.out = 5)
hist(mtcars$mpg, breaks = mpg_breaks)

This forces R to use your custom break points, giving you the exact number of bins you specified.


内容的提问来源于stack exchange,提问作者MMEL

火山引擎 最新活动