You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

新版xgboost包在caret环境中模型拟合失败,评估指标全为NA求助

新版xgboost包在caret环境中模型拟合失败,评估指标全为NA求助

我最近在尝试用caret包调用xgboost算法训练iris数据集,但是跑代码的时候直接报错终止了,所有的Accuracy和Kappa指标都是NA,还弹出了一堆警告。先把我的情况整理一下:

我的代码

library(caret)
data(iris)

TrainData <- iris[,1:4]
TrainClasses <- iris[,5]

xgbFit <- train(TrainData, TrainClasses,
                 method = "xgbTree",
                 preProcess = c("center", "scale"),
                 trControl = trainControl(method = "cv"))

运行后报错信息

Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :108   NA's   :108  
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)

查看警告信息(截取关键内容)

调用warnings()后,主要有两类重复警告+模型拟合失败的错误:

Warning messages:
1: In check.deprecation(deprecated_train_params, match.call(),  ... :
  Passed invalid function arguments: num_class. These should be passed as a list to argument 'params'. Conversion from argument to 'params' entry will be done automatically, but this behavior will become an error in a future version.
2: In check.custom.obj(params, objective) :
  Argument 'objective' is only for custom objectives. For built-in objectives, pass the objective under 'params'. This warning will become an error in a future version.
3: model fit failed for Fold01: eta=0.3, max_depth=1, gamma=0, colsample_bytree=0.6, min_child_weight=1, subsample=0.50, nrounds=150 Error in modelFit$xNames <- colnames(x) : 
  ALTLIST classes must provide a Set_elt method [class: XGBAltrepPointerClass, pkg: xgboost]
...
# 后续均为重复的上述警告与拟合失败错误

我的Session信息

sessionInfo()
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_India.utf8  LC_CTYPE=English_India.utf8   
[3] LC_MONETARY=English_India.utf8 LC_NUMERIC=C                  
[5] LC_TIME=English_India.utf8    

time zone: Asia/Calcutta
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] caret_7.0-1    lattice_0.22-7 ggplot2_4.0.1 

loaded via a namespace (and not attached):
 [1] future_1.68.0        generics_0.1.4       class_7.3-23         stringi_1.8.7       
 [5] pROC_1.19.0.1        listenv_0.10.0       digest_0.6.39        magrittr_2.0.4      
 [9] grid_4.5.1           timechange_0.3.0     RColorBrewer_1.1-3   iterators_1.0.14    
[13] jsonlite_2.0.0       xgboost_3.1.2.1      foreach_1.5.2        plyr_1.8.9          
[17] Matrix_1.7-4         e1071_1.7-16         ModelMetrics_1.2.2.2 nnet_7.3-20         
[21] survival_3.8-3       purrr_1.2.0          scales_1.4.0         codetools_0.2-20    
[25] lava_1.8.2           cli_3.6.5            rlang_1.1.6          hardhat_1.4.2       
[29] parallelly_1.46.0    future.apply_1.20.1  splines_4.5.1        withr_3.0.2         
[33] prodlim_2025.04.28   tools_4.5.1          parallel_4.5.1       reshape2_1.4.5      
[37] dplyr_1.1.4          recipes_1.3.1        globals_0.18.0       vctrs_0.6.5         
[41] R6_2.6.1             rpart_4.1.24         proxy_0.4-28         stats4_4.5.1        
[45] lifecycle_1.0.4      lubridate_1.9.4      stringr_1.6.0        MASS_7.3-65         
[49] pkgconfig_2.0.3      pillar_1.11.1        gtable_0.3.6         glue_1.8.0          
[53] data.table_1.17.8    Rcpp_1.1.0           tibble_3.3.0         tidyselect_1.2.1    
[57] rstudioapi_0.17.1    farver_2.1.2         nlme_3.1-168         ipred_0.9-15        
[61] timeDate_4051.111    gower_1.0.2          compiler_4.5.1       S7_0.2.1

问题分析

从报错和警告来看,核心是两个兼容性问题:

  1. xgboost参数传递规则更新:新版本xgboost要求num_classobjective这类核心参数必须放在params列表中传递,而不是直接作为train()的参数,旧的传递方式已经被标记为过时,会干扰模型初始化。
  2. Altrep数据格式不兼容:你用的xgboost3.x版本默认启用了Altrep内存优化格式,但当前的caret7.0-1版本没有适配这个格式,导致在给模型对象赋值列名时直接失败,所有交叉验证折叠都跑不起来,最终指标全为NA。

解决方案

方案1:快速绕过Altrep兼容性问题(推荐临时解决)

在运行代码前禁用xgboost的Altrep支持,同时按照新规则传递参数:

library(caret)
data(iris)

TrainData <- iris[,1:4]
TrainClasses <- iris[,5]

# 禁用xgboost的Altrep优化,避免和旧版caret的兼容性问题
options(xgboost.use_altreplicable = FALSE)

# 按新规则定义xgboost参数列表
xgb_params <- list(
  objective = "multi:softmax",  # 多分类任务的内置目标函数
  num_class = length(unique(TrainClasses))  # 类别数量,iris是3类
)

xgbFit <- train(TrainData, TrainClasses,
                 method = "xgbTree",
                 preProcess = c("center", "scale"),
                 trControl = trainControl(method = "cv"),
                 params = xgb_params)  # 传递参数列表

# 查看训练结果
print(xgbFit)

方案2:升级caret到兼容版本

如果想保留xgboost的Altrep优化,可以升级caret到最新开发版(该版本已经适配了xgboost3.x的Altrep格式):

# 先安装devtools
install.packages("devtools")
# 安装caret最新开发版
devtools::install_github("topepo/caret")

升级后同样要按照新规则把num_classobjective放到params列表中,避免过时警告。


验证效果

运行修改后的代码后,你会看到交叉验证的Accuracy和Kappa指标都能正常显示,也不会再弹出参数传递的警告。以后用caret调用xgboost时,记得所有内置目标函数、类别数这类参数都要放到params列表里,避免未来版本直接报错。

火山引擎 最新活动