肿瘤康复网 > factor java_使用randomForest Caret和factor变量预测栅格时出错

factor java_使用randomForest Caret和factor变量预测栅格时出错

时间：2020-01-02 06:18:29

相关推荐

我试图用randomForest和插入符号包预测栅格图层，但在引入因子变量时失败 . 没有因素，一切正常，但一旦我带来一个因素，我得到错误：

Error in predict.randomForest(modelFit, newdata) : Type of predictors in new data do not match that of the training data.

我在下面创建了一些示例代码来完成他的过程 . 我提出了透明度的几个步骤并提供了一个工作示例 .

(To skip the set-up code, jump from here on down...)

首先是创建样本数据，拟合RF模型，以及预测涉及NO因素的栅格 . 一切正常 .

# simulate data

x1p

x2p

x1a

x2a

# RF Classification on data with no factors... works fine

require(randomForest)

dRF

dRF$y

levels = c("present", "absent"))

rfFit

# Create sample Rasters

require(raster)

values(r1)

values(r2)

names(s)

# raster::predict() with no factors, works fine.

model

spplot(model)

接下来的步骤是创建一个因子变量，以添加到训练数据并创建具有预测匹配值的栅格 . 请注意，栅格是常规的旧整数，而不是 as.factor 栅格 . 一切都还行不错......

# Create factor variable

x3p

x3a

dFac

dFac$x3

dFac

# RF model with factors, works fine

rfFit2

# Create new raster, but not as.factor()

values(r3)

names(s2)

# RF, raster::predict() from fit with factor

model2

progress='text', factors=f, index=1:2)

spplot(model2) # works fine

完成上述步骤后，我现在有了一个RF模型，该模型使用包含因子变量的数据进行训练，并在包含类似值的整数栅格的栅格砖上进行预测 . 这是我的最终目标，但我希望能够通过 caret 包工作流程来实现 . 下面我介绍 caret::train() 没有因素，一切运作良好 .

# RF with Caret and NO factors

require(caret)

rf_ctrl

allowParallel=FALSE, verboseIter=TRUE,

savePredictions=TRUE, classProbs=TRUE)

cFit1

tuneLength=4, trControl = rf_ctrl, importance = TRUE)

model3

progress='text', factors=f, index=1:2)

spplot(model3) # works with caret and NO factors

(...to here. This is where the issues begin)

事情就是失败的地方 . 插入符号训练的Rf模型与因子变量有效，但在 raster::predict() 失败 .

# RF with Caret and FACTORS

rf_ctrl2

allowParallel=FALSE, verboseIter=TRUE,

savePredictions=TRUE, classProbs=TRUE)

cFit2

tuneLength=4, trControl = rf_ctrl2, importance = TRUE)

model4

progress='text', factors=f, index=1:2)

# FAIL: "Type of predictors in new data do not match that of the training data."

尝试与上面相同，但不是使用与因子级别具有相同值的整数栅格，而是使用 as.factor() 并指定级别将栅格转换为因子 . 这也失败了 .

#trying with raster as.factor()

r3f

values(r3f)

r3f

f$code

levels(r3f)

s2f

names(s2f)

s2f

model4f

progress='text', factors=f, index=1:2)

# FAIL "Type of predictors in new data do not match that of the training data."

上述步骤的错误和进展清楚地表明我的方法存在问题， caret:train() 与 raster::predict() . 我已经完成了调试(尽我所能)并解决了我注意到的问题，但没有吸烟枪 .

任何和所有的帮助将不胜感激 . 谢谢！

Added: 我继续乱搞，意识到如果 caret::train() 中的模型是用公式形式写的，它就可以工作 . 查看模型对象的结构，很容易看出为因子变量创建了对比 . 我想这也意味着 raster::predict() 认识到了对比 . 这很好，但是因为我的方法没有设置为使用基于公式的预测，这是一个无赖 . 任何额外的帮助仍然受到赞赏 .

#with Caret WITH FACTORS as model formula!

rf_ctrl3

allowParallel=FALSE, verboseIter=TRUE, savePredictions=TRUE, classProbs=TRUE)

cFit3

tuneLength=4, trControl = rf_ctrl2, importance = TRUE)

model5

spplot(model5)

如果觉得《factor java_使用randomForest Caret和factor变量预测栅格时出错》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。