我对机器学习非常陌生,正在尝试Kaggle 上的森林覆盖预测竞赛,但我很早就挂断了。当我运行下面的代码时,出现以下错误。
Error in train.default(x, y, weights = w, ...) :
final tuning parameters could not be determined
In addition: There were 50 or more warnings (use warnings() to see the first 50)
# Load the libraries
library(ggplot2); library(caret); library(AppliedPredictiveModeling)
library(pROC)
library(Amelia)
set.seed(1234)
# Load the forest cover dataset from the csv file
rawdata <- read.csv("train.csv",stringsAsFactors = F)
#this data won't be used in model evaluation. It will only be used for the submission.
test <- read.csv("test.csv",stringsAsFactors = F)
########################
### DATA PREPARATION ###
########################
#create a training and test set for building and evaluating the model
samples <- createDataPartition(rawdata$Cover_Type, p = 0.5,list = FALSE)
data.train <- rawdata[samples, ]
data.test <- rawdata[-samples, ]
model1 <- train(as.factor(Cover_Type) ~ Elevation + Aspect + Slope + Horizontal_Distance_To_Hydrology,
data = data.train,
method = "rf", prox = "TRUE")
以下应该有效:
model1 <- train(as.factor(Cover_Type) ~ Elevation + Aspect + Slope + Horizontal_Distance_To_Hydrology,
data = data.train,
method = "rf", tuneGrid = data.frame(mtry = 3))
最好指定tuneGrid
参数是具有可能的调整值的数据帧。看着?randomForest
and ?train
了解更多信息。rf
只有一个调整参数mtry
,它控制为每棵树选择的特征数量。
你也可以运行modelLookup
获取每个模型的调整参数列表
> modelLookup("rf")
# model parameter label forReg forClass probModel
#1 rf mtry #Randomly Selected Predictors TRUE TRUE TRUE
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)