在调整 SVM 参数时,我观察到一个非常奇怪的行为caret
。当训练单个模型而不进行调整时,具有径向基核的 SVM 比具有线性核的 SVM 花费更多时间,这是预期的。然而,当在相同的惩罚网格上调整具有两个核的 SVM 时,具有线性核的 SVM 比具有径向基核的 SVM 花费更多的时间。使用 R 3.2 可以在 Windows 和 Linux 中轻松重现此行为caret
6.0-47。有谁知道为什么调整线性 SVM 比径向基核 SVM 花费更多时间?
SVM linear
user system elapsed
0.51 0.00 0.52
SVM radial
user system elapsed
0.85 0.00 0.84
SVM linear tuning
user system elapsed
129.98 0.02 130.08
SVM radial tuning
user system elapsed
2.44 0.05 2.48
玩具示例代码如下:
library(data.table)
library(kernlab)
library(caret)
n <- 1000
p <- 10
dat <- data.table(y = as.factor(sample(c('p', 'n'), n, replace = T)))
dat[, (paste0('x', 1:p)) := lapply(1:p, function(x) rnorm(n, 0, 1))]
dat <- as.data.frame(dat)
sigmas <- sigest(as.matrix(dat[, -1]), na.action = na.omit, scaled = TRUE)
sigma <- mean(as.vector(sigmas[-2]))
cat('\nSVM linear\n')
print(system.time(fit1 <- train(y ~ ., data = dat, method = 'svmLinear', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial\n')
print(system.time(fit2 <- train(y ~ ., data = dat, method = 'svmRadial', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM linear tuning\n')
print(system.time(fit3 <- train(y ~ ., data = dat, method = 'svmLinear',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5)),
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial tuning\n')
print(system.time(fit4 <- train(y ~ ., data = dat, method = 'svmRadial',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5), sigma = sigma),
trControl = trainControl(method = 'cv', number = 3))))