您需要做的就是获取模型的预测:
model_predictions <- predict(fmi)
现在您可以检查数据缺失索引的预测:
missing <- which(is.na(iris.mis$Sepal.Length))
imputed <- model_predictions[missing]
imputed
#> 5 22 27 32 34 35 54 60
#> 5.073695* 5.119113* 5.182343* 4.949794* 5.381427* 4.863149* 5.565716* 5.596861*
#> 89 102 107 117 131 135 145 149
#> 5.950823* 6.217764* 5.757642* 6.829916* 7.116657* 6.726274* 6.738296* 6.662452*
#> 150
#> 6.428420*
看看它们与实际值的比较:
actual <- iris$Sepal.Length[missing]
plot(x = actual, y = imputed, xlim = c(4, 8), ylim = c(4, 8), col = "red",
xlab = "Actual", ylab = "Imputed", main = "Imputed vs Actual Sepal Length")
lines(c(4, 8), c(4, 8), lty = 2)
#> # calculate residuals
imputed - actual
#> 5 22 27 32 34 35
#> 0.07369483* 0.01911295* 0.18234346* -0.45020634* -0.11857279* -0.03685114*
#> 54 60 89 102 107 117
#> 0.06571631* 0.39686061* 0.35082282* 0.41776385* 0.85764178* 0.32991602*
#> 131 135 145 149 150
#> -0.28334270* 0.62627448* 0.03829600* 0.46245174* 0.52842038*
#>
#> # sum of squared errors
sum((imputed - actual)^2)
#> [1] 2.52802
因此,如果您想要在集合中添加一个包含插补的新列,您可以执行以下操作
iris.mis$Sepal.Length.Imputed <- iris.mis$Sepal.Length
iris.mis$Sepal.Length.Imputed[is.na(iris.mis$Sepal.Length.Imputed)] <- imputed