自动比较鼠标glm.mids中的嵌套模型

我有一个来自R的mice包的乘法 - 估算模型,其中有很多因子变量。例如:自动比较鼠标glm.mids中的嵌套模型

library(mice) 

library(Hmisc)

# turn all the variables into factors

fake = nhanes

fake$age = as.factor(nhanes$age)

fake$bmi = cut2(nhanes$bmi, g=3)

fake$chl = cut2(nhanes$chl, g=3)

head(fake)

age bmi hyp chl

1 1 <NA> NA <NA>

2 2 [20.4,25.5) 1 [187,206)

3 1 <NA> 1 [187,206)

4 3 <NA> NA <NA>

5 1 [20.4,25.5) 1 [113,187)

6 3 <NA> NA [113,187)

imput = mice(nhanes)

# big model

fit1 = glm.mids((hyp==2) ~ age + bmi + chl, data=imput, family = binomial)

我想通过针对在滴一个变量中的每个可能的嵌套模型中测试的完整模型,以测试在所述模型(未每个级别的指示器变量)每个整个因子变量的意义一次。手动,我可以这样做:

# small model (no chl) 

fit2 = glm.mids((hyp==2) ~ age + bmi, data=imput, family = binomial)

# extract p-value from pool.compare

pool.compare(fit1, fit2)$pvalue

如何自动为我的模型中的所有因子做这些事情?建议drop1非常有帮助的功能a previous question - 现在我想要做的事情完全一样,除了mice情况。

可能有用注:的pool.compare一个恼人的特点是,它似乎要在更大的模型“额外”的变量被放置与该小模型共享的那些之后。

回答:

在按照pool.compare所需的顺序排列它们之后,可以使用循环遍历预测变量的不同组合。

因此,使用从上面的数据fake - 调整了类别

library(mice) 

library(Hmisc)

# turn all the variables into factors

# turn all the variables into factors

fake <- nhanes

fake$age <- as.factor(nhanes$age)

fake$bmi <- cut2(nhanes$bmi, g=2)

fake$chl <- cut2(nhanes$chl, g=2)

# Impute

imput <- mice(fake, seed=1)

# Create models

# - reduced models with one variable removed

# - full models with extra variables at end of expression

vars <- c("age", "bmi", "chl")

red <- combn(vars, length(vars)-1 , simplify=FALSE)

diffs <- lapply(red, function(i) setdiff(vars, i))

(full <- lapply(1:length(red), function(i)

paste(c(red[[i]], diffs[[i]]), collapse=" + ")))

#[[1]]

#[1] "age + bmi + chl"

#[[2]]

#[1] "age + chl + bmi"

#[[3]]

#[1] "bmi + chl + age"

(red <- combn(vars, length(vars)-1 , FUN=paste, collapse=" + "))

#[1] "age + bmi" "age + chl" "bmi + chl"

该机型现在在正确的顺序传递给glm来电的号码。我也换成glm.mids方法,因为它已取代with.mids - 见?glm.mids

out <- vector("list", length(red)) 

for(i in 1:length(red)) {

redMod <- with(imput,

glm(formula(paste("(hyp==2) ~ ", red[[i]])), family = binomial))

fullMod <- with(imput,

glm(formula(paste("(hyp==2) ~ ", full[[i]])), family = binomial))

out[[i]] <- list(predictors = diffs[[i]],

pval = c(pool.compare(fullMod, redMod)$pvalue))

}

do.call(rbind.data.frame, out)

# predictors pval

#2 chl 0.9976629

#21 bmi 0.9985028

#3 age 0.9815831

# Check manually by leaving out chl

mod1 <- with(imput, glm((hyp==2) ~ age + bmi + chl , family = binomial))

mod2 <- with(imput, glm((hyp==2) ~ age + bmi , family = binomial))

pool.compare(mod1, mod2)$pvalue

# [,1]

#[1,] 0.9976629

你会使用这个数据集得到了很多警告

编辑

你可以在一个函数包装这个

impGlmDrop1 <- function(vars, outcome, Data=imput, Family="binomial") 

{

red <- combn(vars, length(vars)-1 , simplify=FALSE)

diffs <- lapply(red, function(i) setdiff(vars, i))

full <- lapply(1:length(red), function(i)

paste(c(red[[i]], diffs[[i]]), collapse=" + "))

red <- combn(vars, length(vars)-1 , FUN=paste, collapse=" + ")

out <- vector("list", length(red))

for(i in 1:length(red)) {

redMod <- with(Data,

glm(formula(paste(outcome, red[[i]], sep="~")), family = Family))

fullMod <- with(Data,

glm(formula(paste(outcome, full[[i]], sep="~")), family = Family))

out[[i]] <- list(predictors = diffs[[i]],

pval = c(pool.compare(fullMod, redMod)$pvalue) )

}

do.call(rbind.data.frame, out)

}

# Run

impGlmDrop1(c("age", "bmi", "chl"), "(hyp==2)")

以上是 自动比较鼠标glm.mids中的嵌套模型 的全部内容, 来源链接: utcz.com/qa/258389.html

回到顶部