为什么即使在 R 中使用 dplyr 将 na.rm 设置为 TRUE 也是 NaN？

Z时代
2024-01-10
分类：综合

如果na.rm使用 dplyr 包设置为 TRUE，则统计操作的输出将返回 NaN。为了避免这种情况，我们需要排除 na.rm。按照以下步骤了解 tw 之间的区别 -

首先，创建一个数据框。
na.rm如果数据框中存在 NA，则将数据框汇总为 TRUE。
汇总数据框而不设置na.rm为 TRUE。

创建数据框

让我们创建一个数据框，如下所示 -

例子

Group&li;-rep(c("First","Second","Third"),times=c(3,10,7))
Response&li;-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4))
df&li;-data.frame(Group,Response)
df

执行时，上述脚本生成以下内容output(this output will vary on your system due to randomization)-

输出

Group Response

1 First NA

2 First NA

3 First NA

4 Second 3

5 Second 3

6 Second 4

7 Second 4

8 Second 4

9 Second 4

10 Second 4

11 Second 5

12 Second 5

13 Second 7

14 Third 7

15 Third 7

16 Third 7

17 Third 8

18 Third 8

19 Third 8

20 Third 8

汇总数据框并na.rm设置为 TRUE

加载 dplyr 包并用每组响应的平均值汇总数据框 df -

例子

library(dplyr)
Group<-rep(c("First","Second","Third"),times=c(3,10,7))
Response<-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4))
df<-data.frame(Group,Response)
df%>%group_by(Group)%>%summarise(mean=mean(Response,na.rm=TRUE))

输出

# A tibble: 3 x 2
Group mean
  <chr> <dbl>
1 First NaN
2 Second 4.3
3 Third 7.57

汇总数据框而不设置na.rm为 TRUE

用每组响应的平均值总结数据帧 df 而不设置na.rm为 TRUE -

例子

Group<-rep(c("First","Second","Third"),times=c(3,10,7))
Response<-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4))
df<-data.frame(Group,Response)
df%>%group_by(Group)%>%summarise(mean=mean(Response))

输出

# A tibble: 3 x 2
Group mean
  <chr> <dbl>
1 First NA
2 Second 4.3
3 Third 7.57

以上是为什么即使在 R 中使用 dplyr 将 na.rm 设置为 TRUE 也是 NaN？的全部内容，来源链接： utcz.com/z/341473.html

为什么即使在 R 中使用 dplyr 将 na.rm 设置为 TRUE 也是 NaN？

创建数据框

例子

输出

汇总数据框并na.rm设置为 TRUE

例子

输出

汇总数据框而不设置na.rm为 TRUE

例子

输出

其他人也看了：