如何对 R 数据框中少于四个类别的列进行子集化?
如果列是分类的,则至少可以有两个类别,类别总数没有限制,但也取决于案例总数。如果我们有一个数据框,其中包含一些类别多于或少于 4 的分类列,那么我们可能希望对少于四个类别的列进行子集化。在我们想要有偏见地对数据进行子集化或具有一些允许这种更改的预定义数据特征的情况下,这可能是必需的。可以在 sapply 函数的帮助下完成此类列的子集,如下例所示。
示例 1
考虑以下数据框 -
> x1<-sample(c("Hot","Cold","Warm"),20,replace=TRUE)输出结果> x2<-sample(c("Male","Female"),20,replace=TRUE)
> x3<-sample(letters[1:4],20,replace=TRUE)
> df1<-data.frame(x1,x2,x3)
> df1
x1 x2 x31 Warm Male b
2 Cold Female c
3 Cold Male a
4 Hot Male d
5 Hot Male d
6 Hot Female a
7 Hot Male a
8 Cold Female d
9 Warm Male d
10 Warm Female d
11 Cold Male a
12 Cold Female c
13 Hot Male b
14 Warm Male c
15 Cold Male b
16 Warm Male a
17 Hot Male b
18 Cold Male b
19 Hot Female c
20 Warm Female d
在 df1 中查找少于 4 个类别的列的子集 -
> df1[,sapply(df1, function(col) length(unique(col)))<4]输出结果
x1 x21 Warm Male
2 Cold Female
3 Cold Male
4 Hot Male
5 Hot Male
6 Hot Female
7 Hot Male
8 Cold Female
9 Warm Male
10 Warm Female
11 Cold Male
12 Cold Female
13 Hot Male
14 Warm Male
15 Cold Male
16 Warm Male
17 Hot Male
18 Cold Male
19 Hot Female
20 Warm Female
例2
> y1<-sample(c("Male","Female"),20,replace=TRUE)输出结果> y2<-sample(letters[1:5],20,replace=TRUE)
> y3<-sample(c("Asian","American","Chinese"),20,replace=TRUE)
> df2<-data.frame(y1,y2,y3)
> df2
y1 y2 y31 Male b Chinese
2 Female b American
3 Female d Asian
4 Female e American
5 Female e Asian
6 Female c Chinese
7 Female a Chinese
8 Female a Chinese
9 Male d American
10 Female d Chinese
11 Female d Chinese
12 Female c American
13 Female b American
14 Male d Chinese
15 Male a American
16 Male e Asian
17 Male b Asian
18 Female d Chinese
19 Female d Chinese
20 Female c Asian
在 df2 中查找少于 4 个类别的列的子集 -
> df2[,sapply(df2, function(col) length(unique(col)))<4]输出结果
y1 y31 Male Chinese
2 Female American
3 Female Asian
4 Female American
5 Female Asian
6 Female Chinese
7 Female Chinese
8 Female Chinese
9 Male American
10 Female Chinese
11 Female Chinese
12 Female American
13 Female American
14 Male Chinese
15 Male American
16 Male Asian
17 Male Asian
18 Female Chinese
19 Female Chinese
20 Female Asian
以上是 如何对 R 数据框中少于四个类别的列进行子集化? 的全部内容, 来源链接: utcz.com/z/331881.html