如何在R中查找字符串列的每一行中的字符数?

如果我们在 R 数据框中有一个字符串列,并且字符串与数字混合,并且我们想要找到字符串列的每一行中的字符数,那么 nchar 函数可以与 gsub 函数一起使用,如下例所示。

由于 R 区分大小写,因此在进行此类分析时,我们需要确保对小写和大写字母使用正确的表示法。

示例 1

以下代码段创建了一个示例数据框 -

x<-c("A01K", "140AL", "A142R", "A255SW", "A2474EZ", "CA214N", "C14O", "CGSLT", "DC23QW", "D2411RWEDE", "FL233EGV", "G36521VCLPBA", "G54TRU", "H214FI", "245IA", "ID3699", "IL01", "IFDFDN", "K2254FDES", "KY244RLPKJ")

df1<-data.frame(x)

df1

创建以下数据框 -

     x

1  A01K

2  140AL

3  A142R

4  A255SW

5  A2474EZ

6  CA214N

7  C14O

8  CGSLT

9  DC23QW

10 D2411RWEDE

11 FL233EGV

12 G36521VCLPBA

13 G54TRU

14 H214FI

15 245IA

16 ID3699

17 IL01

18 IFDFDN

19 K2254FDES

20 KY244RLPKJ

要查找 x 列每行中的字符数,请将以下代码添加到上述代码段中 -

x<-c("A01K", "140AL", "A142R", "A255SW", "A2474EZ", "CA214N", "C14O", "CGSLT", "DC23QW", "D2411RWEDE", "FL233EGV", "G36521VCLPBA", "G54TRU", "H214FI", "245IA", "ID3699", "IL01", "IFDFDN", "K2254FDES", "KY244RLPKJ")

df1<-data.frame(x)

df1$No_of_Chars<-nchar(gsub("[^A-Z]","",df1$x))

df1

输出结果

如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出 -

    x    No_of_Chars

1  A01K         2

2  140AL        2

3  A142R        2

4  A255SW       3

5  A2474EZ      3

6  CA214N       3

7  C14O         2

8  CGSLT        5

9  DC23QW       4

10 D2411RWEDE   6

11 FL233EGV     5

12 G36521VCLPBA 7

13 G54TRU       4

14 H214FI       3

15 245IA        2

16 ID3699       2

17 IL01         2

18 IFDFDN       6

19 K2254FDES    5

20 KY244RLPKJ   7

示例 2

以下代码段创建了一个示例数据框 -

y<-c("ala5412bama","ala1475ska","american11022samoa","arizona3652","arkan1475sas","califor2365nia","co1475lorado","0014connecticut","dela25366ware","district257of22columbia","florid02535a","57412georgia","gu25987am","hawaii36250","20057idaho","i369852llinois","indiana0146563","3255iowa","kansas3682701","kentucky2574")

df2<-data.frame(y)

df2

创建以下数据框 -

      y

1  ala5412bama

2  ala1475ska

3  american11022samoa

4  arizona3652

5  arkan1475sas

6  califor2365nia

7  co1475lorado

8  0014connecticut

9  dela25366ware

10 district257of22columbia

11 florid02535a

12 57412georgia

13 gu25987am

14 hawaii36250

15 20057idaho

16 i369852llinois

17 indiana0146563

18 3255iowa

19 kansas3682701

20 kentucky2574

要查找 y 列的每一行中的字符数,请将以下代码添加到上述代码段中 -

y<-c("ala5412bama","ala1475ska","american11022samoa","arizona3652","arkan1475sas","califor2365nia","co1475lorado","0014connecticut","dela25366ware","district257of22columbia","florid02535a","57412georgia","gu25987am","hawaii36250","20057idaho","i369852llinois","indiana0146563","3255iowa","kansas3682701","kentucky2574")

df2<-data.frame(y)

df2$No_of_Chars<-nchar(gsub("[^a-z]","",df2$y))

df2

输出结果

如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出 -

          y          No_of_Chars

1  ala5412bama              7

2  ala1475ska               6

3  american11022samoa      13

4  arizona3652              7

5  arkan1475sas             8

6  califor2365nia          10

7  co1475lorado             8

8  0014connecticut         11

9  dela25366ware            8

10 district257of22columbia 18

11 florid02535a             7

12 57412georgia             7

13 gu25987am                4

14 hawaii36250              6

15 20057idaho               5

16 i369852llinois           8

17 indiana0146563           7

18 3255iowa                 4

19 kansas3682701            6

20 kentucky2574             8

以上是 如何在R中查找字符串列的每一行中的字符数? 的全部内容, 来源链接: utcz.com/z/360744.html

回到顶部