如何在R中为名字和姓氏创建单独的列?
大多数时候在数据分析中,人的名字和姓氏连接在一起,或者我们可以说它们存储在一个空间中,因此我们需要将它们分开以使数据易于阅读。要在 R 中为名字和姓氏创建单独的列,我们可以使用 tidyr 包的提取功能。
查看下面给出的示例以了解如何完成。
示例 1
以下代码段创建了一个示例数据框 -
Names<-c("John Jones","Steve Smith","Pat Cummins","David Warner","Andrew Flintoff","Aaron Finch","Mitchell Starc","Nathan Lyon","Mathew Wade","Adam Zampa","Adam Gilchrist","Ricky Ponting","Glenn McGrath","Ben Cutting","John Cena","Brock Williams","Rubel Hussain","Soumya Sarkar","Mehidy Hasan","Liton Das")df1<-data.frame(Names)
df1
创建以下数据框 -
Names1 John Jones
2 Steve Smith
3 Pat Cummins
4 David Warner
5 Andrew Flintoff
6 Aaron Finch
7 Mitchell Starc
8 Nathan Lyon
9 Mathew Wade
10 Adam Zampa
11 Adam Gilchrist
12 Ricky Ponting
13 Glenn McGrath
14 Ben Cutting
15 John Cena
16 Brock Williams
17 Rubel Hussain
18 Soumya Sarkar
19 Mehidy Hasan
20 Liton Das
要加载 tidyr 包并在 df1 中为名字和姓氏创建单独的列,请将以下代码添加到上述代码段中 -
library(tidyr)输出结果extract(df1,Names,c("First_Name","Last_Name"), "([^ ]+) (.*)")
如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出 -
First_Name Last_Name1 John Jones
2 Steve Smith
3 Pat Cummins
4 David Warner
5 Andrew Flintoff
6 Aaron Finch
7 Mitchell Starc
8 Nathan Lyon
9 Mathew Wade
10 Adam Zampa
11 Adam Gilchrist
12 Ricky Ponting
13 Glenn McGrath
14 Ben Cutting
15 John Cena
16 Brock Williams
17 Rubel Hussain
18 Soumya Sarkar
19 Mehidy Hasan
20 Liton Das
示例 2
以下代码段创建了一个示例数据框 -
Names<-c("Kane Williamson","Devon Conway","Trent Boult","Ross Taylor","Martin Guptill","Tim Southee","James Neesham","Lockie Ferguson","Ish Sodhi","Matt Henry","Tom Latham","Mark Chapman","Henry Nicholos","Tom Bundell","Sachin Tendulkar","Rahul Dravid","Chris Gayle","Tabraiz Shamsi","Aiden Makram","David Miller")df2<-data.frame(Names)
df2
创建以下数据框 -
Names1 Kane Williamson
2 Devon Conway
3 Trent Boult
4 Ross Taylor
5 Martin Guptill
6 Tim Southee
7 James Neesham
8 Lockie Ferguson
9 Ish Sodhi
10 Matt Henry
11 Tom Latham
12 Mark Chapman
13 Henry Nicholos
14 Tom Bundell
15 Sachin Tendulkar
16 Rahul Dravid
17 Chris Gayle
18 Tabraiz Shamsi
19 Aiden Makram
20 David Miller
要在 df2 中为名字和姓氏创建单独的列,请将以下代码添加到上述代码段中 -
extract(df2,Names,c("First_Name","Last_Name"), "([^ ]+) (.*)")输出结果
如果您将上述所有给定的片段作为单个程序执行,它会生成以下输出 -
First_Name Last_Name1 Kane Williamson
2 Devon Conway
3 Trent Boult
4 Ross Taylor
5 Martin Guptill
6 Tim Southee
7 James Neesham
8 Lockie Ferguson
9 Ish Sodhi
10 Matt Henry
11 Tom Latham
12 Mark Chapman
13 Henry Nicholos
14 Tom Bundell
15 Sachin Tendulkar
16 Rahul Dravid
17 Chris Gayle
18 Tabraiz Shamsi
19 Aiden Makram
20 David Miller
以上是 如何在R中为名字和姓氏创建单独的列? 的全部内容, 来源链接: utcz.com/z/345717.html