如何获取摘要统计信息,包括R数据帧列的所有基本统计值?
当我们在R中应用汇总函数时,输出会给出最小值,第一四分位数,中位数,均值,第三四分位数和最大值,但是还有许多其他基本统计值可帮助我们理解变量,例如范围,总和,均值的标准误,方差,标准差和变异系数。因此,如果要查找所有值,则可以使用stat.descpastecs软件包的功能,如以下示例所示。
例1
考虑以下数据帧-
> x1<-rnorm(20)输出结果> x2<-rnorm(20)
> x3<-rnorm(20)
> df1<-data.frame(x1,x2,x3)
> df1
x1 x2 x31 1.37057327 0.96585723 -1.6824440
2 0.43258556 -2.54077794 -1.5962218
3 0.68188832 1.08144561 -0.9956110
4 0.24553258 0.07541754 -0.3527252
5 -0.19946765 0.49262220 -0.7946248
6 -1.93924451 0.13544724 -0.4184053
7 0.27443524 0.08363552 0.8696729
8 -2.02613035 -0.67827697 -0.8940207
9 0.33772301 -1.51171368 0.4032073
10 -0.44463177 1.69245587 1.7037202
11 1.69256604 -0.60384845 0.7247898
12 0.11356829 1.05543184 0.9780191
13 -0.01516246 0.92529906 0.4805570
14 -0.78159893 -0.55414738 -0.4680645
15 -0.08974609 0.76847977 -0.2780631
16 -0.45456509 1.08361106 -1.6672789
17 1.13920983 0.24680491 1.3922984
18 0.55562889 -0.06529163 -0.7083794
19 -0.11607439 1.09421670 2.1602874
20 -0.78351132 0.48005020 0.3453250
使用摘要功能查找df1的摘要-
> summary(df1)输出结果
x1 x2 x3Min. :-2.0261304 Min. :-2.5408 Min. :-1.6824
1st Qu.:-0.4471151 1st Qu.:-0.1875 1st Qu.:-0.8195
Median : 0.0492029 Median : 0.3634 Median :-0.3154
Mean :-0.0003211 Mean : 0.2113 Mean :-0.0399
3rd Qu.: 0.4633464 3rd Qu.: 0.9883 3rd Qu.: 0.7610
Max. : 1.6925660 Max. : 1.6925 Max. : 2.1603
加载pastecs软件包并使用stat.desc函数查找df1的统计摘要-
> library(pastecs)输出结果> stat.desc(df1)
x1 x2 x3nbr.val 2.000000e+01 20.0000000 20.00000000
nbr.null 0.000000e+00 0.0000000 0.00000000
nbr.na 0.000000e+00 0.0000000 0.00000000
min -2.026130e+00 -2.5407779 -1.68244397
max 1.692566e+00 1.6924559 2.16028742
range 3.718696e+00 4.2332338 3.84273139
sum -6.421540e-03 4.2267187 -0.79796158
median 4.920292e-02 0.3634276 -0.31539416
mean -3.210770e-04 0.2113359 -0.03989808
SE.mean 2.103941e-01 0.2262258 0.25081489
CI.mean.0.95 4.403600e-01 0.4734961 0.52496160
var 8.853137e-01 1.0235624 1.25816219
std.dev 9.409111e-01 1.0117126 1.12167829
coef.var -2.930484e+03 4.7872246 -28.11359138
例2
> y1<-rpois(20,5)输出结果> y2<-rpois(20,2)
> y3<-rpois(20,10)
> y4<-rpois(20,8)
> df2<-data.frame(y1,y2,y3,y4)
> df2
y1 y2 y3 y41 4 4 10 6
2 4 1 9 8
3 2 3 12 9
4 4 0 11 4
5 7 3 7 7
6 6 0 9 18
7 5 1 7 3
8 6 2 5 10
9 5 1 10 5
10 6 1 12 7
11 11 2 8 7
12 4 2 10 11
13 4 3 7 6
14 4 0 11 15
15 10 1 8 8
16 5 0 6 8
17 3 1 13 14
18 4 1 8 5
19 5 1 5 4
20 8 2 13 5
使用stat.desc函数查找df2的统计摘要-
> stat.desc(df2)输出结果
y1 y2 y3 y4nbr.val 20.0000000 20.0000000 20.0000000 20.0000000
nbr.null 0.0000000 4.0000000 0.0000000 0.0000000
nbr.na 0.0000000 0.0000000 0.0000000 0.0000000
min 2.0000000 0.0000000 5.0000000 3.0000000
max 11.0000000 4.0000000 13.0000000 18.0000000
range 9.0000000 4.0000000 8.0000000 15.0000000
sum 107.0000000 29.0000000 181.0000000 160.0000000
median 5.0000000 1.0000000 9.0000000 7.0000000
mean 5.3500000 1.4500000 9.0500000 8.0000000
SE.mean 0.4988144 0.2562380 0.5547641 0.8795932
CI.mean.0.95 1.0440305 0.5363122 1.1611345 1.8410097
var 4.9763158 1.3131579 6.1552632 15.4736842
std.dev 2.2307657 1.1459310 2.4809803 3.9336604
coef.var 0.4169656 0.7902973 0.2741415 0.4917076
以上是 如何获取摘要统计信息,包括R数据帧列的所有基本统计值? 的全部内容, 来源链接: utcz.com/z/319770.html