如何标准化R数据框中的列?

标准化是指以某种方式转换R数据帧的向量或列,以使其平均值变为0且标准差变为1,即应将其转换为标准正态分布。在R中,可以借助缩放功能轻松完成此操作。查看以下示例,了解其操作方法。

示例

请看以下数据帧:

> set.seed(3665)

> x1<-rnorm(20,1,0.35)

> x2<-rnorm(20,50,1.25)

> x3<-rnorm(20,125,10.27)

> x4<-rpois(20,5)

> x5<-runif(20,1,5)

> x6<-rexp(20,1.35)

> df<-data.frame(x1,x2,x3,x4,x5,x6)

> df

输出结果

x1 x2 x3 x4 x5 x6

1 1.3958185 49.39843 128.5224 3 4.183664 2.33406246

2 1.0467979 48.90103 120.5796 7 3.526731 0.02043217

3 0.9190516 50.74664 110.4765 6 2.145181 0.04268455

4 1.1196425 47.83063 126.3711 9 4.276084 0.87234197

5 1.0033896 51.31879 144.2594 5 3.308073 0.28540083

6 0.7571435 49.92559 109.9660 5 2.349070 0.09613835

7 0.8266129 48.93754 135.5895 3 2.479160 0.15018153

8 1.2786206 50.27384 122.8543 4 4.343062 1.26431542

9 0.8661156 50.36976 122.9482 7 3.517678 0.24045191

10 0.9237285 48.55069 121.6440 4 1.619902 0.72327013

11 0.8191029 49.27937 111.8696 3 4.760655 0.97199973

12 1.2619135 50.91131 129.0021 4 3.355301 1.42184615

13 1.5297983 49.38604 133.4756 1 2.977833 0.50042231

14 0.7858227 47.92899 142.0669 3 3.262058 0.37260602

15 0.5626517 51.22160 107.5586 2 3.194546 0.21176125

16 1.2106700 51.65911 132.4945 3 1.088987 0.78318970

17 1.5351378 48.60769 116.1427 7 3.423079 0.65904040

18 0.4134951 52.09415 125.1567 5 4.309763 0.03658430

19 1.0490230 49.87242 125.6695 6 2.255468 1.32173240

20 0.9521718 50.96409 131.8025 1 4.312514 0.26955446

标准化数据框df的列:

示例

> scale(df$x1)

输出结果

       [,1]

[1,] 1.29357316

[2,] 0.11471258

[3,] -0.31676647

[4,] 0.36075419

[5,] -0.03190435

[6,] -0.86363126

[7,] -0.62898965

[8,] 0.89772243

[9,] -0.49556454

[10,] -0.30096951

[11,] -0.65435564

[12,] 0.84129223

[13,] 1.74610668

[14,] -0.76676371

[15,] -1.52055135

[16,] 0.66821113

[17,] 1.76414130

[18,] -2.02434637

[19,] 0.12222816

[20,] -0.20489902

attr(,"scaled:center")

[1] 1.012835

attr(,"scaled:scale")

[1] 0.2960661

示例

> scale(df$x2)

输出结果

        [,1]

[1,] -0.41047444

[2,] -0.81044801

[3,] 0.67365908

[4,] -1.67118308

[5,] 1.13374554

[6,] 0.01343048

[7,] -0.78108246

[8,] 0.29346832

[9,] 0.37059777

[10,] -1.09216479

[11,] -0.50620790

[12,] 0.80607839

[13,] -0.42043493

[14,] -1.59208513

[15,] 1.05558665

[16,] 1.40740638

[17,] -1.04632509

[18,] 1.75723305

[19,] -0.02932018

[20,] 0.84852032

attr(,"scaled:center")

[1] 49.90889

attr(,"scaled:scale")

[1] 1.243585

示例

> scale(df$x3)

输出结果

        [,1]

[1,] 0.34992259

[2,] -0.42214208

[3,] -1.40419396

[4,] 0.14081369

[5,] 1.87961161

[6,] -1.45381494

[7,] 1.03687396

[8,] -0.20103605

[9,] -0.19190685

[10,] -0.31867565

[11,] -1.26878469

[12,] 0.39654942

[13,] 0.83139424

[14,] 1.66649197

[15,] -1.68782930

[16,] 0.73602324

[17,] -0.85342655

[18,] 0.02276361

[19,] 0.07260960

[20,] 0.66875614

attr(,"scaled:center")

[1] 124.9225

attr(,"scaled:scale")

[1] 10.28772

示例

> scale(df$x4)

输出结果

        [,1]

[1,] -0.6550055

[2,] 1.2164389

[3,] 0.7485778

[4,] 2.1521611

[5,] 0.2807167

[6,] 0.2807167

[7,] -0.6550055

[8,] -0.1871444

[9,] 1.2164389

[10,] -0.1871444

[11,] -0.6550055

[12,] -0.1871444

[13,] -1.5907277

[14,] -0.6550055

[15,] -1.1228666

[16,] -0.6550055

[17,] 1.2164389

[18,] 0.2807167

[19,] 0.7485778

[20,] -1.5907277

attr(,"scaled:center")

[1] 4.4

attr(,"scaled:scale")

[1] 2.137387

示例

> scale(df$x5)

输出结果

         [,1]

[1,] 0.95270619

[2,] 0.29336256

[3,] -1.09325500

[4,] 1.04546450

[5,] 0.07390313

[6,] -0.88861840

[7,] -0.75805159

[8,] 1.11268817

[9,] 0.28427648

[10,] -1.62046133

[11,] 1.53181336

[12,] 0.12130380

[13,] -0.25754850

[14,] 0.02771942

[15,] -0.04004115

[16,] -2.15332455

[17,] 0.18933035

[18,] 1.07926704

[19,] -0.98256346

[20,] 1.08202896

attr(,"scaled:center")

[1] 3.23444

attr(,"scaled:scale")

[1] 0.9963449

示例

> scale(df$x6)

输出结果

        [,1]

[1,] 2.8372424

[2,] -1.0124395

[3,] -0.9754135

[4,] 0.4050652

[5,] -0.5715544

[6,] -0.8864709

[7,] -0.7965477

[8,] 1.0572753

[9,] -0.6463456

[10,] 0.1570224

[11,] 0.5708871

[12,] 1.3193929

[13,] -0.2137772

[14,] -0.4264525

[15,] -0.6940844

[16,] 0.2567235

[17,] 0.0501497

[18,] -0.9855638

[19,] 1.1528122

[20,] -0.5979214

attr(,"scaled:center")

[1] 0.6289008

attr(,"scaled:scale")

[1] 0.6009926

以上是 如何标准化R数据框中的列? 的全部内容, 来源链接: utcz.com/z/362076.html

回到顶部