pg性能分析

database

postgresql 库中出现性能问题,对于复杂的sql, 常用分析过程:

  1. 简化SQL,定位性能异常点:
  2. 简化输出。像下面语句,可以先把输出的子查询去掉。有时也可以使用count(*)代替输出。
  3. 逐个测试union(minus),with子句。基于这些语句的独立性,可以逐个测试,逐渐添加条件,找到异常点。
  4. 分析执行计划,查看表数据量,连接方式,统计信息情况,索引情况
  5. Explain  各部分的消耗,连接方式等,如果语句可以在接受时间内执行,可以使用explain(analyze, buffers, timing)
  6. Pg_stat_user_table可以查看什么时候做的vacuum和analyze,live tuple和dead tuple个数,还有增删改查的次数等。
  7. Pg_stats 可以查看值的分布情况

回到下面的SQL:

1. 先做简化,使用count(*)替换所有输出:

explain(analyze , buffers, timing) select count(*)

  from sms_task_content_info    a,

       tsk_type_tbl b,

       tsk_plan_info       c,

       sm_code_tbl      d,

       smu_info            e

where a.course_type = b.course_type

   and a.course_id = c.content_id

   and c.plan_maker = e.user_id

   and e.region_code = d.region_code

   and d.is_valid = "Y"

   and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")

   and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1

;

 

                                                                              QUERY PLAN                                                                              

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=2947.16..2947.18 rows=1 width=0) (actual time=49.154..49.154 rows=1 loops=1)

   Buffers: shared hit=1602

   ->  Hash Join  (cost=657.32..2928.06 rows=7643 width=0) (actual time=13.259..48.521 rows=7440 loops=1)

         Hash Cond: ((c.content_id)::text = (a.course_id)::text)

         Buffers: shared hit=1602

         ->  Hash Join  (cost=459.24..2615.33 rows=7643 width=33) (actual time=10.020..42.532 rows=7440 loops=1)

               Hash Cond: ((c.plan_maker)::text = (e.user_id)::text)

               Buffers: shared hit=1491

               ->  Seq Scan on tsk_plan_info c  (cost=0.00..2022.34 rows=7643 width=45) (actual time=0.629..29.272 rows=7440 loops=1)

                     Filter: ((date_plan >= to_date("2016-12-01"::text, "yyyy-mm-dd"::text)) AND (date_plan < (to_date("2016-12-31"::text, "yyyy-mm-dd"::text) + 1)))

                     Rows Removed by Filter: 25003

                     Buffers: shared hit=1286

               ->  Hash  (cost=412.29..412.29 rows=3756 width=12) (actual time=9.377..9.377 rows=3756 loops=1)

                     Buckets: 1024  Batches: 1  Memory Usage: 164kB

                     Buffers: shared hit=205

                     ->  Hash Join  (cost=179.00..412.29 rows=3756 width=12) (actual time=3.754..7.788 rows=3756 loops=1)

                           Hash Cond: ((e.region_code)::text = (d.region_code)::text)

                           Buffers: shared hit=205

                           ->  Seq Scan on smu_info e  (cost=0.00..167.56 rows=3756 width=14) (actual time=0.006..1.228 rows=3756 loops=1)

                                 Buffers: shared hit=130

                           ->  Hash  (cost=127.00..127.00 rows=4160 width=6) (actual time=3.736..3.736 rows=4103 loops=1)

                                 Buckets: 1024  Batches: 1  Memory Usage: 156kB

                                 Buffers: shared hit=75

                                 ->  Seq Scan on sms_region_code_tbl d  (cost=0.00..127.00 rows=4160 width=6) (actual time=0.003..2.201 rows=4103 loops=1)

                                       Filter: ((is_valid)::text = "Y"::text)

                                       Rows Removed by Filter: 4

                                       Buffers: shared hit=75

         ->  Hash  (cost=171.94..171.94 rows=2092 width=33) (actual time=3.231..3.231 rows=2093 loops=1)

               Buckets: 1024  Batches: 1  Memory Usage: 133kB

               Buffers: shared hit=111

               ->  Hash Join  (cost=12.25..171.94 rows=2092 width=33) (actual time=0.021..2.231 rows=2093 loops=1)

                     Hash Cond: ((a.course_type)::text = (b.course_type)::text)

                     Buffers: shared hit=111

                     ->  Seq Scan on sms_task_content_info a  (cost=0.00..130.92 rows=2092 width=35) (actual time=0.004..0.818 rows=2093 loops=1)

                           Buffers: shared hit=110

                     ->  Hash  (cost=11.00..11.00 rows=100 width=20) (actual time=0.009..0.009 rows=6 loops=1)

                           Buckets: 1024  Batches: 1  Memory Usage: 1kB

                           Buffers: shared hit=1

                           ->  Seq Scan on tsk_type_tbl b  (cost=0.00..11.00 rows=100 width=20) (actual time=0.003..0.005 rows=6 loops=1)

                                 Buffers: shared hit=1

Planning time: 2.522 ms

Execution time: 49.270 ms

(42 rows)

 

Time: 71.990 ms

 

去掉子查询后,语句很快就输出了, 问题就在输出结果里的子查询,到最后输出7440行,就意味着那两个子查询都需要7440次。整体语句慢在这里。

 

 

select distinct d.description as "RANGE",

                b.description as "COURSE_CLASSIFICATION_DESC",

                to_char(c.date_make, "yyyy-mm-dd") as "DATE_MAKE",

                to_char(c.date_end, "yyyy-mm-dd") as "DATE_END",

                to_char(c.date_plan, "yyyy-mm-dd") as "DATE_PLAN",

                (select cast((case

                               when (select count(1)

                                       from sms_task_content_info a2,

                                            tsk_plan_info    b2,

                                            smu_info         s2

                                      where a2.course_id = b2.content_id

                                        and b2.plan_maker = s2.user_id

                                        and b2.plan_status != "2"

                                        and s2.region_code = e.region_code

                                        and a2.course_type = a.course_type

                                        and to_char(b2.date_make, "yyyy-mm-dd") =

                                            to_char(c.date_make, "yyyy-mm-dd")

                                        and to_char(b2.date_plan, "yyyy-mm-dd") =

                                            to_char(c.date_plan, "yyyy-mm-dd")

                                        and to_char(b2.date_end, "yyyy-mm-dd") =

                                            to_char(c.date_end, "yyyy-mm-dd")) != 0 then

                                (cast(100 AS numeric(5, 2)) *

                                (select count(1)

                                    from sms_task_content_info a1,

                                         tsk_plan_info    b1,

                                         smu_info         s1

                                   where a1.course_id = b1.content_id

                                     and b1.plan_maker = s1.user_id

                                     and b1.plan_status = "1"

                                     and s1.region_code = e.region_code

                                     and a1.course_type = a.course_type

                                     and to_char(b1.date_make, "yyyy-mm-dd") =

                                         to_char(c.date_make, "yyyy-mm-dd")

                                     and to_char(b1.date_plan, "yyyy-mm-dd") =

                                         to_char(c.date_plan, "yyyy-mm-dd")

                                     and to_char(b1.date_end, "yyyy-mm-dd") =

                                         to_char(c.date_end, "yyyy-mm-dd")) /

                                (select count(1)

                                    from sms_task_content_info a2,

                                         tsk_plan_info    b2,

                                         smu_info         s2

                                   where a2.course_id = b2.content_id

                                     and b2.plan_maker = s2.user_id

                                     and b2.plan_status != "2"

                                     and s2.region_code = e.region_code

                                     and a2.course_type = a.course_type

                                     and to_char(b2.date_make, "yyyy-mm-dd") =

                                         to_char(c.date_make, "yyyy-mm-dd")

                                     and to_char(b2.date_plan, "yyyy-mm-dd") =

                                         to_char(c.date_plan, "yyyy-mm-dd")

                                     and to_char(b2.date_end, "yyyy-mm-dd") =

                                         to_char(c.date_end, "yyyy-mm-dd")))

                               else

                                "0"

                             end) AS numeric(5, 2)) || "%"

                   from dual) as "FINISH_RATIO",

                d.region_code,

                b.course_type

  from sms_task_content_info    a,

       tsk_type_tbl b,

       tsk_plan_info       c,

       sm_code_tbl      d,

       smu_info            e

where a.course_type = b.course_type

   and a.course_id = c.content_id

   and c.plan_maker = e.user_id

   and e.region_code = d.region_code

   and d.is_valid = "Y"

  -- and e.region_code in()

  -- and a.course_type = "1"

   and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")

   and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1

order by d.region_code, b.course_type;

 

以上是 pg性能分析 的全部内容, 来源链接: utcz.com/z/534354.html

回到顶部