pg性能分析

Z时代
2024-01-10
分类：综合
database
postgresql 库中出现性能问题，对于复杂的sql，常用分析过程：
简化SQL，定位性能异常点：
简化输出。像下面语句，可以先把输出的子查询去掉。有时也可以使用count(*)代替输出。
逐个测试union（minus），with子句。基于这些语句的独立性，可以逐个测试，逐渐添加条件，找到异常点。
分析执行计划，查看表数据量，连接方式，统计信息情况，索引情况
Explain 各部分的消耗，连接方式等，如果语句可以在接受时间内执行，可以使用explain(analyze, buffers, timing)
Pg_stat_user_table可以查看什么时候做的vacuum和analyze，live tuple和dead tuple个数，还有增删改查的次数等。
Pg_stats 可以查看值的分布情况
回到下面的SQL：
1. 先做简化，使用count(*)替换所有输出：
explain(analyze , buffers, timing) select count(*)
  from sms_task_content_info    a,
       tsk_type_tbl b,
       tsk_plan_info       c,
       sm_code_tbl      d,
       smu_info            e
where a.course_type = b.course_type
   and a.course_id = c.content_id
   and c.plan_maker = e.user_id
   and e.region_code = d.region_code
   and d.is_valid = "Y"
   and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")
   and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1
;
 
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=2947.16..2947.18 rows=1 width=0) (actual time=49.154..49.154 rows=1 loops=1)
   Buffers: shared hit=1602
   ->  Hash Join  (cost=657.32..2928.06 rows=7643 width=0) (actual time=13.259..48.521 rows=7440 loops=1)
         Hash Cond: ((c.content_id)::text = (a.course_id)::text)
         Buffers: shared hit=1602
         ->  Hash Join  (cost=459.24..2615.33 rows=7643 width=33) (actual time=10.020..42.532 rows=7440 loops=1)
               Hash Cond: ((c.plan_maker)::text = (e.user_id)::text)
               Buffers: shared hit=1491
               ->  Seq Scan on tsk_plan_info c  (cost=0.00..2022.34 rows=7643 width=45) (actual time=0.629..29.272 rows=7440 loops=1)
                     Filter: ((date_plan >= to_date("2016-12-01"::text, "yyyy-mm-dd"::text)) AND (date_plan < (to_date("2016-12-31"::text, "yyyy-mm-dd"::text) + 1)))
                     Rows Removed by Filter: 25003
                     Buffers: shared hit=1286
               ->  Hash  (cost=412.29..412.29 rows=3756 width=12) (actual time=9.377..9.377 rows=3756 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 164kB
                     Buffers: shared hit=205
                     ->  Hash Join  (cost=179.00..412.29 rows=3756 width=12) (actual time=3.754..7.788 rows=3756 loops=1)
                           Hash Cond: ((e.region_code)::text = (d.region_code)::text)
                           Buffers: shared hit=205
                           ->  Seq Scan on smu_info e  (cost=0.00..167.56 rows=3756 width=14) (actual time=0.006..1.228 rows=3756 loops=1)
                                 Buffers: shared hit=130
                           ->  Hash  (cost=127.00..127.00 rows=4160 width=6) (actual time=3.736..3.736 rows=4103 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 156kB
                                 Buffers: shared hit=75
                                 ->  Seq Scan on sms_region_code_tbl d  (cost=0.00..127.00 rows=4160 width=6) (actual time=0.003..2.201 rows=4103 loops=1)
                                       Filter: ((is_valid)::text = "Y"::text)
                                       Rows Removed by Filter: 4
                                       Buffers: shared hit=75
         ->  Hash  (cost=171.94..171.94 rows=2092 width=33) (actual time=3.231..3.231 rows=2093 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 133kB
               Buffers: shared hit=111
               ->  Hash Join  (cost=12.25..171.94 rows=2092 width=33) (actual time=0.021..2.231 rows=2093 loops=1)
                     Hash Cond: ((a.course_type)::text = (b.course_type)::text)
                     Buffers: shared hit=111
                     ->  Seq Scan on sms_task_content_info a  (cost=0.00..130.92 rows=2092 width=35) (actual time=0.004..0.818 rows=2093 loops=1)
                           Buffers: shared hit=110
                     ->  Hash  (cost=11.00..11.00 rows=100 width=20) (actual time=0.009..0.009 rows=6 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           Buffers: shared hit=1
                           ->  Seq Scan on tsk_type_tbl b  (cost=0.00..11.00 rows=100 width=20) (actual time=0.003..0.005 rows=6 loops=1)
                                 Buffers: shared hit=1
Planning time: 2.522 ms
Execution time: 49.270 ms
(42 rows)
 
Time: 71.990 ms
 
去掉子查询后，语句很快就输出了， 问题就在输出结果里的子查询，到最后输出7440行，就意味着那两个子查询都需要7440次。整体语句慢在这里。
 
 
select distinct d.description as "RANGE",
                b.description as "COURSE_CLASSIFICATION_DESC",
                to_char(c.date_make, "yyyy-mm-dd") as "DATE_MAKE",
                to_char(c.date_end, "yyyy-mm-dd") as "DATE_END",
                to_char(c.date_plan, "yyyy-mm-dd") as "DATE_PLAN",
                (select cast((case
                               when (select count(1)
                                       from sms_task_content_info a2,
                                            tsk_plan_info    b2,
                                            smu_info         s2
                                      where a2.course_id = b2.content_id
                                        and b2.plan_maker = s2.user_id
                                        and b2.plan_status != "2"
                                        and s2.region_code = e.region_code
                                        and a2.course_type = a.course_type
                                        and to_char(b2.date_make, "yyyy-mm-dd") =
                                            to_char(c.date_make, "yyyy-mm-dd")
                                        and to_char(b2.date_plan, "yyyy-mm-dd") =
                                            to_char(c.date_plan, "yyyy-mm-dd")
                                        and to_char(b2.date_end, "yyyy-mm-dd") =
                                            to_char(c.date_end, "yyyy-mm-dd")) != 0 then
                                (cast(100 AS numeric(5, 2)) *
                                (select count(1)
                                    from sms_task_content_info a1,
                                         tsk_plan_info    b1,
                                         smu_info         s1
                                   where a1.course_id = b1.content_id
                                     and b1.plan_maker = s1.user_id
                                     and b1.plan_status = "1"
                                     and s1.region_code = e.region_code
                                     and a1.course_type = a.course_type
                                     and to_char(b1.date_make, "yyyy-mm-dd") =
                                         to_char(c.date_make, "yyyy-mm-dd")
                                     and to_char(b1.date_plan, "yyyy-mm-dd") =
                                         to_char(c.date_plan, "yyyy-mm-dd")
                                     and to_char(b1.date_end, "yyyy-mm-dd") =
                                         to_char(c.date_end, "yyyy-mm-dd")) /
                                (select count(1)
                                    from sms_task_content_info a2,
                                         tsk_plan_info    b2,
                                         smu_info         s2
                                   where a2.course_id = b2.content_id
                                     and b2.plan_maker = s2.user_id
                                     and b2.plan_status != "2"
                                     and s2.region_code = e.region_code
                                     and a2.course_type = a.course_type
                                     and to_char(b2.date_make, "yyyy-mm-dd") =
                                         to_char(c.date_make, "yyyy-mm-dd")
                                     and to_char(b2.date_plan, "yyyy-mm-dd") =
                                         to_char(c.date_plan, "yyyy-mm-dd")
                                     and to_char(b2.date_end, "yyyy-mm-dd") =
                                         to_char(c.date_end, "yyyy-mm-dd")))
                               else
                                "0"
                             end) AS numeric(5, 2)) || "%"
                   from dual) as "FINISH_RATIO",
                d.region_code,
                b.course_type
  from sms_task_content_info    a,
       tsk_type_tbl b,
       tsk_plan_info       c,
       sm_code_tbl      d,
       smu_info            e
where a.course_type = b.course_type
   and a.course_id = c.content_id
   and c.plan_maker = e.user_id
   and e.region_code = d.region_code
   and d.is_valid = "Y"
  -- and e.region_code in()
  -- and a.course_type = "1"
   and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")
   and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1
order by d.region_code, b.course_type;
以上是 pg性能分析的全部内容，来源链接： utcz.com/z/534354.html
pg性能分析

其他人也看了：