我使用NOT IN,但它是缓慢

CRM表例子:我使用NOT IN,但它是缓慢

`crm` example: 

+----+--------+---------------------+--------------------+

| id | name | date | status |

+----+--------+---------------------+--------------------+

| 1 | john | 2017-12-27 10:58:10 | A status |

| 2 | steve | 2017-12-27 10:58:08 | A status |

| 3 | eric | 2017-12-27 10:58:04 | Delivery Arranged |

| 4 | phil | 2017-12-27 10:57:55 | A status |

| 5 | bob | 2017-12-27 10:57:52 | A status |

| 6 | foo | 2017-12-27 10:57:50 | A status |

| 7 | steven | 2017-12-27 10:57:48 | Delivery Arranged |

| 8 | paul | 2017-12-27 10:57:43 | A status |

| 9 | alex | 2017-12-27 10:57:31 | Delivery Arranged |

我查询的目的是要返回的crm行,其中的status交货安排的数量, date介于2017-12-012018-01-01之间。

所以,这里是我的主要查询:

SET @from='2017-12-01'; 

SET @to='2018-01-01';

SELECT

COUNT(*) AS `delivery_arranged`

FROM

`crm` a

WHERE

a.`status` = 'Delivery Arranged'

AND DATE(a.`date`) BETWEEN @from AND @to

结果:

+---------------------+ 

| delivery_arranged |

+---------------------+

| 30 |

都很好。但我想要折扣那些曾经有过的行(实际上除此日期范围外)已被设置为交货安排。我有一个statuslog表,我可以用这个:

STATUSLOG表例子:

`statuslog` example: 

+--------+-------+---------------------+-----------+---------------------+

| id | crmid | date | user | status |

+--------+-------+---------------------+-----------+---------------------+

| 818572 | 1 | 2017-12-27 10:58:10 | johnsmith | Some status change |

| 818571 | 2 | 2017-12-27 10:58:08 | johnsmith | Some status change |

| 818570 | 3 | 2017-12-27 10:58:04 | another | Delivery Arranged |

| 818569 | 4 | 2017-12-27 10:57:55 | another | Delivery Arranged |

| 818568 | 5 | 2017-12-27 10:57:52 | johnsmith | Some status change |

| 818567 | 6 | 2017-12-27 10:57:50 | another | Some status change |

| 818566 | 7 | 2017-12-27 10:57:48 | johnsmith | Delivery Arranged |

| 818565 | 8 | 2017-12-27 10:57:43 | another | Some status change |

| 818564 | 9 | 2017-12-27 10:57:31 | johnsmith | Some status change |

所以用这个表,我可以从statuslog得到行不日期间然后做一个NOT IN

SELECT 

COUNT(*) AS `delivery_arranged`

FROM

`crm` a

WHERE

a.`status` = 'Delivery Arranged'

AND DATE(a.`date`) BETWEEN @from AND @to

AND a.`id`

NOT IN (

SELECT

a.crmid AS `crmid`

FROM

statuslog a

WHERE

a.status = 'Delivery Arranged'

AND DATE(a.`date`) NOT BETWEEN @from AND @to

GROUP BY a.crmid

ORDER BY a.`date` DESC

)

这个工程,但取决于th e日期范围的大小可能需要很长时间! statuslog有> 2,000,000行。

如何使此查询更快?

回答:

LEFT JOIN可能比代孕子查询更好:

SELECT 

COUNT(*) AS `delivery_arranged`

FROM

`crm` a

LEFT OUTER JOIN

(

SELECT

a.crmid AS `crmid`

FROM

statuslog a

WHERE

a.status = 'Delivery Arranged'

AND DATE(a.`date`) NOT BETWEEN @from AND @to

GROUP BY a.crmid

--ORDER BY a.`date` DESC --<-- this has no sense

) b

on a.`id` = b.crmid

WHERE

b.crmid is null and --<- not int translated to left join

a.`status` = 'Delivery Arranged'

AND DATE(a.`date`) BETWEEN @from AND @to

另外,记得使用正确的索引。

回答:

这通常会更快,如果您使用的是LEFT JOIN/WHERE

SELECT COUNT(*) AS delivery_arranged 

FROM crm c LEFT JOIN

statuslog sl

ON sl.crmid = c.id AND

sl.status = 'Delivery Arranged'

sl.date >= @from AND

sl.date < @to + INTERVAL 1 DAY

WHERE c.status = 'Delivery Arranged' AND

c.date >= @from AND

c.date < @to + INTERVAL 1 DAY AND

sl.crmid IS NULL;

对于这个版本,你想在crm(status, date, id)statuslog(crmid, status, date)指标。

请注意,这会更改日期比较以避免在列上调用函数。这使得使用包含date列的索引更为可行。

以上是 我使用NOT IN,但它是缓慢 的全部内容, 来源链接: utcz.com/qa/264890.html

回到顶部