我使用NOT IN,但它是缓慢
CRM表例子:我使用NOT IN,但它是缓慢
`crm` example: +----+--------+---------------------+--------------------+
| id | name | date | status |
+----+--------+---------------------+--------------------+
| 1 | john | 2017-12-27 10:58:10 | A status |
| 2 | steve | 2017-12-27 10:58:08 | A status |
| 3 | eric | 2017-12-27 10:58:04 | Delivery Arranged |
| 4 | phil | 2017-12-27 10:57:55 | A status |
| 5 | bob | 2017-12-27 10:57:52 | A status |
| 6 | foo | 2017-12-27 10:57:50 | A status |
| 7 | steven | 2017-12-27 10:57:48 | Delivery Arranged |
| 8 | paul | 2017-12-27 10:57:43 | A status |
| 9 | alex | 2017-12-27 10:57:31 | Delivery Arranged |
我查询的目的是要返回的crm
行,其中的status
是交货安排的数量, date
介于2017-12-01
和2018-01-01
之间。
所以,这里是我的主要查询:
SET @from='2017-12-01'; SET @to='2018-01-01';
SELECT
COUNT(*) AS `delivery_arranged`
FROM
`crm` a
WHERE
a.`status` = 'Delivery Arranged'
AND DATE(a.`date`) BETWEEN @from AND @to
结果:
+---------------------+ | delivery_arranged |
+---------------------+
| 30 |
都很好。但我想要折扣那些曾经有过的行(实际上除此日期范围外)已被设置为交货安排。我有一个statuslog表,我可以用这个:
STATUSLOG表例子:
`statuslog` example: +--------+-------+---------------------+-----------+---------------------+
| id | crmid | date | user | status |
+--------+-------+---------------------+-----------+---------------------+
| 818572 | 1 | 2017-12-27 10:58:10 | johnsmith | Some status change |
| 818571 | 2 | 2017-12-27 10:58:08 | johnsmith | Some status change |
| 818570 | 3 | 2017-12-27 10:58:04 | another | Delivery Arranged |
| 818569 | 4 | 2017-12-27 10:57:55 | another | Delivery Arranged |
| 818568 | 5 | 2017-12-27 10:57:52 | johnsmith | Some status change |
| 818567 | 6 | 2017-12-27 10:57:50 | another | Some status change |
| 818566 | 7 | 2017-12-27 10:57:48 | johnsmith | Delivery Arranged |
| 818565 | 8 | 2017-12-27 10:57:43 | another | Some status change |
| 818564 | 9 | 2017-12-27 10:57:31 | johnsmith | Some status change |
所以用这个表,我可以从statuslog
得到行不日期间然后做一个NOT IN
:
SELECT COUNT(*) AS `delivery_arranged`
FROM
`crm` a
WHERE
a.`status` = 'Delivery Arranged'
AND DATE(a.`date`) BETWEEN @from AND @to
AND a.`id`
NOT IN (
SELECT
a.crmid AS `crmid`
FROM
statuslog a
WHERE
a.status = 'Delivery Arranged'
AND DATE(a.`date`) NOT BETWEEN @from AND @to
GROUP BY a.crmid
ORDER BY a.`date` DESC
)
这个工程,但取决于th e日期范围的大小可能需要很长时间! statuslog
有> 2,000,000行。
如何使此查询更快?
回答:
LEFT JOIN可能比代孕子查询更好:
SELECT COUNT(*) AS `delivery_arranged`
FROM
`crm` a
LEFT OUTER JOIN
(
SELECT
a.crmid AS `crmid`
FROM
statuslog a
WHERE
a.status = 'Delivery Arranged'
AND DATE(a.`date`) NOT BETWEEN @from AND @to
GROUP BY a.crmid
--ORDER BY a.`date` DESC --<-- this has no sense
) b
on a.`id` = b.crmid
WHERE
b.crmid is null and --<- not int translated to left join
a.`status` = 'Delivery Arranged'
AND DATE(a.`date`) BETWEEN @from AND @to
另外,记得使用正确的索引。
回答:
这通常会更快,如果您使用的是LEFT JOIN
/WHERE
:
SELECT COUNT(*) AS delivery_arranged FROM crm c LEFT JOIN
statuslog sl
ON sl.crmid = c.id AND
sl.status = 'Delivery Arranged'
sl.date >= @from AND
sl.date < @to + INTERVAL 1 DAY
WHERE c.status = 'Delivery Arranged' AND
c.date >= @from AND
c.date < @to + INTERVAL 1 DAY AND
sl.crmid IS NULL;
对于这个版本,你想在crm(status, date, id)
和statuslog(crmid, status, date)
指标。
请注意,这会更改日期比较以避免在列上调用函数。这使得使用包含date
列的索引更为可行。
以上是 我使用NOT IN,但它是缓慢 的全部内容, 来源链接: utcz.com/qa/264890.html