JDBC内幕subquery
subquery
子查询指一个查询语句嵌套在另一个查询语句内部的查询,在 SELECT 子句中先计算子查询,子查询结果作为外层另一个查询的过滤条件,查询可以基于一个表或者多个表,子查询中常用的操作符有 ANY(SOME)、ALL、IN 和 EXISTS。常见形式如下
数据中间件一般处理subquery思路是拆分SQL语句,先查询内部语句结果作为外部条件继续查询,整体流程类似下图
另外一种处理方式就是确定subquery内部查询与外部查询是否在同一个分片(db)上,如果可以确定这层关系,则可以将SQL直接发送到某个分片上,一般在非代理数据中间件实现中,可以考虑这种方式规避拆分subquery,尤其在oracle嵌套分页查询时需要面临这个问题。
sharding subquery
了解subquery后,我们来看看sharding-JDBC是怎么处理subquery的,希望读者对原理有了解后,能正确使用子查询,首先我们来验证上边这个场景,SQL如下
select * from t_order where order_id in (select order_id from t_order_item where order_name ='zhangsan')
执行结果有点出乎意料
java.lang.IllegalStateException: Must have sharding column with subquery. at com.google.common.base.Preconditions.checkState(Preconditions.java:173)
at org.apache.shardingsphere.core.route.router.sharding.ParsingSQLRouter.checkSubqueryShardingValues(ParsingSQLRouter.java:111)
at org.apache.shardingsphere.core.route.router.sharding.ParsingSQLRouter.route(ParsingSQLRouter.java:75)
at org.apache.shardingsphere.core.route.StatementRoutingEngine.route(StatementRoutingEngine.java:56)
提示我们必须要分片键,我们可以把t_order_item表分片键改为order_name
t_order_item: actualDataNodes:ds_0.t_order_item
tableStrategy:
inline:
shardingColumn:order_name
algorithmExpression:t_order_item
databaseStrategy:
inline:
shardingColumn:order_name
algorithmExpression:ds_0
执行结果还是同样的错误,将SQL改写如下
select * from t_order where order_id in (select order_id from t_order_item where order_id =1001)
执行正常,不猜了,看下源码是怎么处理的,这块和路由相关,先看下路由模块代码
public SQLRouteResult route(final String logicSQL, final List<Object> parameters, final SQLStatement sqlStatement){//此处OptimizedStatement属于ShardingSelectOptimizedStatement,主要是对SQL中的关注的部分从语法树中提取,比如order by , 聚合函数等
OptimizedStatement optimizedStatement = ShardingOptimizeEngineFactory.newInstance(sqlStatement).optimize(shardingRule, shardingMetaData.getTable(), logicSQL, parameters, sqlStatement);
boolean needMergeShardingValues = isNeedMergeShardingValues(optimizedStatement);
if (optimizedStatement instanceof ShardingConditionOptimizedStatement && needMergeShardingValues) {
checkSubqueryShardingValues(optimizedStatement, ((ShardingConditionOptimizedStatement) optimizedStatement).getShardingConditions());
mergeShardingConditions(((ShardingConditionOptimizedStatement) optimizedStatement).getShardingConditions());
}
RoutingResult routingResult = RoutingEngineFactory.newInstance(shardingRule, shardingMetaData.getDataSource(), optimizedStatement).route();
if (needMergeShardingValues) {
Preconditions.checkState(1 == routingResult.getRoutingUnits().size(), "Must have one sharding with subquery.");
}
if (optimizedStatement instanceof ShardingInsertOptimizedStatement) {
setGeneratedValues((ShardingInsertOptimizedStatement) optimizedStatement);
}
SQLRouteResult result = new SQLRouteResult(optimizedStatement);
result.setRoutingResult(routingResult);
return result;
}
optimize方法
public ShardingSelectOptimizedStatement optimize(final ShardingRule shardingRule,final ShardingTableMetaData shardingTableMetaData, final String sql, final List<Object> parameters, final SelectStatement sqlStatement)
{ WhereClauseShardingConditionEngine shardingConditionEngine = new WhereClauseShardingConditionEngine(shardingRule, shardingTableMetaData);
WhereClauseEncryptConditionEngine encryptConditionEngine = new WhereClauseEncryptConditionEngine(shardingRule.getEncryptRule(), shardingTableMetaData);
GroupByEngine groupByEngine = new GroupByEngine();
OrderByEngine orderByEngine = new OrderByEngine();
SelectItemsEngine selectItemsEngine = new SelectItemsEngine(shardingTableMetaData);
PaginationEngine paginationEngine = new PaginationEngine();
//核心点,获取条件,包括where条件和subquery条件
List<ShardingCondition> shardingConditions = shardingConditionEngine.createShardingConditions(sqlStatement, parameters);
List<EncryptCondition> encryptConditions = encryptConditionEngine.createEncryptConditions(sqlStatement);
GroupBy groupBy = groupByEngine.createGroupBy(sqlStatement);
OrderBy orderBy = orderByEngine.createOrderBy(sqlStatement, groupBy);
SelectItems selectItems = selectItemsEngine.createSelectItems(sql, sqlStatement, groupBy, orderBy);
Pagination pagination = paginationEngine.createPagination(sqlStatement, selectItems, parameters);
ShardingSelectOptimizedStatement result = new ShardingSelectOptimizedStatement(sqlStatement, shardingConditions, encryptConditions, groupBy, orderBy, selectItems, pagination);
setContainsSubquery(sqlStatement, result);
return result;
}
createShardingConditions方法
public List<ShardingCondition> createShardingConditions(final SQLStatement sqlStatement, final List<Object> parameters){if (!(sqlStatement instanceof WhereSegmentAvailable)) {
return Collections.emptyList();
}
List<ShardingCondition> result = new ArrayList<>();
//获取where条件
Optional<WhereSegment> whereSegment = ((WhereSegmentAvailable) sqlStatement).getWhere();
//关键点,查询中涉及到的所有表
Tables tables = new Tables(sqlStatement);
if (whereSegment.isPresent()) {
result.addAll(createShardingConditions(tables, whereSegment.get().getAndPredicates(), parameters));
}
//获取subquery条件
Collection<SubqueryPredicateSegment> subqueryPredicateSegments = sqlStatement.findSQLSegments(SubqueryPredicateSegment.class);
for (SubqueryPredicateSegment each : subqueryPredicateSegments) {
//获取具体的subquery查询条件
Collection<ShardingCondition> subqueryShardingConditions = createShardingConditions(tables, each.getAndPredicates(), parameters);
if (!result.containsAll(subqueryShardingConditions)) {
result.addAll(subqueryShardingConditions);
}
}
return result;
}
我们就用上边SQL看下Tables tables涉及的表
从解析结果中可以知道并未对子查询内部表名提取,则暗含着内部与外部是同一个表,验证我们猜测对不对
createShardingConditions方法获取subquery条件
private Collection<ShardingCondition> createShardingConditions(final Tables tables, final Collection<AndPredicate> andPredicates, final List<Object> parameters){ Collection<ShardingCondition> result = new LinkedList<>();
for (AndPredicate each : andPredicates) {
//从子查询条件中获取路由单元,即column-shardingValue数据结构
Map<Column, Collection<RouteValue>> routeValueMap = createRouteValueMap(tables, each, parameters);
if (routeValueMap.isEmpty()) {
return Collections.emptyList();
}
result.add(createShardingCondition(routeValueMap));
}
return result;
}
createRouteValueMap方法
private Map<Column, Collection<RouteValue>> createRouteValueMap(final Tables tables, final AndPredicate andPredicate, final List<Object> parameters) { Map<Column, Collection<RouteValue>> result = new HashMap<>();
for (PredicateSegment each : andPredicate.getPredicates()) {
Optional<String> tableName = tables.findTableName(each.getColumn(), shardingTableMetaData);
//判断当前条件中的column是不是分片键,上一步我们知道tables中只有t_order一个表
//由于当前column是 order_name 并不是t_order表的分片键,所以此处并未构造出RouteValue,
//所有就看到第一开始的异常信息 Must have sharding column with subquery.
if (!tableName.isPresent() || !shardingRule.isShardingColumn(each.getColumn().getName(), tableName.get())) {
continue;
}
Column column = new Column(each.getColumn().getName(), tableName.get());
Optional<RouteValue> routeValue = ConditionValueGeneratorFactory.generate(each.getRightValue(), column, parameters);
if (!routeValue.isPresent()) {
continue;
}
if (!result.containsKey(column)) {
result.put(column, new LinkedList<RouteValue>());
}
result.get(column).add(routeValue.get());
}
return result;
}
到这块基本处理逻辑应该了解的差不多,subquery应该是处理自生表的嵌套子查询即一个表的嵌套子查询,那我们可以推断如下的SQL应该也是可以正常执行的
select * from t_order where order_id in (select order_name from t_order_item where order_id = 1001)
直接结果和推断的一样,SQL直接发往其中一个db上,这明显不正确,因为order_name 和order_id 没有任何关系,并且t_order_item的分片键并不是order_id,分析到这基本结束。
the end
由上可见sharding-JDBC对子查询支持并没有我们想象的那么强大,希望读者了解原理后可以正确的使用sharding subquery,我觉得对于subquery处理还是有不足的地方,比如类似SQL select * from t_order where order_id in (select order_name from t_order_item where order_id = 1001)
涉及多表子查询应该抛异常获取提示信息,不能直接根据 subquery条件order_id = 1001
与外部表的 路由算法直接路由,因为可能存在两张表毫无关系的情况。希望读者清楚sharding-JDBC处理对subquery的处理逻辑,正确的使用subquery
以上是 JDBC内幕subquery 的全部内容, 来源链接: utcz.com/a/19568.html