目录
一、知识点
二、测试验证
三、引申
left join中关于where和on条件的知识点:
例如下面的sql:关联表(右表)的筛选条件如果放置在on后面,则下面sql的执行顺序是:B先按照条件进行过滤,再与A表关联;
#主表:A; 关联表:B 关联条件:A.id = B.id 筛选条件:B.id >1 A left join B on A.id = B.id and B.id >1;
筛选条件如果放置在where后面,则下面sql的执行顺序是:A和B先关联,基于生成的临时表再进行where条件过滤。
#主表:A; 关联表:B 关联条件:A.id = B.id 过滤条件 A left join B on A.id = B.id where A.id <> 1
创建t1 ,t2 两张表
create table if not exists test1( id int comment '用户id',name string comment '用户姓名'); insert overwrite table test1 values (1,'a1'), (2,'a2'), (3,'a3'), (4,'a4'); create table if not exists test2( id int comment '用户id',province string comment '用户所在省份'); insert overwrite table test1 values (1,'hainan'), (2,'beijing'), (5,'sichuan'), (6,'chongqing');
测试1:返回左表所有行,右表符合on条件可匹配上,不满足条件的直接补null
select test1.id, test1.name, test2.id, test2.province from test1 left join test2 on test1.id = test2.id;
测试2:on后面增加对右表的限制条件;结论:左表记录全部返回,右表筛选条件生效
select test1.id, test1.name, test2.id, test2.province from test1 left join test2 on test1.id = test2.id and test2.province = 'beijing';
测试3:where后面增加对右表的限制条件:where test2.province = 'beijing';
结论:where后面是对最后的临时表进行记录筛选,行数可能会减少
select test1.id, test1.name, test2.id, test2.province from test1 left join test2 on test1.id = test2.id where test2.province = 'beijing';
测试4:on中增加对左表的限制条件,不影响返回的行数,只影响右表的匹配内容。
select test1.id, test1.name, test2.id, test2.province from test1 left join test2 on test1.id = test2.id and test1.name = 'a1';
测试5:where后面增加对左表的限制条件:where test1.name ='a4'
结论:where条件是在最后临时表的基础上进行筛选,返回满足条件的行
select test1.id, test1.name, test2.id, test2.province from test1 left join test2 on test1.id = test2.id where test1.name = 'a4';
由上述的where和on之间的区别,引申出来的面试题: t1表和t2表的字段如图,计算以下两个sql的值。
--1.输出结果 select t1.id,t2.id from t1 left join t2 on t1.id = t2.id and t2.id <>2; --2.输出结果 select t1.id,t2.id from t1 left join t2 on t1.id = t2.id where t2.id <>2;
解析:
(1)on后面:跟着的是对右表的限制条件 t2.id <>2;
结论:左表记录全部返回,右表筛选条件生效
with t1 as ( select 1 as id union all select 2 as id union all select 3 as id ), t2 as ( select 1 as id union all select 2 as id union all select 2 as id ) select t1.id,t2.id from t1 left join t2 on t1.id = t2.id and t2.id <>2;
输出结果为:
(2)where后面:跟着的是对右表的限制条件 where t2.id <>2;
结论:最后临时表的基础上进行筛选,最终返回符合where过滤条件的行;
ps: sql中比较的结果一般有:true, false, null; 而where条件只会过滤出true的结果。
with t1 as ( select 1 as id union all select 2 as id union all select 3 as id ), t2 as ( select 1 as id union all select 2 as id union all select 2 as id ) select t1.id,t2.id from t1 left join t2 on t1.id = t2.id where t2.id <>2;
输出结果为:
补充:
select (1>2) --> FALSE select (1<2) --> true select (1null --所以,sql比较的结果会有三种:false, true, null