首先要弄清楚 count() 的函数的语义。count() 是一个聚合函数,对于返回的结果集,一行行地进行判断,如果 count 函数的参数不是 NULL,累计值就加 1,否则不加。最后返回累计值。所以,count(*)、count(1)和count(主键) 都表示返回满足条件的结果集的总行数;而 count(字段),则表示返回满足条件的数据行里面字段不为 NULL 的总个数。
1.测试演示
以表fj_news为例,一共加载了15819行的数据,其中允许为null的字段title中存在3个null值。
create table fj_news
(
id serial not null ,
title varchar(100)
default '',
seo_title varchar(50) not null ,
zone_id integer
default 0 not null ,
primary key (id)
) ;
count(可空字段)
扫描全表,判断字段可空,扫描该字段所有值,判断字段是否为空,不为空则累加。
> select count(title) from fj_news;
(count)
15816
1 row(s) retrieved.
查看其执行计划,可以看到对全表做了完整的顺序扫描:
QUERY: (OPTIMIZATION TIMESTAMP: 07-20-2023 14:29:45)
------
select count(title) from fj_news
Estimated Cost: 1093
Estimated # of Rows Returned: 1
1) sinodb.fj_news: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 fj_news
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 15819 15819 15819 00:00.00 1094
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 15819 00:00.00
count(非空字段)
扫描全表,判断字段不可空,按行累加。
> select count(seo_title) from fj_news;
(count)
15819
1 row(s) retrieved.
查看其执行计划,也是顺序扫描全表:
select count(seo_title) from fj_news
Estimated Cost: 1093
Estimated # of Rows Returned: 1
1) sinodb.fj_news: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 fj_news
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 15819 15819 15819 00:00.00 1094
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 15819 00:00.00
count(id)
扫描全表,判断字段不可空,按行累加。
> select count(id) from fj_news;
(count)
15819
1 row(s) retrieved.
查看其执行计划,执行的是索引键值扫描:
QUERY: (OPTIMIZATION TIMESTAMP: 07-20-2023 14:30:55)
------
select count(id) from fj_news
Estimated Cost: 579
Estimated # of Rows Returned: 1
1) sinodb.fj_news: INDEX PATH
(1) Index Name: sinodb. 193_117
Index Keys: id (Key-Only) (Serial, fragments: ALL)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 fj_news
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 15819 15819 15819 00:00.00 580
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 15819 00:00.00
count(1)
扫描全表,但不取值,默认每一行都是1,判断不可能是null,按值累加。
> select count(1) from fj_news;
(count)
15819
1 row(s) retrieved.
查看其执行计划,顺序扫描整表:
QUERY: (OPTIMIZATION TIMESTAMP: 07-20-2023 14:31:24)
------
select count(1) from fj_news
Estimated Cost: 1093
Estimated # of Rows Returned: 1
1) sinodb.fj_news: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 fj_news
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 15819 15819 15819 00:00.00 1094
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 15819 00:00.00
count(*)
SinoDB在执行count(*)的时候,查看执行计划,可以发现优化器专门做了优化。如果不带where条件,将直接从内部系统表中返回值。有where条件的话会顺序扫描全表,但不取值,按行累加。
> select count(*) from fj_news;
(count(*))
15819
1 row(s) retrieved.
查看其执行计划,不带where条件的话,直接从内部系统表返回结果:
QUERY: (OPTIMIZATION TIMESTAMP: 07-20-2023 14:31:41)
------
select count(*) from fj_news
Estimated Cost: 1
Estimated # of Rows Returned: 1
1) sinodb.fj_news: INDEX PATH
(1) Index Name: (count)
Index Keys: (count)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 0 00:00.00
如果带where条件,则遵循条件限制:
> select count(*) from fj_news where title like '福州%';
(count(*))
954
1 row(s) retrieved.
>
查看其执行计划,顺序扫描全表:
QUERY: (OPTIMIZATION TIMESTAMP: 07-20-2023 14:32:01)
------
select count(*) from fj_news where title like '福州%'
Estimated Cost: 1093
Estimated # of Rows Returned: 1
1) sinodb.fj_news: SEQUENTIAL SCAN
Filters: sinodb.fj_news.title LIKE '福州%'
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 fj_news
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 15819 790 15819 00:00.00 1094
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 15819 00:00.00
2. 结论
通过对比以上各个查询计划中的Estimated Cost值,我们有以下结论:
count(*) 无限制条件 > count(主键) > count(1) ~= count(非空字段) ~= count(可空字段)