I have a table numbers (n, ..) that I want to get the count of.

I can either select count(*) or select count(1), and they might be different in runtime.

If they are different, one is faster and one is slower, but which one?

On my Mac Pro, 9 years old:

$ psql ..
(
=# select count(1) from numbers; select count(*) from numbers; select count(1) from numbers;
x3
=# select count(*) from numbers; select count(1) from numbers; select count(*) from numbers;
x3
) x2

And we get (in ms)

run count(1) count(*) count(1) count(*)
1 10.793 5.071 6.009  
2 6.290 5.457 5.461  
3 6.324 5.523 5.242  
4   6.027 5.644 4.850
5   5.561 5.252 4.786
6   5.861 5.612 5.145
7 6.358 5.436 5.122  
8 6.311 5.091 5.058  
9 6.404 5.180 5.502  
10   5.567 5.800 4.810
11   5.534 5.219 5.587
12   5.861 5.802 5.494

So

count low mean high
count(*) 4.786 5.380 6.027
count(1) 5.091 6.013 10.793

This is a pretty rough benchmark; we can do much better. I want to run it on larger sizes, so I really need to automate/scrape the runtime.

It looks like count(*) is slightly faster on a 49,999-row count, no conditions.

Going deeper

I wrote some benchmark code, largely based upon this great jOOQ blog post, at the bottom of this post.

I didn’t parameterize the queries, but it’s very much an improvement over doing any of it by hand.

Let’s repeat with a v_repeat of 5 and decrease the outermost 30 iterations down to 3:

=# select bench_ab(5000); select ab, min(runtime), avg(runtime), max(runtime), count(runtime) from runtimes group by ab;
ab min avg max count
count(*) 00:00:01.333068 00:00:01.348974 00:00:01.373866 9
count(1) 00:00:01.327441 00:00:01.350864 00:00:01.373268 9

Dividing the averages,

count(*) gives 1.350864 / 1.348974 = 1.001x latency speedup on 49,999-row table counts (no conditions)

More rows

Instead of 50,000 rows, let’s do 5,000,000.

delete from numbers;
insert into numbers (n, r2, r3, r5, r7)
select n, n%2, n%3, n%5, n%7 from generate_series(3, 1e7, 2) as numbers(n);
select count(*) from numbers;

I had to reduce the innermost repeat down from 5,000 down to 5 to not wait all night:

=# select bench_ab(5); select ab, min(runtime), avg(runtime), max(runtime), count(runtime) from runtimes group by ab;
ab min avg max count
count(*) 00:00:03.631091 00:00:04.401283 00:00:10.299831 9
count(1) 00:00:03.726291 00:00:03.818642 00:00:04.149109 9

Dividing the averages,

count(1) gives 4.401283 / 3.8186421 = 1.153x latency speedup on 4,999,999-now table counts (no conditions)

Once with conditions

Let’s just get the numbers that are fizzbuzz, divisible by both 3 and 5:

select count(*)
from numbers
where r3 = 0 and r5 = 0
ab min avg max count
count(*) 00:00:06.954637 00:00:07.10634 00:00:07.211031 9
count(1) 00:00:06.960223 00:00:07.089915 00:00:07.154958 9

Dividing the averages,

count(1) gives 7.10634 / 7.089915 = 1.002x latency speedup on 333,333-row table counts (with conditions)

Pretty close.

These results looked a little odd, maybe, so I ran it again the next day, and it flipped back to count(*) generally faster; 4,500 runs in total:

ab min avg max count
count(*) 00:02:44.51731 00:02:46.415199 00:02:49.43029 9
count(1) 00:02:47.562048 00:02:50.376417 00:02:54.533021 9

Giving a 170.376417 / 166.415199 = 1.024x latency speedup. I wonder why the benchmark was different yesterday; I might run it once more just to be sure.

Benchmark code

-- delete from numbers;
-- insert into numbers (n, r2, r3, r5, r7)
-- select n, n%2, n%3, n%5, n%7 from generate_series(3, 1e5, 2) as numbers(n);
-- select count(*) from numbers;

--drop function bench_ab
create table if not exists runtimes (ab text, runtime interval);
create or replace function bench_ab(v_repeat int)
returns text
language plpgsql
as $$
declare
  v_ts timestamp;
  v_ts0 timestamp;
  v_ts2 timestamp;
  jmax constant int := 3;
  imax constant int := 3;
  count int;
begin
  delete from runtimes;
  for j in 1..jmax loop
    for i in 1..imax loop
      v_ts0 := clock_timestamp();
      v_ts := clock_timestamp();
      for i in 1..v_repeat loop
        perform count(1)
          from numbers;
      end loop;
      v_ts2 := clock_timestamp();
      insert into runtimes("ab", runtime) values ('count(1)', v_ts2 - v_ts);
      v_ts = v_ts2;
    end loop;
    raise info '%',j;
    for i in 1..imax loop
      v_ts := clock_timestamp();
      for i in 1..v_repeat loop
        perform count(*)
          from numbers;
      end loop;
      v_ts2 := clock_timestamp();
      insert into runtimes("ab", runtime) values ('count(*)', v_ts2 - v_ts);
      v_ts = v_ts2;
    end loop;
  end loop;
  return (select to_char(jmax*imax*v_repeat, '999,999,999')||' runs in total');
end
$$