Inconsistent Unnesting in PostgreSQL
I thought this was a bug earlier today… It’s not. So here goes in case anyone else ends up scratching his head.
PostgreSQL features some set returning function:
# select i from generate_series(1,2) i; i --- 1 2 (2 rows)
Syntactic sugar also allows to use them in select statements:
# select generate_series(1,2) i; i --- 1 2 (2 rows)
The same internals allow to unnest an array:
# select i from unnest(array[1,2]) i; i --- 1 2 (2 rows)
Which also works in a select statement:
# select unnest(array[1,2]) i; i --- 1 2 (2 rows)
Note that placing set returning functions in a select statement are by no means a recommended practice — on the contrary. Nonetheless, the functionality is occasionally convenient when querying the pg_catalog or when working with a questionably designed schema.
Anyway, this also works with multiple generators:
# select unnest(array[1,2]) i, unnest(array[3,4,5]) j order by i, j; i | j ---+--- 1 | 3 1 | 4 1 | 5 2 | 3 2 | 4 2 | 5 (6 rows)
Until it doesn’t…:
# select unnest(array[1,2]) i, unnest(array[3,4]) j order by i, j; i | j ---+--- 1 | 3 2 | 4 (2 rows)
I don’t know about you, but I expected a cartesian product there.
It turns out, the generators stop yielding rows, not when they’ve expanded the entire cartesian product, but rather when the series generators all end at the same time. In other words, when they reach their smallest common multiplier.
As a result, the following statement yields 2 * 2 = 4 rows:
# select unnest(array[1,2,3,4]) i, unnest(array[1,2]) j;
And this one yields 3 * 4 = 12 rows:
# select generate_series(1,6) i, generate_series(1,4) j;
For the expected 24 rows, use:
# select i, j from generate_series(1,6) i, generate_series(1,4) j;
And starting with Postgres 9.3, prefer using lateral queries anywhere you would have considered using a set returning function in a select statement.