1997-11-25 22:07:18 +00:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* parse_agg.h
|
2008-12-28 18:54:01 +00:00
|
|
|
* handle aggregates and window functions in parser
|
1997-11-25 22:07:18 +00:00
|
|
|
*
|
2016-01-02 13:33:40 -05:00
|
|
|
* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
|
2000-01-26 05:58:53 +00:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
1997-11-25 22:07:18 +00:00
|
|
|
*
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/include/parser/parse_agg.h
|
1997-11-25 22:07:18 +00:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#ifndef PARSE_AGG_H
|
|
|
|
#define PARSE_AGG_H
|
|
|
|
|
1999-07-15 23:04:24 +00:00
|
|
|
#include "parser/parse_node.h"
|
1997-11-25 22:07:18 +00:00
|
|
|
|
2009-12-15 17:57:48 +00:00
|
|
|
extern void transformAggregateCall(ParseState *pstate, Aggref *agg,
|
2010-03-17 16:52:38 +00:00
|
|
|
List *args, List *aggorder,
|
2010-02-26 02:01:40 +00:00
|
|
|
bool agg_distinct);
|
Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.
This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.
The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.
The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.
Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage. The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting. The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.
A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.
Needs a catversion bump because stored rules may change.
Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
|
|
|
|
|
|
|
extern Node *transformGroupingFunc(ParseState *pstate, GroupingFunc *g);
|
|
|
|
|
2008-12-28 18:54:01 +00:00
|
|
|
extern void transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
|
2009-06-11 14:49:15 +00:00
|
|
|
WindowDef *windef);
|
2003-06-06 15:04:03 +00:00
|
|
|
|
2003-01-17 03:25:04 +00:00
|
|
|
extern void parseCheckAggregates(ParseState *pstate, Query *qry);
|
2001-10-28 06:26:15 +00:00
|
|
|
|
Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.
This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.
The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.
The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.
Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage. The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting. The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.
A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.
Needs a catversion bump because stored rules may change.
Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:40:59 +02:00
|
|
|
extern List *expand_grouping_sets(List *groupingSets, int limit);
|
|
|
|
|
Support ordered-set (WITHIN GROUP) aggregates.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()). We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions. To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c. This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need. There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates. Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
2013-12-23 16:11:35 -05:00
|
|
|
extern int get_aggregate_argtypes(Aggref *aggref, Oid *inputTypes);
|
|
|
|
|
|
|
|
extern Oid resolve_aggregate_transtype(Oid aggfuncid,
|
|
|
|
Oid aggtranstype,
|
|
|
|
Oid *inputTypes,
|
|
|
|
int numArguments);
|
|
|
|
|
2015-08-04 17:53:10 +03:00
|
|
|
extern void build_aggregate_transfn_expr(Oid *agg_input_types,
|
2006-07-27 19:52:07 +00:00
|
|
|
int agg_num_inputs,
|
Support ordered-set (WITHIN GROUP) aggregates.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()). We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions. To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c. This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need. There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates. Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
2013-12-23 16:11:35 -05:00
|
|
|
int agg_num_direct_inputs,
|
Allow aggregate functions to be VARIADIC.
There's no inherent reason why an aggregate function can't be variadic
(even VARIADIC ANY) if its transition function can handle the case.
Indeed, this patch to add the feature touches none of the planner or
executor, and little of the parser; the main missing stuff was DDL and
pg_dump support.
It is true that variadic aggregates can create the same sort of ambiguity
about parameters versus ORDER BY keys that was complained of when we
(briefly) had both one- and two-argument forms of string_agg(). However,
the policy formed in response to that discussion only said that we'd not
create any built-in aggregates with varying numbers of arguments, not that
we shouldn't allow users to do it. So the logical extension of that is
we can allow users to make variadic aggregates as long as we're wary about
shipping any such in core.
In passing, this patch allows aggregate function arguments to be named, to
the extent of remembering the names in pg_proc and dumping them in pg_dump.
You can't yet call an aggregate using named-parameter notation. That seems
like a likely future extension, but it'll take some work, and it's not what
this patch is really about. Likewise, there's still some work needed to
make window functions handle VARIADIC fully, but I left that for another
day.
initdb forced because of new aggvariadic field in Aggref parse nodes.
2013-09-03 17:08:38 -04:00
|
|
|
bool agg_variadic,
|
2003-08-04 00:43:34 +00:00
|
|
|
Oid agg_state_type,
|
2011-03-19 20:29:08 -04:00
|
|
|
Oid agg_input_collation,
|
2003-08-04 00:43:34 +00:00
|
|
|
Oid transfn_oid,
|
2014-04-12 11:58:53 -04:00
|
|
|
Oid invtransfn_oid,
|
2003-08-04 00:43:34 +00:00
|
|
|
Expr **transfnexpr,
|
2015-08-04 17:53:10 +03:00
|
|
|
Expr **invtransfnexpr);
|
|
|
|
|
|
|
|
extern void build_aggregate_finalfn_expr(Oid *agg_input_types,
|
|
|
|
int num_finalfn_inputs,
|
|
|
|
Oid agg_state_type,
|
|
|
|
Oid agg_result_type,
|
|
|
|
Oid agg_input_collation,
|
|
|
|
Oid finalfn_oid,
|
2003-08-04 00:43:34 +00:00
|
|
|
Expr **finalfnexpr);
|
2003-07-01 19:10:53 +00:00
|
|
|
|
2001-11-05 17:46:40 +00:00
|
|
|
#endif /* PARSE_AGG_H */
|