postgres/src/include/parser/parse_agg.h

/*-------------------------------------------------------------------------
 *
 * parse_agg.h
 *	  handle aggregates and window functions in parser
 *
 * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * src/include/parser/parse_agg.h
 *
 *-------------------------------------------------------------------------
 */
#ifndef PARSE_AGG_H
#define PARSE_AGG_H

#include "parser/parse_node.h"

extern void transformAggregateCall(ParseState *pstate, Aggref *agg,
					   List *args, List *aggorder,
					   bool agg_distinct);

extern Node *transformGroupingFunc(ParseState *pstate, GroupingFunc *g);

extern void transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
						WindowDef *windef);

extern void parseCheckAggregates(ParseState *pstate, Query *qry);

extern List *expand_grouping_sets(List *groupingSets, int limit);

extern int	get_aggregate_argtypes(Aggref *aggref, Oid *inputTypes);

extern Oid resolve_aggregate_transtype(Oid aggfuncid,
							Oid aggtranstype,
							Oid *inputTypes,
							int numArguments);

extern void build_aggregate_transfn_expr(Oid *agg_input_types,
						int agg_num_inputs,
						int agg_num_direct_inputs,
						bool agg_variadic,
						Oid agg_state_type,
						Oid agg_input_collation,
						Oid transfn_oid,
						Oid invtransfn_oid,
						Expr **transfnexpr,
						Expr **invtransfnexpr);

extern void build_aggregate_finalfn_expr(Oid *agg_input_types,
						int num_finalfn_inputs,
						Oid agg_state_type,
						Oid agg_result_type,
						Oid agg_input_collation,
						Oid finalfn_oid,
						Expr **finalfnexpr);

#endif   /* PARSE_AGG_H */
Break parser functions into smaller files, group together. 1997-11-25 22:07:18 +00:00			`/*-------------------------------------------------------------------------`
			`*`
			`* parse_agg.h`
Support window functions a la SQL:2008. Hitoshi Harada, with some kibitzing from Heikki and Tom. 2008-12-28 18:54:01 +00:00			`* handle aggregates and window functions in parser`
Break parser functions into smaller files, group together. 1997-11-25 22:07:18 +00:00			`*`
Update copyright for 2016 Backpatch certain files through 9.1 2016-01-02 13:33:40 -05:00			`* Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group`
Add: * Portions Copyright (c) 1996-2000, PostgreSQL, Inc to all files copyright Regents of Berkeley. Man, that's a lot of files. 2000-01-26 05:58:53 +00:00			`* Portions Copyright (c) 1994, Regents of the University of California`
Break parser functions into smaller files, group together. 1997-11-25 22:07:18 +00:00			`*`
Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00			`* src/include/parser/parse_agg.h`
Break parser functions into smaller files, group together. 1997-11-25 22:07:18 +00:00			`*`
			`*-------------------------------------------------------------------------`
			`*/`
			`#ifndef PARSE_AGG_H`
			`#define PARSE_AGG_H`

Change #include's to use <> and "" as appropriate. 1999-07-15 23:04:24 +00:00			`#include "parser/parse_node.h"`
Break parser functions into smaller files, group together. 1997-11-25 22:07:18 +00:00
Support ORDER BY within aggregate function calls, at long last providing a non-kluge method for controlling the order in which values are fed to an aggregate function. At the same time eliminate the old implementation restriction that DISTINCT was only supported for single-argument aggregates. Possibly release-notable behavioral change: formerly, agg(DISTINCT x) dropped null values of x unconditionally. Now, it does so only if the agg transition function is strict; otherwise nulls are treated as DISTINCT normally would, ie, you get one copy. Andrew Gierth, reviewed by Hitoshi Harada 2009-12-15 17:57:48 +00:00			`extern void transformAggregateCall(ParseState pstate, Aggref agg,`
Pass incompletely-transformed aggregate argument lists as separate parameters to transformAggregateCall, instead of abusing fields in Aggref to carry them temporarily. No change in functionality but hopefully the code is a bit clearer now. Per gripe from Gokulakannan Somasundaram. 2010-03-17 16:52:38 +00:00			`List args, List aggorder,`
pgindent run for 9.0 2010-02-26 02:01:40 +00:00			`bool agg_distinct);`
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com 2015-05-16 03:40:59 +02:00
			`extern Node transformGroupingFunc(ParseState pstate, GroupingFunc *g);`

Support window functions a la SQL:2008. Hitoshi Harada, with some kibitzing from Heikki and Tom. 2008-12-28 18:54:01 +00:00			`extern void transformWindowFuncCall(ParseState pstate, WindowFunc wfunc,`
8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew. 2009-06-11 14:49:15 +00:00			`WindowDef *windef);`
Implement outer-level aggregates to conform to the SQL spec, with extensions to support our historical behavior. An aggregate belongs to the closest query level of any of the variables in its argument, or the current query level if there are no variables (e.g., COUNT(*)). The implementation involves adding an agglevelsup field to Aggref, and treating outer aggregates like outer variables at planning time. 2003-06-06 15:04:03 +00:00
Fix parse_agg.c to detect ungrouped Vars in sub-SELECTs; remove code that used to do it in planner. That was an ancient kluge that was never satisfactory; errors should be detected at parse time when possible. But at the time we didn't have the support mechanism (expression_tree_walker et al) to make it convenient to do in the parser. 2003-01-17 03:25:04 +00:00			`extern void parseCheckAggregates(ParseState pstate, Query qry);`
Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments. 2001-10-28 06:26:15 +00:00
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com 2015-05-16 03:40:59 +02:00			`extern List expand_grouping_sets(List groupingSets, int limit);`

Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane 2013-12-23 16:11:35 -05:00			`extern int get_aggregate_argtypes(Aggref aggref, Oid inputTypes);`

			`extern Oid resolve_aggregate_transtype(Oid aggfuncid,`
			`Oid aggtranstype,`
			`Oid *inputTypes,`
			`int numArguments);`

Share transition state between different aggregates when possible. If there are two different aggregates in the query with same inputs, and the aggregates have the same initial condition and transition function, only calculate the state value once, and only call the final functions separately. For example, AVG(x) and SUM(x) aggregates have the same transition function, which accumulates the sum and number of input tuples. For a query like "SELECT AVG(x), SUM(x) FROM x", we can therefore accumulate the state function only once, which gives a nice speedup. David Rowley, reviewed and edited by me. 2015-08-04 17:53:10 +03:00			`extern void build_aggregate_transfn_expr(Oid *agg_input_types,`
Aggregate functions now support multiple input arguments. I also took the opportunity to treat COUNT(*) as a zero-argument aggregate instead of the old hack that equated it to COUNT(1); this is materially cleaner (no more weird ANYOID cases) and ought to be at least a tiny bit faster. Original patch by Sergey Koposov; review, documentation, simple regression tests, pg_dump and psql support by moi. 2006-07-27 19:52:07 +00:00			`int agg_num_inputs,`
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane 2013-12-23 16:11:35 -05:00			`int agg_num_direct_inputs,`
Allow aggregate functions to be VARIADIC. There's no inherent reason why an aggregate function can't be variadic (even VARIADIC ANY) if its transition function can handle the case. Indeed, this patch to add the feature touches none of the planner or executor, and little of the parser; the main missing stuff was DDL and pg_dump support. It is true that variadic aggregates can create the same sort of ambiguity about parameters versus ORDER BY keys that was complained of when we (briefly) had both one- and two-argument forms of string_agg(). However, the policy formed in response to that discussion only said that we'd not create any built-in aggregates with varying numbers of arguments, not that we shouldn't allow users to do it. So the logical extension of that is we can allow users to make variadic aggregates as long as we're wary about shipping any such in core. In passing, this patch allows aggregate function arguments to be named, to the extent of remembering the names in pg_proc and dumping them in pg_dump. You can't yet call an aggregate using named-parameter notation. That seems like a likely future extension, but it'll take some work, and it's not what this patch is really about. Likewise, there's still some work needed to make window functions handle VARIADIC fully, but I left that for another day. initdb forced because of new aggvariadic field in Aggref parse nodes. 2013-09-03 17:08:38 -04:00			`bool agg_variadic,`
pgindent run. 2003-08-04 00:43:34 +00:00			`Oid agg_state_type,`
Revise collation derivation method and expression-tree representation. All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing. 2011-03-19 20:29:08 -04:00			`Oid agg_input_collation,`
pgindent run. 2003-08-04 00:43:34 +00:00			`Oid transfn_oid,`
Create infrastructure for moving-aggregate optimization. Until now, when executing an aggregate function as a window function within a window with moving frame start (that is, any frame start mode except UNBOUNDED PRECEDING), we had to recalculate the aggregate from scratch each time the frame head moved. This patch allows an aggregate definition to include an alternate "moving aggregate" implementation that includes an inverse transition function for removing rows from the aggregate's running state. As long as this can be done successfully, runtime is proportional to the total number of input rows, rather than to the number of input rows times the average frame length. This commit includes the core infrastructure, documentation, and regression tests using user-defined aggregates. Follow-on commits will update some of the built-in aggregates to use this feature. David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional hacking by me 2014-04-12 11:58:53 -04:00			`Oid invtransfn_oid,`
pgindent run. 2003-08-04 00:43:34 +00:00			`Expr **transfnexpr,`
Share transition state between different aggregates when possible. If there are two different aggregates in the query with same inputs, and the aggregates have the same initial condition and transition function, only calculate the state value once, and only call the final functions separately. For example, AVG(x) and SUM(x) aggregates have the same transition function, which accumulates the sum and number of input tuples. For a query like "SELECT AVG(x), SUM(x) FROM x", we can therefore accumulate the state function only once, which gives a nice speedup. David Rowley, reviewed and edited by me. 2015-08-04 17:53:10 +03:00			`Expr **invtransfnexpr);`

			`extern void build_aggregate_finalfn_expr(Oid *agg_input_types,`
			`int num_finalfn_inputs,`
			`Oid agg_state_type,`
			`Oid agg_result_type,`
			`Oid agg_input_collation,`
			`Oid finalfn_oid,`
pgindent run. 2003-08-04 00:43:34 +00:00			`Expr **finalfnexpr);`
Aggregates can be polymorphic, using polymorphic implementation functions. It also works to create a non-polymorphic aggregate from polymorphic functions, should you want to do that. Regression test added, docs still lacking. By Joe Conway, with some kibitzing from Tom Lane. 2003-07-01 19:10:53 +00:00
New pgindent run with fixes suggested by Tom. Patch manually reviewed, initdb/regression tests pass. 2001-11-05 17:46:40 +00:00			`#endif /* PARSE_AGG_H */`