Ranking functions:
1.RANK
2.DENSE_RANK
3.ROW_NUMBER
4.NTILE
5.WIDTH_BUCKET
6.CUME_DIST
7.PERCENT_RANK
8.FIRST / LAST (KEEP)
1.RANK: Starts with 1, if duplicate values, same rank is assigned. Gaps appear in the ranks.
2.DENSE_RANK: Starts with 1, if duplicate values,same rank is assigned. No gap in ranks.
3.ROW_NUMBER: Starts with 1 & unique, if duplicate values,numbers are arbitrarily assigned.
The PARTITION BY clause used in ranking functions is used to divide a result set into pieces so that rankings can be applied within each subset.
Handling NULLS:
All ranking functions allow to specify where in the ranking order NULL values should appear.
Append NULLS FIRST / NULLS LAST after the ORDER BY clause of the function.
USAGE: Top/Bottom-N queries
FIRST / LAST: (KEEP)
"Find the regions with the best and worst total sales last year."
Example:
MIN(region_id)
KEEP (DENSE_RANK FIRST ORDER BY SUM(tot_sales) DESC) best_region,
MIN(region_id)
KEEP (DENSE_RANK LAST ORDER BY SUM(tot_sales) DESC) worst_region
NTILE:
Group the records into buckets ( equiheight buckets ).
Example:
NTILE(4) OVER (ORDER BY SUM(tot_sales) DESC) will divide it into four buckets.
WIDTH_BUCKET:(equiwidth buckets)
WIDTH_BUCKET function groups rows of the result set into buckets.
WIDTH_BUCKET function attempts to create equiwidth buckets, meaning that the range of values is evenly distributed across the buckets.
WIDTH_BUCKET can operate on numeric or date types.
WIDTH_BUCKET uses the values of the second, third, and fourth parameters to generate N buckets.
If the expression yields values that fall outside the range specified by the second and third parameters, the WIDTH_BUCKET function will generate two additional buckets, numbered 0 and N+1, into which the outliers are placed.
Example:
WIDTH_BUCKET(SUM(tot_sales), 1, 3000000, 3) sales_buckets
Based on these parameters, the WIDTH_BUCKET function generates three buckets;
the first bucket starts at 1, and the third bucket has an upper range of 3,000,000.
Since there are three buckets, the ranges for each bucket will be
1 to 1,000,000,
1,000,0001 to 2,000,000, and
2,000,0001 to 3,000,000.
Any values less than 1 or greater than 3,000,000 will be placed to 0th and 4th buckets.
NOTE:
Keep in mind that the WIDTH_BUCKET function does not remove rows from the result set that do not lie within the specified range; rather, they are placed into special buckets that your query can either utilize or ignore as needed.
CUME_DIST and PERCENT_RANK :
CUME_DIST( ) : (Cumulative Distribution)
Ratio of the number of rows that have a lesser or equal ranking to the total number of rows in the partition.
PERCENT_RANK( ) :
Ratio of a row's ranking to the number of rows in the partition.
PERCENT_RANK=((Rank of the row -1) / (Total rows -1))
Example:
CUME_DIST( ) OVER (ORDER BY SUM(tot_sales) DESC) sales_cume_dist
PERCENT_RANK( ) OVER (ORDER BY SUM(tot_sales) DESC) sales_percent_rank
No comments:
Post a Comment