Oracle Analytics - I wish there was one more aggregate function...

February 17, 2011

Oracle Analytics - I wish there was one more aggregate function...

I'm a huge fan of Oracle's analytic functions and try to use them in my SQL whenever I can. I do however wish there was just one more tool in this very neat toolbox, MOST_COMMON().

The toolbox currently consists of the following functions, I've created a test table (DDL and inserts below) for those of us who are visual learners and like to see examples. Please note, these are very simple examples and I'd recommend you doing some Google-Kung-Fu to learn more about them if you find them interesting:

LAG and LEAD: LAG helps you get the value of the field from the previous row and LEAD allows access to the value of the next row. Here's an example but


    SELECT  id,
           str,
           LAG(str) OVER (ORDER BY id) lag_str,
           LEAD(str) OVER (ORDER BY id) lead_str
    FROM    test_stuff
    ORDER BY id;

     ID STR          LAG_STR      LEAD_STR
   ---- ------------ ------------ ------------
      1 Rich                      Richard
      2 Richard      Rich         Richie
      3 Richie       Richard      Rich
      4 Rich         Richie       Richard
      5 Richard      Rich         Richie
      6 Richie       Richard      Rick
      7 Rick         Richie       Ricker
      8 Ricker       Rick         Rich
      9 Rich         Ricker       Richie Rich
     10 Richie Rich  Rich

    10 rows selected.

FIRST_VALUE & LAST_VALUE: These functions allow you to get the First and Last values in a set.


    SELECT  id,
           str,
           FIRST_VALUE(str) over () first_str,
           LAST_VALUE(str) over () last_str
    FROM    test_stuff
    ORDER BY id;

     ID STR          FIRST_STR    LAST_STR
   ---- ------------ ------------ -------------
      1 Rich         Rich         Richie Rich
      2 Richard      Rich         Richie Rich
      3 Richie       Rich         Richie Rich
      4 Rich         Rich         Richie Rich
      5 Richard      Rich         Richie Rich
      6 Richie       Rich         Richie Rich
      7 Rick         Rich         Richie Rich
      8 Ricker       Rich         Richie Rich
      9 Rich         Rich         Richie Rich
     10 Richie Rich  Rich         Richie Rich

    10 rows selected.

Rank & Dense Rank: Using the Oracle Rank function allows assign a number to the row illustrating the sequence or "rank" of that record. Used in aggregate functions, the most common value would have a rank of 1 and the second 2, etc. Dense Rank ensures the values are consecutive, so if you had 2 values with a rank of #2, the next dense rank value would have a value of 3 (where rank would have 4). Take a look at the example:


    SELECT  str,
           COUNT(*) records,
           RANK() OVER (order by COUNT(*) DESC) ranking,
           DENSE_RANK() OVER (order by COUNT(*) DESC) DENSE_ranking
    FROM    test_stuff
    GROUP BY str
    ORDER BY COUNT(*) DESC;

    STR                 RECORDS        RANKING  DENSE_RANKING
    ------------ -------------- -------------- --------------
    Rich                      3              1              1
    Richie                    2              2              2
    Richard                   2              2              2
    Richie Rich               1              4              3
    Rick                      1              4              3
    Ricker                    1              4              3
                --------------
    sum                      10

    6 rows selected.

Other Oracle functions which should be in your SQL Toolbox include: MIN, MAX, SUM, AVG, COUNT, RATIO_TO_REPORT, VARIANCE, and STDDEV.

The one that's missing? MOST_COMMON. I wish while using the other aggregate functions there was a function which would give you the most common value, for example:


    SELECT  MOST_COMMON(str) STR
    FROM    test_stuff;

    STR         
    ------------
    Rich

I'm not sure off hand what the function should do if there are more than one most common values, obviously this would take a little bit deeper thinking than what I've done here. Yesterday I had to do something like this and I cobbled together a big-old SQL statement which used sub-selects and the RANK function, it was really ugly but it worked.

We'll see if Oracle enhances it's analytic functions in the future, perhaps Larry E. will read this and the team there will implement my MOST_COMMON function, maybe I'll get a sailboat ride out of the deal.

Until next time...Rich


    CREATE TABLE test_stuff (
        id      NUMBER PRIMARY KEY,
        str     VARCHAR2(30));

    INSERT INTO test_stuff (id, str) VALUES (1, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (2, 'Richard');
    INSERT INTO test_stuff (id, str) VALUES (3, 'Richie');
    INSERT INTO test_stuff (id, str) VALUES (4, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (5, 'Richard');
    INSERT INTO test_stuff (id, str) VALUES (6, 'Richie');
    INSERT INTO test_stuff (id, str) VALUES (7, 'Rick');
    INSERT INTO test_stuff (id, str) VALUES (8, 'Ricker');
    INSERT INTO test_stuff (id, str) VALUES (9, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (10, 'Richie Rich');
    COMMIT;

Comments

Edgar Chupit said…

You still can achieve it by using analytic and a bit of inline views:

SQL> select id, str, max( case when cnt_ = max_cnt then str end ) over() most_frequent
2 from (
3 select id, str, cnt_, max(cnt_) over() max_cnt
4 from (
5 select id, str, count(*) over(partition by str) cnt_
6 from test_stuff
7 )
8 )
9 /

ID STR MOST_FREQUENT
---------- ------------------------------ ------------------------------
4 Rich Rich
1 Rich Rich
9 Rich Rich
2 Richard Rich
5 Richard Rich
6 Richie Rich
3 Richie Rich
10 Richie Rich Rich
7 Rick Rich
8 Ricker Rich

10 rows selected

Feb 17, 2011, 9:20:00 AM

mig_l said…

http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions161.htm#SQLRF06320

STATS_MODE()

Mar 5, 2014, 9:52:00 AM

Murnane said…

Thanks mig_l, what's crazy is that I knew about the function back in 2008 (http://richmurnane.blogspot.com/2008/07/getting-mean-median-and-mode-in-oracle.html) but completely forgot about it!

Mar 5, 2014, 10:12:00 AM

Search This Blog

Rich Murnane's Blog

Oracle Analytics - I wish there was one more aggregate function...

Comments

Popular Posts

Levenshtein Distance Algorithm: Oracle PL/SQL Implementation using a two-dimensional array of numbers

The "Mosaic Effect"...