Oracle Analytics - I wish there was one more aggregate function...

I'm a huge fan of Oracle's analytic functions and try to use them in my SQL whenever I can. I do however wish there was just one more tool in this very neat toolbox, MOST_COMMON().

The toolbox currently consists of the following functions, I've created a test table (DDL and inserts below) for those of us who are visual learners and like to see examples. Please note, these are very simple examples and I'd recommend you doing some Google-Kung-Fu to learn more about them if you find them interesting:

  • LAG and LEAD: LAG helps you get the value of the field from the previous row and LEAD allows access to the value of the next row. Here's an example but

    SELECT id,
    str,
    LAG(str) OVER (ORDER BY id) lag_str,
    LEAD(str) OVER (ORDER BY id) lead_str
    FROM test_stuff
    ORDER BY id;

    ID STR LAG_STR LEAD_STR
    ---- ------------ ------------ ------------
    1 Rich Richard
    2 Richard Rich Richie
    3 Richie Richard Rich
    4 Rich Richie Richard
    5 Richard Rich Richie
    6 Richie Richard Rick
    7 Rick Richie Ricker
    8 Ricker Rick Rich
    9 Rich Ricker Richie Rich
    10 Richie Rich Rich

    10 rows selected.


  • FIRST_VALUE & LAST_VALUE: These functions allow you to get the First and Last values in a set.

    SELECT id,
    str,
    FIRST_VALUE(str) over () first_str,
    LAST_VALUE(str) over () last_str
    FROM test_stuff
    ORDER BY id;

    ID STR FIRST_STR LAST_STR
    ---- ------------ ------------ -------------
    1 Rich Rich Richie Rich
    2 Richard Rich Richie Rich
    3 Richie Rich Richie Rich
    4 Rich Rich Richie Rich
    5 Richard Rich Richie Rich
    6 Richie Rich Richie Rich
    7 Rick Rich Richie Rich
    8 Ricker Rich Richie Rich
    9 Rich Rich Richie Rich
    10 Richie Rich Rich Richie Rich

    10 rows selected.


  • Rank & Dense Rank: Using the Oracle Rank function allows assign a number to the row illustrating the sequence or "rank" of that record. Used in aggregate functions, the most common value would have a rank of 1 and the second 2, etc. Dense Rank ensures the values are consecutive, so if you had 2 values with a rank of #2, the next dense rank value would have a value of 3 (where rank would have 4). Take a look at the example:

    SELECT str,
    COUNT(*) records,
    RANK() OVER (order by COUNT(*) DESC) ranking,
    DENSE_RANK() OVER (order by COUNT(*) DESC) DENSE_ranking
    FROM test_stuff
    GROUP BY str
    ORDER BY COUNT(*) DESC;

    STR RECORDS RANKING DENSE_RANKING
    ------------ -------------- -------------- --------------
    Rich 3 1 1
    Richie 2 2 2
    Richard 2 2 2
    Richie Rich 1 4 3
    Rick 1 4 3
    Ricker 1 4 3
    --------------
    sum 10

    6 rows selected.


    Other Oracle functions which should be in your SQL Toolbox include: MIN, MAX, SUM, AVG, COUNT, RATIO_TO_REPORT, VARIANCE, and STDDEV.

    The one that's missing? MOST_COMMON. I wish while using the other aggregate functions there was a function which would give you the most common value, for example:

    SELECT MOST_COMMON(str) STR
    FROM test_stuff;

    STR
    ------------
    Rich


    I'm not sure off hand what the function should do if there are more than one most common values, obviously this would take a little bit deeper thinking than what I've done here. Yesterday I had to do something like this and I cobbled together a big-old SQL statement which used sub-selects and the RANK function, it was really ugly but it worked.

    We'll see if Oracle enhances it's analytic functions in the future, perhaps Larry E. will read this and the team there will implement my MOST_COMMON function, maybe I'll get a sailboat ride out of the deal.

    Until next time...Rich




    CREATE TABLE test_stuff (
    id NUMBER PRIMARY KEY,
    str VARCHAR2(30));

    INSERT INTO test_stuff (id, str) VALUES (1, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (2, 'Richard');
    INSERT INTO test_stuff (id, str) VALUES (3, 'Richie');
    INSERT INTO test_stuff (id, str) VALUES (4, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (5, 'Richard');
    INSERT INTO test_stuff (id, str) VALUES (6, 'Richie');
    INSERT INTO test_stuff (id, str) VALUES (7, 'Rick');
    INSERT INTO test_stuff (id, str) VALUES (8, 'Ricker');
    INSERT INTO test_stuff (id, str) VALUES (9, 'Rich');
    INSERT INTO test_stuff (id, str) VALUES (10, 'Richie Rich');
    COMMIT;
  • Comments

    Edgar Chupit said…
    You still can achieve it by using analytic and a bit of inline views:

    SQL> select id, str, max( case when cnt_ = max_cnt then str end ) over() most_frequent
    2 from (
    3 select id, str, cnt_, max(cnt_) over() max_cnt
    4 from (
    5 select id, str, count(*) over(partition by str) cnt_
    6 from test_stuff
    7 )
    8 )
    9 /

    ID STR MOST_FREQUENT
    ---------- ------------------------------ ------------------------------
    4 Rich Rich
    1 Rich Rich
    9 Rich Rich
    2 Richard Rich
    5 Richard Rich
    6 Richie Rich
    3 Richie Rich
    10 Richie Rich Rich
    7 Rick Rich
    8 Ricker Rich

    10 rows selected
    mig_l said…
    http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions161.htm#SQLRF06320

    STATS_MODE()
    Murnane said…
    Thanks mig_l, what's crazy is that I knew about the function back in 2008 (http://richmurnane.blogspot.com/2008/07/getting-mean-median-and-mode-in-oracle.html) but completely forgot about it!