Feeds:
Posts

SQL SERVER 2012- Shortest way to calculate median

Calculating median has never been an issue, as we have lots of solutions to calculate medians using CTE, user defined functions etc. But what encourages me to write this post is, I found a shortcut to calculate the median using PERCENTILE_CONT (A new analytic function shipped in SQL Server 2012).

As per the definition : To find the Median, place the numbers you are given in value order and find the middle number. Click here for Reference

Given below is the sample that I will use to demonstrate what is median.

<br />SELECT ProductID,StandardCost FROM<br />[Production].[ProductCostHistory]<br />WHERE ProductID IN (707,708,711,712,713)<br />ORDER BY ProductID<br />GO<br />--OUTPUT<br /> Given below is the script to calculate the median in a shortest way.

<br />Use AdventureWorks2012<br />GO<br />SELECT<br />DISTINCT [ProductID]<br />,PERCENTILE_CONT(0.5)<br />WITHIN GROUP (ORDER BY [StandardCost])<br />OVER (PARTITION BY [ProductID]) AS Median<br />FROM [Production].[ProductCostHistory]<br />WHERE ProductID IN (707,708,711,712,713)<br />ORDER BY [ProductID]<br />GO<br />--OUTPUT<br /> Note : Whenever you calculate median using PERCENTILE_CONT make sure it is PERCENTILE_CONT(0.5).

SQL Server 2012 – Analytic Functions – PERCENTILE_CONT

In this article, we will discuss another important analytical function introduced in SQL SERVER 2012 namely PERCENTILE_CONT. Lets discuss PERCENTILE_CONT syntax, purpose, return type with simple examples.

Syntax

PERCENTILE_CONT ( numeric_literal )
WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] )
OVER ( [ <partition_by_clause> ] )

Purpose
The purpose of this function is to calculate the percentile based on a continuous distribution of the field. In other words you can say that it also calculates the median but just to base it on the median, is not correct.

Return Type
The return type is float(53) and the value of percentile should be between 0 and 1.

Formula
Given below is the formula that will help you to understand how it works.
If (CRN = FRN = RN) then the result is
(value of expression from row at RN)
Otherwise the result is
(CRN – RN) * (value of expression for row at FRN) +
(RN – FRN) * (value of expression for row at CRN)

Where
RN = Row number
CRN = Ceiling of RN
FRN = Floor of RN

Lets implement this formula in the examples :
Example 1 : When PERCENTILE_CONT is 0.5
Lets take gold rate as an example to check their percentile based on a continuous distribution of the gold rate in one week.

Create table [Daily_Gold_Rate]
(
[S.No] int,
[Date] datetime,
[Carat] int,
[Gold Rate] numeric(18,2)
)

Insert into [Daily_Gold_Rate] values(1,'2013-01-03',18,155.00)
Insert into [Daily_Gold_Rate] values(2,'2013-01-03',22,190.50)
Insert into [Daily_Gold_Rate] values(3,'2013-01-03',24,202.23)

Insert into [Daily_Gold_Rate] values(4,'2013-01-04',18,153.00)
Insert into [Daily_Gold_Rate] values(5,'2013-01-04',22,191.00)
Insert into [Daily_Gold_Rate] values(6,'2013-01-04',24,202.25)

Insert into [Daily_Gold_Rate] values(7,'2013-01-05',18,150.00)
Insert into [Daily_Gold_Rate] values(8,'2013-01-05',22,190.00)
Insert into [Daily_Gold_Rate] values(9,'2013-01-05',24,203.25)

Insert into [Daily_Gold_Rate] values(10,'2013-01-06',18,158.00)
Insert into [Daily_Gold_Rate] values(11,'2013-01-06',22,189.50)
Insert into [Daily_Gold_Rate] values(12,'2013-01-06',24,201.50)

Select
[Date]
,[Carat]
,[Gold Rate]
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY [Gold rate])
OVER (PARTITION BY [Carat]) AS MedianCont
from [Daily_Gold_Rate] Lets calculate the formula in excel and view the results.
if Percentile is 0.5 then RN is not equal to CRN and is not equal to FRN. Therefore the formula will be
(CRN – RN) * (value of expression for row at FRN) +
(RN – FRN) * (value of expression for row at CRN) Example 2 : When PERCENTILE_CONT is 1

Select
[Date]
,[Carat]
,[Gold Rate]
,PERCENTILE_CONT(1) WITHIN GROUP (ORDER BY [Gold rate])
OVER (PARTITION BY [Carat]) AS MedianCont
from [Daily_Gold_Rate] Lets calculate the formula in excel and view the results.
if Percentile is 1 then RN=CRN=FRN and the formula will be
If (CRN = FRN = RN) then the result is
(value of expression from row at RN) Reference :MSDN