Let’s say we have an sql table of jobs. They can be in any of several states. They have a starte time and a completed time. And they have have a type of A or B.

type started completed state
A 9:01am 9:02am completed
A 9:10am 9:14am aborted
B 9:20am 9:21am cancelled
A 9:20am 9:22am completed

Let’s say we want to know for A and B, what percentage of jobs complete. We’d also like to know what percentage of jobs finish under a certain time threshold. Now this is drastically simplified, but you can imagine that job type could be any complex set of variables you want to analyze by: region, language, type of rendering, whatever.

Here’s a query that will calculate the ratio of completed jobs by type:

select type,
avg(cast(
case
  when state='completed' then 1
  else 0
end
as float)
) as completed_ratio
from jobs
group by type

The way it works:

  1. The case statement sets of value of 1 for the success condition, and 0 for any other.
  2. These are integers, so we cast them as float, so we’ll get the averaging behavior we want.
  3. We avg them, to get the average.
Similarly, if we wanted to calculate the ratio of jobs completed under a certain time threshold, we could do that in the case statement, such as this example which calculates the ratio of jobs whose end time is under 3 minutes:


select type,
avg(cast(
case
  when datediff(minute, completed, started) < 3 then 1
  else 0
end
as float)
) as completed_ratio
from jobs
group by type