Counting Inversions using Merge Sort

One useful application of the merge sort is finding the number of inversions in a list. Inversion count represents how close (or far) the list is from being sorted.

Formal definition à Two elements array[i] and array[j] form an inversion if i < j and array[i] > array[j].

To put in simpler terms, inversions are pairs of numbers, in a disordered list, where the larger of the two numbers is to the left of the smaller number.

The first thought would be "for each element of the array, we count all elements less than it to its right and add the count to output". The complexity of this solution would be O(n2).

Can we reduce time complexity?

Of course! We use merge sort! Merge Sort was already counting inversions for us, we just needed to track it. Let’s see how.

Visualization

These two examples will give you a better understanding of what are inversions.


Pseudo Code

split each element into partitions of size 1
recursively merge adjacent partitions
  for i = leftPartIdx to rightPartIdx
    if leftPartHeadValue <= rightPartHeadValue
      copy leftPartHeadValue
    else
      copy rightPartHeadValue
      Increase InvIdx
copy elements back to the original array

Implementation in C++


The final array gets sorted by following the above code. To avoid that, create a copy of the original array and perform the mergeSort on that.

Practice Problem
Question: Counting Inversions Revisited (CodeChef)
Solution: As you have read in the question, we are given an array of 'n' numbers and that array is repeated 'k' number of times to get an array of length 'n*k'. Now, we have to find the number of inversions in the final array. So, how would we go about solving this?

Your first thought would be to apply merge sort over this new array of length 'n*k' and find the number of inversions. But, the time complexity of this method would be 'nk*log(nk)' and for large values of k, this would make it less efficient. So, any other way to solve this? Yes! We use some basic counting techniques. Let's see how.

First, take the given array of length n, compare the values in arr[0] (let's say A) and arr[1] (let's say B) i.e., the first and the second element. Now,
  • Case 1: If A > B. In this case, (A, B) itself forms an inversion pair. So, from the POV (Point Of View) of the first A, there will be 'k' values of B after it, which form an inversion pair with A. From the POV of the second A, there will 'k-1' values of B after it, which form an inversion pair with the second A and so on. We get a pattern.
  • Case 2: If A < B. In this case, (A, B) don't form an inversion pair. But, the first B and the second A form an inversion pair. So, from the POV of the first B, there will be 'k-1' values of A after it, which form an inversion pair with B. From the POV of the second B, there will be 'k-2' values of A after it, which form an inversion pair with the second B and so on. We once again get a pattern.
By iterating the above method over the array of length n taking two elements like a pair, we get the total number of inversions in the final array of length 'n*k'. Given below is the solution in C++.


Also, check out another practice problem -- Inversion Count (SPOJ)

If you find any errors in the article, please comment below.


Written with  by Ashwin


Comments