I needed to find a ranking of a large data set. Using Python, it makes sense to look at the numpy library for this.
Numpy has the function argsort, which returns index positions [1]. One would think these are exactly the ranks we are after. Unfortunately, this is not the case.
>>> import numpy as np >>> a = [3.0, 1.0, 5.0, 2.0] >>> indx = np.argsort(a) >>> indx array([1, 3, 0, 2], dtype=int64)
I would expect:
indx = [2, 0, 3, 1]
Actually, the reported indices are for a reversed mapping: from the (unknown) sorted vector to the original unsorted vector.
On the left is what I was after, and on the right is what argsort returns.
It is not very difficult to get the inverse mapping. Here are a few ways to do this:
- Using loops. Use the simple fact that for an index \(p\) and its inverse mapping \(p'\), we have: \[p_i=j \iff p'_j=i\]
Looping is in general quite slow in Python. So we may want to look into some alternatives.
>>> rank = np.empty_like(indx) >>> for i,j in enumerate(indx): ... rank[j]=i ... >>> rank array([2, 0, 3, 1], dtype=int64)
- Fancy indexing[2]. Here we use the array indx to permute the values [0,1,2,3].
>>> rank = np.empty_like(indx) >>> rank[indx] = np.arange(len(indx)) >>> rank array([2, 0, 3, 1], dtype=int64)
- Applying argsort twice[2]. This is a bit of a surprise, but it does exactly what we want.
This one is the most intriguing of course: argsort(argsort(a)) gives the ranking.
>>> rank=np.argsort(indx) >>> rank array([2, 0, 3, 1], dtype=int64)
An alternative is to use scipy.stats.rankdata. The above ranking can be replicated with:
>>> import scipy.stats as stats >>> rank = stats.rankdata(a,method='ordinal')-1 >>> rank array([2, 0, 3, 1], dtype=int64)
References
- Numpy.argsort, https://numpy.org/doc/stable/reference/generated/numpy.argsort.html
- Rank Values in Numpy Array, https://www.delftstack.com/howto/numpy/python-numpy-rank/
Hi Erwin,
ReplyDeleteCould you please let me know your idea about this ?
https://or.stackexchange.com/questions/8447/how-to-solve-rogo-puzzle-with-an-extra-constraint/8448#8448