I needed to find a ranking of a large data set. Using Python, it makes sense to look at the numpy library for this.
Numpy has the function argsort, which returns index positions . One would think these are exactly the ranks we are after. Unfortunately, this is not the case.
I would expect:
Actually, the reported indices are for a reversed mapping: from the (unknown) sorted vector to the original unsorted vector.
- Using loops. Use the simple fact that for an index \(p\) and its inverse mapping \(p'\), we have: \[p_i=j \iff p'_j=i\] Looping is in general quite slow in Python. So we may want to look into some alternatives.
- Fancy indexing. Here we use the array indx to permute the values [0,1,2,3].
- Applying argsort twice. This is a bit of a surprise, but it does exactly what we want. This one is the most intriguing of course: argsort(argsort(a)) gives the ranking.
- Numpy.argsort, https://numpy.org/doc/stable/reference/generated/numpy.argsort.html
- Rank Values in Numpy Array, https://www.delftstack.com/howto/numpy/python-numpy-rank/