数组排序是很常见的一种操作。

一维数组排序大家都很熟：

>>> l=[4,2,5,7,1]
>>> sorted(l)
[1, 2, 4, 5, 7]

>>> l=['a','apple','banana','car','kubernetes']
>>> sorted(l,key=len)
['a', 'car', 'apple', 'banana', 'kubernetes']

>>> sorted(l,key=lambda x:x[-1])
['a', 'banana', 'apple', 'car', 'kubernetes']

>>> sorted(l,key=lambda x:x[-1],reverse=True)
['kubernetes', 'car', 'apple', 'a', 'banana']

>>> class Human:
...     def __init__(self,age):
...             self.age=age
...     def __repr__(self):
...             return f"Human, age:{self.age}"
...
>>> l=[Human(10),Human(5),Human(34),Human(24)]
>>> sorted(l,key=lambda x:x.age)
[Human, age:5, Human, age:10, Human, age:24, Human, age:34]

谈到二维数组或更高维的多重排序时，刚开始自己就摸不到头脑了。

对于数据库表可以很简单地使用多重排序：

  
select a,b from table order by a,b;

但 python 不能使用 sql 对多维数组排序啊。

经过查找资料发现有如下有趣的用法：

lambda x: (x[0], x[1])

  
data = [[2, 3], [1, 3], [2, 1], [3, 5], [2, 6]]
# 多重排序，对元素 item，先按 item[0] 升序，再按 item[1] 升序
s = sorted(data, key=lambda x: (x[0], x[1]))
for i in s:
    print(i)

输出结果：

[1, 3]
[2, 1]
[2, 3]
[2, 6]
[3, 5]

当然可以对每个排序对象指定排序规则：

  
data = [[2, 3], [1, 3], [2, 1], [3, 5], [2, 6]]
s = sorted(data, key=lambda x: (x[0], -x[1]))
for i in s:
    print(i)

输出结果：

[1, 3]
[2, 6]
[2, 3]
[2, 1]
[3, 5]

局限性：上面的 - 表示逆序，只适用于数字的比较。

attrgetter

对于类对象列表的按属性多重排序可以 attrgetter ：

  
from operator import attrgetter


class Human:
    def __init__(self, name, age, height):
        self.name = name
        self.age = age
        self.height = height

    def __repr__(self):
        return f"Human, age:{self.age}, height:{self.height}, name:{self.name}"


data = [Human('july', 12, 134), Human('andy', 13, 145),
        Human('lisa', 10, 125), Human('joe', 10, 130),
        Human('bob', 34, 178), Human('tom', 13, 145)]
s = sorted(data, key=attrgetter('age', 'name'))
for i in s:
    print(i)

输出结果：

Human, age:10, height:130, name:joe
Human, age:10, height:125, name:lisa
Human, age:12, height:134, name:july
Human, age:13, height:145, name:andy
Human, age:13, height:145, name:tom
Human, age:34, height:178, name:bob

局限性：只能按照指定属性升序排列。

自定义比较函数

在 sorted() 函数中我们可以指定排序函数，但是这个函数只能接收一个参数，我们自定义比较函数时需要两个参数来互相比较，这就很矛盾。

python functools 包中有个辅助函数 cmp_to_key 可以帮我们解决。

  
from functools import cmp_to_key


def compare(x, y):
    # item[0] 升序
    if x[0] > y[0]:
        # 返回1，为了排序后 x[0] 在 y[0] 左侧
        return 1
    if x[0] < y[0]:
        return -1

    # 以下是 x[0]=y[0] 的情况

    # item[1] 降序
    if x[1] > y[1]:
        return -1
    if x[1] < y[1]:
        return 1
    return 0


data = [[2, 3], [1, 3], [2, 1], [3, 5], [2, 6]]
s = sorted(data, key=cmp_to_key(compare))
for i in s:
    print(i)

输出结果：

[1, 3]
[2, 6]
[2, 3]
[2, 1]
[3, 5]

cmp_to_key 的实现原理如下：

  
################################################################################
### cmp_to_key() function converter
################################################################################

def cmp_to_key(mycmp):
    """Convert a cmp= function into a key= function"""
    class K(object):
        __slots__ = ['obj']
        def __init__(self, obj):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        __hash__ = None
    return K

cmp_to_key 将函数封装成了一个对象，利用了对象里的大小比较协议来实现排序。

这种方法虽然实现起来比较繁琐，但最灵活，实用性更广。

多维排序

lambda x: (x[0], x[1])

attrgetter

自定义比较函数

相关文章

回文

最长公共前缀

Python中的拷贝