Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

动态规划:最长公共子序列 LCS #131

Open
xxleyi opened this issue Jun 25, 2019 · 4 comments
Open

动态规划:最长公共子序列 LCS #131

xxleyi opened this issue Jun 25, 2019 · 4 comments

Comments

@xxleyi
Copy link
Owner

xxleyi commented Jun 25, 2019

给定两个字符串 a, b, 求这两个字符串最长的公共子序列的长度。

a = 'weare', b = 'are' -> 很明显,是3。但如何给出一个通用的算法吗?

@xxleyi
Copy link
Owner Author

xxleyi commented Jun 25, 2019

貌似简明的递归

def LCS(n1, n2, a, b):
    if n1 < 0 or n2 < 0:
        return 0
    if a[n1] == b[n2]:
        return 1 + LCS(n1 - 1, n2 - 1, a, b)
    return max(LCS(n1 - 1, n2, a, b), LCS(n1, n2 - 1, a, b))
a = 'weare'
b = 'are'
LCS(len(a)-1, len(b)-1, a, b)
# 3
a = 'iwanttobeagreatman'
b = 'styleisthekeything'
LCS(len(a)-1, len(b)-1, a, b)
# 等待很久也没有出来结果

看来这个貌似简明的递归版重复计算太多太多。

@xxleyi
Copy link
Owner Author

xxleyi commented Jun 25, 2019

带 memoizatation 的繁冗但有效降低复杂度的递归版本

cache = {}

def LCS(n1, n2, a, b):

    if (n1, n2) in cache:
        return cache[(n1, n2)]

    if n1 < 0 or n2 < 0:
        res = 0
    elif a[n1] == b[n2]:
        res = 1 + LCS(n1 - 1, n2 - 1, a, b)
    else:
        res = max(LCS(n1 - 1, n2, a, b), LCS(n1, n2 - 1, a, b))

    cache[(n1, n2)] = res

    return res
a = 'iwanttobeagreatman'
b = 'styleisthekeything'
LCS(len(a)-1, len(b)-1, a, b)
# 瞬间出结果: 6

来看一下 cache 的数据结构:

{(-1, 0): 0,
 (-1, 1): 0,
 (-1, 2): 0,
 (-1, 3): 0,
 (-1, 4): 0,
 (-1, 6): 0,
 (-1, 7): 0,
 (-1, 8): 0,
 (-1, 9): 0,
 (-1, 10): 0,
 (-1, 11): 0,
 (-1, 12): 0,
 (-1, 13): 0,
 (-1, 14): 0,
 (0, -1): 0,
 (0, 0): 0,
 (0, 1): 0,
 (0, 2): 0,
 (0, 3): 0,
 (0, 4): 0,
 (0, 5): 1,
 (0, 6): 1,
 (0, 7): 1,
 (0, 8): 1,
 (0, 9): 1,
 (0, 10): 1,
 (0, 11): 1,
 (0, 12): 1,
 (0, 13): 1,
 (0, 14): 1,
 (0, 15): 1,
 (1, -1): 0,
 (1, 0): 0,
 (1, 1): 0,
 (1, 2): 0,
 (1, 3): 0,
 (1, 4): 0,
 (1, 5): 1,
 (1, 6): 1,
 (1, 7): 1,
 (1, 8): 1,
 (1, 9): 1,
 (1, 10): 1,
 (1, 11): 1,
 (1, 12): 1,
 (1, 13): 1,
 (1, 14): 1,
 (1, 15): 1,
 (2, -1): 0,
 (2, 0): 0,
 (2, 1): 0,
 (2, 2): 0,
 (2, 3): 0,
 (2, 4): 0,
 (2, 5): 1,
 (2, 6): 1,
 (2, 7): 1,
 (2, 8): 1,
 (2, 9): 1,
 (2, 10): 1,
 (2, 11): 1,
 (2, 12): 1,
 (2, 13): 1,
 (2, 14): 1,
 (2, 15): 1,
 (3, -1): 0,
 (3, 0): 0,
 (3, 1): 0,
 (3, 2): 0,
 (3, 3): 0,
 (3, 4): 0,
 (3, 5): 1,
 (3, 6): 1,
 (3, 7): 1,
 (3, 8): 1,
 (3, 9): 1,
 (3, 10): 1,
 (3, 11): 1,
 (3, 12): 1,
 (3, 13): 1,
 (3, 14): 1,
 (3, 15): 1,
 (3, 16): 2,
 (4, -1): 0,
 (4, 0): 0,
 (4, 1): 1,
 (4, 2): 1,
 (4, 3): 1,
 (4, 4): 1,
 (4, 5): 1,
 (4, 6): 1,
 (4, 7): 2,
 (4, 8): 2,
 (4, 9): 2,
 (4, 10): 2,
 (4, 11): 2,
 (4, 12): 2,
 (4, 13): 2,
 (4, 14): 2,
 (4, 15): 2,
 (4, 16): 2,
 (5, -1): 0,
 (5, 0): 0,
 (5, 1): 1,
 (5, 2): 1,
 (5, 3): 1,
 (5, 4): 1,
 (5, 5): 1,
 (5, 6): 1,
 (5, 7): 2,
 (5, 8): 2,
 (5, 9): 2,
 (5, 10): 2,
 (5, 11): 2,
 (5, 12): 2,
 (5, 13): 3,
 (5, 14): 3,
 (5, 15): 3,
 (5, 16): 3,
 (6, -1): 0,
 (6, 0): 0,
 (6, 1): 1,
 (6, 2): 1,
 (6, 3): 1,
 (6, 4): 1,
 (6, 5): 1,
 (6, 6): 1,
 (6, 7): 2,
 (6, 8): 2,
 (6, 9): 2,
 (6, 10): 2,
 (6, 11): 2,
 (6, 12): 2,
 (6, 13): 3,
 (6, 14): 3,
 (6, 15): 3,
 (6, 16): 3,
 (7, -1): 0,
 (7, 0): 0,
 (7, 1): 1,
 (7, 2): 1,
 (7, 3): 1,
 (7, 4): 1,
 (7, 5): 1,
 (7, 6): 1,
 (7, 7): 2,
 (7, 8): 2,
 (7, 9): 2,
 (7, 10): 2,
 (7, 11): 2,
 (7, 12): 2,
 (7, 13): 3,
 (7, 14): 3,
 (7, 15): 3,
 (7, 16): 3,
 (8, -1): 0,
 (8, 0): 0,
 (8, 1): 1,
 (8, 2): 1,
 (8, 3): 1,
 (8, 4): 2,
 (8, 5): 2,
 (8, 6): 2,
 (8, 7): 2,
 (8, 8): 2,
 (8, 9): 3,
 (8, 10): 3,
 (8, 11): 3,
 (8, 12): 3,
 (8, 13): 3,
 (8, 14): 3,
 (8, 15): 3,
 (8, 16): 3,
 (9, -1): 0,
 (9, 0): 0,
 (9, 1): 1,
 (9, 2): 1,
 (9, 3): 1,
 (9, 4): 2,
 (9, 5): 2,
 (9, 6): 2,
 (9, 7): 2,
 (9, 8): 2,
 (9, 9): 3,
 (9, 10): 3,
 (9, 11): 3,
 (9, 12): 3,
 (9, 13): 3,
 (9, 14): 3,
 (9, 15): 3,
 (9, 16): 3,
 (10, -1): 0,
 (10, 0): 0,
 (10, 1): 1,
 (10, 2): 1,
 (10, 3): 1,
 (10, 4): 2,
 (10, 5): 2,
 (10, 6): 2,
 (10, 7): 2,
 (10, 8): 2,
 (10, 9): 3,
 (10, 10): 3,
 (10, 11): 3,
 (10, 12): 3,
 (10, 13): 3,
 (10, 14): 3,
 (10, 15): 3,
 (10, 16): 3,
 (10, 17): 4,
 (11, -1): 0,
 (11, 0): 0,
 (11, 1): 1,
 (11, 2): 1,
 (11, 3): 1,
 (11, 4): 2,
 (11, 5): 2,
 (11, 6): 2,
 (11, 7): 2,
 (11, 8): 2,
 (11, 9): 3,
 (11, 10): 3,
 (11, 11): 3,
 (11, 12): 3,
 (11, 13): 3,
 (11, 14): 3,
 (11, 15): 3,
 (11, 16): 3,
 (11, 17): 4,
 (12, -1): 0,
 (12, 0): 0,
 (12, 1): 1,
 (12, 2): 1,
 (12, 3): 1,
 (12, 4): 2,
 (12, 5): 2,
 (12, 6): 2,
 (12, 7): 2,
 (12, 8): 2,
 (12, 9): 3,
 (12, 10): 3,
 (12, 11): 4,
 (12, 12): 4,
 (12, 13): 4,
 (12, 14): 4,
 (12, 15): 4,
 (12, 16): 4,
 (12, 17): 4,
 (13, -1): 0,
 (13, 0): 0,
 (13, 1): 1,
 (13, 2): 1,
 (13, 3): 1,
 (13, 4): 2,
 (13, 5): 2,
 (13, 6): 2,
 (13, 7): 2,
 (13, 8): 2,
 (13, 9): 3,
 (13, 10): 3,
 (13, 11): 4,
 (13, 12): 4,
 (13, 13): 4,
 (13, 14): 4,
 (13, 15): 4,
 (13, 16): 4,
 (13, 17): 4,
 (14, -1): 0,
 (14, 0): 0,
 (14, 1): 1,
 (14, 2): 1,
 (14, 3): 1,
 (14, 4): 2,
 (14, 5): 2,
 (14, 6): 2,
 (14, 7): 3,
 (14, 8): 3,
 (14, 9): 3,
 (14, 10): 3,
 (14, 11): 4,
 (14, 12): 4,
 (14, 13): 5,
 (14, 14): 5,
 (14, 15): 5,
 (14, 16): 5,
 (14, 17): 5,
 (15, -1): 0,
 (15, 0): 0,
 (15, 1): 1,
 (15, 2): 1,
 (15, 3): 1,
 (15, 4): 2,
 (15, 5): 2,
 (15, 6): 2,
 (15, 7): 3,
 (15, 8): 3,
 (15, 9): 3,
 (15, 10): 3,
 (15, 11): 4,
 (15, 12): 4,
 (15, 13): 5,
 (15, 14): 5,
 (15, 15): 5,
 (15, 16): 5,
 (15, 17): 5,
 (16, -1): 0,
 (16, 0): 0,
 (16, 1): 1,
 (16, 2): 1,
 (16, 3): 1,
 (16, 4): 2,
 (16, 5): 2,
 (16, 6): 2,
 (16, 7): 3,
 (16, 8): 3,
 (16, 9): 3,
 (16, 10): 3,
 (16, 11): 4,
 (16, 12): 4,
 (16, 13): 5,
 (16, 14): 5,
 (16, 15): 5,
 (16, 16): 5,
 (16, 17): 5,
 (17, 16): 6,
 (17, 17): 6}

a b 的长度正好都是 18。

转制为 19 x 19 的表格,换个角度看一下:

[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 0],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 0],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 0],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6]]

@xxleyi
Copy link
Owner Author

xxleyi commented Jun 25, 2019

至拙至巧的迭代

根据上述 cache 的数据结构可以断定动态规划的可行性:初始化一个二维数组,然后从左上角迭代到右下角即可得出答案。复杂度为 len(a) * len(b)。

def LCS(n1, n2, a, b):

    T = [[0] * (len(a) + 1) for _ in range(len(b) + 1)]

    for i in range(1, len(a) + 1):
        for j in range(1, len(b) + 1):
            if a[i -1] == b[j - 1]:
                T[i][j] = T[i - 1][j - 1] + 1
            else:
                T[i][j] = max(T[i - 1][j], T[i][j - 1])

    print(T)

    return T[i][j]
a = 'iwanttobeagreatman'
b = 'styleisthekeything'
LCS(len(a), len(b), a, b)
# 注意这里传参规则发生了变化,导致函数签名不一致,应该优化成一样的,此处就不改了。
# 输出结果当然也是 6

看看迭代版产生的数据结构:

[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
 [0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 5, 5],
 [0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6]]

@xxleyi
Copy link
Owner Author

xxleyi commented Jun 25, 2019

递归和迭代产生的数据结构还是有不一样之处的。这也恰恰说明迭代版还有进一步提升性能的空间。不过那都已经是常数意义上的优化。

把动态规划这套解体思路再次运用一遍,力争由生到熟。

@xxleyi xxleyi added the HOW label Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant