python 比较和处理文本之间的差异

2744557306 发表于 2024-3-31 11:12

[*]difflib是Python标准库中的一个模块，用于比较和处理文本之间的差异。它提供了一些函数和类，可以用于生成差异报告、计算相似度、查找最长公共子序列等操作。
安装内置库无需安装常见用法1：比较差异import difflib

text1 = "hello world"
text2 = "hello there"

diff = difflib.ndiff(text1, text2)
print('\n'.join(diff))常见用法2：比较文件的差异import difflib

with open('file1.txt') as file1, open('file2.txt') as file2:
diff = difflib.ndiff(file1.readlines(), file2.readlines())
print('\n'.join(diff))常见用法3：比较列表的差异import difflib

list1 = ['apple', 'banana', 'cherry']
list2 = ['apple', 'banana', 'kiwi']

diff = difflib.ndiff(list1, list2)
print('\n'.join(diff))
常见用法4：比较字符串相似度import difflib

text1 = "hello world"
text2 = "hello there"

similarity = difflib.SequenceMatcher(None, text1, text2).ratio()
print(similarity)
输出，相似度百分之63.6%0.6363636363636364常见用法5：获取两个字符串的相似块：import difflib

text1 = "hello world"
text2 = "hello there"

blocks = difflib.SequenceMatcher(None, text1, text2).get_matching_blocks()
print(blocks)
输出
[*]常见用法6：获取两个字符串的最长公共子序列import difflib
text1 = "hello world"
text2 = "hello there"

lcs = difflib.SequenceMatcher(None, text1, text2).find_longest_match(0, len(text1), 0, len(text2))
print(lcs)输出Match(a=0, b=0, size=6)import difflib

text1 = "hello world"
text2 = "hello there"

lcs = difflib.SequenceMatcher(None, text1, text2).find_longest_match(0, len(text1), 0, len(text2))
print(text1)输出hello 常见用法7：比较两个字符串，并返回上下文差异import difflib

text1 = "hello world"
text2 = "hello there"

diff = difflib.context_diff(text1, text2)
print('\n'.join(diff)

页: [1]

数学建模社区-数学中国's Archiver

python 比较和处理文本之间的差异