当前位置：首页 > C++ > 正文

深入理解KMP算法（C++语言实现高效字符串匹配）

主机测评网
C++
2025-12-22
602

在计算机科学中，KMP算法（Knuth-Morris-Pratt算法）是一种高效的字符串匹配算法，用于在一个主串（文本）中查找一个模式串（子串）出现的位置。与暴力匹配不同，KMP算法通过预处理模式串构建“部分匹配表”（也称为next数组），避免了不必要的回溯，从而将时间复杂度从O(n×m)优化到O(n+m)。

深入理解KMP算法（C++语言实现高效字符串匹配） KMP算法 C++字符串匹配模式匹配算法高效字符串搜索第1张

为什么需要KMP算法？

假设我们要在字符串 "ABABABCABABABD" 中查找子串 "ABABABD"。如果使用暴力匹配法，一旦某一位不匹配，就要将模式串整体右移一位，重新从头开始比较。这在最坏情况下效率极低。

而KMP算法的核心思想是：当发生不匹配时，利用已匹配部分的信息，尽可能多地跳过一些字符，避免重复比较。这就依赖于我们预先计算出的 next 数组。

KMP算法的关键：构建next数组

next[i] 表示模式串中前 i 个字符组成的子串的最长相等前后缀的长度。例如，模式串 "ABABABD" 的 next 数组为：

模式串: A B A B A B D索引:   0 1 2 3 4 5 6next:  -1 0 0 1 2 3 4

注意：有些教材将 next[0] 定义为0，但本文采用更常见的 -1 起始方式，便于统一处理边界。

C++实现KMP算法

下面我们用C++语言完整实现KMP算法，包括 getNext 函数和 kmpSearch 函数。

#include <iostream>#include <vector>#include <string>// 构建next数组std::vector<int> getNext(const std::string& pattern) {    int n = pattern.length();    std::vector<int> next(n);    next[0] = -1;    int i = 0, j = -1;    while (i < n - 1) {        if (j == -1 || pattern[i] == pattern[j]) {            ++i;            ++j;            next[i] = j;        } else {            j = next[j];        }    }    return next;}// KMP搜索函数int kmpSearch(const std::string& text, const std::string& pattern) {    std::vector<int> next = getNext(pattern);    int i = 0, j = 0;    int n = text.length();    int m = pattern.length();    while (i < n && j < m) {        if (j == -1 || text[i] == pattern[j]) {            ++i;            ++j;        } else {            j = next[j];        }    }    if (j == m) {        return i - j; // 返回首次匹配位置    }    return -1; // 未找到}// 测试主函数int main() {    std::string text = "ABABABCABABABD";    std::string pattern = "ABABABD";    int pos = kmpSearch(text, pattern);    if (pos != -1) {        std::cout << "Pattern found at index: " << pos << std::endl;    } else {        std::cout << "Pattern not found." << std::endl;    }    return 0;}