<Go Back

KMP Search

Houssem

June 18, 2021

KMP Search

The Knutt-Morris-Pratt algorithm is a Pattern Matching Substring Search algorithm. It intelligently traverses the longer string to reduce the amount of redundant searching. This method is more efficient than the regular search algorithm (the naive approach). At a high level, the KMP algorithm is similar to the naive algorithm: it considers shifts in order from 1 to n−m, and determines if the pattern matches at that shift. The difference is that the KMP algorithm uses information gleaned from partial matches of the pattern and text to skip over shifts that are guaranteed not to result in a match. (Click here for more)

KMP Search Pseudocode

KMP Search Implementation

Python

def compute_lps_array(short, m, lps_arr):
    i = 0
    j = 1
    lps_arr[0] = 0
    while j < m:
        if short[j] == short[i]:
            lps_arr[j] = i + 1
            i += 1
            j += 1
        else:
            if i != 0:
                i = lps_arr[i-1]
            else:
                lps_arr[j] = 0
                j += 1

def KMP_search(large, short):
    count = 0
    n = len(large)
    m = len(short)
    lps_arr = [0]*m
    compute_lps_array(short, m, lps_arr)
    i = 0
    j = 0
    while i < n:
        if large[i] == short[j]:
            i += 1
            j += 1
        else:
            if j != 0:
                j = lps_arr[j-1]
            else:
                i += 1
        if j == m:
            count += 1
            j = lps_arr[j-1]
    return count

Time Complexity: Big(O)

Worst case: O(N+M): where N is the length of the main string and M is the length of the substring.