[Medium] 686. Repeated String Match

Problem Statement

Given two strings a and b, return the minimum number of times you should repeat string a so that string b is a substring of it. If it is impossible for b to be a substring of a after repeating it, return -1.

Notice: String "abc" repeated 0 times is "", repeated 1 time is "abc", and repeated 2 times is "abcabc".

Examples

Example 1:

Input: a = "abcd", b = "cdabcdab"
Output: 3
Explanation: We return 3 because by repeating a three times "abcdabcdabcd", b is a substring of it.

Example 2:

Input: a = "a", b = "aa"
Output: 2

Example 3:

Input: a = "a", b = "a"
Output: 1

Example 4:

Input: a = "abc", b = "wxyz"
Output: -1

Constraints

1 <= a.length, b.length <= 10^4
a and b consist of lowercase English letters.

Clarification Questions

Before diving into the solution, here are 5 important clarifications and assumptions to discuss during an interview:

Repetition definition: What does “repeated string match” mean? (Assumption: Repeat string ‘a’ multiple times until ‘b’ becomes a substring of the repeated string)
Minimum repetitions: What are we trying to find? (Assumption: Minimum number of times to repeat ‘a’ so that ‘b’ is a substring)
Substring matching: Does ‘b’ need to be an exact substring? (Assumption: Yes - ‘b’ must appear as a contiguous substring in the repeated ‘a’)
No match: What should we return if ‘b’ can never be a substring? (Assumption: Return -1 - impossible to form ‘b’ as substring)
Case sensitivity: Are character comparisons case-sensitive? (Assumption: Based on constraints, only lowercase letters, so case doesn’t matter)

Interview Deduction Process (20 minutes)

Step 1: Brute-Force Approach (5 minutes)

Initial Thought: “I need to find minimum repetitions. Let me try repeating string a and checking if b is substring.”

Naive Solution: Repeat string a multiple times, check if b is substring using simple string matching. Try repetitions from 1 to some upper bound.

Complexity: O(n × m) per repetition check, where n = len(a), m = len(b)

Issues:

Simple substring matching is inefficient
May need many repetitions
Doesn’t leverage string matching algorithms
Timeout for large strings

Step 2: Semi-Optimized Approach (7 minutes)

Insight: “I can use KMP algorithm for efficient substring matching, or optimize repetition checking.”

Improved Solution: Use KMP algorithm for substring matching. Repeat a until length >= b, then check if b is substring. Can also use Rabin-Karp for alternative.

Complexity: O(n + m) per check with KMP, but need multiple repetitions

Improvements:

KMP makes substring matching efficient
Still need to determine upper bound for repetitions
Much faster than naive matching

Step 3: Optimized Solution (8 minutes)

Final Optimization: “I can determine upper bound: need at most ceil(m/n) + 1 repetitions. Use KMP for matching.”

Best Solution: Calculate upper bound: need at most ceil(len(b)/len(a)) + 1 repetitions. Use KMP algorithm for efficient substring matching. Can also use Rabin-Karp as alternative.

Complexity: O(n + m) time with KMP, O(n + m) space

Key Realizations:

Upper bound is ceil(m/n) + 1 repetitions
KMP algorithm is optimal for substring matching
O(n + m) time is optimal for string matching
Rabin-Karp is alternative with similar complexity

Solution Approach

This problem requires finding the minimum number of repetitions of string a such that string b becomes a substring. This is a string matching problem that can be efficiently solved using advanced algorithms like Knuth-Morris-Pratt (KMP) or Rabin-Karp.

Key Insights:

Minimum Repetitions: Need at least ⌈b.length / a.length⌉ repetitions
Maximum Repetitions: At most ⌈b.length / a.length⌉ + 1 repetitions needed
Circular Matching: Since a is repeated, we can use circular string matching
Efficient Algorithms: KMP or Rabin-Karp provide O(n + m) time complexity

Algorithm Options:

KMP Algorithm: Preprocess pattern to avoid backtracking
Rabin-Karp: Use rolling hash for efficient substring matching
Naive Approach: O(n × m) - check each position

String Matching Algorithms

Knuth-Morris-Pratt (KMP) Algorithm

KMP is an efficient string-searching algorithm that preprocesses the pattern to create a prefix function (also called LPS - Longest Proper Prefix which is also a Suffix). This allows skipping unnecessary comparisons.

Key Concepts:

Prefix Function (π/LPS): For each position i in pattern, π[i] is the length of the longest proper prefix that is also a suffix of pattern[0..i]
No Backtracking: When a mismatch occurs, we don’t reset to the beginning but use the prefix function to determine the next position
Time Complexity: O(n + m) where n = text length, m = pattern length

How KMP Works:

Preprocessing Phase: Build prefix function for pattern
- For each position, find longest prefix-suffix match
- Use previous values to compute current value efficiently
Search Phase: Match pattern in text
- Compare characters from left to right
- On mismatch, use prefix function to skip ahead
- Never backtrack in text

Prefix Function Example:

For pattern "ababaca":

Pattern:  a  b  a  b  a  c  a
Index:     0  1  2  3  4  5  6
π[i]:     0  0  1  2  3  0  1

Explanation:
- π[0] = 0 (no proper prefix)
- π[1] = 0 ("ab" has no prefix-suffix match)
- π[2] = 1 ("aba" has "a" as prefix-suffix)
- π[3] = 2 ("abab" has "ab" as prefix-suffix)
- π[4] = 3 ("ababa" has "aba" as prefix-suffix)
- π[5] = 0 ("ababac" has no prefix-suffix match)
- π[6] = 1 ("ababaca" has "a" as prefix-suffix)

Rabin-Karp Algorithm

Rabin-Karp uses rolling hash to efficiently compute hash values for substrings. It compares hash values first, then verifies with character-by-character comparison if hashes match.

Key Concepts:

Rolling Hash: Compute hash of substring in O(1) time using previous hash
Hash Function: Use polynomial rolling hash: hash = (hash * base + char) % mod
Collision Handling: When hashes match, verify with actual string comparison
Time Complexity: Average O(n + m), worst case O(n × m) if many hash collisions

How Rabin-Karp Works:

Precompute Pattern Hash: Calculate hash of pattern string
Rolling Hash in Text:
- Compute hash of first window
- Slide window and update hash in O(1)
- Compare hashes, verify if match
Base and Modulo: Use large base and prime modulo to reduce collisions

Rolling Hash Formula:

For substring s[i..i+m-1]:
hash = (s[i] * base^(m-1) + s[i+1] * base^(m-2) + ... + s[i+m-1]) % mod

To slide window from i to i+1:
new_hash = ((old_hash - s[i] * base^(m-1)) * base + s[i+m]) % mod

Solution 1: KMP Algorithm (Recommended)

Solution: KMP with Circular String Matching

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        if needle == "":
            return 0

        n, m = len(haystack), len(needle)

        # build lps (prefix function)
        lps = [0] * m

        j = 0
        for i in range(1, m):
            while j > 0 and needle[i] != needle[j]:
                j = lps[j - 1]

            if needle[i] == needle[j]:
                j += 1
                lps[i] = j

        # search
        j = 0
        for i in range(n):
            while j > 0 and haystack[i] != needle[j]:
                j = lps[j - 1]

            if haystack[i] == needle[j]:
                j += 1

            if j == m:
                return i - m + 1

        return -1

Algorithm Explanation:

Prefix Function Construction (Lines 7-15):
- Build pi array for pattern needle
- pi[i] = length of longest prefix-suffix for needle[0..i]
- Use previous values to compute efficiently
KMP Search with Circular Matching (Lines 16-28):
- Search for needle in circular haystack
- Use i % n to handle circular indexing
- On mismatch: j = pi[j - 1] (skip using prefix function)
- On match: increment j
- When j == m: pattern found at position i - m + 1
Calculate Minimum Repetitions (Lines 30-38):
- If pattern found at idx:
  - If remaining characters in first repetition ≥ bn: return 1
  - Otherwise: calculate repetitions needed
  - Formula: (bn + idx - an - 1) / an + 2

Example Walkthrough:

For a = "abcd", b = "cdabcdab":

Step 1: Build prefix function for "cdabcdab"
Pattern: c  d  a  b  c  d  a  b
π[i]:    0  0  0  0  1  2  3  4

Step 2: KMP search in circular "abcd"
Text (circular): a b c d a b c d a b c d ...
Pattern:         c d a b c d a b

i=0: 'a' vs 'c' → mismatch, j=0
i=1: 'b' vs 'c' → mismatch, j=0
i=2: 'c' vs 'c' → match, j=1
i=3: 'd' vs 'd' → match, j=2
i=4: 'a' vs 'a' → match, j=3
i=5: 'b' vs 'b' → match, j=4
i=6: 'c' vs 'c' → match, j=5
i=7: 'd' vs 'd' → match, j=6
i=8: 'a' vs 'a' → match, j=7
i=9: 'b' vs 'b' → match, j=8 → FOUND!

idx = 9 - 8 + 1 = 2
an = 4, bn = 8
an - idx = 4 - 2 = 2 < 8
repetitions = (8 + 2 - 4 - 1) / 4 + 2 = 5/4 + 2 = 1 + 2 = 3

Solution 2: Rabin-Karp Algorithm

Solution: Rabin-Karp with Rolling Hash

class Solution:
    BASE = 256
    MOD = 10**9 + 7

    def computeHash(self, s, start, length):
        h = 0
        for i in range(length):
            h = (h * self.BASE + ord(s[start + i])) % self.MOD
        return h

    def updateHash(self, oldHash, remove, add, power):
        oldHash = (oldHash - remove * power) % self.MOD
        oldHash = (oldHash * self.BASE + add) % self.MOD
        return oldHash

    def verifyMatch(self, text, start, pattern):
        for i in range(len(pattern)):
            if text[start + i] != pattern[i]:
                return False
        return True

    def repeatedStringMatch(self, a: str, b: str) -> int:
        # simple correct solution (recommended)
        repeat = -(-len(b) // len(a))  # ceil
        s = a * repeat

        if b in s:
            return repeat
        if b in s + a:
            return repeat + 1

        return -1

Algorithm Explanation:

Hash Computation (Lines 6-11):
- Compute hash for substring using polynomial rolling hash
- Formula: hash = (hash * BASE + char) % MOD
Rolling Hash Update (Lines 13-17):
- Remove leftmost character: subtract remove * power
- Add rightmost character: multiply by BASE and add
- Handle modulo arithmetic correctly
Verification (Lines 19-24):
- When hashes match, verify with character-by-character comparison
- Prevents false positives from hash collisions
Rabin-Karp Search (Lines 26-58):
- Precompute pattern hash and power value
- Slide window and update hash in O(1)
- Check hash matches, verify if needed
- Support circular string matching

KMP Template

Here’s the general template for KMP algorithm:

class KMP:
    # Build LPS (prefix function)
    def buildPrefixFunction(self, pattern: str):
        m = len(pattern)
        pi = [0] * m

        j = 0
        for i in range(1, m):
            while j > 0 and pattern[i] != pattern[j]:
                j = pi[j - 1]

            if pattern[i] == pattern[j]:
                j += 1
                pi[i] = j

        return pi

    # Search pattern in text
    def search(self, text: str, pattern: str) -> int:
        n, m = len(text), len(pattern)

        if m == 0:
            return 0

        pi = self.buildPrefixFunction(pattern)

        j = 0
        for i in range(n):
            while j > 0 and text[i] != pattern[j]:
                j = pi[j - 1]

            if text[i] == pattern[j]:
                j += 1

            if j == m:
                return i - m + 1

        return -1

Key Template Components:

Prefix Function (π/LPS):
- pi[i] = longest prefix-suffix length for pattern[0..i]
- Built in O(m) time
Search Algorithm:
- No backtracking in text
- Use prefix function to skip on mismatch
- Time complexity: O(n + m)
Circular Matching:
- Use i % n for circular text
- Adjust loop condition: i - j < n

Rabin-Karp Template

Here’s the general template for Rabin-Karp algorithm:

class RabinKarp:
    BASE = 256
    MOD = 10**9 + 7

    def computeHash(self, s, length):
        h = 0
        for i in range(length):
            h = (h * self.BASE + ord(s[i])) % self.MOD
        return h

    def updateHash(self, oldHash, remove, add, power):
        oldHash = (oldHash - remove * power) % self.MOD
        oldHash = (oldHash * self.BASE + add) % self.MOD
        return oldHash

    def search(self, text: str, pattern: str) -> int:
        n, m = len(text), len(pattern)

        if m == 0:
            return 0
        if n < m:
            return -1

        patternHash = self.computeHash(pattern, m)

        power = pow(self.BASE, m - 1, self.MOD)

        textHash = self.computeHash(text, m)

        if textHash == patternHash and text[:m] == pattern:
            return 0

        for i in range(1, n - m + 1):
            textHash = self.updateHash(
                textHash,
                ord(text[i - 1]),
                ord(text[i + m - 1]),
                power
            )

            if textHash == patternHash:
                if text[i:i + m] == pattern:
                    return i

        return -1

Key Template Components:

Hash Function:
- Polynomial rolling hash: hash = (hash * BASE + char) % MOD
- BASE = 256 (for ASCII), MOD = large prime
Rolling Hash:
- Update hash in O(1) when sliding window
- Remove left char, add right char
Collision Handling:
- Always verify hash matches with actual string comparison
- Prevents false positives

Complexity Analysis

Solution 1: KMP

Time Complexity: O(n + m)

Prefix function: O(m) - build LPS array
Search phase: O(n) - each character visited at most twice
Total: O(n + m) where n = a.length, m = b.length

Space Complexity: O(m)

Prefix function array: O(m)
Total: O(m)

Solution 2: Rabin-Karp

Time Complexity: O(n + m) average, O(n × m) worst case

Hash computation: O(m) for pattern
Rolling hash: O(n) for text (O(1) per window)
Verification: O(m) per hash match (rare collisions)
Total: O(n + m) average, O(n × m) worst case with many collisions

Space Complexity: O(1)

Hash variables: O(1)
Total: O(1) excluding input strings

Key Points

KMP is Optimal: Guaranteed O(n + m) time, no worst-case degradation
Rabin-Karp: Average O(n + m), but can degrade with hash collisions
Circular Matching: Use modulo arithmetic for repeated strings
Minimum Repetitions: At least ⌈b.length / a.length⌉, at most ⌈b.length / a.length⌉ + 1
Prefix Function: Key to KMP’s efficiency - avoids backtracking
Rolling Hash: Key to Rabin-Karp’s efficiency - O(1) hash updates

Comparison: KMP vs Rabin-Karp

Aspect	KMP	Rabin-Karp
Time Complexity	O(n + m) guaranteed	O(n + m) average, O(n × m) worst
Space Complexity	O(m)	O(1)
Preprocessing	O(m) for prefix function	O(m) for pattern hash
Backtracking	None (no text backtracking)	None (sliding window)
Collision Handling	Not needed	Required (verify matches)
Implementation	More complex	Simpler
Recommended	✅ Yes (guaranteed performance)	⚠️ Good for average case

Alternative Approaches

Approach 1: KMP (Current Solution 1)

Time: O(n + m)
Space: O(m)
Best for: Guaranteed performance, no worst-case degradation

Approach 2: Rabin-Karp (Current Solution 2)

Time: O(n + m) average
Space: O(1)
Use case: When space is limited, average case is acceptable

Approach 3: Naive String Matching

Time: O(n × m)
Space: O(1)
Use case: Simple but inefficient for large inputs

28. Find the Index of the First Occurrence in a String - KMP application
214. Shortest Palindrome - KMP for palindrome
1392. Longest Happy Prefix - Prefix function
187. Repeated DNA Sequences - Rolling hash

[Medium] 686. Repeated String Match

Problem Statement

Examples

Constraints

Clarification Questions

Interview Deduction Process (20 minutes)

Step 1: Brute-Force Approach (5 minutes)

Step 2: Semi-Optimized Approach (7 minutes)

Step 3: Optimized Solution (8 minutes)

Solution Approach

Key Insights:

Algorithm Options:

String Matching Algorithms

Knuth-Morris-Pratt (KMP) Algorithm

Key Concepts:

How KMP Works:

Prefix Function Example:

Rabin-Karp Algorithm

Key Concepts:

How Rabin-Karp Works:

Rolling Hash Formula:

Solution 1: KMP Algorithm (Recommended)

Solution: KMP with Circular String Matching

Algorithm Explanation:

Example Walkthrough:

Solution 2: Rabin-Karp Algorithm

Solution: Rabin-Karp with Rolling Hash

Algorithm Explanation:

KMP Template

Key Template Components:

Rabin-Karp Template

Key Template Components:

Complexity Analysis

Solution 1: KMP

Solution 2: Rabin-Karp

Key Points

Comparison: KMP vs Rabin-Karp

Alternative Approaches

Approach 1: KMP (Current Solution 1)

Approach 2: Rabin-Karp (Current Solution 2)

Approach 3: Naive String Matching

Related Problems

Tags

Related Posts

Recent Posts