Finding the most repeated substring across a collection of strings is a useful task in text analysis, data compression, and pattern recognition. In this Java program, we aim to identify the substring that occurs most frequently in all strings of a given array, helping developers practice substring generation and frequency counting.
Problem Statement
Write a Java program to identify the substring that appears most frequently across all strings in a given array. This problem helps developers practice substring generation, frequency counting, and the use of hash maps for tracking occurrences.
Example
The substring "an"
appears multiple times across different strings ("banana"
, "bandana"
, "anagram"
), more than any other substring.
Input:
String[] input = {“banana”, “bandana”, “anagram”};
Output:
Most Repeated Substring: “an”
Approach to Solve the Problem
- Iterate through each string in the array.
- Generate all possible substrings of each string (length > 1).
- Count occurrences of each substring using a HashMap.
- Track the substring that occurs the most.
- Return the result.
Java Program to Find the Most Repeated Substring
package com.javacodepoint.stringarray;
import java.util.HashMap;
import java.util.Map;
public class MostRepeatedSubstring {
// Method to find the most repeated substring
public static String findMostRepeatedSubstring(String[] strings) {
Map<String, Integer> substringCount = new HashMap<>();
for (String str : strings) {
int len = str.length();
// Generate all possible substrings of each string
for (int i = 0; i < len; i++) {
for (int j = i + 1; j <= len; j++) {
String sub = str.substring(i, j);
if (sub.length() > 1) { // ignoring single-character substrings
substringCount.put(sub, substringCount.getOrDefault(sub, 0) + 1);
}
}
}
}
// Find the substring with the highest frequency
String mostRepeated = "";
int maxCount = 0;
for (Map.Entry<String, Integer> entry : substringCount.entrySet()) {
if (entry.getValue() > maxCount) {
maxCount = entry.getValue();
mostRepeated = entry.getKey();
}
}
return mostRepeated;
}
public static void main(String[] args) {
String[] input = { "banana", "bandana", "anagram" };
String result = findMostRepeatedSubstring(input);
System.out.println("Most Repeated Substring: " + result);
}
}
OUTPUT:
Most Repeated Substring: an
Explanation of Code
- Step 1: Loop through each word in the input array.
- Step 2: Generate all substrings using nested loops.
- Step 3: Use a HashMap to track how many times each substring occurs.
- Step 4: After processing all the strings, find the substring with the highest frequency.
- Step 5: Return or print the result.
Conclusion
Finding the most repeated substring is a classic programming challenge that combines string manipulation, hashing, and logical thinking. This Java implementation is simple and effective for small to medium-sized datasets and lays the foundation for more advanced pattern-matching algorithms.
You can learn the Top 20 string array programs for interview preparation (Click here).