Find the Most Repeated Substring in Java

Finding the most repeated substring across a collection of strings is a useful task in text analysis, data compression, and pattern recognition. In this Java program, we aim to identify the substring that occurs most frequently in all strings of a given array, helping developers practice substring generation and frequency counting.

Problem Statement

Write a Java program to identify the substring that appears most frequently across all strings in a given array. This problem helps developers practice substring generation, frequency counting, and the use of hash maps for tracking occurrences.

Example

The substring "an" appears multiple times across different strings ("banana", "bandana", "anagram"), more than any other substring.

Approach to Solve the Problem

  1. Iterate through each string in the array.
  2. Generate all possible substrings of each string (length > 1).
  3. Count occurrences of each substring using a HashMap.
  4. Track the substring that occurs the most.
  5. Return the result.

Java Program to Find the Most Repeated Substring

package com.javacodepoint.stringarray;

import java.util.HashMap;
import java.util.Map;

public class MostRepeatedSubstring {

	// Method to find the most repeated substring
	public static String findMostRepeatedSubstring(String[] strings) {
		Map<String, Integer> substringCount = new HashMap<>();

		for (String str : strings) {
			int len = str.length();
			// Generate all possible substrings of each string
			for (int i = 0; i < len; i++) {
				for (int j = i + 1; j <= len; j++) {
					String sub = str.substring(i, j);
					if (sub.length() > 1) { // ignoring single-character substrings
						substringCount.put(sub, substringCount.getOrDefault(sub, 0) + 1);
					}
				}
			}
		}

		// Find the substring with the highest frequency
		String mostRepeated = "";
		int maxCount = 0;

		for (Map.Entry<String, Integer> entry : substringCount.entrySet()) {
			if (entry.getValue() > maxCount) {
				maxCount = entry.getValue();
				mostRepeated = entry.getKey();
			}
		}

		return mostRepeated;
	}

	public static void main(String[] args) {
		String[] input = { "banana", "bandana", "anagram" };

		String result = findMostRepeatedSubstring(input);
		System.out.println("Most Repeated Substring: " + result);
	}
}

Explanation of Code

  • Step 1: Loop through each word in the input array.
  • Step 2: Generate all substrings using nested loops.
  • Step 3: Use a HashMap to track how many times each substring occurs.
  • Step 4: After processing all the strings, find the substring with the highest frequency.
  • Step 5: Return or print the result.

Conclusion

Finding the most repeated substring is a classic programming challenge that combines string manipulation, hashing, and logical thinking. This Java implementation is simple and effective for small to medium-sized datasets and lays the foundation for more advanced pattern-matching algorithms.

You can learn the Top 20 string array programs for interview preparation (Click here).

Java logical programs list


Java Basic Programs

Java String Programs

Java String Array Programs

Java Miscellaneous Programs

Java Programs based on the Collection Framework

Leave a Reply

Your email address will not be published. Required fields are marked *