Java Program to Implement Aho-Corasick Algorithm for String Matching

This is a java program to implement Aho-Corasick Algorithm. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings (the “dictionary”) within an input text. It matches all patterns simultaneously. The complexity of the algorithm is linear in the length of the patterns plus the length of the searched text plus the number of output matches. Note that because all matches are found, there can be a quadratic number of matches if every substring matches (e.g. dictionary = a, aa, aaa, aaaa and input string is aaaa).

Here is the source code of the Java Program to Implement Aho-Corasick Algorithm for String Matching. The Java program is successfully compiled and run on a Windows system. The program output is also shown below.

package com.maixuanviet.setandstring;
 
import java.util.Arrays;
import java.util.Scanner;
 
public class StringSearchUsingAhoCorasickAlgo
{
    static final int ALPHABET_SIZE = 26;
    Node[]           nodes;
    int              nodeCount;
 
    public static class Node
    {
        int     parent;
        char    charFromParent;
        int     suffLink    = -1;
        int[]   children    = new int[ALPHABET_SIZE];
        int[]   transitions = new int[ALPHABET_SIZE];
        boolean leaf;
        {
            Arrays.fill(children, -1);
            Arrays.fill(transitions, -1);
        }
    }
 
    public StringSearchUsingAhoCorasickAlgo(int maxNodes)
    {
        nodes = new Node[maxNodes];
        // create root
        nodes[0] = new Node();
        nodes[0].suffLink = 0;
        nodes[0].parent = -1;
        nodeCount = 1;
    }
 
    public void addString(String s)
    {
        int cur = 0;
        for (char ch : s.toCharArray())
        {
            int c = ch - 'a';
            if (nodes[cur].children == -1)
            {
                nodes[nodeCount] = new Node();
                nodes[nodeCount].parent = cur;
                nodes[nodeCount].charFromParent = ch;
                nodes[cur].children = nodeCount++;
            }
            cur = nodes[cur].children;
        }
        nodes[cur].leaf = true;
    }
 
    public int suffLink(int nodeIndex)
    {
        Node node = nodes[nodeIndex];
        if (node.suffLink == -1)
            node.suffLink = node.parent == 0 ? 0 : transition(
                    suffLink(node.parent), node.charFromParent);
        return node.suffLink;
    }
 
    public int transition(int nodeIndex, char ch)
    {
        int c = ch - 'a';
        Node node = nodes[nodeIndex];
        if (node.transitions == -1)
            node.transitions = node.children != -1 ? node.children
                    : (nodeIndex == 0 ? 0 : transition(suffLink(nodeIndex), ch));
        return node.transitions;
    }
 
    // Usage example
    public static void main(String[] args)
    {
        StringSearchUsingAhoCorasickAlgo ahoCorasick = new StringSearchUsingAhoCorasickAlgo(
                1000);
        Scanner sc = new Scanner(System.in);
        System.out.println("Enter the main string: ");
        String main = sc.nextLine().toLowerCase().trim();
        System.out.println("Enter the pattern string: ");
        String pattern = sc.nextLine().toLowerCase().trim();
        ahoCorasick.addString(pattern);
        int node = 0;
        for (char ch : main.toCharArray())
        {
            node = ahoCorasick.transition(node, ch);
        }
        if (ahoCorasick.nodes[node].leaf)
            System.out.println("A '" + pattern + "' string is substring of "
                    + main + " string.");
        else
            System.out.println("A '" + pattern
                    + "' string is not substring of " + main + " string.");
        sc.close();
    }
}

Output:

$ javac StringSearchUsingAhoCorasickAlgo.java
$ java StringSearchUsingAhoCorasickAlgo
 
Enter the main string: 
Sanfoundry
Enter the pattern string: 
Asanfoundry
A 'asanfoundry' string is not substring of sanfoundry string.

Related posts:

Debugging Reactive Streams in Java
Multi Dimensional ArrayList in Java
Getting Started with Stream Processing with Spring Cloud Data Flow
Registration – Password Strength and Rules
Java Program to Find kth Smallest Element by the Method of Partitioning the Array
Java Program to Find the Longest Subsequence Common to All Sequences in a Set of Sequences
Spring Cloud AWS – RDS
Jackson – Marshall String to JsonNode
The HttpMediaTypeNotAcceptableException in Spring MVC
Getting Started with Custom Deserialization in Jackson
Java Program to Implement Maximum Length Chain of Pairs
Java Program to Perform Naive String Matching
Reversing a Linked List in Java
Introduction to Eclipse Collections
Java Program to Implement Treap
Java – Combine Multiple Collections
Spring Boot - Hystrix
Java Program to Check Whether Topological Sorting can be Performed in a Graph
Converting between an Array and a List in Java
The SpringJUnitConfig and SpringJUnitWebConfig Annotations in Spring 5
Convert String to int or Integer in Java
Converting String to Stream of chars
Java Program to Check whether Directed Graph is Connected using DFS
Java Program to Perform Cryptography Using Transposition Technique
Guide to the Volatile Keyword in Java
Java Program to Find the Minimum Element of a Rotated Sorted Array using Binary Search approach
Java Program to Implement Kosaraju Algorithm
Java Program to Implement Booth Algorithm
Template Engines for Spring
Java Program to Implement DelayQueue API
Java – Delete a File
LinkedHashSet trong java