Java Program to Implement Aho-Corasick Algorithm for String Matching

This is a java program to implement Aho-Corasick Algorithm. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings (the “dictionary”) within an input text. It matches all patterns simultaneously. The complexity of the algorithm is linear in the length of the patterns plus the length of the searched text plus the number of output matches. Note that because all matches are found, there can be a quadratic number of matches if every substring matches (e.g. dictionary = a, aa, aaa, aaaa and input string is aaaa).

Here is the source code of the Java Program to Implement Aho-Corasick Algorithm for String Matching. The Java program is successfully compiled and run on a Windows system. The program output is also shown below.

package com.maixuanviet.setandstring;
 
import java.util.Arrays;
import java.util.Scanner;
 
public class StringSearchUsingAhoCorasickAlgo
{
    static final int ALPHABET_SIZE = 26;
    Node[]           nodes;
    int              nodeCount;
 
    public static class Node
    {
        int     parent;
        char    charFromParent;
        int     suffLink    = -1;
        int[]   children    = new int[ALPHABET_SIZE];
        int[]   transitions = new int[ALPHABET_SIZE];
        boolean leaf;
        {
            Arrays.fill(children, -1);
            Arrays.fill(transitions, -1);
        }
    }
 
    public StringSearchUsingAhoCorasickAlgo(int maxNodes)
    {
        nodes = new Node[maxNodes];
        // create root
        nodes[0] = new Node();
        nodes[0].suffLink = 0;
        nodes[0].parent = -1;
        nodeCount = 1;
    }
 
    public void addString(String s)
    {
        int cur = 0;
        for (char ch : s.toCharArray())
        {
            int c = ch - 'a';
            if (nodes[cur].children == -1)
            {
                nodes[nodeCount] = new Node();
                nodes[nodeCount].parent = cur;
                nodes[nodeCount].charFromParent = ch;
                nodes[cur].children = nodeCount++;
            }
            cur = nodes[cur].children;
        }
        nodes[cur].leaf = true;
    }
 
    public int suffLink(int nodeIndex)
    {
        Node node = nodes[nodeIndex];
        if (node.suffLink == -1)
            node.suffLink = node.parent == 0 ? 0 : transition(
                    suffLink(node.parent), node.charFromParent);
        return node.suffLink;
    }
 
    public int transition(int nodeIndex, char ch)
    {
        int c = ch - 'a';
        Node node = nodes[nodeIndex];
        if (node.transitions == -1)
            node.transitions = node.children != -1 ? node.children
                    : (nodeIndex == 0 ? 0 : transition(suffLink(nodeIndex), ch));
        return node.transitions;
    }
 
    // Usage example
    public static void main(String[] args)
    {
        StringSearchUsingAhoCorasickAlgo ahoCorasick = new StringSearchUsingAhoCorasickAlgo(
                1000);
        Scanner sc = new Scanner(System.in);
        System.out.println("Enter the main string: ");
        String main = sc.nextLine().toLowerCase().trim();
        System.out.println("Enter the pattern string: ");
        String pattern = sc.nextLine().toLowerCase().trim();
        ahoCorasick.addString(pattern);
        int node = 0;
        for (char ch : main.toCharArray())
        {
            node = ahoCorasick.transition(node, ch);
        }
        if (ahoCorasick.nodes[node].leaf)
            System.out.println("A '" + pattern + "' string is substring of "
                    + main + " string.");
        else
            System.out.println("A '" + pattern
                    + "' string is not substring of " + main + " string.");
        sc.close();
    }
}

Output:

$ javac StringSearchUsingAhoCorasickAlgo.java
$ java StringSearchUsingAhoCorasickAlgo
 
Enter the main string: 
Sanfoundry
Enter the pattern string: 
Asanfoundry
A 'asanfoundry' string is not substring of sanfoundry string.

Related posts:

Java Program to Create a Balanced Binary Tree of the Incoming Data
A Guide to Java HashMap
Allow user:password in URL
Spring’s RequestBody and ResponseBody Annotations
Guide to the Java Queue Interface
Spring REST API + OAuth2 + Angular
Guide to CountDownLatch in Java
Working With Maps Using Streams
Java Program to Use rand and srand Functions
Java Program to Compute the Area of a Triangle Using Determinants
Java Program to Compare Binary and Sequential Search
Java Program to Check whether Graph is a Bipartite using 2 Color Algorithm
Java Program to Check Whether an Undirected Graph Contains a Eulerian Cycle
Java Program to Check whether Directed Graph is Connected using DFS
Spring Boot Integration Testing with Embedded MongoDB
Spring Cloud – Tracing Services with Zipkin
Java Program to Implement Strassen Algorithm
Java Program to Use Boruvka’s Algorithm to Find the Minimum Spanning Tree
Java Program to Perform Insertion in a 2 Dimension K-D Tree
Function trong Java 8
Using Spring @ResponseStatus to Set HTTP Status Code
Java Program to implement Circular Buffer
Check If a String Is Numeric in Java
Java Program to Test Using DFS Whether a Directed Graph is Strongly Connected or Not
Comparing Long Values in Java
Java – Get Random Item/Element From a List
How to Read HTTP Headers in Spring REST Controllers
Circular Dependencies in Spring
The Spring @Controller and @RestController Annotations
Java Multi-line String
The Basics of Java Security
Java Program to Perform the Unique Factorization of a Given Number