Skip to content

An autocomplete and autocorrect program written in Java 8.

Notifications You must be signed in to change notification settings

kashiish/autotext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoText

AutoText is a small autocomplete and autocorrect program written in Java 8. I wanted to create this project to use and understand new data structures.

A Trie is an efficient data structure for prefix searching, therefore, it is widely used in autocomplete programs.

A BK-tree data structure is a metric tree designed for efficient string matching, making it a logical structure for autocorrect. In my implementation, I used Damerau–Levenshtein distance to calculate edit distance between strings.

Additionally, I created a simple GUI using Java Swing to go along with the program. I used a list from @first20hours (20k.txt) to create the lexicons for the tree structures (I have not uploaded that file into this repo).

Usage

Please look at the javadoc comments for a more detailed usage description.

Here is an example of basic usage.

import main.java.kashiish.autotext.AutoText;

import java.util.ArrayList;
public static void main(String[] args) { 

	//Create a new AutoText instance with a file of words to build a Trie, BKTree, 
	//and lexicon for word validation
	AutoText autotext = new AutoText(lexiconFileName);
	/*
	 * Or you can use different dictionaries for each set up.
	 * AutoText autotext = new AutoText(lexiconFileName, trieFileName, bktreeFileName);
	 */
	ArrayList<String> corrections = autotext.autocorrect("lovly");
	//set max autocomplete suggestions
	autotext.setMaxSuggestions(3);
	ArrayList<String> suggestions = autotext.autocomplete("ques")
	System.out.println(corrections);
	System.out.println(suggestions);
}
["lovely"]
["quest", "question", "questions"]

Issues and Contribution

Please feel free to report or fix any bugs you may find in the program. It's greatly appreciated!

Current issues:

  • the NPath complexity of the method that calculates the distance between strings is very high.

License

MIT

About

An autocomplete and autocorrect program written in Java 8.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages