Skip to content
jpountz edited this page Mar 5, 2011 · 4 revisions

Goal

This project is an attempt to implement high-performance and memory-efficient tries in the Java programming language. Many implementations of tries in Java use a tree of nodes keeping references to each other for the implementation. Because of the memory overhead of an object allocation in Java (it depends on the JVM you use, but is usually 8 bytes, see http://www.javaworld.com/javatips/jw-javatip130.html), this leads to huge data structures. Moreover, the performance of such implementations is not always as good as expected (see http://bannister.us/weblog/2009/10/05/example-general-purpose-trie-in-java/ where the lookup performance is 5 to 6 times worse than a HashMap).

This work has been inspired by a Trie implementation in Java using backing arrays from Julien Lemoine, which you can find at http://wikipedia-clustering.speedblue.org/trieJava.php.

When (not) to use tries

Tries can sometimes replace String Maps or SortedMaps, especially if:

  • you need to treat char[] and Charsequences the same way,
  • you need to perform some operations tries are better at than Hash or Tree maps:
    • finding all the words which share a common prefix,
    • finding all words whose Levenshtein distance to a given word is at most X.

However, you can expect better results with a HashMap for most use cases, especially if you are working with immutable CharSequences which cache the value of the hash code (as String does).

For more details, look at :

Clone this wiki locally