Skip to content

v0.1.0

Compare
Choose a tag to compare
@hrs hrs released this 23 May 18:22
· 42 commits to main since this release
474d74c

Changelog

  • 474d74c Add a build release task
  • 97208f8 Write a simple README
  • ed0c97d Don't include symlinks in the corpus
  • a8ae6a2 Split words manually instead of by regexp
  • cf33283 Check that this is a text file before opening
  • 45f20a4 Document use with non-English documents
  • 2b745c5 Replace deprecated use of ioutil with io
  • 71d71f7 Backfill stoplist tests
  • 911c082 Move corpus parsing out of main
  • 85c23ad Backfill similarity tests
  • 9b10bb2 If no query file's provided, read from STDIN
  • 0913d52 Add --no-stemming flag to skip stemming
  • b41825e Add --no-stoplist flag to skip stoplist
  • adb900a Default to searching current directory
  • 1fa579d License with the GPLv3
  • 68c9692 Add a simple Makefile
  • 69dee47 Include a simple manual page
  • 2be26f1 Move code into a lib directory
  • da14b80 Save memory by clearing term freqs after TF-IDF
  • 30caf51 Recursively search directories for files
  • 25f1c88 Add a --verbose flag
  • 750d5f2 Add --omit-target flag to skip target in results
  • f81720b Just print errors to stderr, don't log
  • 08612d3 Only search files that seem to contain text
  • 144d473 Add flags for sort order, limit, showing scores
  • 43be014 Sort results, low-to-high
  • 3cf6192 Display search results more readably
  • 27eff9a Search the corpus with a query document
  • 371c11d Maintain TF-IDF weights for each document
  • 471fcd1 Corpus stores its inverse document frequency
  • 4b7b1ac Documents store their term frequency
  • 5358a2d Maintain a corpus of documents
  • e023141 Track count of term occurrences
  • cc1a73d Memoize stemming results in a local cache
  • 765d3af Stem words with the Porter Stemmer
  • bee8fd0 Don't include words in a standard English stoplist
  • 7959b68 Split tokens but retain contractions
  • c5a08da Instantiate a document containing words
  • e36485e Get target file and search files from args
  • 8cfa20f Hello, docsim.