company-wordfreq

A company-backend for human language texts based on word frequency dictionaries.

Synopsis

company-wordfreq is a company backend intended for writing texts in a human language. The completions it proposes are words already used in the current (or another open) buffer and matching words from a word list file. This word list file is supposed to be a simple list of words ordered by the frequency the words are used in the language. So the first completions are words already used in the buffer followed by matching words of the language ordered by frequency.

Why not `company-ispell`?

company-ispell presents you the candidates in the alphabetical sequence, then maybe sorted by a company transformer such as company-prescient. That way it often happens that the word you are about to type appears somewhere in eighth place and it is actually easier to type the word manually.

company-wordfreq however uses word lists in which the words are ordered according to their frequency in the language. That way the probability that the word you want to type is among the first is higher.

The package is still somewhat in an experimental stage. There might be ways to even more optimize the behavior.

Word list files

company-wordfreq does not come with the word list files directly, but it can download the files for you for many languages from FrequencyWords. I made a fork of that repo just in case, the original changes all over sudden without my noticing.

The directory where the word list files reside is determined by the variable company-wordfreq-path, default wordfreq-dicts in your emacs home directory, usually like ~/.emacs.d/wordfreq-dicts. Their names must follow the pattern <language>.txt where language is the ispell-local-dictionary value of the current language.

Further requirements

You need grep in your $PATH as company-wordfreq uses it to grep into the word list files. Should be the case by default on any UNIX like systems. On Windows you might have to tweak it somehow.

Installation

Easiest way to install is from MELPA. If you have configured the MELPA sources you can just install the company-wordfreq package.

If you prefer straight.el, put the following lines into your init file.

(straight-use-package
 '(company-wordfreq :type git :host github :repo "johannes-mueller/company-wordfreq.el"))

Configuration

company-wordfreq is supposed to be the one and only company backend and company-mode should not transform or sort its candidates. This can be achieved by setting the variables company-backends and company-transformers buffer locally in text-mode buffers by

(add-hook 'text-mode-hook (lambda ()
                            (setq-local company-backends '(company-wordfreq))
                            (setq-local company-transformers nil)))

Usually you don’t need to configure the language picked to get the word completions. company-wordfreq uses the variable ispell-local-dictionary. It should work dynamically even if you use auto-dictionary-mode.

To download a word list use

M-x company-wordfreq-download-list

You are presented a list of languages to choose. For some languages the word lists are huge, which can lead to noticeable latency when the completions are build. Therefore you are asked if you want to use a word list with only the 50k most frequent words.

The file will then be downloaded, processed and put in place.

Status

This is basically the result of a week-end hack. So probably not everything will work under any circumstances. Bug reports and feedback welcome in the issue tracker. Pull requests also, of course.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.org

README.org

company-wordfreq

Synopsis

Why not `company-ispell`?

Word list files

Further requirements

Installation

Configuration

Status

Files

README.org

Latest commit

History

README.org

File metadata and controls

company-wordfreq

Synopsis

Why not company-ispell?

Word list files

Further requirements

Installation

Configuration

Status

Why not `company-ispell`?