From 3f8581bf8ac7163dfa42211e878dbd80044eaae8 Mon Sep 17 00:00:00 2001 From: Olly Betts Date: Fri, 17 Dec 2021 17:08:52 +1300 Subject: [PATCH] CONTRIBUTING.rst: Clarify which charsets to list --- CONTRIBUTING.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index 1d5a2111..9a359a86 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -43,10 +43,10 @@ the first column. The columns are: * Algorithm name (needs to match the `.sbl` source without extension) * Encodings to support. Wide-character Unicode is always supported and doesn't need to be listed here. You should always include `UTF_8`, and - also `ISO_8859_1` if the stemmer only uses characters from that and the - language can be usefully written using it. We currently also have support - for `ISO_8859_2` and `KOI8_R`, but other single-byte character sets can be - supported quite easily if they are useful. + also any of `ISO_8859_1`, `ISO_8859_2` and `KOI8_R` which the language can + usefully be written using only characters from (in particular they need to + contain all the characters the stemmer explicitly uses). Support for other + single-byte character sets is easy to add if they're useful. * Names and ISO-639 codes for the language. Wikipedia has a handy list of `all the ISO-639 codes `_ - find the row for your new language and include the codes from the "639-1",