Skip to content

Commit

Permalink
swedish: Remove -et or -en when stem ends in et
Browse files Browse the repository at this point in the history
Removing -et and -en in general is problematic, as many words end in -et
or -en where this isn't a suffix, but very few end in -etet or -eten
where the last two letters aren't a suffix (and those that do don't seem
to suffer if we make the stem not have the -et).

Fixes #47
  • Loading branch information
ojwb committed Sep 3, 2021
1 parent 000100d commit d6d9243
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions algorithms/swedish.sbl
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
routines (
mark_regions
R1
main_suffix
consonant_pair
other_suffix
Expand Down Expand Up @@ -33,6 +34,8 @@ define mark_regions as (

backwardmode (

define R1 as $p1 <= cursor

define main_suffix as (
setlimit tomark p1 for ([substring])
among(
Expand Down Expand Up @@ -66,6 +69,7 @@ define stem as (
do mark_regions
backwards (
do main_suffix
do ( ['et' or 'en' R1 ] 'et' delete )
do consonant_pair
do other_suffix
)
Expand Down

0 comments on commit d6d9243

Please sign in to comment.