Skip to content

Commit

Permalink
Fix preprocess allowed_symbols (#247)
Browse files Browse the repository at this point in the history
* fixes #245; adds allowed_symbols="all" and removes non working "*" version
  • Loading branch information
derNarr authored Nov 28, 2023
1 parent be6c481 commit 23463a3
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 3 deletions.
8 changes: 6 additions & 2 deletions pyndl/preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ def process_occurrences(occurrences, outfile, *,
def create_event_file(corpus_file,
event_file,
*,
allowed_symbols="*",
allowed_symbols="all",
context_structure="document",
event_structure="consecutive_words",
event_options=(3,), # number_of_words,
Expand Down Expand Up @@ -175,11 +175,12 @@ def create_event_file(corpus_file,
automatically. If the corpus file contains these special symbols a warning
will be given.
If you want to use all symbols use the special word ``all``.
These examples define the same allowed symbols::
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
'a-zA-Z'
'*'
or a function indicating which characters to include. The function should
return `True`, if the passed character is a allowed symbol.
Expand Down Expand Up @@ -264,6 +265,9 @@ def filter_symbols(line, replace):
if not allowed_symbols(line[ii]):
line_copy[ii] = replace
return ''.join(line_copy)
elif allowed_symbols == 'all':
def filter_symbols(line, replace):
return line
else:
not_in_symbols = re.compile(f"[^{allowed_symbols:s}]")
def filter_symbols(line, replace):
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "pyndl"
version = "1.2.0"
version = "1.2.1"
description = "Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations."

license = "MIT"
Expand Down

0 comments on commit 23463a3

Please sign in to comment.