diff --git a/doc/source/about.rst b/doc/source/about.rst index 5d5972a..79fe21d 100644 --- a/doc/source/about.rst +++ b/doc/source/about.rst @@ -1,7 +1,7 @@ About: Why another tool for parsing? ==================================== -RE|PARSE is simply a tool for combining regular expressions together +Reparse is simply a tool for combining regular expressions together and using a regular expression engine to scan/search/parse/process input for certain tasks. Larger parsing tools like YACC/Bison, ANTLR, and others are really @@ -9,27 +9,27 @@ good for structured input like computer code or xml. They aren't specifically designed for scanning and parsing semi-structured data from unstructured text (like books, or internet documents, or diaries). -RE|PARSE is designed to work with exactly that kind of stuff, (and is completely +Reparse is designed to work with exactly that kind of stuff, (and is completely useless for the kinds of tasks any of the above is often used for). Parsing Spectrum ---------------- -RE|PARSE isn't the first parser of it's kind. A hypothetical spectrum +Reparse isn't the first parser of it's kind. A hypothetical spectrum of parsers from pattern-finding only all the way to highly-featured, structured grammars might look something like this:: - v- RE|PARSE v- YACC/Bison + v- Reparse v- YACC/Bison UNSTRUCTURED |-------------------------| STRUCTURED ^- Regex ^- Parboiled/PyParsing -RE|PARSE is in fact very featureless. It's only a little better +Reparse is in fact very featureless. It's only a little better than plain regular expressions. Still, you might find it ideal for the kinds of tasks it was designed to deal with (like dates and addresses). -What kind of things might RE|PARSE be useful for parsing? ---------------------------------------------------------- +What kind of things might Reparse be useful for parsing? +-------------------------------------------------------- Any kind of semi-structured formats: @@ -41,17 +41,17 @@ Any kind of semi-structured formats: - Addresses - Phone numbers -Or in other words, anything you might consider parsing with Regex, might consider RE|PARSE, +Or in other words, anything you might consider parsing with Regex, might consider Reparse, especially if you are considering combining multiple regular expressions together. Why Regular Expressions --------------------------------- +----------------------- PyParsing (Python) and Parboiled (JVM) also have use-cases very similar -to RE|PARSE, and they are much more feature-filled. They have their own (much more powerful) +to Reparse, and they are much more feature-filled. They have their own (much more powerful) DSL for parsing text. -RE|PARSE uses Regular Expressions which has some advantages: +Reparse uses Regular Expressions which has some advantages: - Short, minimal Syntax - Universal (with some minor differences between different engines) @@ -59,20 +59,20 @@ RE|PARSE uses Regular Expressions which has some advantages: - Moderately Easy-to-learn (Though this is highly subjective) - Many programmers already know the basics - Skills can be carried else where -- **Regular Expressions can be harvested elsewhere and used within RE|PARSE** +- **Regular Expressions can be harvested elsewhere and used within Reparse** - Decent performance over large inputs - Ability to use fuzzy matching regex engines -Limitations of RE|PARSE -------------------------- +Limitations of Reparse +---------------------- Regular Expressions have been known to catch input that was unexpected, or miss input that was expected due to unforeseen edge cases. -RE|PARSE provides tools to help alleviate this by checking the expressions against expected matching +Reparse provides tools to help alleviate this by checking the expressions against expected matching inputs, and against expected non-matching inputs. This library is very limited in what it can parse, if you realize you need something like a recursive grammar, you might want to try PyParsing or something greater -(though RE|PARSE might be helpful as a 'first step' matching and transforming the parse-able data before it is properly +(though Reparse might be helpful as a 'first step' matching and transforming the parse-able data before it is properly parsed by a different library). \ No newline at end of file diff --git a/doc/source/best_practices.rst b/doc/source/best_practices.rst index 2404497..330d82c 100644 --- a/doc/source/best_practices.rst +++ b/doc/source/best_practices.rst @@ -13,7 +13,7 @@ they can have a long productive life without getting out of control: - Never let a regex become too big to be easily understood. Split up big regex into smaller expressions. (Sensible splits won't hurt them). - Maintain a Matches and Non-Matches - - RE|PARSE can use this to test your Regex to make sure they are matching properly + - Reparse can use this to test your Regex to make sure they are matching properly - It helps maintainers see which regular expressions match what quickly - It helps show your intention with each expression, so that others can confidently improve or modify them - Maintain a description which talks about what you are trying to match with each regex, diff --git a/doc/source/howto.rst b/doc/source/howto.rst index 01f6817..756a53f 100644 --- a/doc/source/howto.rst +++ b/doc/source/howto.rst @@ -1,5 +1,5 @@ -Howto: How to use RE|PARSE -========================== +Howto: How to use Reparse +========================= You will need @@ -10,15 +10,15 @@ You will need #. Some example texts that you will want to parse and their solutions. This will be useful to check your parser and will help you put together the expressions and patterns. -1. Setup Python & RE|PARSE --------------------------- +1. Setup Python & Reparse +------------------------- -See :ref:`installation-howto` for instructions on how to install RE|PARSE +See :ref:`installation-howto` for instructions on how to install Reparse -2. Layout of an example RE|PARSE parser -------------------------------------- +2. Layout of an example Reparse parser +-------------------------------------- -RE|PARSE needs 3 things in its operation: +Reparse needs 3 things in its operation: 1. Functions: A dictionary with String Key -> Function Value mapping. @@ -113,7 +113,7 @@ in expressions and merely *combined* in patterns. Order: 2 # I could have used instead to use a pattern inside a pattern but it wouldn't have made a difference really (just an extra function call). -The order field tells RE|PARSE which pattern to pick if multiple patterns match. +The order field tells Reparse which pattern to pick if multiple patterns match. Generally speaking, the more specific patterns should be ordered higher than the lower ones (you wouldn't want someone to try and call a fax machine!). @@ -129,9 +129,9 @@ Done this way, I could have had 3 different formats for Area Code and the patter on any of them. I didn't here because that'd be overkill for phone numbers. 5. Writing your functions.py file ----------------------------------- +--------------------------------- -RE|PARSE matches text and also does some parsing using functions. +Reparse matches text and also does some parsing using functions. The order in which the functions are run and results passed are as follows: @@ -179,7 +179,7 @@ I used namedtuples here, but you can parse your output anyway you want to. 6. Combining it all together! ----------------------------- -The builder.py module contains some functions to build a RE|PARSE system together. +The builder.py module contains some functions to build a Reparse system together. Here's how I'd put together my phone number parser: .. code-block:: python diff --git a/doc/source/modules.rst b/doc/source/modules.rst index ff29990..c06e2d6 100644 --- a/doc/source/modules.rst +++ b/doc/source/modules.rst @@ -1,4 +1,4 @@ -Here lies the embedded docblock documentation for the various parts of RE|PARSE. +Here lies the embedded docblock documentation for the various parts of Reparse. expression ========= diff --git a/examples/colortime/colortime.py b/examples/colortime/colortime.py index ca5c9da..04534f8 100644 --- a/examples/colortime/colortime.py +++ b/examples/colortime/colortime.py @@ -11,7 +11,7 @@ """ # Example stuff ----------------------------------------------------- # Have to add the parent directory just in case you -# run this file in the demo directory without installing RE|PARSE +# run this file in the demo directory without installing Reparse import sys sys.path.append('../..') @@ -24,7 +24,7 @@ path += "/" -# RE|PARSE ---------------------------------------------------------- +# Reparse ---------------------------------------------------------- from examples.colortime.functions import functions import reparse diff --git a/examples/colortime/functions.py b/examples/colortime/functions.py index f2c3335..581f7d5 100644 --- a/examples/colortime/functions.py +++ b/examples/colortime/functions.py @@ -18,7 +18,7 @@ def color_time(Color=None, Time=None): return Color, Time # --------------- Function list ------------------ -# This is the dictionary that is used by the RE|PARSE +# This is the dictionary that is used by the Reparse # expression builder. The key is the same value used in the patterns.yaml # file under ``Function: ``. The value is a reference to function. diff --git a/examples/phone/functions.py b/examples/phone/functions.py index 5b55605..08136d1 100644 --- a/examples/phone/functions.py +++ b/examples/phone/functions.py @@ -25,7 +25,7 @@ def fax_phone(p): return p._replace(fax=True) # --------------- Function list ------------------ -# This is the dictionary that is used by the RE|PARSE +# This is the dictionary that is used by the Reparse # expression builder. The key is the same value used in the patterns.yaml # file under ``Function: ``. The value is a reference to function. diff --git a/examples/phone/phone.py b/examples/phone/phone.py index e86ed93..96509f7 100644 --- a/examples/phone/phone.py +++ b/examples/phone/phone.py @@ -7,7 +7,7 @@ """ # Example stuff ----------------------------------------------------- # Have to add the parent directory just in case you -# run this file in the demo directory without installing RE|PARSE +# run this file in the demo directory without installing Reparse import sys sys.path.append('../..') @@ -20,7 +20,7 @@ path += "/" -# RE|PARSE ---------------------------------------------------------- +# Reparse ---------------------------------------------------------- from examples.phone.functions import functions import reparse diff --git a/examples/readme.rst b/examples/readme.rst index 8fc3cf8..c559b56 100644 --- a/examples/readme.rst +++ b/examples/readme.rst @@ -1,4 +1,4 @@ -These examples shows a very basic RE|PARSE setup to help you get started. +These examples shows a very basic Reparse setup to help you get started. Under each directory there are files like this:: expressions.yaml -- Contains the regular expression building blocks diff --git a/readme.rst b/readme.rst index e60e801..4fb51e4 100644 --- a/readme.rst +++ b/readme.rst @@ -1,5 +1,5 @@ -RE|PARSE -======== +Reparse +======= *Python library/tools for combining and parsing using Regular Expressions in a maintainable way* @@ -28,7 +28,7 @@ So you want to get (color and time) or ``[('green', datetime.time(23, 0))]`` out blah blah blah go to the store to buy green at 11pm! blah blah If you need scan/search/parse/transform some unstructured input and get some semi-structured data -out of it RE|PARSE might be able to help. +out of it Reparse might be able to help. First structure some Regular Expressions (Here, in Yaml) -------------------------------------------------------- @@ -105,9 +105,9 @@ Result Cool! -Intrigued? Learn more how to make the magic happen in `Howto: How to use RE|PARSE`_. +Intrigued? Learn more how to make the magic happen in `Howto: How to use Reparse`_. -Want to read more about what RE|PARSE is and what it can do? More info in `About: Why another tool for parsing?`_ +Want to read more about what Reparse is and what it can do? More info in `About: Why another tool for parsing?`_ Info ==== @@ -127,7 +127,7 @@ manually ~~~~~~~~ 1. If you don't have them already, - RE|PARSE depends on REGEX_, and PyYaml_. + Reparse depends on REGEX_, and PyYaml_. Download those and ``python setup.py install`` in their directories. If you are on windows, you may have to find binary installers for these, since they contain modules that have to be compiled. @@ -157,7 +157,7 @@ Send me suggestions, issues, and pull requests and I'll gladly review them! Versions -------- -- *3.0* InvalidPattern Exception, Allow monkey patching regex arguments +- *3.0* InvalidPattern Exception, Allow monkey patching regex arguments. RE|PARSE -> Reparse. - *2.1* Change `yaml.load` to `yaml.safe_load` for security - *2.0* Major Refactor, Python 3, Better Parser builders - *1.1* Fix setup.py @@ -177,7 +177,7 @@ MIT Licensed! See LICENSE file for the full text. .. _Docs at Readthedocs: https://reparse.readthedocs.org/en/latest/ -.. _`Howto: How to use RE|PARSE`: https://reparse.readthedocs.org/en/latest/howto.html +.. _`Howto: How to use Reparse`: https://reparse.readthedocs.org/en/latest/howto.html .. _`About: Why another tool for parsing?`: https://reparse.readthedocs.org/en/latest/about.html diff --git a/reparse/__init__.py b/reparse/__init__.py index 892b046..34f07bd 100644 --- a/reparse/__init__.py +++ b/reparse/__init__.py @@ -1,4 +1,4 @@ -""" RE|PARSE +""" Reparse """ from reparse.parsers import * diff --git a/reparse/parsers.py b/reparse/parsers.py index 40cee8a..d51cea6 100644 --- a/reparse/parsers.py +++ b/reparse/parsers.py @@ -62,7 +62,7 @@ def output(): def parser(parser_type=basic_parser, functions=None, patterns=None, expressions=None, patterns_yaml_path=None, expressions_yaml_path=None): - """ A RE|PARSE parser description. + """ A Reparse parser description. Simply provide the functions, patterns, & expressions to build. If you are using YAML for expressions + patterns, you can use ``expressions_yaml_path`` & ``patterns_yaml_path`` for convenience. @@ -77,9 +77,9 @@ def _load_yaml(file_path): with open(file_path) as f: return yaml.safe_load(f) - assert expressions or expressions_yaml_path, "RE|PARSE can't build a parser without expressions" - assert patterns or patterns_yaml_path, "RE|PARSE can't build a parser without patterns" - assert functions, "RE|PARSE can't build without a functions" + assert expressions or expressions_yaml_path, "Reparse can't build a parser without expressions" + assert patterns or patterns_yaml_path, "Reparse can't build a parser without patterns" + assert functions, "Reparse can't build without a functions" if patterns_yaml_path: patterns = _load_yaml(patterns_yaml_path)