Skip to content

Latest commit

 

History

History
62 lines (51 loc) · 2.74 KB

README.rst

File metadata and controls

62 lines (51 loc) · 2.74 KB

luapatt travis-develop

A Python 3.5+ implementation of the Lua language’s pattern matching functions. Lua’s pattern matching is simpler than regular expressions and lacks several features that regexes have, such as | for alternation, but also contains some features difficult or impossible to duplicate in most regex flavors, such as the ability to easily match a balanced pair of parentheses (or any two other characters).

Installation

pip install luapatt

Documentation

For documentation on how pattern matching works, please read the Lua reference manual. This library contains the following differences from stock Lua:

  • Character classes that rely on the meaning of a character call Python’s str.is* family of methods or use Python’s unicodedata module for character classification, and so use the Unicode definition of that meaning.
  • String positions are zero-based instead of one-based, reflecting the fact that Python is generally zero-based (as opposed to Lua, which has one-based indexes). This affects position captures and the indexes returned as the first two results from find().
  • Function return values are combined into a tuple, as is standard with Python. However, singleton tuples are not returned; the single value is returned directly instead.
  • gsub() does not return the number of substitutions by default, instead returning only the new string. To get the count, pass the named argument count=True to the call (which will result in a 2-tuple of the new string and the count).
  • An extra function, set_escape_char(), is provided to change the escape character. It takes one argument: the new escape character, which must be a str object of length 1. The escape character cannot be set to any of the other special characters. While it is possible to set it to a letter or number, this is not recommended as it may interfere with other aspects of pattern matching, and doing so may be disallowed in the future.
    • NOTE: Because set_escape_char modifies global state, it is not thread-safe.
  • Unlike Lua, which has no notion of a Unicode string and assumes all characters are one byte in length, this library operates on full Unicode strings (i.e. str objects). If you pass bytes objects to this library, the behavior is undefined.

Licensing

As with Lua itself, this library is released under the MIT License.