Links
Code |
Meaning |
\d |
a digit |
\D |
a non-digit |
\s |
whitespace (tab, space, newline, etc.) |
\S |
non-whitespace |
\w |
alphanumeric |
\W |
non-alphanumeric |
Anchoring
Code |
Meaning |
^ |
start of string, or line |
$ |
end of string, or line |
\A |
start of string |
\Z |
end of string |
\b |
empty string at the beginning or end of a word |
\B |
empty string not at the beginning or end of a word |
Syntax |
Description |
(?P<name>...) |
Capture group with name "name". To refer to this in the same regex, use (?P=name) and to refer to it in a substitution, use \g<name> |
(?=...) |
(Positive) Lookahead assertion: Matches if ... matches next, but doesn’t consume any of the string. |
(?!...) |
Negative Lookahead assertion: Matches if ... does not match next, but doesn’t consume any of the string. |
(?<=...) |
Positive lookbehind assertion: Succeeds only when the current position is preceded by a match for ... . The contained ... must only match strings of some fixed length, meaning that abc or a |
(?<!...) |
Negative lookbehind assertion: Similar to the Positive lookbehind assertion but requires ... to not precede the current position in the string. |
Named capture groups
- Specifying in a regular expression
- to match specified pattern:
(?P<name>pattern)
- backreferences / match a previously defined group:
(?P=name)
- Specifying in a replacement string:
\g<name>
Specifing flags / compilation options.
Ref: Compilation Flags
- To specify flags inline in the regex, prefix the regex
with
(?FLAGS)
.
- e.g. For case insensitive matching, prefix with
(?i)
.
Flag |
Meaning |
DOTALL, S |
Make . match any character, including newlines |
IGNORECASE, I |
Do case-insensitive matches |
LOCALE, L |
Do a locale-aware match |
MULTILINE, M |
Multi-line matching, affecting ^ and $ |
VERBOSE, X |
Enable verbose REs, which can be organized more cleanly and understandably. |
UNICODE, U |
Makes several escapes like \w, \b, \s and \d dependent on the Unicode character database. |
Snippets
text = "There are 24 hours in a day, 7 days in a week, 4 weeks in a month"
r = re.compile(r'\d+')
m = r.search("text")
c = itertools.count(1)
re.sub(r'\d+', lambda m: str(c.next()), in_this_text)
re.sub(r'index = (?P<counter>\d+)', lambda m: "index = {0}".format(c.next()), in_this_text)
# Verbose / multiline regex.
regex = re.compile(ur"""
\$ (?:
(?P<name>\w+) |
# this is incorrect - it doesn't handle } inside the expression.
\{(?P<expression>[^}]+)\}
)
""", re.VERBOSE)
# You can also use the (?x) flag instead of using re.VERBOSE
regex = re.compile(ur"""
(?x)
Verbose multiline regex # comment
""")