Regex in Python
Python's re module provides Perl-style regex support. Patterns are typically written as raw strings (r'pattern') to avoid backslash escaping issues. Python 3.11+ added atomic groups and possessive quantifiers.
Code Examples
Basic match and search
import re
# search() finds the first match anywhere in the string
match = re.search(r'\d{3}-\d{4}', 'Call 555-1234 today')
if match:
print(match.group()) # "555-1234"
# match() only matches at the beginning of the string
result = re.match(r'\d+', '123abc')
print(result.group()) # "123"re.search() scans the entire string for a match. re.match() only checks from the start. Both return a Match object or None.
Find all matches
import re
text = "Emails: alice@example.com, bob@test.org"
emails = re.findall(r'[\w.+-]+@[\w-]+\.[\w.]+', text)
print(emails) # ['alice@example.com', 'bob@test.org']
# finditer() returns Match objects with position info
for m in re.finditer(r'\w+@\w+', text):
print(f"{m.group()} at position {m.start()}-{m.end()}")findall() returns a list of matched strings. finditer() returns an iterator of Match objects — use it when you need match positions or groups.
Named groups and groupdict()
import re
pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
match = re.search(pattern, 'Date: 2026-03-08')
print(match.group('year')) # "2026"
print(match.groupdict()) # {'year': '2026', 'month': '03', 'day': '08'}Python named groups use (?P<name>...) syntax (not the (?<name>...) syntax used in JavaScript). groupdict() returns all named groups as a dictionary.
Substitution with re.sub()
import re
# Simple replacement
result = re.sub(r'\bfoo\b', 'bar', 'foo is not foobar')
print(result) # "bar is not foobar"
# Replacement with a function
def censor(match):
return '*' * len(match.group())
print(re.sub(r'\b\w{4,}\b', censor, 'hide long words'))
# "**** **** *****"re.sub() replaces matches. The replacement can be a string (with \\1, \\g<name> backreferences) or a callable that receives each Match object.
Compile patterns for reuse
import re
# Compile once, reuse many times
ip_pattern = re.compile(
r'(\d{1,3}\.){3}\d{1,3}'
)
logs = ["192.168.1.1 - GET /", "10.0.0.5 - POST /api"]
for line in logs:
match = ip_pattern.search(line)
if match:
print(match.group())re.compile() pre-compiles a pattern into a reusable regex object. This avoids recompiling on every call and makes code cleaner when using the same pattern repeatedly.
Verbose mode for readable patterns
import re
email_pattern = re.compile(r"""
^[\w.+-]+ # local part
@ # @ separator
[\w-]+ # domain name
\. # dot
[\w.]+$ # TLD (may contain dots)
""", re.VERBOSE | re.IGNORECASE)
print(email_pattern.match("user@example.com")) # Match objectThe re.VERBOSE (re.X) flag lets you write multi-line patterns with comments. Whitespace is ignored unless escaped or inside a character class.
Note
Always use raw strings (r'...') for regex patterns in Python — without them, \b means backspace instead of a word boundary. Python 3.11 added atomic groups (?>...) and possessive quantifiers (++, *+, ?+). The regex module (pip install regex) provides additional features like fuzzy matching, Unicode categories, and variable-length lookbehinds.
Regex in Other Languages
Frequently Asked Questions
What is the difference between re.match() and re.search()?
re.match() only checks for a match at the beginning of the string. re.search() scans the entire string and returns the first match anywhere. Use re.fullmatch() (Python 3.4+) to check if the entire string matches the pattern.
Why do I need raw strings (r'...') for regex in Python?
Without raw strings, Python interprets backslash sequences before the regex engine sees them. For example, '\b' is a backspace character, but r'\b' is the literal characters \b — which the regex engine interprets as a word boundary. Raw strings pass backslashes through unchanged.
How do I match across multiple lines in Python?
Use the re.DOTALL (re.S) flag to make . match newline characters. Use the re.MULTILINE (re.M) flag to make ^ and $ match the start/end of each line instead of the entire string. You can combine flags: re.DOTALL | re.MULTILINE.
What is the difference between re and the regex module?
The built-in re module covers standard regex needs. The third-party regex module (pip install regex) adds features like fuzzy matching, Unicode category support (\p{L}), variable-length lookbehinds, atomic groups (in Python < 3.11), and better Unicode handling. It's a drop-in replacement for re.
Want to test a Python regex pattern? Our regex tester runs JavaScript's native RegExp engine in your browser — paste your pattern and see matches in real time.
← Open the Regex Tester