Flags
file_re accepts the same flag bitmask as re. Every entry
point that takes a flags keyword forwards it to the underlying Rust
regex engine after translation. Because file_re uses the Rust
regex crate, a few flags behave differently than they do in CPython
re and are documented here.
Supported flags
Python flag |
Rust handling |
Notes |
|---|---|---|
|
|
Full parity with |
|
|
|
|
|
|
|
|
Full parity with |
|
Default-on; no-op. |
Unicode character classes are always on in Rust’s |
|
|
Divergent; see below. |
Ignored; emits |
Rust |
|
|
Raises |
No Rust equivalent; |
Flags combine the same way as in re:
import re
from file_re import file_re
match = file_re.search(
r"^error:.*$",
"server.log",
flags=re.IGNORECASE | re.MULTILINE,
)
Rust-vs-Python divergences
re.ASCII
In CPython, re.ASCII scopes \w, \s, \d, and \b
to ASCII for the patterns that use them. The Rust regex crate has
no equivalent scoping: it only exposes an all-or-nothing
unicode(false) switch that turns off Unicode interpretation for the
entire pattern. file_re translates re.ASCII to
unicode(false), which is broader than what re does.
In practice the results are usually equivalent for ASCII-only input,
but patterns that mix ASCII and non-ASCII character classes will not
behave identically. To surface this, file_re emits a
UserWarning the first time re.ASCII is used in a process:
UserWarning: re.ASCII in file_re disables Unicode character class
matching entirely (Rust regex semantics); this is broader than
Python's re.ASCII.
re.LOCALE
CPython implements re.LOCALE by consulting the current C locale
at match time for byte patterns. Rust’s regex has no locale
concept and file_re does not attempt to emulate one. Passing
re.LOCALE raises ValueError:
import re
from file_re import file_re
file_re.search(r"\w+", "data.txt", flags=re.LOCALE)
# ValueError: re.LOCALE is not supported by file_re.
If you have existing re code that uses re.LOCALE, note
that re itself deprecated the flag for str patterns in
Python 3.6. The usual migration is to drop the flag and rely on the
default Unicode semantics.
re.DEBUG
The Rust regex crate does not have an equivalent of re’s
debug print. file_re ignores re.DEBUG and emits a
UserWarning so the loss of behavior is visible:
UserWarning: re.DEBUG is not supported by file_re and is ignored.
Inline directives
Inline flags such as (?i), (?m), (?s), and (?x) are
handled by the Rust regex crate directly and are always supported
regardless of the flags keyword. The underlying compiler uses
RegexBuilder, so keyword flags and inline directives compose
predictably.