You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

Python原始字符串是否禁用\w/\d元字符?两类re.findall语句差异解析

Python原始字符串与正则元字符的常见疑问解答

Hey there! Let's break down these two common questions about Python raw strings and regex metacharacters—they’re easy to mix up when you’re just starting out, so let’s make it clear.

1. 原始字符串是否会像禁用转义字符\n一样,禁用\w\d这类正则元字符?

Short answer: Absolutely not.

Raw strings (the ones prefixed with r) only affect how Python’s string parser handles backslashes—they don’t change how the regex engine interprets the string. Let’s break down the two types of "escaping" here to avoid confusion:

  • Python string-level escaping: Sequences like \n (newline), \t (tab), or \" (double quote) are defined by Python. When you don’t use a raw string, Python converts these sequences into their corresponding special characters.
  • Regex engine-level escaping: Sequences like \d (match digits) or \w (match letters/numbers/underscores) are defined by the regex engine. The backslash here is for the regex engine, not Python.

Since \d and \w aren’t valid Python escape sequences, Python doesn’t touch them—whether you use a raw string or not, it passes the exact sequence \d or \w to the regex engine. The regex engine then recognizes them as metacharacters like usual.

2. 为什么re.findall(r"\d+","i am aged 35")re.findall("\d+","i am aged 35")效果一致?

This ties directly into the first explanation. Let’s walk through each case:

  • When you write "\d+": Python checks the string for valid escape sequences. Since \d isn’t one of them, Python leaves it as-is, passing the string "\d+" (characters \, d, +) straight to the regex engine.
  • When you write r"\d+": The r prefix tells Python to ignore all escape processing. So Python also passes the exact same string "\d+" to the regex engine.

Since the regex engine gets identical input in both cases, it produces the same result—matching the digits 35 in your example.

When do raw strings matter for regex?

Raw strings shine when you need to match literal backslashes or avoid Python accidentally interpreting your regex escape as a string escape. For example:

  • To match a literal backslash in a string like "a\b":
    • Without raw strings: You’d need to write re.findall("\\\\", "a\\b") (four backslashes total—Python converts two to one literal backslash, and the regex engine needs two to match one literal backslash).
    • With raw strings: Just write re.findall(r"\\", "a\\b")—cleaner and easier to read.

Another example: If you wanted to match the literal sequence \n (not a newline character), raw strings let you write r"\\n" instead of the messy "\\\\n".


内容的提问来源于stack exchange,提问作者Pankaj Kulkarni

火山引擎 最新活动