如何基于指定标记字符串提取Python中两标识间的子串

阿华AIGC实验室

2026-5-25

Extract Substring Between Two Markers in Python

Hey there! Extracting text between two specific markers is a super common task in Python, and there are a few straightforward ways to pull it off depending on your needs. Let’s walk through them using your example:

Original string: "hey I'm using python now"
Desired output: "I'm using python"

Method 1: Using `find()` + String Slicing

This is the most intuitive approach for simple, non-overlapping markers. We’ll locate the positions of both markers, then slice the string between them:

original_str = "hey I'm using python now"
start_marker = "hey"
end_marker = "now"

# Find the end index of the start marker
start_pos = original_str.find(start_marker) + len(start_marker)
# Find the start index of the end marker (starting from start_pos to skip earlier matches)
end_pos = original_str.find(end_marker, start_pos)

# Extract and clean up the substring (strip removes extra whitespace)
result = original_str[start_pos:end_pos].strip()
print(result)  # Output: I'm using python

Pro Tip: Handle Edge Cases

What if one of the markers is missing? Add checks to avoid errors:

original_str = "hey I'm using python now"
start_marker = "hey"
end_marker = "now"

start_pos = original_str.find(start_marker)
if start_pos == -1:
    print("Start marker not found in the string!")
else:
    start_pos += len(start_marker)
    end_pos = original_str.find(end_marker, start_pos)
    if end_pos == -1:
        print("End marker not found after the start marker!")
    else:
        result = original_str[start_pos:end_pos].strip()
        print(result)

Method 2: Using `split()`

If you prefer a concise one-liner (and don’t need strict edge case handling right away), string splitting works great:

original_str = "hey I'm using python now"
result = original_str.split("hey")[1].split("now")[0].strip()
print(result)  # Output: I'm using python

Here’s the breakdown:

split("hey") splits the string into ["", " I'm using python now"]
We take the second element ([1]), then split it again on "now" to get [" I'm using python ", ""]
Grabbing the first element ([0]) and stripping whitespace gives us the desired text.

Method 3: Regular Expressions (For Complex Cases)

If your markers have special characters, or you need flexible matching (like handling multiple occurrences), regex is your go-to. Use re.search() to capture the text between the markers:

import re

original_str = "hey I'm using python now"
# The parentheses () capture the text between "hey" and "now"
match = re.search(r'hey(.*)now', original_str)
if match:
    result = match.group(1).strip()
    print(result)  # Output: I'm using python

Non-Greedy Matching

If there are multiple instances of the end marker, use .*? to stop at the first occurrence:

original_str = "hey first part now hey second part now"
match = re.search(r'hey(.*?)now', original_str)
if match:
    result = match.group(1).strip()
    print(result)  # Output: first part

内容的提问来源于stack exchange，提问作者Fred F