如何从乱序字符串提取数字对应英文单词并转换为阿拉伯数字？

阿华AIGC实验室

2026-5-21

Solution to Extract Numbers from Jumbled String

Got it, let's build out your initial idea into a solid, working solution. The key here is to leverage unique character identifiers for each target number word—this avoids conflicts when multiple words share common characters (like "E" in both "ONE" and "THREE").

Step-by-Step Approach

First, map each target number to its word, and order them by characters that only appear in one word (this is critical for accuracy):
- "ONE" has the unique character O (no other number word we care about uses O)
- "TWO" has the unique character W
- "THREE" has the unique character H
Next, count the occurrences of each character in your input string. This lets us easily track and deduct characters as we match words.
Iterate through our ordered number-word list: if the unique character exists in our count dictionary, record the corresponding number, then subtract the count of each character in the word from our dictionary.
Finally, sort the collected numbers and output them as space-separated digits.

Example Code (Python)

def extract_numbers(jumbled_str):
    # Ordered mapping: sorted by unique character priority
    num_word_map = [
        (1, "ONE"),
        (2, "TWO"),
        (3, "THREE")
    ]
    # Count character occurrences in the input string
    char_count = {}
    for char in jumbled_str:
        char_count[char] = char_count.get(char, 0) + 1
    
    result = []
    # Define unique characters for each target word to avoid conflicts
    unique_chars = {1: 'O', 2: 'W', 3: 'H'}
    
    for num, word in num_word_map:
        if char_count.get(unique_chars[num], 0) > 0:
            result.append(num)
            # Deduct each character of the matched word from the count
            for c in word:
                char_count[c] -= 1
                if char_count[c] == 0:
                    del char_count[c]
    # Sort to ensure the output is in 1 2 3 order
    result.sort()
    return ' '.join(map(str, result))

# Test with your input string
input_str = "OTNWEHRE"
print(extract_numbers(input_str))  # Output: 1 2 3

Why This Works

Using character counts is far more efficient than repeatedly scanning the original string—we only process each character once upfront.
Prioritizing unique characters ensures we don't accidentally "use up" characters that belong to another word. For example, if we checked for "E" first, we might deduct an E from "THREE" thinking it's for "ONE", which would break the match for "THREE".

内容的提问来源于stack exchange，提问作者Amr4AOT