如何用Python基于Excel B列单词高亮A列段落中的对应文本?
Your current code applies the font style to the entire cell because you're setting a2.font = ft—this changes the font for all text in the cell. To highlight only the target word(s), you need to use rich text formatting in openpyxl, which lets you apply styles to individual parts of the cell's content.
Here's the corrected code that handles case-insensitive matches and highlights all occurrences of the target word in each corresponding row (A column paragraph with B column word):
from openpyxl import load_workbook from openpyxl.styles import Font, colors from openpyxl.cell.text import RichText import re # Load the workbook and active worksheet wb = load_workbook('sample_sentence.xlsx') ws = wb.active # Define the highlight style highlight_font = Font(color=colors.GREEN, bold=True) # Iterate over each row (assuming no headers; adjust min_row if you have headers) for row in ws.iter_rows(min_row=1, max_col=2, values_only=False): a_cell = row[0] # Column A: paragraph text b_cell = row[1] # Column B: target word to highlight # Skip rows where either cell is empty if not a_cell.value or not b_cell.value: continue original_text = a_cell.value target_word = b_cell.value # Prepare segments for rich text rich_text_segments = [] prev_position = 0 # Find all case-insensitive occurrences of the target word # Use re.escape to handle special regex characters in the target word for match in re.finditer(re.escape(target_word), original_text, flags=re.IGNORECASE): # Add the text before the match (uses default font) rich_text_segments.append(original_text[prev_position:match.start()]) # Add the matched word with highlight font rich_text_segments.append((match.group(), highlight_font)) # Update the previous position to the end of the match prev_position = match.end() # Add any remaining text after the last match rich_text_segments.append(original_text[prev_position:]) # Assign the rich text to the cell a_cell.value = RichText(*rich_text_segments) # Save the modified workbook wb.save('sample_sentence_highlighted.xlsx')
Key Improvements:
- RichText Usage: Instead of setting the entire cell's font, we build a
RichTextobject that combines normal text segments and highlighted word segments. - Case-Insensitive Matching: Uses
re.IGNORECASEto find matches regardless of uppercase/lowercase differences. - Handle Multiple Occurrences: Highlights every instance of the target word in the paragraph, not just the first one.
- Escape Special Characters: Uses
re.escape()to ensure the target word is treated as literal text (even if it contains regex symbols like.or*).
If You Need to Highlight All B Column Words in All A Column Paragraphs:
If your requirement is to highlight every word from column B in every paragraph in column A (not just row-wise), modify the loop like this:
# Collect all target words from column B (skip empty cells) target_words = [cell.value for cell in ws['B'] if cell.value] for a_cell in ws['A']: if not a_cell.value: continue original_text = a_cell.value rich_text_segments = [] prev_position = 0 # Sort words by length descending to avoid partial matches (e.g., "pest" before "pes") sorted_words = sorted(target_words, key=len, reverse=True) pattern = re.compile('|'.join(re.escape(word) for word in sorted_words), flags=re.IGNORECASE) for match in pattern.finditer(original_text): rich_text_segments.append(original_text[prev_position:match.start()]) rich_text_segments.append((match.group(), highlight_font)) prev_position = match.end() rich_text_segments.append(original_text[prev_position:]) a_cell.value = RichText(*rich_text_segments)
This version collects all words from column B, sorts them by length (to prevent shorter words from matching parts of longer ones), and highlights every occurrence of any of those words in each column A cell.
内容的提问来源于stack exchange,提问作者Abdul Raoof




