如何通过坐标从PDF提取特定文本?Python脚本开发求助
Got it, let's tackle this reverse problem—you need to pull text from a specific coordinate area in a PDF, and the tools you've tried so far only go the other way (finding coordinates from text). Good news: PyMuPDF (fitz) has exactly what you need, you just need to use a different method than searchFor().
The Fix: Use page.get_text("words")
This method returns every word in the page along with its bounding box coordinates. Each entry is a tuple formatted like:(x0, y0, x1, y1, text, block_no, line_no, word_no)
Where:
x0, y0: Top-left corner of the word's bounding boxx1, y1: Bottom-right cornertext: The actual word itself
Example Implementation
Here's how to use it to extract text from a target coordinate region:
import fitz # PyMuPDF # Load your PDF doc = fitz.Document("pdf_name.pdf") page = doc.load_page(0) # Load first page (index 0) # Define your target coordinate area (adjust these values to your needs) # Let's use the sample coordinates you mentioned: (90.0, 145.85) to (142.13, 156.50) target_rect = fitz.Rect(90.0, 145.85, 142.13, 156.50) # Get all words with their coordinates all_words = page.get_text("words") # Collect words that lie inside or overlap with the target rectangle extracted_text = [] for word in all_words: word_rect = fitz.Rect(word[0], word[1], word[2], word[3]) # Check if the word's rectangle intersects with our target area if word_rect.intersects(target_rect): extracted_text.append(word[4]) # Join the words into a single string final_text = " ".join(extracted_text) print(final_text)
Customize the Matching Logic
Depending on your needs, you can tweak how you check the word's position:
- Use
word_rect in target_rectif you want only words fully contained within the target area - Use
target_rect in word_rectif you want words that fully contain the target area (useful if your target is a small point inside a larger text block) intersects()works for partial overlaps, which is often the most flexible option
Why This Works
Unlike PyPDF2 or pdfminer.six, PyMuPDF gives you granular access to every text element's spatial data. By iterating through all words and comparing their bounding boxes to your target coordinates, you can precisely extract the text you need.
内容的提问来源于stack exchange,提问作者Damiano Shehaj




