YOLO目标检测:将Mark Jay摄像头检测代码改为Windows窗口检测求助
Hey Tobias, no worries—let's get your modified YOLO code working for detecting objects in a Windows window (like pygta5) instead of a webcam. I'll break this down into simple, actionable steps since you're new to this.
First: Understand the Core Change
Your original code uses cv2.VideoCapture(0) to pull frames from a webcam. We need to replace that with code that captures frames directly from a specific Windows window. The easiest way to do this (for beginners) is using the mss library for fast screenshots, plus pygetwindow to locate your pygta5 window.
Step 1: Install Required Libraries
First, install the tools we'll need if you haven't already:
pip install mss pygetwindow opencv-python ultralytics
(Note: I'm assuming you're using Ultralytics YOLO, which is what most modern tutorials like Mark Jay's use.)
Step 2: Rewrite the Capture & Detection Code
Here's a modified version of the code that targets your pygta5 window, with comments explaining each change:
import mss import pygetwindow as gw import cv2 import numpy as np from ultralytics import YOLO # Load your YOLO model (use the same one from Mark Jay's tutorial) model = YOLO('yolov8n.pt') # Swap this with your custom model if you have one # Locate the pygta5 window - critical: use the EXACT window title try: # Replace 'pygta5' with your window's actual title (check taskbar hover text!) window = gw.getWindowsWithTitle('pygta5')[0] except IndexError: print("Error: Couldn't find the pygta5 window!") print("Tip: Use gw.getAllTitles() to list all open window titles and find the correct one.") exit() # Set up mss for fast window screenshots sct = mss.mss() # Define the area to capture (matches the window's position/size) monitor = { "top": window.top, "left": window.left, "width": window.width, "height": window.height } # Main detection loop while True: # Capture a screenshot of the window sct_screenshot = sct.grab(monitor) # Convert the screenshot to a format YOLO/OpenCV can use # mss returns RGB(A), OpenCV expects BGR - so we convert it frame = np.array(sct_screenshot) frame = cv2.cvtColor(frame, cv2.COLOR_RGBA2BGR) # Run YOLO detection (same as your original code) results = model(frame) # Draw detection boxes/labels on the frame annotated_frame = results[0].plot() # Show the result window cv2.imshow('YOLO Window Detection', annotated_frame) # Exit when you press 'q' if cv2.waitKey(1) & 0xFF == ord('q'): break # Clean up resources when done sct.close() cv2.destroyAllWindows()
Step 3: Fix Common Errors You Might Be Seeing
If you're getting errors, here's how to troubleshoot the most common issues:
- "Window not found" error:
- The window title you used doesn't match exactly. Run
print(gw.getAllTitles())to see all open window titles, then copy-paste the exact one for pygta5.
- The window title you used doesn't match exactly. Run
- Black/garbled frame, or YOLO isn't detecting anything:
- Make sure the pygta5 window is visible (not minimized or covered by another window). mss can only capture what's on screen.
- Double-check the color conversion step:
cv2.COLOR_RGBA2BGRis necessary because mss captures in RGBA format, while YOLO expects BGR.
- Slow performance:
- Try using a smaller YOLO model (like
yolov8n.ptinstead ofyolov8x.pt) or resize the pygta5 window to a smaller size. - Add
verbose=Falseto the model call:results = model(frame, verbose=False)to reduce console output overhead.
- Try using a smaller YOLO model (like
- Detection boxes are offset:
- Some Windows apps have a title bar or border that's included in the window's
top/leftcoordinates. If boxes are off, you can adjust the monitor values (e.g.,top: window.top + 30to skip the title bar—adjust the number based on your system's title bar height).
- Some Windows apps have a title bar or border that's included in the window's
Optional: More Precise Window Capture (For Advanced Cases)
If pygetwindow isn't giving you accurate coordinates (e.g., for borderless windows), you can use pywin32 to get the client area (the actual content of the window, excluding title bars/borders). But this is more complex—stick with mss + pygetwindow first since it's simpler for beginners.
内容的提问来源于stack exchange,提问作者Tobias




