You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

YOLO目标检测:将Mark Jay摄像头检测代码改为Windows窗口检测求助

Fixing YOLO Detection for Windows Windows (Like pygta5)

Hey Tobias, no worries—let's get your modified YOLO code working for detecting objects in a Windows window (like pygta5) instead of a webcam. I'll break this down into simple, actionable steps since you're new to this.

First: Understand the Core Change

Your original code uses cv2.VideoCapture(0) to pull frames from a webcam. We need to replace that with code that captures frames directly from a specific Windows window. The easiest way to do this (for beginners) is using the mss library for fast screenshots, plus pygetwindow to locate your pygta5 window.

Step 1: Install Required Libraries

First, install the tools we'll need if you haven't already:

pip install mss pygetwindow opencv-python ultralytics

(Note: I'm assuming you're using Ultralytics YOLO, which is what most modern tutorials like Mark Jay's use.)

Step 2: Rewrite the Capture & Detection Code

Here's a modified version of the code that targets your pygta5 window, with comments explaining each change:

import mss
import pygetwindow as gw
import cv2
import numpy as np
from ultralytics import YOLO

# Load your YOLO model (use the same one from Mark Jay's tutorial)
model = YOLO('yolov8n.pt')  # Swap this with your custom model if you have one

# Locate the pygta5 window - critical: use the EXACT window title
try:
    # Replace 'pygta5' with your window's actual title (check taskbar hover text!)
    window = gw.getWindowsWithTitle('pygta5')[0]
except IndexError:
    print("Error: Couldn't find the pygta5 window!")
    print("Tip: Use gw.getAllTitles() to list all open window titles and find the correct one.")
    exit()

# Set up mss for fast window screenshots
sct = mss.mss()

# Define the area to capture (matches the window's position/size)
monitor = {
    "top": window.top,
    "left": window.left,
    "width": window.width,
    "height": window.height
}

# Main detection loop
while True:
    # Capture a screenshot of the window
    sct_screenshot = sct.grab(monitor)
    # Convert the screenshot to a format YOLO/OpenCV can use
    # mss returns RGB(A), OpenCV expects BGR - so we convert it
    frame = np.array(sct_screenshot)
    frame = cv2.cvtColor(frame, cv2.COLOR_RGBA2BGR)

    # Run YOLO detection (same as your original code)
    results = model(frame)

    # Draw detection boxes/labels on the frame
    annotated_frame = results[0].plot()

    # Show the result window
    cv2.imshow('YOLO Window Detection', annotated_frame)

    # Exit when you press 'q'
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Clean up resources when done
sct.close()
cv2.destroyAllWindows()

Step 3: Fix Common Errors You Might Be Seeing

If you're getting errors, here's how to troubleshoot the most common issues:

  • "Window not found" error:
    • The window title you used doesn't match exactly. Run print(gw.getAllTitles()) to see all open window titles, then copy-paste the exact one for pygta5.
  • Black/garbled frame, or YOLO isn't detecting anything:
    • Make sure the pygta5 window is visible (not minimized or covered by another window). mss can only capture what's on screen.
    • Double-check the color conversion step: cv2.COLOR_RGBA2BGR is necessary because mss captures in RGBA format, while YOLO expects BGR.
  • Slow performance:
    • Try using a smaller YOLO model (like yolov8n.pt instead of yolov8x.pt) or resize the pygta5 window to a smaller size.
    • Add verbose=False to the model call: results = model(frame, verbose=False) to reduce console output overhead.
  • Detection boxes are offset:
    • Some Windows apps have a title bar or border that's included in the window's top/left coordinates. If boxes are off, you can adjust the monitor values (e.g., top: window.top + 30 to skip the title bar—adjust the number based on your system's title bar height).

Optional: More Precise Window Capture (For Advanced Cases)

If pygetwindow isn't giving you accurate coordinates (e.g., for borderless windows), you can use pywin32 to get the client area (the actual content of the window, excluding title bars/borders). But this is more complex—stick with mss + pygetwindow first since it's simpler for beginners.

内容的提问来源于stack exchange,提问作者Tobias

火山引擎 最新活动