基于TensorFlow.js的BodyPix：用数字/字母替代颜色实现人体部位分割

阿华AIGC实验室

2026-5-12

How to Label Body Parts with Numbers/Letters Instead of Colors in BodyPix Video Overlays

Got it, let's figure out how to replace those colored segmentations with numbers or letters for each body part using BodyPix and TensorFlow.js. This is totally doable by tapping into the raw segmentation data and drawing text overlays directly on a canvas. Here's a step-by-step guide with code examples:

1. Understand the BodyPix Part Segmentation Output

First, when you use BodyPix's segmentPersonParts() method, it returns a Segmentation object that includes a data array. Each value in this array corresponds to a pixel's body part ID (0 = background, 1 = left face, 2 = right face, 3 = left upper arm, etc.). You can map these IDs to any labels you want—numbers, letters, or even abbreviations.

2. Set Up Your HTML Structure

You'll need a video element and a canvas element stacked on top of each other. The canvas will be where we draw our text labels:

<div style="position: relative; width: 640px; height: 480px;">
  <video id="video" width="640" height="480" autoplay muted playsinline></video>
  <canvas id="overlay" width="640" height="480" style="position: absolute; top: 0; left: 0;"></canvas>
</div>

3. Load the Model and Process Video Frames

Here's the core JavaScript code. We'll load the BodyPix model, run segmentation on each video frame, calculate the center of each body part, then draw the corresponding number/letter at that center:

async function setupBodyPixLabeling() {
  // Load BodyPix model (balance speed/accuracy with these params)
  const model = await bodyPix.load({
    architecture: 'MobileNetV1',
    outputStride: 16,
    multiplier: 0.75,
    quantBytes: 4
  });

  const video = document.getElementById('video');
  const canvas = document.getElementById('overlay');
  const ctx = canvas.getContext('2d');

  // Get user camera access
  const stream = await navigator.mediaDevices.getUserMedia({ video: true });
  video.srcObject = stream;

  // Customize this mapping to use numbers, letters, or abbreviations
  const partLabels = {
    0: '', // Skip background
    1: 'A', // Left face
    2: 'B', // Right face
    3: 'C', // Left upper arm
    4: 'D', // Right upper arm
    5: 'E', // Left lower arm
    6: 'F', // Right lower arm
    7: 'G', // Left hand
    8: 'H', // Right hand
    9: 'I', // Torso front
    10: 'J', // Torso back
    11: 'K', // Left upper leg
    12: 'L', // Right upper leg
    13: 'M', // Left lower leg
    14: 'N', // Right lower leg
    15: 'O', // Left foot
    16: 'P'  // Right foot
  };

  // Animation loop to process each video frame
  function renderFrame() {
    model.segmentPersonParts(video).then(segmentation => {
      // Clear previous labels
      ctx.clearRect(0, 0, canvas.width, canvas.height);

      // Track pixel positions to calculate part centers
      const partCenters = {};
      const partPixelCounts = {};

      // Initialize counters for each part
      Object.keys(partLabels).forEach(id => {
        partCenters[id] = { x: 0, y: 0 };
        partPixelCounts[id] = 0;
      });

      // Iterate over all pixels to sum positions for each body part
      for (let y = 0; y < video.height; y++) {
        for (let x = 0; x < video.width; x++) {
          const pixelIndex = y * video.width + x;
          const partId = segmentation.data[pixelIndex];
          if (partId !== 0) { // Skip background pixels
            partCenters[partId].x += x;
            partCenters[partId].y += y;
            partPixelCounts[partId]++;
          }
        }
      }

      // Draw labels for parts that have detected pixels
      Object.keys(partLabels).forEach(id => {
        const count = partPixelCounts[id];
        if (count > 0) {
          // Calculate average center of the part
          const centerX = partCenters[id].x / count;
          const centerY = partCenters[id].y / count;

          // Style text for readability (tweak these values!)
          ctx.font = '24px Arial';
          ctx.fillStyle = 'white';
          ctx.strokeStyle = 'black';
          ctx.lineWidth = 2;
          ctx.textAlign = 'center';
          ctx.textBaseline = 'middle';

          // Draw text with a black stroke to stand out
          const label = partLabels[id];
          ctx.strokeText(label, centerX, centerY);
          ctx.fillText(label, centerX, centerY);
        }
      });

      // Request next frame to keep the overlay updated
      requestAnimationFrame(renderFrame);
    });
  }

  // Start rendering once the video starts playing
  video.addEventListener('play', () => {
    renderFrame();
  });
}

// Kick off the labeling system
setupBodyPixLabeling();

4. Customization & Optimization Tips

Label Mapping: Swap out the letters in partLabels for numbers (e.g., 1: '1') or shorter abbreviations (e.g., 1: 'LF' for left face) based on your needs.
Text Style: Adjust the font, fillStyle, or strokeStyle to make labels pop against different backgrounds. For example, use a bright yellow fill with a thick black stroke for high contrast.
Performance: If you notice lag, try increasing outputStride (to 32) or decreasing multiplier (to 0.5) to speed up segmentation at the cost of minor accuracy.
Multiple People: To handle multiple people, use segmentMultiPersonParts() instead. You'll need to group parts by person using the personSegmentation data to avoid overlapping labels.