You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何优化Google Apps Script中非递归数据扁平化算法

Optimizing Non-Recursive Flattening for Google Apps Script Data Studio Connectors

Hey there! I see you're tackling a performance bottleneck with flattening nested API responses for Google Data Studio connectors in Apps Script—great call going with a non-recursive stack approach, since recursion would definitely tank with large datasets. Let's break down how to optimize your existing stackIt function to speed things up.

What's Slowing Down Your Current Code?

Looking at your implementation, a few key patterns are eating up performance, especially with big datasets:

  • Excessive array copies: The repeated slice(0) calls to clone parent fields add unnecessary memory overhead, and Apps Script isn't great with frequent memory operations.
  • Redundant field handling: You're maintaining multiple arrays (data_fields, current_fields) with overlapping logic, which adds extra loop iterations and data shuffling.
  • Unnecessary variable initialization: Variables like pushing and redundant array resets inside loops create avoidable overhead.
  • Order inversion: Using pop() on the stack reverses the order of your child elements, which might not be intentional (and fixing it cleanly can also help with performance).

Optimized Implementation

Here's a revamped version of your function with targeted fixes, plus explanations of each improvement:

function optimizedStackIt(data) {
  const totalData = [];
  // Initialize stack with clear, concise properties
  const stack = [{ node: data, parentValues: [] }];
  let expectedFieldCount = null;

  while (stack.length > 0) {
    const { node, parentValues } = stack.pop();
    if (!node) continue;

    let childArray = null;
    const currentValues = [];
    // Use Object.entries to avoid prototype chain traversal (faster than for...in)
    const fieldEntries = Object.entries(node);

    for (const [key, value] of fieldEntries) {
      // Prioritize finding the child array first (per your "no sibling nesting" rule)
      if (Array.isArray(value) && value.length > 0 && !childArray) {
        childArray = value;
      }
      // Capture non-object values directly
      else if (typeof value !== 'object' || value === null) {
        currentValues.push(value);
      }
      // We ignore non-array objects since you stated no sibling nesting
    }

    if (childArray) {
      // Combine parent and current values ONCE instead of repeated copies
      const combinedParent = [...parentValues, ...currentValues];
      // Push children in reverse order to preserve original sequence when popped
      for (let i = childArray.length - 1; i >= 0; i--) {
        const child = childArray[i];
        // Skip errored entries early to avoid unnecessary stack pushes
        if (child && !('errors' in child)) {
          stack.push({ node: child, parentValues: combinedParent });
        }
      }
    } else {
      // Build the final row in one step
      const row = [...parentValues, ...currentValues];
      // Set expected field count only once (avoids repeated checks)
      if (expectedFieldCount === null) {
        expectedFieldCount = row.length;
      }
      // Validate row length (keep only if you need this safety check)
      if (row.length === expectedFieldCount) {
        totalData.push(row);
      }
    }
  }

  return totalData;
}

Key Improvements Explained

  1. Cut down on array copies:

    • Instead of cloning parent fields every time we process a node, we only combine parent and current values once when pushing child nodes to the stack. This eliminates repeated slice(0) calls that were eating up memory.
    • We use the spread operator ([...a, ...b]) for clean, efficient array merging where needed.
  2. Faster field traversal:

    • Object.entries() is faster than for...in loops because it skips prototype chain properties, which we don't need here. It also gives us direct access to key-value pairs without extra lookups.
  3. Stack order preservation:

    • By pushing child elements in reverse order, we ensure that when we pop them from the stack, they're processed in the original sequence (your original code was reversing child order inadvertently).
  4. Simplified logic:

    • Removed redundant variables like pushing and totalFields, replacing them with a single expectedFieldCount that's set once.
    • Streamlined row construction to avoid multiple array splices and concatenations.

Bonus Performance Tips

  • Drop console logs: Any console.log statements in production code will slow things down dramatically with large datasets—make sure to remove them.
  • Pre-convert data types: If your API returns numeric values as strings, convert them to numbers early (before flattening) to avoid type coercion overhead later.
  • Skip generic logic if possible: If your API structure has a predictable number of levels (e.g., always campaigns → ads), you could write a direct nested loop instead of a generic stack algorithm. This would cut out the overhead of dynamic node processing entirely.

内容的提问来源于stack exchange,提问作者Alex Escobedo

火山引擎 最新活动