如何优化Google Apps Script中非递归数据扁平化算法
Optimizing Non-Recursive Flattening for Google Apps Script Data Studio Connectors
Hey there! I see you're tackling a performance bottleneck with flattening nested API responses for Google Data Studio connectors in Apps Script—great call going with a non-recursive stack approach, since recursion would definitely tank with large datasets. Let's break down how to optimize your existing stackIt function to speed things up.
What's Slowing Down Your Current Code?
Looking at your implementation, a few key patterns are eating up performance, especially with big datasets:
- Excessive array copies: The repeated
slice(0)calls to clone parent fields add unnecessary memory overhead, and Apps Script isn't great with frequent memory operations. - Redundant field handling: You're maintaining multiple arrays (
data_fields,current_fields) with overlapping logic, which adds extra loop iterations and data shuffling. - Unnecessary variable initialization: Variables like
pushingand redundant array resets inside loops create avoidable overhead. - Order inversion: Using
pop()on the stack reverses the order of your child elements, which might not be intentional (and fixing it cleanly can also help with performance).
Optimized Implementation
Here's a revamped version of your function with targeted fixes, plus explanations of each improvement:
function optimizedStackIt(data) { const totalData = []; // Initialize stack with clear, concise properties const stack = [{ node: data, parentValues: [] }]; let expectedFieldCount = null; while (stack.length > 0) { const { node, parentValues } = stack.pop(); if (!node) continue; let childArray = null; const currentValues = []; // Use Object.entries to avoid prototype chain traversal (faster than for...in) const fieldEntries = Object.entries(node); for (const [key, value] of fieldEntries) { // Prioritize finding the child array first (per your "no sibling nesting" rule) if (Array.isArray(value) && value.length > 0 && !childArray) { childArray = value; } // Capture non-object values directly else if (typeof value !== 'object' || value === null) { currentValues.push(value); } // We ignore non-array objects since you stated no sibling nesting } if (childArray) { // Combine parent and current values ONCE instead of repeated copies const combinedParent = [...parentValues, ...currentValues]; // Push children in reverse order to preserve original sequence when popped for (let i = childArray.length - 1; i >= 0; i--) { const child = childArray[i]; // Skip errored entries early to avoid unnecessary stack pushes if (child && !('errors' in child)) { stack.push({ node: child, parentValues: combinedParent }); } } } else { // Build the final row in one step const row = [...parentValues, ...currentValues]; // Set expected field count only once (avoids repeated checks) if (expectedFieldCount === null) { expectedFieldCount = row.length; } // Validate row length (keep only if you need this safety check) if (row.length === expectedFieldCount) { totalData.push(row); } } } return totalData; }
Key Improvements Explained
Cut down on array copies:
- Instead of cloning parent fields every time we process a node, we only combine parent and current values once when pushing child nodes to the stack. This eliminates repeated
slice(0)calls that were eating up memory. - We use the spread operator (
[...a, ...b]) for clean, efficient array merging where needed.
- Instead of cloning parent fields every time we process a node, we only combine parent and current values once when pushing child nodes to the stack. This eliminates repeated
Faster field traversal:
Object.entries()is faster thanfor...inloops because it skips prototype chain properties, which we don't need here. It also gives us direct access to key-value pairs without extra lookups.
Stack order preservation:
- By pushing child elements in reverse order, we ensure that when we pop them from the stack, they're processed in the original sequence (your original code was reversing child order inadvertently).
Simplified logic:
- Removed redundant variables like
pushingandtotalFields, replacing them with a singleexpectedFieldCountthat's set once. - Streamlined row construction to avoid multiple array splices and concatenations.
- Removed redundant variables like
Bonus Performance Tips
- Drop console logs: Any
console.logstatements in production code will slow things down dramatically with large datasets—make sure to remove them. - Pre-convert data types: If your API returns numeric values as strings, convert them to numbers early (before flattening) to avoid type coercion overhead later.
- Skip generic logic if possible: If your API structure has a predictable number of levels (e.g., always campaigns → ads), you could write a direct nested loop instead of a generic stack algorithm. This would cut out the overhead of dynamic node processing entirely.
内容的提问来源于stack exchange,提问作者Alex Escobedo




