Python内存耗尽问题排查：多进程场景下内存持续减少原因分析

阿华AIGC实验室

2026-5-15

Memory Leak with multiprocessing.Pool, Pickle, and Numpy

Let's break down why your available memory is steadily dropping and how to fix this issue step by step:

What's Causing the Memory Leak?

Frequent Pool Creation (and Improper Cleanup)
You’re spawning a new mp.Pool(10) in every loop iteration, then only calling p.close() afterward. close() just stops the pool from accepting new tasks—it doesn’t wait for child processes to finish and shut down. These lingering child processes hold onto memory from their previous tasks, and over multiple loops, they pile up, eating into your system’s available RAM.
Slow Garbage Collection for Numpy Data
Even with gc.collect(), references to results and temp_list might linger. Numpy arrays use C-level memory management that Python’s garbage collector doesn’t always reclaim immediately, so without explicitly deleting these objects, their memory stays tied up longer than necessary.
Pickle’s Inefficiency for Numpy Arrays
Pickle adds unnecessary metadata when serializing numpy arrays, and during the dump operation, it often creates temporary in-memory copies of your data. This extra memory overhead can push you over the edge when you’re already low on available RAM.

Fixes to Stop the Leak

1. Reuse a Single Pool (With Automatic Cleanup)

Create your pool once before the loop, reuse it for all iterations, and use a with statement to handle cleanup automatically. This avoids spawning new child processes repeatedly and ensures they’re properly terminated:

import multiprocessing as mp
import pickle
import numpy as np
import psutil
import gc

test = [np.random.choice(range(1, 1000), 1000000) for el in range(1,1000)]
step_size = 10**4

# Create the pool once outside the loop
with mp.Pool(10) as p:
    for i in range(0, len(test), step_size):
        temp_list = test[i:i+step_size]
        results = p.map(some_function, temp_list)
        
        # Explicitly delete temporary objects to free memory
        del temp_list
        gc.collect()
        
        # Check memory (for debugging)
        mem = psutil.virtual_memory()
        print(f'Memory available in GB: {mem.available/(1024**3):.2f}')
        
        # Save results
        with open(f'file_to_store_{int(i/step_size)}.pickle', 'wb') as f:
            pickle.dump(results, f)
        
        # Delete results before the next iteration
        del results
        gc.collect()

The with statement automatically calls close() and join() on the pool when the loop finishes, ensuring all child processes are shut down and their memory is reclaimed.

2. Use Numpy’s Native Serialization Instead of Pickle

If your results contain numpy arrays, numpy.save or numpy.savez is far more efficient than pickle—it’s built specifically for numpy data and avoids extra metadata overhead. Swap out the pickle dump for this:

# If results is a list of numpy arrays, convert to a single array first
np.save(f'file_to_store_{int(i/step_size)}.npy', np.array(results))

# Use savez if you need to store multiple separate arrays
np.savez(f'file_to_store_{int(i/step_size)}.npz', results=results)

This cuts down on memory spikes during serialization, which is often what triggers the final MemoryError.

3. Avoid Holding the Entire `test` List in Memory

Your test list already takes up ~3.7GB of memory (999 arrays × 1 million integers each). If possible, generate or load chunks of test as you need them instead of precomputing the whole list upfront. For example, generate each chunk inside the loop instead of storing all of them in memory at once.

4. Monitor Your Process’s Memory Usage

Instead of checking system-wide available RAM, track your Python process’s memory usage to pinpoint the leak:

process = psutil.Process()
print(f'Current process memory usage (GB): {process.memory_info().rss/(1024**3):.2f}')

This will tell you if the main process is hoarding memory, or if lingering child processes are the culprit.

Why These Fixes Work

Reusing the pool eliminates the overhead of spawning new child processes and ensures proper cleanup.
Explicitly deleting objects and calling gc.collect() forces Python to free up memory immediately, instead of waiting for automatic garbage collection.
Numpy’s native serialization reduces memory spikes during saving, preventing the final MemoryError.

内容的提问来源于stack exchange，提问作者user1700890