You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

Spring优雅停机时正确关闭SSE连接的方法及相关异常问题排查

Spring优雅停机时正确关闭SSE连接的方法及相关异常问题排查

Hi there, let's break down why you're facing this graceful shutdown timeout and NPE issue with your SSE implementation, and walk through the fixes step by step.

🔍 Root Cause Analysis

From your logs and code, two key issues are causing the problem:

  1. Incorrect Shutdown Order
    Spring's graceful shutdown process starts with the web server (Undertow) waiting for active requests to complete. Your @PreDestroy method only runs after the web server's 30-second grace period times out and starts forcing shutdown. By then, Undertow's internal context (like Deployment) is already destroyed, so calling SseEmitter.complete() throws a NullPointerException.

  2. Incomplete SseEmitter Cleanup
    If your onCompletion/onTimeout/onError callbacks don't properly remove dead SseEmitter instances from the emitters map, the map retains references to these emitters. This makes the web server treat them as active requests, triggering the graceful shutdown timeout, and also prevents your SSEService bean from being destroyed properly.

🛠️ Step-by-Step Solutions

1. Use SmartLifecycle to Control Shutdown Order

Instead of relying solely on @PreDestroy, implement SmartLifecycle to ensure your SSE connections are closed before the web server's grace period ends. This lets you define a shutdown phase that runs earlier than the web server's graceful shutdown.

Here's the updated SSEService code:

import org.springframework.context.SmartLifecycle
import org.springframework.web.servlet.mvc.method.annotation.SseEmitter
import java.util.concurrent.ConcurrentHashMap
import java.util.concurrent.CopyOnWriteArrayList
import jakarta.annotation.PreDestroy
import org.slf4j.LoggerFactory

class SSEService : AutoCloseable, SmartLifecycle {
    private val logger = LoggerFactory.getLogger(javaClass)
    private val emitters: MutableMap<SSEEventType, MutableList<SseEmitter>> = ConcurrentHashMap()
    private var running = true
    private var isServiceRunning = false

    // --- SmartLifecycle Methods ---
    override fun isRunning(): Boolean = isServiceRunning

    // Set a phase lower than Undertow's web server phase (usually 2147483647)
    // Ensures this bean shuts down BEFORE the web server starts forcing connections closed
    override fun getPhase(): Int = 2147483646

    override fun start() {
        isServiceRunning = true
        logger.info("SSE Service started")
    }

    override fun stop() {
        shutdownAllEmitters()
        isServiceRunning = false
    }

    override fun stop(callback: Runnable) {
        shutdownAllEmitters()
        isServiceRunning = false
        callback.run() // Notify Spring shutdown is complete
    }

    // --- Shutdown Logic ---
    private fun shutdownAllEmitters() {
        if (!running) return
        running = false
        logger.info("Shutting down SSE service... Closing all active emitters")

        // Iterate and complete emitters safely, catching exceptions to avoid blocking
        emitters.values.flatMap { it.toList() }.forEach { emitter ->
            try {
                if (!emitter.isCompleted && !emitter.isCompletedWithError && !emitter.isTimedOut) {
                    emitter.complete()
                }
            } catch (e: Exception) {
                logger.warn("Failed to complete SseEmitter gracefully", e)
            }
        }
        emitters.clear()
    }

    // Fallback cleanup if SmartLifecycle doesn't trigger (e.g., bean destruction)
    @PreDestroy
    override fun close() {
        shutdownAllEmitters()
    }

    // --- Emitter Management ---
    fun newEmitterFor(type: SSEEventType): SseEmitter {
        val emitter = SseEmitter(Long.MAX_VALUE)
        val emitterList = emitters.getOrPut(type) { CopyOnWriteArrayList() }
        emitterList.add(emitter)

        // Reusable cleanup logic to remove emitter from the map
        val cleanup = {
            emitterList.remove(emitter)
            if (emitterList.isEmpty()) {
                emitters.remove(type) // Clean up empty lists to save memory
            }
        }

        emitter.onCompletion {
            logger.debug("SSE emitter completed, cleaning up")
            cleanup()
        }

        emitter.onTimeout {
            logger.debug("SSE emitter timed out, cleaning up")
            cleanup()
            try {
                emitter.completeWithError(IllegalStateException("SSE connection timed out"))
            } catch (e: Exception) {
                logger.warn("Failed to complete timed out emitter", e)
            }
        }

        emitter.onError { e: Throwable ->
            logger.error("SSE emitter encountered error", e)
            cleanup()
        }

        return emitter
    }

    // --- Your Existing Methods ---
    fun emitServiceStatus(statusList: List<ServiceStatusDTO>) {
        // Your existing emit logic here
    }
}

2. Configure Spring Graceful Shutdown Correctly

Add these properties to your application.properties (or application.yml) to enable and tune graceful shutdown:

# Enable global graceful shutdown
spring.web.graceful-shutdown.enabled=true

# Set timeout for each shutdown phase (adjust as needed)
spring.lifecycle.timeout-per-shutdown-phase=30s

# Undertow-specific graceful shutdown timeout
server.undertow.graceful-shutdown-timeout=30s

3. Verify Emitter Cleanup

Double-check that your cleanup logic in onCompletion/onTimeout/onError correctly removes emitters from the emitters map. The reusable cleanup lambda in the code above ensures this, preventing memory leaks and lingering active requests.

🧪 Why This Works

  1. Early Shutdown Trigger: The SmartLifecycle phase ensures shutdownAllEmitters() runs before Undertow starts waiting for active requests. Completing all emitters tells the web server these requests are no longer active, so the graceful shutdown completes without timing out.
  2. Safe Context Access: Since we're closing emitters while the web server is still in graceful wait mode, Undertow's context is still active, so SseEmitter.complete() won't throw an NPE.
  3. Leak Prevention: Proper cleanup removes dead emitters from the map, so the SSEService bean can be destroyed cleanly without leaving references hanging.

📝 Additional Tips

  • Test Hot Code Replace: With the correct shutdown order, hot code replace should work again because the SSE connections are closed gracefully before the bean is reloaded.
  • Log Emitter Activity: Add debug logs for emitter creation/cleanup to track if any emitters are lingering.
  • Handle Edge Cases: In shutdownAllEmitters, we check the emitter's state before calling complete() to avoid redundant operations and exceptions.

Let me know if you run into any issues after implementing these changes!

火山引擎 最新活动