Scala 2.12中等价函数性能差异原因及排查方法问询

阿华AIGC实验室

2026-5-15

Why isSimilarFast Runs 5x Faster Than isSimilarSlow in Scala 2.12

Great question—this performance gap is almost definitely due to JVM JIT optimizations working with short-circuit evaluation, not a Scala compiler bug. Let’s break down exactly what’s going on and how you can confirm it:

The Core Difference: Short-Circuit vs. Forced Evaluation

First, let’s revisit your two functions to spot the key distinction:

Fast Version

def isSimilarFast(s1: Point, s2: Point): Boolean = { 
  Haversine.distance(s1.lat, s1.lon, s2.lat, s2.lon) <= 5.0 && 
  Levenshtein.distance(s1.label, s2.label) <= 2 
}

The && operator in Scala (and all JVM languages) is short-circuiting: if the first condition (Haversine distance ≤5) is false, it never executes the second condition (the Levenshtein distance check). In a tight loop, this is a massive win—especially since Levenshtein’s string edit distance calculation is likely way slower than Haversine’s geometric math.

The JVM’s JIT compiler amplifies this gain: if it notices the Haversine check fails frequently in your loop, it can inline the entire isSimilarFast function, and even optimize away the Levenshtein code path entirely for those failed cases. That’s where the 5x speedup comes from.

Slow Version

def isSimilarSlow(s1: Point, s2: Point): Boolean = { 
  val d = Haversine.distance(s1.lat, s1.lon, s2.lat, s2.lon) 
  val l = Levenshtein.distance(s1.label, s2.label) 
  d <= 5.0 && l <= 2 
}

By assigning both results to vals upfront, you’re explicitly telling the compiler and JVM that both values are required. This means it has to run both distance calculations every single time, even if the Haversine check is already false. There’s no way to short-circuit here, so you’re paying the cost of the slower Levenshtein call on every iteration—hence the 175-second runtime.

How to Diagnose This Yourself

Here are practical steps to confirm this is the root cause:

Count Levenshtein invocations: Add a simple counter inside Levenshtein.distance (or wrap it temporarily) to track how many times it’s called. You’ll see isSimilarFast only invokes it when Haversine passes, while isSimilarSlow calls it every single iteration, no matter what.
Inspect JIT compilation logs: Enable JVM flags like -XX:+PrintCompilation and -XX:+PrintInlining when running your app. You’ll see isSimilarFast gets heavily inlined, and the JIT eliminates the unnecessary Levenshtein path for failed Haversine checks.
Test with edge-case inputs:
- If you run your loop with a dataset where all Haversine distances are >5, isSimilarFast will finish almost instantly (since it never runs Levenshtein), while isSimilarSlow will still take ~175 seconds.
- If all Haversine distances are ≤5, both functions will have nearly identical runtime (since both run Levenshtein every time).
Decompile bytecode: Use javap -c on your compiled class files to compare the bytecode. isSimilarFast will have a conditional branch that skips the Levenshtein call, while isSimilarSlow executes both calls unconditionally.

A Middle Ground: Readable and Fast

If you prefer the readability of intermediate variables but don’t want to sacrifice performance, rewrite the slow version to retain short-circuit behavior:

def isSimilarOptimized(s1: Point, s2: Point): Boolean = { 
  val d = Haversine.distance(s1.lat, s1.lon, s2.lat, s2.lon) 
  d <= 5.0 && {
    val l = Levenshtein.distance(s1.label, s2.label) 
    l <= 2
  }
}

This way, you get the clarity of named variables and the speed benefit of skipping Levenshtein when it’s not needed.

内容的提问来源于stack exchange，提问作者mitchus