基于Python实现3D空间下随机选取个体的10个最近邻查询
Got it, let's work through this problem step by step to build a complete, functional Python implementation. We'll cover generating the dataset, picking a random person, calculating distances, and finding the closest neighbors.
Step 1: Set Up the Dataset with Names and GPS Coordinates
First, we need to pair each of the 750 entries with a unique name (your initial code only has coordinates). We'll simulate realistic GPS values (latitude: -90 to 90, longitude: -180 to 180, altitude: 0 to 5000 meters) for this example.
import random from math import sqrt, radians, sin, cos, atan2 # Generate 750 people with unique names and simulated GPS coordinates people = [] for person_id in range(750): name = f"Person_{person_id + 1}" # Simulate GPS coordinates latitude = random.uniform(-90.0, 90.0) longitude = random.uniform(-180.0, 180.0) altitude = random.uniform(0.0, 5000.0) # Altitude in meters people.append( (name, latitude, longitude, altitude) )
Step 2: Randomly Select a Target Person
Use random.choice() to pick one person from our dataset:
# Pick a random person from the list target_person = random.choice(people) print(f"Randomly selected target: *{target_person[0]}*")
Step 3: Calculate Distances & Find Nearest Neighbors
Important Note on Distance Calculation
For real-world GPS data, the Haversine formula is the standard for calculating the great-circle distance between two points on Earth's surface (ignoring altitude). If altitude is a factor, we can extend it to a 3D distance calculation. For simplicity in this example, we'll show both options.
Option 1: 3D Euclidean Distance (Simplified for Simulated Data)
This works well if your coordinates are normalized or you don't need precise real-world distances:
def calculate_3d_euclidean(person_a, person_b): # Extract coordinates from the person tuples lat_a, lon_a, alt_a = person_a[1], person_a[2], person_a[3] lat_b, lon_b, alt_b = person_b[1], person_b[2], person_b[3] # Calculate squared differences and sum, then take square root lat_diff = lat_a - lat_b lon_diff = lon_a - lon_b alt_diff = alt_a - alt_b return sqrt(lat_diff**2 + lon_diff**2 + alt_diff**2)
Option 2: 3D Haversine Distance (Real-World GPS Accuracy)
Use this if you need accurate distances for actual GPS data:
def calculate_3d_haversine(person_a, person_b): # Convert degrees to radians (required for trigonometric functions) lat_a, lon_a = radians(person_a[1]), radians(person_a[2]) lat_b, lon_b = radians(person_b[1]), radians(person_b[2]) # Haversine formula for great-circle distance (surface of Earth) delta_lat = lat_b - lat_a delta_lon = lon_b - lon_a a = sin(delta_lat / 2)**2 + cos(lat_a) * cos(lat_b) * sin(delta_lon / 2)**2 c = 2 * atan2(sqrt(a), sqrt(1 - a)) # Earth radius in kilometers earth_radius = 6371.0 surface_distance = earth_radius * c # Add altitude component (convert meters to kilometers) alt_diff_km = (person_a[3] - person_b[3]) / 1000 total_distance = sqrt(surface_distance**2 + alt_diff_km**2) return total_distance
Find the 10 Closest Neighbors
Now we'll compute distances from the target to everyone else, sort them, and pick the top 10 (excluding the target themselves):
# Choose which distance function to use (swap between the two options above) distance_function = calculate_3d_euclidean # Calculate distances to all other people distance_list = [] for person in people: if person == target_person: continue # Don't include the target in their own neighbor list distance = distance_function(target_person, person) distance_list.append( (distance, person[0]) ) # Sort by distance (ascending order) and take the first 10 entries distance_list.sort() top_10_neighbors = [name for (dist, name) in distance_list[:10]] # Print the result print(f"\n10 Nearest Neighbors of *{target_person[0]}*:") for rank, name in enumerate(top_10_neighbors, 1): print(f"{rank}. {name}")
Key Notes
- You can easily swap between the two distance functions depending on your use case.
- If you have a pre-existing dataset of names and coordinates, replace the
peoplelist generation code with your actual data (e.g., loading from a CSV or database). - For larger datasets, consider using spatial indexing libraries like
scipy.spatial.KDTreeto speed up nearest neighbor searches.
内容的提问来源于stack exchange,提问作者Tonikami04




