Implements mean shift clustering from scratch — iteratively shifting each point toward the local density maximum until convergence. K-d trees accelerate the neighbor lookups needed for kernel density estimation, and GPU batching makes the bandwidth search tractable over large point sets.

Key ideas covered:
- Mean shift: kernel density gradient ascent, no cluster count needed
- K-D trees: spatial partitioning for O(log n) range queries
- GPU batching: parallelizing the shift updates across the full dataset