Approximating Nearest Neighbor Distances
Michael B. Cohen, Brittany Terese Fasy, Gary L. Miller, Amir Nayyeri, Donald R. Sheehy, and Ameya Velingker
WADS: Algorithms and Data Structures Symposium 2015
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points. In this paper, we consider a natural metric with this property, which we call the nearest neighbor metric. Given a point set~$P$ and a path~$\gamma$, this metric is the integral of the distance to $P$ along $\gamma$. We describe a \mbox{$(3+\varepsilon)$}-approximation algorithm and a more intricate $(1+\varepsilon)$-approximation algorithm to compute the nearest neighbor metric. Both approximation algorithms work in near-linear time. The former uses shortest paths on a sparse graph defined over the input points. The latter uses a sparse sample of the ambient space, to find good approximate geodesic paths.
@inproceedings{cohen15approximating,
Author = {Michael B. Cohen and Brittany Terese Fasy and Gary L. Miller and Amir Nayyeri and Donald R. Sheehy and Ameya Velingker},
Booktitle = {Proceedings of the Algorithms and Data Structures Symposium},
Year = {2015}}