Well-Separated Pair Decomposition for the Unit-Disk Graph Metric and its ApplicationsJie Gao∗Li Zhang†January 30, 2005AbstractWe extend the classic notion of well-separated pair decomposition [10] to the (weighted) unit-disk graph metric: the shortest path distance metric induced by the intersection graph of unit disks. We show that for the unit-disk graph metric of n points in the plane and for any constant c ≥ 1, there exists a c-well- separated pair decomposition with O(nlogn) pairs, and the decomposition can be computed in O(nlogn) time. We also show that for the unit-ball graph metric in k dimensions where k ≥ 3, there exists a c-well-separated pair decomposition with O(n2−2/k) pairs, and the bound is tight in the worst case. We present theapplication of the well-separated pair decomposition in obtaining efficient algo- rithms for approximating the diameter, closest pair, nearest neighbor, center, median, and stretch factor, all under the unit-disk graph metric.Keywords well separated pair decomposition, unit-disk graph, approximation al- gorithmAMS Subject Classifications. 68W25, 68W40, 68R10, 05C851IntroductionWell-separated pair decomposition, introduced by Callahan and Kosaraju[10], has found numerous applications in solving proximity problems for points in the Eu- clidean space[8, 10, 9, 5, 4, 28, 24, 18, 14].A pair of point sets (A,B) is c-well- separated if the distance between A,B is at least c times the diameters of both A and B. A well-separated pair decomposition of a point set consists of a set of well- separated pairs that “cover” all the pairs of distinct points, i.e. any two distinctpoints belong to the different sets of some pair. In[10], Callahan and Kosaraju showed that for any point set in an Euclidean space and for any constant c ≥ 1, there always exists a c-well-separated pair decomposition with linearly many pairs. This fact has been very useful in obtaining nearly linear time algorithms for many∗Department of Computer Science,Stanford University, Stanford,CA 94305.E-mail:

[email protected] †Hewlett-Packard Labs, 1501 PageMill Road, Palo Alto, CA 94304. E-mail:

[email protected] such as computing k-nearest neighbors, N-body potential fields, geometric spanners, approximate minimum spanning trees etc. Well-separated pair decompo-sition is also shown very useful in obtaining efficient dynamic, parallel, and external memory algorithms[8, 9, 10, 7, 17].The definition of well-separated pair decomposition can be naturally extended to any metric space. Curiously enough however, there has been no work for such an extension. One reason is probably that a general metric space may not admit a well- separated pair decomposition with a sub-quadratic size. Indeed, even for the metric induced by a star tree with unit weight on each edge1, any well-separated pair de- composition requires quadratically many pairs. This makes the well-separated pair decomposition useless for such a metric. In this paper, we will show that for a certain metric, there do exist well-separated pair decompositions with almost linear size, andtherefore many proximity problems under that metric can be solved efficiently. The metric we investigate is the so called unit-disk graph metric. At the same time we call the well-separated pair decomposition in the Euclidean space the geometric well separated pair decomposition, to be distinguished from the decomposition in graph metrics. For a point set S in the plane, its unit-disk graph[12]is formed by connecting two points p,q in S if the Euclidean distance d(p,q) is at most 1.A unit-disk graph can also be viewed as the intersection graph of a set of unit disks centered at the points in S.We consider the weighted unit-disk graphs where each edge (p,q) receives the weight d(p,q).Similarly, unit-ball graphs can be defined for points in high dimensions. Such graphs have been used extensively to model thecommunication or influence between objects[26, 20]and studied in many different contexts[12, 6, 21, 15]. For an example, wireless ad hoc networks can be modeled by unit-disk graphs[22, 29, 30], as two wireless nodes can directly communicate with each other only if they are within certain distance. In unsupervised learning, given a dense sampling of points from some unknown manifold, the length of the shortest path on the unit-ball graph is a good approximation of the geodesic distance on the underlying (unknown) manifold if radius is chosen appropriately [?, ?]. In this paper, we show that the all pairs shortest path lengths of nodes in unit-diskgraphs (or unit-ball graphs) can be compactly encoded and efficiently estimated by a subquadratic size well-separated pair decomposition.2OverviewIn this paper, we show that for the metric induced by the unit-disk graph on n points and for any constant c ≥ 1, there does exist a c-well-separated pair decomposition with O(nlogn) pairs, and such a decomposition can be computed in O(nlogn) time. We also show that the bounds can be extended to higher dimensions: for k ≥ 3, there always exists a c-well-separated pair decomposition with size O(n2−2/k) for1A metric induced by a graph (with positive edge weights) is the shortest path distance metricof the graph.2the unit-ball graph metric on n points, and the bound is tight in the worst case. The construction time is O(n4/3polylogn) for k = 3 and O(n2−2/k) for k ≥ 4.The difficulty in obtaining a well-separated pair decomposition for unit-disk graph metric is that two points that are close in the space are not necessarily closeunder the graph metric. We first prove the bound for the point set with constant- bounded density, i.e. a point set where any unit disk covers only a constant number of points in the set, by using a packing argument similar to the one in[19]. For a point set with unbounded density, we apply the clustering technique similar to the one used in[16]to the point set and obtain a set of “clusterheads” with a bounded density. We then apply the result for bounded density point set on those clusterheads.Then, by combining the well-separated pair decomposition for the bounded density point sets and for the Euclidean metric, we are able to show that the bound holds for any point sets.For a pair of well-separated sets, the distance between two points from different sets can be approximated by the “distance” between the two sets or the distancebetween any pair of points in different sets. In other words, a well-separated pair decomposition can be thought as a compressed representation to approximate the Θ(n2) pairwise distances. Many problems that require to check the pairwise short- est distances can therefore be approximately solved by examining those distances between the well-separated pairs of sets. When the size of the well-separated pairdecomposition is sub-quadratic, it often gives us more efficient algorithms than examining all the pairwise distances. Indeed, this is the intuition behind many ap- plications of the geometric well-separated pair decomposition. By using the same intuition, we show the application of well-separated pair decomposition in several proximity problems under the unit-disk graph metric.Specifically, we consider the following natural proximity problems. Assume that S1⊆ S.• Furthest neighbor, diameter, center. The furthest neighbor of p ∈ S1is the point in S1that maximizes the distance to p. Related problems include computing the diameter, the maximum pairwise shortest distance for points in S1, and the center, the point that minimizes the maximum distance to all the other points.• Nearest neighbor, closest pair. The nearest neighbor of p ∈ S1is the point in S1with the minimum distance to p. Related problems include computing the closest pair, the pair with minimum shortest distance, and the bichromaticclosest pair, the pair that minimizes distance between points from two different sets.• Median. The median of S is the point in S that minimizes the average (or total) distance to all the other points.• Stretch factor. For a graph G defined on S, its stretch factor with respect tothe unit-disk graph metric is defined to be the maximum ratio πG(p,q)/π(p,q)3where πG,π are the distances induced by G and by the unit-disk graph, re- spectively.All the above problems can be solved or approximated efficiently for points in the Euclidean space. However, for the metric induced by a graph, even for planar graphs, very little is known other than solving the expensive all-pairs shortest path problem. For computing diameter, there is a simple linear time method that achieves a 2- approximation and a 4/3-approximate algorithm with running time O(m√nlogn+ n2logn), for a graph with n vertices and m edges, by Aingworth et al.[2].By using the powerful tool of the well-separated pair decomposition, we are able to obtain, for all the above problems, nearly linear time algorithms for computing 2.42-approximation2and O(n√nlogn/ε3) time algorithms for computing (1 + ε)- approximation for any ε > 0. In addition, the well-separated pair decomposition can be used to obtain an O(nlogn/ε4) space distance oracle so that any (1 + ε) distance query in the unit-disk graph can be answered in O(1) time. While the existence of almost linear size well-separated pair decomposition has reduced the number of pairs needed to examine when solving many proximity prob- lems, we still need good approximation of the distances between those pairs. Our construction algorithm only produces well separated pair decompositions without knowing an accurate approximation of the distances. For approximation algorithms, we need accurate estimation of shortest distances between O(nlogn) pairs of points in the unit-disk graph. Indeed, the approximation ratio and the running time of ouralgorithms are dominated by the efficiency of such algorithms. Once the distance estimation has been made, the rest of computation only takes almost linear time. For a general graph, it is unknown whether O(nlogn) pairs shortest path dis-tances can be computed significantly faster than all pairs shortest path distances. For the planar graph, one can compute O(nlogn) pairs shortest path distance in O(n√nlogn) time by using separators with O(√n) size[3]. This method extends to the unit-disk graph with constant bounded density since such graphs enjoy similar separator property as the planar graphs[27, 31]. As for approximation, Thorup[33] recently discovered an algorithm for planar graphs that can answer any (1 + ε)- shortest distance query in O(1/ε) time after almost linear time preprocessing. Un- fortunately, Thorup’s algorithm uses balanced shortest-path separators in planar graphs which do not obviously extend to the unit-disk graphs. On the other hand, it is known that there does exist planar 2.42-spanner for a unit-disk graph[25]. By applying Thorup’s algorithm to that planar spanner, we can compute 2.42- approximate shortest path distance for O(nlogn) pairs in almost linear time. Another application of well-separated pair decomposition is that we are able to obtain an almost linear size data structure to answer (1 + ε)-approximate shortest path query in O(1) time. Approximate distance oracles have been studied where the emphasis is often on the size of the oracles (for a survey, see[36]). For general graphs,2For a minimization problem, a quantityˆ‘ is a c-approximation of ‘ if ‘ ≤ˆ‘ ≤ c‘. An object ˆO is a c-approximation of O with respect to a cost function f if f(O) ≤ f(ˆO) ≤ cf(O). For a maximization problem,ˆ‘ is a c-approximation of ‘ if ‘/c ≤ˆ‘ ≤ ‘, andˆO is a c-approximation of O if f(O)/c ≤ f(ˆO) ≤ f(O).4it has been shown that it is possible to construct a (2k − 1)-approximate distance oracle with size O(kn1+1/k)[34]. It is also shown in[34]that this bound is tight for some small k’s and is conjectured to be tight for all the k’s. For planar graphs, Thorup[33]and Klein[23]have shown that there exists (1+ε)-approximate distance oracle by using almost linear space for any ε > 0. As mentioned before, their results do not extend to the unit-disk graph. In addition, the query time of their algorithm is O(1/ε). Recently, Gudmundsson et al. showed that when a geometric graph is an Euclidean spanner, there does exist an almost linear time (and therefore almost linear space) method to construct (1+ε)-approximate and O(1) query time distance oracles[18]. Again, a unit-disk graph is not necessarily an Euclidean spanner with bounded stretch factor, and their technique does not extend.3DefinitionsUnit-disk graphs. Denote by d(·,·) the Euclidean metric. For a set of pointsS in the plane, the unit-disk graph I(S) = (S,E) is defined to be the weighted graph where an edge e = (p,q) is in the graph if d(p,q) ≤ 1, and the weight of eis d(p,q). Likewise, we can define the unit-ball graph for points in higher dimensions.Metric space. Suppose that (S,π) is a metric space where S is a set of elementsand π the distance function defined on S × S.For any subset S1⊆ S, the di- ameter Dπ(S1) (or D(S1) when π is clear from the context) of S is defined to be maxs1,s2∈S1π(s1,s2). The distance π(S1,S2) between two sets S1,S2⊆ S is defined to be mins1∈S1,s2∈S2π(s1,s2). In this paper, we are interested in the unit-disk graph metric π = πI(S)induced by the unit-disk graph of a set of points S in the plane, where the distance betweenany two nodes is defined to be the length of the shortest path between them.Well-separated pair decomposition. For a metric space (S,π), two non-empty subsets S1,S2⊆ S are called c-well-separated if π(S1,S2) ≥ c·max(Dπ(S1),Dπ(S2)).Following the definition in[10], for any two sets A and B, a set of pairs P = {P1,P2,··· ,Pm}, where Pi= (Ai,Bi), is called a pair decomposition of (A,B) (or of A if A = B) if• for all the i’s, Ai⊆ A, and Bi⊆ B;• Ai∩ Bi= ∅;• for any two elements a ∈ A and b ∈ B, there exists a unique i such that a ∈ Ai, and b ∈ Bi. We call (a,b) is covered by the pair (Ai,Bi).If in addition, every pair in P is c-well-separated, P is called a c-well-separated pair decomposition (or c-WSPD in short). Clearly, any metric space admits a c- WSPD with quadratic size by using the trivial family that contains all the pairwise elements.54Well-separated pair decomposition for unit-disk graph metricWe start with the point set with constant bounded density. Then, by combining with geometric well-separated pair decomposition, we show the extension of the result to arbitrary point sets. We will focus our discussion on points in the plane, but most results extend to higher dimensions, resulting sub-quadratic size well-separated pair decomposition. We also show that our bounds in IRkfor k ≥ 3 are tight.4.1Point sets with constant bounded densityThe density α of a point set S is defined to be the maximum number of points in S covered by a unit disk. S has constant bounded density if its density is O(1). We assume that the unit-disk graph on S is connected; otherwise, we can consider each connected component separately.To construct a w