R-Tree, Oct-Tree, and similar spatial trees can quickly find close points for arbitrary sets of points.
- find the leaf containing your point (or the closest to it)
- Within the leaf find the closest contained point
- Search the tree again for any leaf intersecting the volume centred on the first point, containing the closest point just found.
- Excluding the already searched leaf, search the intersecting leaves for a closer point.
- You now have the closest point
If the data changes just use an R-Tree/Oct Tree/Spatial Tree. It will give you log(N)
lookups.
But If the data does not change, or it is cost-effective to do so, you can pre-index the data.
One way is to simply create a hash table over all known data points, and a sub-sampling of "grid-points". The value in the hash points to the tree node/s its located in. The idea of grid-points is so that novel data points can be mapped to their closest grid point to start this search off. Depending on how the hash is calculated this can speed up searching the data significantly. Log(N) -> Amort O(1)
The next way is to flatten the trees out themselves into arrays of segments. Each segment being a bound on that dimension, eg: [0 <= _ < 3, 3 <= _ <= 8, 8 < _ <= 9, ...]
(In fact you can optimise this further as each pair of cells shares the same number, and if <= 9 is the same as < 10 you can flatten even further by converting to the same operator and moving the number).
To create the segment arrays iterate through the points and add them into their own segments. For the moment list them if multiple share the same point.
Go through your segments and if there is more than one element pick another dimension and repeat for just those listed elements.
At worst you will have an N dimensional lookup. If you pick the same dimensions in the same order its easier to perform lookups. If you can pick a different dimension on each dereference you can reduce the overall depth at the cost of complexity.