2

I have to create a synthetic spatial dataset to test my spatial query. For this I need to create a synthetic dataset of area [10000][10000] and create almost 20000+ rectangle(mbr) and 20000+ data points.

But, the points can be at outside/boundary of the rectangle but not inside of the rectangles.

Should I generate random rectangles and random points and then check the condition or there is any other logical way to accomplish this task?

How to use the random function to generate float values within range 0-10000 in cpp. As i said i have to generate such 20000+ points and rectangles. So , some good incremental float value generation strategy is required.

enter image description here

Tim Bergel
  • 499
  • 3
  • 9
web2dev
  • 557
  • 10
  • 28
  • 1
    A random float value within range 0-10000: `float f = (float)rand()/RAND_MAX*10000`. – barak manos May 13 '14 at 19:15
  • Put the rectangles (or points) in order from left to right using the left edge as the key. Verify that for each rectangle `n` (moving from left to right): if there are any `other` rectangles after `n` such that `n.right >= other.left` then we should also check that the vertical bounds between `n.top | n.bottom` don't intersect with `other.top | other.bottom` (same goes for your points as well). For your sort algorithm, using a modified insertion sort is probably not a bad idea so you don't have to re-sort for each insertion. – sircodesalot May 13 '14 at 19:17
  • There have got to be some additional constraints on the rectangles. For instance, if the first rectangle you generate just happens to fill the space there is no feasible solution. – pjs May 13 '14 at 19:29
  • The rectangle height and width can be same for all rectangles , i don't want to generate such big rectangles that will cover the whole space , as i said i need 20000+ rectangles – web2dev May 13 '14 at 19:33
  • What language and language standard is used? If this is C++ and if it is allowed, consider using a `map` and `set` container for all rectangles, where each key is a `tuple` or a `std::string`, and each value consists of the set of four points of a given rectangle. The `map` will sort the elements as you define it for the key. If using a `std::string`, then it will be necessary to create a comparison, such that the elements are sorted as you want. – CPlusPlus OOA and D May 13 '14 at 19:37
  • 1
    It is not that simple algorithmwise. First you have to prove that you can fit 2000+ rectangles in a 1000x1000 where each rectangle is min/max LxW. – Moataz Elmasry May 13 '14 at 19:38
  • @sicodesalot : Can you give a small example? why do i need to use any sort algorithm ? How to generate the point? – web2dev May 13 '14 at 19:39
  • Further you have to index the rectangles somehow so that the next random rectangle would fit into the grid with the least number of iterations (ideally O(1) ), same goes for points. C++ or whatever language is the last of your worries – Moataz Elmasry May 13 '14 at 19:40
  • @MoatazElmasry: If i consider intersecting rectangles than ? – web2dev May 13 '14 at 19:42
  • yes definitely you have to compare intersecting rectangle, but you don't want to compare each new rectangle with all other rectangles, else you have at least O(N^2) runtime if not more. you can get a good idea by reading http://en.wikipedia.org/wiki/Binary_space_partitioning, or point clustering algorithms and try to use similar approaches for rectangles – Moataz Elmasry May 13 '14 at 19:51
  • 1
    An alternative to BSP is quadtree, here's an example for sptial data http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves – Moataz Elmasry May 13 '14 at 19:53
  • One more thing, can you define some constraints like, a grid with max length,width can at most hold x rectangles where each rectangle has a max length/width of x,y and prove this theory. otherwise your use case might work sometime, sometime not, dependent on the input – Moataz Elmasry May 13 '14 at 20:03

0 Answers0