Problem Definition:

Please consider joining the mailing list to directly receive updates related to the competition.

Input: A set of possibly overlapping regions (polygons with and without interior rings) and a set of points.

Output: Pair each input point with one or more of the input polygons such that a given spatial condition is satisfied between the polygon and the point.

Description: A region can be any polygon (with or without interior rings) that is made up of straight line segments. A set of these regions that can possibly overlap with each other are provided as part of the input. The number of the regions will be no more than a 500. Additionally, a set of input points are given. The number of these points will be no more than 1 Million. The objective is to match each of these points with one or more of the polygons from the input based on a spatial predicate. We have two spatial predicates, explained as follows:

  1. INSIDE: which evaluates to TRUE if a point is inside the polygon. Notice that the point can be associated with one or more polygons (i.e., two overlapping polygons can contain the same point). A point can be INSIDE any polygon with a sequence number (i.e., timestamp) less than the sequence number of the point where only the latest position of each polygon (up to that sequence number) is considered.
  2. WITHIN n (e.g., WITHIN a distance of 1000 units): which evaluates to true if the point is at less than 1000 units distance from the polygon. Notice that each point may be associated with one or more of the polygons, as the same point can be within 1000 units distance from several polygons.
The input points are associated with an ID and the same point might appear multiple times in the input. So these input points can be thought of as moving points. Each instance of the input point may appear zero or more times in the output. If a point is not associated with any polygon, it does not appear in the output. Different instances of the input point may be associated with different polygons in the output. The polygons are also associated with a unique ID. And the location of a polygon can change over time: think of them as moving polygons. In the input sequence, the points should always look at the latest position of the polygon to find the right association. The shape of the polygon might change along with the location.

Assumptions about the data:

  • Data is assumed to be in the spherical Mercator Projection format (the projection used in Google, Bing and OpenStreet maps). That means, the calculations will be done in Cartesian space. We are limiting the scope to simple 2D Cartesian based system to keep the distance calculations simple. The distances calculated are not the actual distances on the globe, but are sufficient for the purpose of the competition. Here is an example of distance calculations in Cartesian space: http://math.info/Algebra/Distance_Cartesian_Plane/
  • Polygons are valid and are ordered CCW for outer-rings, and CW for inner rings. Polygons are valid according to the rules defined by OpenGIS Simple Features Specification for SQL, Revision1.1. This specification is available at: here . Note: CCW: Counter clockwise, CW: clockwise. And see the above document for more details on how holes are defined in a polygon.
  • The possible spatial predicates are: INSIDE and WITHIN_A_DISTANCE.
  • Number of polygons will vary for each test run. But the number will never be more than 500 polygons.
  • Number of points will vary for each test run. But the number will never be more than 1 Million.
  • The distance of a point to a polygon is the distance of the point to the closest point on the polygon. The resolution of the distance is two digits after decimal.
  • If the point is inside or on the boundary of the polygon then the distance is zero.
  • There is no positional uncertainty in the data.
  • Evaluation Rules:

  • The time to process all the input data will be the main criterion for the evaluation of the solution. So the solution with the best rate (points processed per minute) is the winner.
  • The accuracy of the result is also considered. If you process all the input points in 5 minutes but only 1000 points are accurately associated with the right polygon, then the rate will be 1000 points per 5 minutes.
  • For a test run, one of the predicates will be tested. There will be several test runs with different sets of input points and polygons. And the rate of processing of the points is averaged over all the test runs.
  • Input Data Format:
    Data input to the program is provided as two files. The first file contains the input points and the second file contains polygons used to describe the regions. Both the points and the polygons are in GML 2.1.1 format.

    Contestants need to provide an executable, named Geofence that can be invoked as follows:

    Geofence < "INSIDE" | "WITHIN n" > <PointInputFilePath> <PolygonInputFilePath> <OutputFilePath>

    The Geofence executable program takes four main arguments as input: (1) First, the predicate type must be specified (INSIDE or WITHIN n), (2) The path of the text file containing the data points information (PointInputFilePath), (3) The path of the text file containing the polygons information (PolygonInputFilePath), and (4) The path of the text file that contains the output (OutputFilePath).

    Points Format:
    POINT:ID:Sequence:<gml Point>

    Polygons Format:
    POLYGON:ID:Sequence:<gml Polygon>

    Output Data Format:
    Output should contain the Point ID, Sequence, Polygon ID, and Sequence.

    ID:Sequence:ID:Sequence

    Examples:
    Command

    Geofence INSIDE points.txt polygons.txt out.txt

    Points in the points.txt file:

    POINT:1:501:<gml:Point srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:coordinates decimal="." cs="," ts=" ">1.0,1.0 </gml:coordinates></gml:Point>

    POINT:2:502:<gml:Point srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:coordinates decimal="." cs="," ts=" ">100.0,1.0 </gml:coordinates></gml:Point>

    POINT:1:503:<gml:Point srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:coordinates decimal="." cs="," ts=" ">2.0,1.0 </gml:coordinates></gml:Point>

    POINT:1:504:<gml:Point srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:coordinates decimal="." cs="," ts=" ">11.0,1.0 </gml:coordinates></gml:Point>

    Polygons in the polygons.txt file:

    POLYGON:1:1: <gml:Polygon srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:outerBoundaryIs><gml:LinearRing><gml:coordinates decimal="." cs="," ts=" ">0.0,0.0 5.0,0.0 5.0,5.0 0.0,5.0 0.0,0.0 </gml:coordinates> </gml:LinearRing> </gml:outerBoundaryIs></gml:Polygon>

    POLYGON:2:2: <gml:Polygon srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:outerBoundaryIs><gml:LinearRing><gml:coordinates decimal="." cs="," ts=" ">20.0,20.0 25.0,20.0 25.0,25.0 20.0,25.0 20.0,20.0 </gml:coordinates></gml:LinearRing></gml:outerBoundaryIs><gml:innerBoundaryIs><gml:LinearRing><gml:coordinates decimal="."cs="," ts=" ">21.0,21.0 22.0,22.0 23.0,21.0 21.0,21.0 </gml:coordinates></gml:LinearRing></gml:innerBoundaryIs></gml:Polygon>

    POLYGON:1:503: <gml:Polygon srsName="EPSG:" xmlns:gml="http://www.opengis.net/gml"><gml:outerBoundaryIs><gml:LinearRing><gml:coordinates decimal="." cs="," ts=" ">10.0,0.0 15.0,0.0 15.0,5.0 10.0,5.0 10.0,0.0 </gml:coordinates></gml:LinearRing></gml:outerBoundaryIs></gml:Polygon>

    Correct Output in out.txt file

    1:501:1:1

    1:503:1:1

    1:504:1:503

    Note that the third instance of the first point is now associated with the second instance of the Polygon 1 (with ID 503). The second point is not inside any of the polygons, so this point does not appear in the output. The sequence numbers are generated so that each instance of the data in a file gets a unique number. Here, we consider the sequence numbers to be logical timestamps as recorded by a system. So no two points will have the same timestamp and no two polygons will have the same timestamp. When considering the association of points and polygons, the timestamp valuesshould be taken into account. A point can never be associated with a polygon that has a higher (or equal) timestamp value. When a polygon gets a new timestamp value, the old instance of that polygon ceases to exist. In our example, after the timestamp 503 when Polygon 1 gets a new instance, the old instance of Polygon 1 ceases to exist. So points with timestamp value 504 or higher can only see the new instance of the polygon with the timestamp value 503.

    GIS Cup Sample Data Set