Get Time
long_comps_topcoder  Problem Statement
Contest: SpaceNet Challenge
Problem: BuildingDetector

Problem Statement


Prize Distribution

              Prize             USD
  1st                         $10,000
  2nd                          $8,000
  3rd                          $6,500
  4th                          $5,500
  5th                          $4,500
  Docker Incentive               $500
  Total Prizes                $35,000

Docker Incentive - $500

We will ask the 5 winners to submit a Docker file for their solution. We will pay the winners who submit a Docker file an additional $100 each for their efforts.


The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. DigitalGlobe, CosmiQ Works, and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advancing automated feature extraction techniques will serve important downstream uses of map data. For example, this type of map data is needed for planning international events like the 2016 Rio Olympics, as well as humanitarian and disaster response as recently observed in Haiti to aid the response to Hurricane Matthew. Furthermore, we think that solving this challenge is an important stepping stone to unleashing the power of advanced computer vision algorithms applied to a variety of remote sensing data applications in both the public and private sector.

Further background information about the contest can be found at the SpaceNet Challenge Minisite.


Can you help us automate mapping? In this challenge, competitors are tasked with developing automated methods for extracting map-ready building footprints from high-resolution satellite imagery. Moving towards more accurate fully automated extraction of buildings will help bring innovation to computer vision methodologies applied to high-resolution satellite imagery, and ultimately help create better maps where they are needed most.

Your task will be to extract polygonal areas that represent buildings from satellite images. The polygons your algorithm returns will be compared to ground truth data, the quality of your solution will be judged by the combination of precision and recall, see Scoring for details.

Input Files

Satellite images

Two types of images are available for the target area: traditional 3-band RGB images and 8-band images. The images were collected by the DigitalGlobe Worldview-2 satellite over Rio de Janeiro, Brazil. The training data contains more than 7,000 images, each image covers 200m x 200m on the ground. 3-band images have a pixel resolution of ~50cm, 8-band images have a resolution of ~2m. Worldview-2 is sensitive to light in a wide range of wavelengths. 3-band Worldview-2 images are standard natural color images, which means they have three channels containing reflected light intensity in thin spectral bands around the red, green and blue light wavelengths (659, 546 and 478 nanometres (nm) respectively). The 8-band multispectral images contain spectral bands for coastal blue, blue, green, yellow, red, red edge, near infrared 1 (NIR1) and near infrared 2 (NIR2) (corresponding to center wavelengths of 427, 478, 546, 608, 659, 724, 833 and 949 nm, respectively). This extended range of spectral bands allows Worldview-2 8-band imagery to be used to classify the material that is being imaged.

Images are provided in GeoTiff format. 3-band images can be viewed by most image viewer applications and can be processed by most imaging libraries in all programming languages. Processing 8-band images are more involved, we provide tools to extract individual bands to separate images, see the Visualizer description for details.

Building footprints

The location and shape of known buildings are referred to as ‘ground truth’ in this document. These data are described in CSV files using the following format:


AOI_1_RIO_img1,1,"POLYGON ((103.7 205.4 0,107.8 201.9 0,
100.5 203.2 0,94.4 208.0 0,93.0 215.7 0,92.1 226.1 0,89.6 228.8 0,
95.0 233.9 0,92.4 236.6 0,95.4 239.8 0,116.7 221.0 0,116.7 216.1 0,
103.7 205.4 0))","POLYGON ((-43.681699199999969 -22.981289 0, ... 
[truncated for brevity]...))"


AOI_1_RIO_img3,1,"POLYGON ((213 269 0,184 221 0,
130 249 0,154 293 0,162 288 0,165 292 0,169 290 0,180 285 0,
213 269 0),(151 253 0,177 242 0,189 263 0,164 275 0,151 253 0))"
,"POLYGON ((...[truncated for brevity]...))"

(The sample above contains 4 lines of text, extra line breaks are added only for readability.)

  • ImageId is a string that uniquely identifies the image.
  • BuildingId is an integer that identifies a building in the image, it is unique only within an image. A special id of -1 is used to mean that there are no buildings in the image.
  • PolygonWKT_Pix specifies the points of the shape that represents the building in Well Known Text format. Only the POLYGON object type of the WKT standard is supported. If BuildingId is -1 then the POLYGON EMPTY construct is used in this column. (See line #3 for an example above.) The coordinate values represent pixels, the origin of the coordinate system is at the top left corner of the image, the first coordinate is the x value (positive is right), the second coordinate is the y value (positive is down), the third coordinate is always 0. Note that the polygons must be closed: the first and last entries in their point list must be the same.
  • PolygonWKT_Geo specifies the points of the same shape in geographical coordinates in {latitude, longitude, 0} triplets.

Usually a building is defined by a single polygon that represents the exterior boundary edge of its shape. This is sometimes not enough, see building #1 on image AOI_1_RIO_img1194 for an example shape that contains a hole. In such cases two (or more) polygons are needed, the first one always defines the exterior edge, the second (third, fourth, etc) defines the interior rings (holes). (See line #4 above for an example of the required syntax in this case.)

As the resolution of the 3-band and 8-band imagery is different, building footprints are given separately for both image types. The ImageId, BuildingId and PolygonWKT_Geo fields of these two CSV files are the same, but the PolygonWKT_Pix coordinates differ.


  • The difference of PolygonWKT_Pix coordinate values between the 3band and 8band csv files is a simple linear scaling so in theory it would be enough to provide only one of these two.
  • The way of representing holes is different from how the WKT standard specifies them: the standard mandates that the points of a polygon must be enumerated in an anti-clockwise order if the polygon represents a shape with positive area (exterior rings), and in a clockwise order if the polygon represents a shape with negative area (interior rings). However, what appears clockwise in the latitude/longitude based geographic coordinate system appears anti-clockwise in the image space where the y axis points downwards, so in the hope of causing less confusion we chose a simpler approach: the first polygon is always positive, the rest are always negative.
  • It is known that some of the 8-band images are erroneous, they are either empty (all pixel values are 0) or contain unrealistically large values in the 2nd band. Your algorithm should be able to handle these and similar errors.
  • The ground truth data was created with a combination of manual and semi-automatic processes and you should be aware that it does contain some anomalies. Future editions of this contest will feature cleaner, better quality data, but we believe that despite the known anomalies in the current data the quality is good enough to make machine learning possible. The most frequent data anomalies include:
    • Misaligned buildings - These anomalies are typically a result of projection issues or mislabeling.
    • Missing buildings - These are generally labeling errors where the building was simply missed due the the scale of the data set or the building was indistinguishable.
    • Errant polygons that do not cover buildings - Although we tried to remove as many of these polygons as possible, occasionally a polygon was generated in error and not removed.


Input files are available for download from the spacenet-dataset AWS bucket. A separate guide is available that details the process of obtaining the data. Note especially the final chapter that describes where the training and testing data is within the bucket.

This Python script also helps you to download the files from AWS.

Output File

Your output must be a CSV file with almost identical format to the building footprint definition files.


Your output file may or may not include the above header line. The rest of the lines should specify the buildings your algorithm extracted, one per line.

The required fields are:

  • ImageId is a string that uniquely identifies the image.
  • BuildingId is an integer that identifies a building in the image, it should be unique within an image and must be positive unless the special id of -1 is used. -1 must be used to signal that there are no buildings in the image.
  • PolygonWKT_Pix specifies the points of the shape that represents the building you found. The format is exactly the same as given above in the Input files section. Important to know that the coordinates must be given in the scale of the 3-band images. So if you find a building that has a corner at (40, 20) on the 3-band image and (10, 5) on the corresponding 8-band image then your output file should have a (40 20 0) coordinate triplet listed in the shape definition.
  • Confidence is a positive real number, higher numbers mean you are more confident that this building is indeed present. See the details of scoring for how this value is used.

Your output must be a single file with .csv extension. Optionally the file may be zipped, in which case it must have .zip extension. The file must not be larger than 150MB and must not contain more than 2 million lines.


This match uses the result submission style, i.e. you will run your solution locally using the provided files as input, and produce a CSV or ZIP file that contains your answer.

In order for your solution to be evaluated by Topcoder’s marathon system, you must implement a class named BuildingDetector, which implements a single function: getAnswerURL(). Your function will return a String corresponding to the URL of your submission files. You may upload your files to a cloud hosting service such as Dropbox or Google Drive, which can provide a direct link to the file.

To create a direct sharing link in Dropbox, right click on the uploaded file and select share. You should be able to copy a link to this specific file which ends with the tag "?dl=0". This URL will point directly to your file if you change this tag to "?dl=1". You can then use this link in your getAnswerURL() function.

If you use Google Drive to share the link, then please use the following format: "" + id

Note that Google has a file size limit of 25MB and can’t provide direct links to files larger than this. (For larger files the link opens a warning message saying that automatic virus checking of the file is not done.)

You can use any other way to share your result file, but make sure the link you provide opens the filestream directly, and is available for anyone with the link (not only the file owner), to allow the automated tester to download and evaluate it.

An example of the code you have to submit, using Java:

public class BuildingDetector {
  public String getAnswerURL() {
    //Replace the returned String with your submission file’s URL
    return "";

Keep in mind that your complete code that generates these results will be verified at the end of the contest if you achieve a score in the top 10, as described later in the “Requirements to Win a Prize” section, i.e. participants will be required to provide fully automated executable software to allow for independent verification of software performance and the metric quality of the output data.


A full submission will be processed by the Topcoder Marathon test system, which will download, validate and evaluate your submission file.

Any malformed or inaccessible file, or one that exceeds the maximum file size (150 MB) or the maximum number of lines (2 million) will receive a zero score.

If your submission is valid, your solution will be scored using the following algorithm.

  1. Sort the building polygons you returned in decreasing order of confidence.
  2. For each polygon find the best matching one from the set of the ground truth polygons. Loop over the truth polygons, and:
    1. Skip this truth polygon if it was already matched with another solution polygon.
    2. Skip this truth polygon if it belongs to a different ImageId than the solution polygon.
    3. Otherwise calculate the IOU (Intersection over Union, Jaccard index) of the two polygons.
    4. Note the truth polygon which has the highest IOU score if this score is higher than 0.5. Call this the ‘matching’ polygon.
  3. If there is a matching polygon found above, increase the count of true positives by one (TP).
  4. If there is no matching polygon found, increase the count of false positives by one (FP).
  5. When all solution polygons are processed then for each truth polygon that left unmatched increase the count of false negatives by one (FN).

The precision and recall of your algorithm are defined as

Precision = TP / (TP + FP)
Recall = TP / (TP + FN)

The F-score of your algorithm is defined as 0 if either Precision or Recall is 0. Otherwise:

F_score = 2 * Precision * Recall / (Precision + Recall)

Your overall score is calculated as 1000000 * F_score.

Note that your returned confidence values are used only to sort the building footprints you extract so that buildings with higher confidence values are tried to be matched first. The exact values don’t matter, they only establish a ranking among the buildings.

Note also that due to clipping building footprints at image chip boundaries very small buildings may appear. As it is not realistic to expect that your algorithm can find such buildings, all buildings with area less than 20 (measured in pixels on the 3-band imagery) will be ignored both in the ground truth and in your solution.

For the exact algorithm of scoring see the visualizer source code.

Example submissions can be used to verify that your chosen approach to upload submissions works. The tester will verify that the returned String contains a valid URL, its content is accessible, i.e. the tester is able to download the file from the returned URL. If your file is valid, it will be evaluated, and precision, recall and F-score values will be available in the test results. The example evaluation is based on a small subset of the training data: 17 images and corresponding ground truth that you can find in the ./competition1/spacenet_TrainData directory of the downloaded data package (ImageIds in the range [AOI_1_RIO_img2000 ... AOI_1_RIO_img2016]).

Full submissions must contain in a single file all the extracted building polygons that your algorithm found in all images of the ./competition1/spacenet_TestData directory.

Final Scoring

The top 10 competitors according the provisional scores, will be given access to an AWS VM instance within 5 days from the end of submission phase. If your score qualifies you as a top 10 competitor, you will need to load your code to your assigned VM, along with two scripts for running it:

  • should perform any necessary compilation of the code so that it can be run.
  • [Path to training data folder] [Path to testing data folder] [Output file] should run the code and generate an output file in one of the allowed formats (CSV or ZIP). The allowed time limit for the script is 8 hours. The training data folder contains the same data in the same structure as in the downloaded training data. The testing data folder contains similar data in the same structure as in the downloaded testing data. Both these folders contain the corresponding tar.gz data files already extracted in place (e.g. the training folder contains a ‘3band’ subfolder containing all the 3-band training images, etc).

Your solution will be subjected to three tests:

First, your solution will be validated, i.e. we will check if it produces the same output file as your last submission, using the same input files used in this contest.

Second, your solution will be tested against a set of new image files. The number and size of these new set of images will be similar to the one you downloaded as testing data. Also the scene content will be similar.

Third, the resulting output from the steps above will be validated and scored. The final rankings will be based on this score alone.

Competitors who fail to provide their solution as expected will receive a zero score in this final scoring phase, and will not be eligible to win prizes.

Additional Resources

  • A visualizer is available here that you can use to test your solution locally. It displays your extracted building footprints, the expected ground truth, and the difference of these two. It also calculates precision, recall and f-scores so it serves as an offline tester. (But note that the visualizer does not enforce the limits on allowed file size and nuber of lines.) The visualizer code also contains utilities to extract a single image band from a 8-band GeoTiff file.
  • The SpaceNet Challenge Asset Library contains plenty of reading material and tools that make it easier to get started.

General Notes

  • This match is rated.

  • In this match you may use any programming language and libraries, including commercial solutions, provided Topcoder is able to run it free of any charge. You may also use open source languages and libraries, with the restrictions listed in the next section below. If your solution requires licenses, you must have these licenses and be able to legally install them in a testing VM (see “Requirements to Win a Prize” section). Submissions will be deleted/destroyed after they are confirmed. Topcoder will not purchase licenses to run your code. Prior to submission, please make absolutely sure your submission can be run by Topcoder free of cost, and with all necessary licenses pre-installed in your solution. Topcoder is not required to contact submitters for additional instructions if the code does not run. If we are unable to run your solution due to license problems, including any requirement to download a license, your submission might be rejected. Be sure to contact us right away if you have concerns about this requirement.

  • You may use open source languages and libraries provided they are equally free for your use, use by another competitor, or use by the client.

  • If your solution includes licensed software (e.g. commercial software, open source software, etc), you must include the full license agreements with your submission. Include your licenses in a folder labeled “Licenses”. Within the same folder, include a text file labeled “README” that explains the purpose of each licensed software package as it is used in your solution.

  • The usage of any external resources, like maps, real estate data, additional images, etc about the focus region are NOT allowed. Your solution should rely only on the provided satellite images and building footprint data files to generate your output. NOTE: this requirement is deprecated. See the contest forum for up to date information on external data usage.

  • Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about the problem itself or possible solution techniques.

  • The stakeholders of this competition are especially interested in how the presence of 8-band imagery improves the quality of results. They believe that the non-standard/multi-band RGB images provide extra information that your algorithms can incorporate in their decisions even better than how a trained human could.

Requirements to Win a Prize

In order to receive a prize, you must do all the following:

Achieve a score in the top 5, according to system test results. See the "Final scoring" section above.

Within 3 days of match end, the top 10 finishers need to provide the copilot and administrator with VM requirements for their solution. 5 days from the date the competitor receives the VM, the VM must be set up with the solution so Topcoder can easily validate and run the solution. Once the final scores are posted and winners are announced, the top 5 winners have 7 days to submit a report outlining your final algorithm explaining the logic behind and steps to its approach. The report must be at least 2 pages long and should contain:

  • Your Information: first and last name, Topcoder handle and email address.

  • Approach Used: a detailed description of your solution, which should include: approaches considered and ultimately chosen; advantages, disadvantages and potential improvements of your approach; processing run times; detailed comments on libraries and open source resources used.

  • Local Programs Used: If any data (including pre-trained models and runtime parameters) were obtained from the provided training data, you will also need to provide the program(s) used to generate these data. The complete process of generating results from the input data must be reproducible.

  • Actual Source Code Used: See "Final Scoring" above for details.

If you place in the top 5 but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all of the above.



Method signature:String getAnswerURL()
(be sure your method is public)


Test case 1

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2010, TopCoder, Inc. All rights reserved.