 This match is a followup to SensorFusion marathon match. The problem to be solved is very similar to the original SensorFusion problem. The changes include a different data set and a revised scoring approach. Additionally, you are given an access to above 80 articles devoted to this and related problems. Finally, you are allowed to reuse implementations and/or ideas of winners of the original marathon match.
GPS does not work indoors and even in an outdoor environment is usually accurate to no more than 10 meters of actual path taken. Modern smartphone are equipped with accelerometers, gyroscopes, magnetometers and other sensors that can be used to approximate human motion and therefore determine an approximate location from a known start point, otherwise known as dead reckoning. The problem is that these consumer grade sensors are typically low quality and drift over time meaning that error will be introduced over time and eventually compound to the point where the derived position is no longer within the tolerance for accuracy. This is compounded by the fact that the user may hold the phone in an unknown and inconsistent orientation during movement.
This challenge asks you to try to deal with the mentioned problem. You will need to implement an online algorithm that will receive the data from a smartphone's sensors. At the end of each second the algorithm will need to calculate as precise estimate of the current user's location as possible.
Training data
The data in this problem consists of 22 routes. They are randomly distributed into training, provisional and final test sets with 10, 4 and 8 routes, respectively. Additionally, all 17 routes used for the original marathon match can also be used for training purposes.
The entire corpus of training data can be downloaded here. Each folder corresponds to one route. Routes with numbers 1 to 17 are from the original match. Files ACCELEROMETER.txt, GYROSCOPE.txt, LIGHT.txt, LINEAR_ACCELERATION.txt, MAGNETIC_FIELD.txt, ORIENTATION.txt contain data collected from various sensors. First one or several lines explain the data, then the data itself follows, one entry per line. Some additional documentation about sensors is located here. File LOCATION.txt contains GPS information collected along the route. It will be used to evaluate the quality of your algorithm. (There is some inaccuracy in this data, but this is the most accurate representation of these routes that is available.) File LOCATION_KML.kml contains the same information, but in KML format. You can directly open it in Google Earth and see what the route looks like. Finally, files START_LOCATION.txt and START_LOCATION_KML.kml contain only the starting point of the route, i.e., the dead reckoning (in other words, this is the first data line from LOCATION.txt and LOCATION_KML.kml).
All data was collected on a Samsung Galaxy S2 smartphone using the Data Recording application located here.
Implementation
You will need to implement two methods:
 init. It will be called in the beginning of your program. The parameter startLocation contains the single data line from START_LOCATION.txt file. You can return any int from this method, the return value will be ignored.
 estimateLocation. It will be called once per each second of the route (in other words, once for each data line of LOCATION.txt file, except the start location line). The parameters accelerometer, gyroscope, light, linearAcceleration, magneticField, orientation will contain all the data from files ACCELEROMETER.txt, GYROSCOPE.txt, LIGHT.txt, LINEAR_ACCELERATION.txt, MAGNETIC_FIELD.txt, ORIENTATION.txt that was collected during this second, one data line per element. The return value should be an estimate of the user's location at the end of this second. It should be a double[] containing three elements in this order: longitude (degree, in 180..180 range), latitude (degrees, in 90..90 range) and altitude (meters, in 10000..10000 range).
Scoring
Let's assume the route contains N+1 points (M[0], M[1], ..., M[N]) and your algorithm estimates the route as (P[0] = M[0], P[1], ..., P[N]). Let dist(A, B) be the Euclidean distance between points A and B (in meters).
Let us define PathLen[i] := dist(M[0], M[1]) + dist(M[1], M[2]) + ... + dist(M[i1], M[i]). This is a distance traveled along the path up to the ith point.
Now, let EstimateErr[i] be equal to max{0.0, (dist(M[i], P[i])  10) / PathLen[i]}. This is the error of your ith estimated point (relative to distance traveled along the path). The "minus 10" part takes care of possible GPS error (which we assume can reach up to 10 meters).
The estimate error is converted to estimate score as follows:
EstimateScore := 1.0, if EstimateErr <= 0.02,
1.0  10.0 / 3.0 * (EstimateErr  0.02), if 0.02 <= EstimateErr <= 0.05,
0.9  8.0 * (EstimateErr  0.05), if 0.05 <= EstimateErr <= 0.15,
0.00225 / EstimateErr^2, if EstimateErr >= 0.15.
Your score for a single test is equal to the arithmetic average of EstimateScore over all N points of the route. Your overall score on a set of test cases is equal to 1,000,000 * (arithmetic average of score on all test cases from the set).
Scoring implementation notes:
 Earth is modeled as a precise sphere with a radius of 6371 kilometers.
 GPS sometimes fails (in most of test cases fairly rarely, but in some cases quite often). We use the following criteria to detect that: if GPS returns absolutely the same coordinates at seconds i and i1, we assume that coordinates for second i are not trustworthy. The values of EsimateScore are only computed for seconds where we have a GPS location which is trustworthy.
 Failure of any kind (time/memory limit, crash of your solution, invalid return value) results in a score of 0 for this test case.
A Java implementation of testing and scoring method can be downloaded here. It is as close as possible to the one which is actually used for serverside testing and scoring.
Additional data
The following archive contains above 80 articles devoted to problems similar or related to this one. You can freely use all ideas in these articles in your submissions. The Excel spreadsheet within the archive contains the summary of each article, including its topic. The stakeholder of this match feels that step estimation and implementation papers are most useful.
You are also allowed to reuse any code and/or ideas from the best 5 solutions of the original SensorFusion marathon match.
Special conditions
In order to be eligible for a prize, your solution must satisfy to the following 3 criteria:
 It must not be based on an assumption that phone's holder always moves with a constant speed.
 It must not be based on an assumption that phone always has a constant orientation.
 It must perform better (get a strictly higher system test score) than alegro's solution that won the original SensorFusion match. This solution is submitted to this match under mm_tester's account.
You can include open source code in your submission provided that it is clearly identified in your code and the following conditions are all fulfilled:
 Such inclusion does not violate the match rules.
 Your solution can be used for commercial purposes.
 If your solution is included into a software product, this product can be distributed with undisclosed source code.
In order to receive the prize money, you will need to fully document your algorithm and the derivation of all parameters internal to it. If these parameters were obtained from the training data set, you will also need to provide the program used to generate these training parameters. There is no restriction on the programming language used to generate these training parameters. Note that all these data should NOT be submitted anywhere during the coding phase. Instead, if you win a prize, a TopCoder representative will contact you directly in order to collect these data. The description must be submitted within 7 days after the contest results are published.
We reserve the right not to release the source code of all solutions after the end of the match. We also reserve the right to restrict the discussion of possible solution approaches at forums after the end of the match.
