Match Editorial

SRM 212
Saturday, September 25, 2004

Match summary

SRM 212 was a bit easier than average, giving coders still in the TopCoder Open a nice break before Round 4, and giving coders eliminated from the TCO a chance to build their confidence back up. The most interesting difficulty raised by this problem set was not in the 1000-point problems, but rather a tricky precision issue in the Division I easy and Division II medium problem.

Eryx finished in the lead for Division I, after jumping from third to first place with one of the few successful challenges in the match. jshute, with the fastest time on the Division I hard problem, came in second, and ploh finished third. Two yellow coders, dmks and riveria, came in fourth and fifth -- I expect they will soon be joining the ranks of the red coders.

In Division II, newcomer zahni took the top spot with the fastest time on the 1000-point problem and one successful challenge. He beat out second-place finisher zartheenumerato by almost 250 points. Rounding out the top 5 were coders HiltonLange, pingus, and tilps_kilm.

The Problems

YahtzeeScore

Used as: Division Two - Level One:

Value 250

Submission Rate 255 / 265 (96.23%)

Success Rate 237 / 255 (92.94%)

High Score Sabur for 249.94 points (0 mins 25 secs)

Average Score 232.44 (for 237 correct submissions)

To solve this problem, you must choose which die value yields the highest score, and then count the number of dice with that value in order to calculate that score. Since there are only 6 possible values, it is feasible to loop over all 6 values, and calculate the score for each. To calculate the score for a given value, you simply count the number of elements in the input equal to that value, and multiply the number by the value. In C++:

int maxPoints(vector<int> toss) {
     int max = 0;
     for (int i=1; i<=6; i++) {
         int score = 0;
         for (int j=0; j<5; j++)
             if (toss[j] == i)
                 score += i;
         if (score > max)
             max = score;
     }
     return max;
}

WinningRecord

Used as: Division Two - Level Two:

Value 500

Submission Rate 221 / 265 (83.40%)

Success Rate 125 / 221 (56.56%)

High Score zartheenumerato for 476.67 points (6 mins 20 secs)

Average Score 345.87 (for 125 correct submissions)

Used as: Division One - Level One:

Value 250

Submission Rate 187 / 192 (97.40%)

Success Rate 147 / 187 (78.61%)

High Score antimatter for 246.51 points (3 mins 23 secs)

Average Score 210.41 (for 147 correct submissions)

The concept behind solving this problem was simple: loop from 3 to N, calculate the fraction of winning games, keep track of the highest and lowest, and break ties by preferring a fraction with a larger denominator over an equal fraction with a smaller denominator. It was this tiebreaking policy that gave people trouble, with only a 57% success rate in Division II and a 79% success rate in Division I.

The main reason for failures was the use of floating-point arithmetic. It is important to understand that a computer's representation of floating-point numbers is not exact, and arithmetic on floating-point numbers is only approximate. For example, if you write the line of code "a = 1.0 / 3.0", the variable "a" will not hold exactly one third, but rather the closest representable floating-point number to that value.

One should be very careful about writing code that tests for equality of floating-point values, such as:

     double a = 2.0 / 7.0;
     double b = 14.0 / 49.0;
     if (a == b) {
         ...
     }

Now it turns out that the code above will behave as expected, and the problem that caught many coders off guard was more subtle than that. It turns out than on some modern processors, floating-point arithmetic is done in 80 bits of precision as long as the values stay in registers. But once they are spilled to memory, they are reduced to 64 bits. So, depending on the code the compiler creates for your method, you may end up comparing a 64-bit and 80-bit version of the same number, which is likely to fail, producing unexpected results. Some coders reported that simply adding a printf() statement changed how their solution behaved, most likely because this affected when registers were being spilled to memory, forcing a 80-bit to 64-bit conversion. See the round tables for a continued discussion of floating-point precision on modern processors.

Java appears to be immune to such issues, as the vast majority of the failing solutions were written in C++.

By now you may be wondering if this problem was fair. Why should we have to understand obscure compiler-dependent precision issues to successfully solve this problem? The answer is: you don't. Good programmers should develop a healthy fear of floating-point arithmetic, or at least an understanding of the pitfalls associated with it. Your first reaction to a problem that appears to require floating-point arithmetic should be "Can I solve this any other way?" In this case, you can.

Rather than storing floating-point versions of the best and worst willing records found so far, you can store the integer numerator and denominator of each of the two fractions. When comparing two fractions, rather than dividing like this:

     if (double(a)/b >= double(c)/d) {
         ...
     }

you can do an equivalent comparison by cross multiplying, like this:

     if (a*d >= c*b) {
         ...
     }

The significant difference here is, of course, the absence of floating-point arithmetic. For a good example of a complete solution using this method, see ploh's solution (Division I, 250-point problem).

The lesson to be learned here is to avoid precision problems by using integer arithmetic whenever possible. A well-written problem will allow for some slight error if you are expected to use floating-point arithmetic in your solution. If you don't see a statement to that effect, that is a good sign that there probably exists an integer-only solution. This isn't the first TopCoder problem in which the unnecessary use of floating-point arithmetic can get you in to trouble, and I guarantee it won't be the last.

LargestCircle

Used as: Division Two - Level Three:

Value 1000

Submission Rate 52 / 265 (19.62%)

Success Rate 16 / 52 (30.77%)

High Score zahni for 786.42 points (15 mins 43 secs)

Average Score 515.57 (for 16 correct submissions)

This problem boils down to finding which grid squares a circle passes through. We just need to find the circle with the largest radius that does not pass through any of the marked squares, and return the radius of that circle. The simplest solution uses 5 nested loops:

   for each radius
     for each center x coordinate
       for each center y coordinate
         for each x in the grid
           for each y in the grid
             test if circle passes through this square

The largest possible radius is the minimum of half the width and half the height of the grid (rounding down). If we start the loop at the largest possible radius and work down to 1, then the program can just return the radius as soon as it finds any circle that fits. The range of x coordinates is [radius, width-radius], and similarly the range of y coordinates is [radius, height-radius]. The 5 loops above become:

   for radius = MIN(width/2, height/2) to 1
     for cx = radius to (width-radius)
       for cy = radius to (height-radius)
         for x = 0 to (width-1)
           for y = 0 to (width-1)
             test if circle passes through this square

Now, one easy way to test if the circle passes through a given square is to compute the distance from the center of the circle to each of the 4 corners. If all 4 distances are less than or equal to the radius, then the square is inside the circle. If all 4 distances are greater than or equal to the radius, then the square is outside of the circle. Otherwise, the circle passes through the square.

If you wish to follow the advice I given for the problem WinningRecord, and avoid floating point, you can do so. Rather than compute the distances to the four corners of the square, compute the distance squared (which is an integer), and compare that to the square of the radius.

For an implementation of this algorithm see zahni's solution, or the writer's solution in the practice room.

SumOfSquareRoots

Used as: Division One - Level Two:

Value 500

Submission Rate 161 / 192 (83.85%)

Success Rate 147 / 161 (91.30%)

High Score antimatter for 488.76 points (4 mins 19 secs)

Average Score 363.14 (for 147 correct submissions)

A sum of square roots can be simplified by factoring out all perfect squares, collecting terms, and then squaring the integer coefficients to put them back underneath the square root sign. I expected this problem to give people more trouble than it did, and for coders to get hung up on trying to prove that this really does result in the shortest list of integers that yeild the same result when their square roots are summed. Either people generalized the example given in the problem statement, guessing (correctly) that it would always give the correct result, or TopCoder members know much more number theory than I anticipated.

jms137 coded a particularly concise solution to this problem.

Arcs

Used as: Division One - Level Three:

Value 1000

Submission Rate 22 / 192 (11.46%)

Success Rate 14 / 22 (63.64%)

High Score jshute for 542.47 points (32 mins 35 secs)

Average Score 461.66 (for 14 correct submissions)

This problem can be solved with a simple breadth-first search. The nodes are coordinate pairs on the grid, and the edges are arcs between nodes that do not intersect blocked squares. All edges have a weight of 1. The interesting part of this problem is determining the legal edges.

There can only be an arc between two points if they are connected by a 45-degree line. And if so, there are two potential arcs between those two points. If either of those arcs passes through only unblocked squares, then there exists an edge between the points.

One way to determine which squares are intersected by a given arc is to simply test each square in the bounding box of the arc. This part is the same as the core of LargestCircle, the Division II 1000-point problem. For each square, compute the distance from the center of the circular arc to the 4 corners of the square. If all four distances are less than or equal to the radius if the arc, or all four distances are greater than or equal to the radius of the arc, then the arc does not intersect that square. Otherwise, the arc does intersect the square, and if that square is blocked then this arc is not allowed.

jms137's solution is particularly easy to read. Method add() determines the two possible arcs between two points, method block() checks if a single arc is legal, and method ok() checks the intersection of an arc with a single square in the grid.

By legakis
TopCoder Member