|
Match Editorial |
SRM 185Monday, March 1, 2004
Match summary
Most Division 1 coders breezed through the Easy and Medium problems,
only to fall into a bottomless pit of a Hard. Only SnapDragon
and John Detheridge succesfully completed all three problems,
with both of their submissions on the Hard coming in the last
few minutes.
In Division 2, there was a minor bloodbath on the Easy problem when
many coders rounded down instead of up. marek.cygan eventually
prevailed by less than a challenge over runnerup king. First-timer
villiros would have come in third if not for a mistaken challenge.
The Problems
PassingGrade
Used as: Division Two - Level One:
Value
|
250
|
Submission Rate
|
233 / 262 (88.93%)
|
Success Rate
|
135 / 233 (57.94%)
|
High Score
|
generator for 245.20 points (3 mins 59 secs)
|
Average Score
|
193.52 (for 135 correct submissions)
|
The fact that you know the scores on individual assignments is a
distraction. All you really need is three numbers: the total points
possible in the course (totalPoss), the total points you've
earned so far (earned), and the maximum points you can score
on the final exam (final). You can calculate
totalPoss by summing the points possible on each assignment,
plus the points possible on the final exam. Similarly, you can calculate
earned by summing the points earned on each assignment.
You can then loop over all integers from 0 to final, inclusive,
and return the smallest one that gives you an overall percentage of 65% or
better. If no such number exists, return -1.
Alternatively, you can calculate the points you need directly as
pointsNeeded = (65*totalPoss+99)/100 - earned
The "+99" ensures that you round up. Then you just have to check
that
pointsNeeded is between 0 and
final. If it's
less than 0, return 0. If it's greater than
final, return -1.
The most common problem involved rounding pointsNeeded incorrectly.
If you accidentally rounded down then you would occasionally return an
answer that would leave you just shy of the 65% mark.
Another easy mistake to make was to write
for (int pts = 0; pts < finalExam; pts++)
...
instead of
for (int pts = 0; pts <= finalExam; pts++)
...
Then you would fail to detect cases where you could only pass with a perfect
score on the final.
TandemRepeats
Used as: Division Two - Level Two:
Value
|
500
|
Submission Rate
|
149 / 262 (56.87%)
|
Success Rate
|
101 / 149 (67.79%)
|
High Score
|
villiros for 463.23 points (8 mins 8 secs)
|
Average Score
|
326.58 (for 101 correct submissions)
|
Used as: Division One - Level One:
Value
|
250
|
Submission Rate
|
233 / 238 (97.90%)
|
Success Rate
|
204 / 233 (87.55%)
|
High Score
|
ZorbaTHut for 247.04 points (3 mins 7 secs)
|
Average Score
|
213.01 (for 204 correct submissions)
|
Brute force is the way to go here. Try every neighboring pair of sequences
and count how many characters are different. Keep track of the length of
the longest base sequence for which the number of errors is small enough.
The biggest variations in this problem are in the loops used to generate
all neighboring pairs of sequences. In the following code snippets, let N
be the length of the overall sequence. The first variation is to generate
all possible start points for the two sequences, making sure that there is
enough room for the second sequence
for (int i = 0; i < N; i++)
for (int j = i+1; j+(j-i)-1 < N; j++)
...
Notice that
j-i is the length of the first sequence, so
j+(j-i)-1 is the index of the last position in the second
sequence.
As a side note, many people would write the bound on the outer loop as
N-1 instead of N. I prefer to write it as N because it is quicker to
type and requires less thought. It'll have the same effect in the end
because when i is N-1, the inner loop will exit immediately.
The second major approach to writing these loops is to loop over the
possible sequence lengths and the possible start points of the first
sequence, as in
for (int len = 1; len <= N/2; len++)
for (int i = 0; i+2*len-1 < N; i++)
...
where
i+2*len-1 is again the index of the last position in the
second sequence (which starts at index
i+len).
Hilbert
Used as: Division Two - Level Three:
Value
|
900
|
Submission Rate
|
20 / 262 (7.63%)
|
Success Rate
|
11 / 20 (55.00%)
|
High Score
|
czh02 for 503.12 points (30 mins 57 secs)
|
Average Score
|
411.70 (for 11 correct submissions)
|
David Hilbert invented the Hilbert curve in 1891. It is now
the most famous of the so-called space-filling curves, although
it was not the first (that honor goes to Peano).
A quick web search might have netted you several dozen implementations of
Hilbert curves, but they wouldn't have helped much because the constraints
allowed grids containing about a billion points. You didn't have the luxury
of generating the entire curve, so you needed to come up with a way
to skip over large chunks of it at a time.
The key insight is to realize that identifying the quadrant of the
target point allows us to account for all the previous quadrants
without actually stepping through them. For example,
if we know that the target point is in the third quadrant, then we merely
include
4k-1 steps for each of the previous two quadrants.
All that remains is to determine the number of steps within the current
quadrant, which can be calculated recursively.
The tricky part is accounting for the different orientations of
the four quadrants. When making a recursive call, we need to
modify the coordinates of the target point relative to the starting
point and orientation of that quadrant. For the first quadrant,
this boils down to swapping the x and y coordinates. For the second
and third quadrants, you simply subtract an offset from one or both
coordinates. For the fourth quadrant...well, yes, then there's the
fourth quadrant. I could try to explain the formula for the fourth
quadrant, but an explanation wouldn't help. The only way to understand
the formula for that case is to draw a couple pictures and play with
them for a while. (You do keep a pad of paper next to you during
SRMs for drawing pictures, don't you?)
Altogether, the code is surprisingly short
define steps(k,x,y) is
if k == 0 then return 0
dim = 2k-1 // width of a quadrant
quad = dim*dim // steps in a quadrant
if (x,y) is in upper left quadrant then
return steps(k-1,y,x)
if (x,y) is in lower left quadrant then
return quad + steps(k-1,x,y-dim)
if (x,y) is in lower right quadrant then
return 2*quad + steps(k-1,x-dim,y-dim)
if (x,y) is in upper right quadrant then
return 3*quad + steps(k-1,dim-y+1,2*dim-x+1)
EulerianRace
Used as: Division One - Level Two:
Value
|
550
|
Submission Rate
|
171 / 238 (71.85%)
|
Success Rate
|
150 / 171 (87.72%)
|
High Score
|
SnapDragon for 508.20 points (8 mins 17 secs)
|
Average Score
|
306.98 (for 150 correct submissions)
|
Leonhard Euler invented graph theory in about 1736 to solve a
puzzle about the bridges of Konigsberg (which also
happens to be the birthplace of David Hilbert, about a century
later). Euler showed that you could not stroll through the town along
a route that crossed each of the town's seven bridges once and only
once, but that you could do so if each land mass touching the
bridges was connected to a positive, even number of bridges, and
furthermore that such a configuration allows you to return to your
starting point. This problem is essentially a constructive proof of
this theorem.
Both the proof and the algorithm hinge on the ability to construct
cycles. Suppose all checkpoints are connected to an even
number of unused bridges. If you start at a checkpoint K and leave across
an unused bridge, then there is always a way to get back to K along some other
unused bridge. Why? Because it is impossible to get "stuck" someplace other
than K. To get "stuck" you would have to enter a checkpoint but not be
able to leave it. But because each checkpoint is connected to an even
number of unused bridges, you know that if there was an unused bridge that
you could use to enter the checkpoint, then there is another unused bridge that
you can use to leave the checkpoint. Similarly, because there was an unused
bridge that you could use to leave checkpoint K, there will be another unused
bridge that you can use to re-enter checkpoint K.
The algorithm keeps building such cycles and splicing them together
until you run out of bridges. The problem statement specified a certain
order in which to build the cycles and splice them together, so as long
as you followed those directions, you were probably okay.
See TAG's submission for a particularly clear implementation of this algorithm.
CounterfeitCoin
Used as: Division One - Level Three:
Value
|
1100
|
Submission Rate
|
5 / 238 (2.10%)
|
Success Rate
|
2 / 5 (40.00%)
|
High Score
|
John Dethridge for 457.21 points (53 mins 19 secs)
|
Average Score
|
455.88 (for 2 correct submissions)
|
This problem was brutal, with the only two successful submissions
taking nearly an hour each to code. Lots of people found it relatively
easy to get started, but then you type...and type...and type...and type.
And even then you still have another three corner cases to deal with!
SnapDragon nominated it as the hardest SRM problem ever, but
the sentimental favorite in the ensuing discussion was Cassette.
To get anywhere on this problem, you had to realize that,
with a maximum of 26 coins, it is utterly hopeless to try all possible
weighings, even using dynamic programming or memoization. The trick is
to classify coins in four categories:
- Category A: those about which nothing is known
- Category B: those that might be heavy, but are known not to be light
- Category C: those that might be light, but are known not to be heavy
- Category D: those that are known to be good
We can always swap coins in the same category without affecting the
maximum number of weighings that might be required, so for most of the problem
we can work with the numbers of coins in each category, without worrying
about the identities of those coins. Let #A, #B, #C, and #D
represent the number of coins in each category.
The algorithm breaks into three stages:
- Calculate #A, #B, #C, and #D from the results of the previous weighings.
- From #A, #B, #C, and #D, determine the optimal number of weighings needed
to find the counterfeit coin and whether it is heavy or light.
- Choose the best coins to use in the next weighing from among the
possible arrangements that achieve the optimal number of weighings.
In stage 1, we keep track of the category for each individual coin.
Initially, all coins are in category A. For each "even" weighing, we
move all the coins involved in that weighing to category D. For each
"left heavy" or "right heavy" weighing, we move all the coins
not involved in the weighing to category D, along with all
category C coins from the heavy side and all category B coins from the
light side. In addition, we move all category A coins on the heavy
or light side to category B or C, respectively.
From #A, #B, #C, and #D, we can check for two special
cases. If #A+#B+#C is 0, then there has been an error in the
previous weighings. If #A is 0 and #B+#C is 1, then we have
already found the counterfeit coin (and whether it is heavy or light).
In stage 2, we build a table W[a][b][c] of the numbers of weighings
required for the various combinations of #A, #B, and #C. (Note that we
don't need W[a][b][c][d], because #D is fully determined by the other three values
as #D = N-#A-#B-#C.) To build this table,
we can use dynamic programming or memoization, but I found memoization to
be significantly easier to code for this problem. Memoization has the
further advantage that it ignores those configurations that are impossible
to achieve. For example, if N = 26, it is impossible to come up with
a set of weighings that would leave you with #A = 25.
For each configuration of #A, #B, and #C (which together imply #D),
we try all possible ways of arranging the coins on the scale.
There are various tricks you could
use to prune the number of arrangements to consider in this step, but
the only one I bothered with was to insist that category D coins could
only be used on the right-hand side of the scale. I used six (!) nested
loops to generate the different possibilities for the numbers of
category A/B/C coins on the left/right side, adding category D coins
to the right-hand side at the end to even out the numbers of coins
on each side. (Incidentally, this beats by one my previous personal record for deeply
nested loops in a TopCoder problem—and on that problem I had
four people unsuccessfully challenge me because they couldn't believe five
nested loops would run fast enough.)
For each arrangement, you figure out what the new values of #A, #B, and #C
would be for each possible result (even, left-heavy, and right-heavy).
Then you recursively check the optimal number of weighings after each of those
three results, and keep the maximum, plus one for the current weighing.
That is the best you can do for this arrangement of coins.
As we loop through all possible arrangements, we keep the minimum of
all these maximums.
There are two corner cases to worry about here. The first is that not
all results are possible on all weighings. For example, if we have
determined which coin is counterfeit but still need to find whether it
is heavy or light, then we weigh it against one of the good coins. It
is impossible for the result of this weighing to be "even". One
way to handle such cases is to initialize W[0][0][0] to be some negative
value, so that it will never be chosen as the maximum.
The other corner case to worry about is weighings that give you
no new information. For example, suppose #A is 0. Then, if you
put all the category B coins on one side and all the category C
coins on the other side, you know the category B side will be
heavier. If you're not careful, you can easily end up with an
infinite loop here. The simplest solution is to test for such cases
and skip over them.
Finally, in stage 3, we decide which coins to use in the next weighing.
By now, we know the optimum number of weighings.
We can generate all possible ways of arranging
the various categories of coins using the same six nested loops from Stage 2,
and look for those arrangements that achieve the optimum.
For each such arrangement, we find the best (ie,
shortest and lexicographically earliest) way to assign the actual coin
labels
to this arrangement. For example, we can loop through the coin labels
in order, and assign each coin to the left-hand side (if that side
still needs a coin of the appropriate category) or to the right-hand side (if
that side still needs a coin of the appropriate category). Remember that
we know the category of each coin from stage 1. There was a slight
gotcha here that, if you insisted in stage 2 that category D coins
only go on the right-hand side, you still have to consider in this stage
the possibility that the category D coins should go on the left instead,
because that might produce a better string.
Fortunately, the preference
for short strings means we don't have to pad the answers with category
D coins on both sides.
By
vorthys
TopCoder Member