|
Match Editorial |
2002 TopCoder Invitational
Online Round #4Wednesday, October 30, 2002
Summary
And there was blood in the water.
Round 4 of the 2002 Topcoder Invitational, starting with the best and
luckiest 64, and narrowing down all the way to the top 16. A relatively easy
Level One problem, a Level Two problem with a few nasty bits, and a truly
devastating Level Three problem later, the results are in and complete.
SnapDragon blitzed the round with 1279.29 points and all three problems.
John Dethridge pulled off a similar feat, completing all three problems for
1024.74 points. ZorbaTHut got 837.82 points, from a Level One, Level Two,
and a pair of challenges.
The lowest seed remaining in the top 16 is, FunOrPain, who started at
position 228. FunOrPain set the cutoff at 659.13 points, the lowest score to
get to the onsite. tjq was the second lowest seed and the second lowest
score, at 666.15 points - less than a quarter of a point behind reid, the
second *highest* seed. The other semifinalists are, in order of score,
ambrose, dgarthur, dmwright, radeye, malpt,
obfuscator, lars, WishingBone,
NDBronson, and DjinnKahn. Some of us have been there before, but quite a few
haven't - it's going to be nice to see some new faces.
And on the unfortunate side of things, the highest seed to drop out was
bigg_nate, seed 7, a particularly bad round leaving him with only 322.30
points. Scores this round ranged all the way down to -100 (surprisingly one
of the higher seeds, and I'm not saying who it was), with seven people
getting zero points or below. And on the front of "it can't have been an
accident", one competitor finished with 7.81 points, and no, that's not a
typo.
Since I know you're probably sick of my rambling and just want to see how to
solve the problems, here they are.
The Problems
GameOfLife
Used as: Level 1
Div-I
|
Value
|
300 points |
Submission Rate
|
63 / 64 (98.43%)
|
Success Rate
|
53 / 63 (84.13%)
|
High Score
|
uglyfool for 282.61 points
|
Average Points
|
246.46
|
Nothing more complex than brute-force here. At most we've got a 50x50 board,
and at most we've got 1000 iterations. Yes, that's 2.5 million, which is a
bit painful, but I don't even want to think about trying to do it any other
way.
The easiest method was to make a big 3d array, [ 1001 ][ 50 ][ 50 ]. Fill in
[ 0 ][ x ][ y ] using the input string and then just iterate over it
repeatedly to fill in each next generation. You'll want to make a get()
function that handles negative values and out-of-bound values by just
wrapping around the "edges". From there it's pretty simple - count the
number of living cells, make a switch() based off the provided rule, and
update the new array. When you get to the end, count the living cells and
call it a day.
If you worried about memory consumption, or wanted to take advantage of
caching a little better (both valid concerns) you could also use only two
arrays, one to store the next generation and one to store the last
generation. This is the approach I took.
The mistakes I'm noticing basically come down to serious conceptual flaws,
speed issues, and simple bugs. Most of the conceptual flaws I've seen would
have been found by doing the example cases, and I think I've beaten that
dead horse quite enough. Speed issues appear to be being caused by people
leaving debug information in inner loops or passing large structures by
value in C++ (use const reference, or make them global! And always test the
worst computational case if there's any chance of it crashing and burning.)
As for bugs - some people's code failing on 0 generations, for example - all
I can say is, look for the boundary cases.
Socialize
Used as: Level 2
Div-I
|
Value
|
550 points |
Submission Rate
|
50 / 64 (78.13%)
|
Success Rate
|
27 / 50 (54.00%)
|
High Score
|
ante for 532.98 points
|
Average Points
|
391.54
|
This was actually very similar to the Doorknobs problem from Round 2, with a
lot of the same efficiency considerations, just far more of them. I wouldn't
be surprised in the least if Doorknobs inspired this problem.
The best way, by far, to do this was to scan the entire array for people.
For each person, do a floodfill (in a different array) of all the distances
from them. Then scan through it again - yes, inside your big loop - and for
each person, add to a distance accumulator and a distance counter. The worst
case efficiency-wise is a 50x50 matrix of people, and it comes out to be an
O( n^4 ) algorithm (n^2 for each person, and another n^2 because for each
person you're checking *each person*), which comes out to 50^4 or 6.25
million. Painful. Very painful. But if you're efficient, doable. Note that
we're counting each pair of people *twice* (starting once from each side),
but that's okay, because we're counting *every* pair of people twice, so the
average is unaffected.
But you have to be careful.
For one thing, finding the shortest-path between each two people
independently is going to crash and burn. I don't know if there's a
super-efficient algorithm possible on a grid like this - A* might be
tolerable - but with the naive n^2 floodfill algorithm that most people did,
that comes out to an O( n^6 ) algorithm, which certainly doesn't work with
n=50. (Approximately 15.6 billion, if you're curious, and an empty for loop
gets under a billion iterations before timeout in both C++ and Java.)
Luckily your floodfill algorithm will give you *all* the shortest distances,
if you just extract the information before getting rid of it.
Another thing that tripped up a lot of people is using depth-first
floodfill. Someone mentioned that depth-first in this case becomes an
exponential-time algorithm, which I can believe, and which will fail
dramatically in this case. In the case of a 50x50 grid, depth-first will
zigzag all the way down the grid before backing up to the most recent zig
and redoing every single location beneath it . . . once, for every single
location *above* it. Painfully slow. The only real way to do it was with
some sort of breadth-first algorithm. The implementation I used had two
arrays, one for "this pass", one for "next pass" - I saw someone else who
had a list of queues, and I expect this would work fine also.
Once we got past that, it was just speed.
Symmetry
Used as: Level 3
Div-I
|
Value
|
1000 points |
Submission Rate
|
21 / 64 (32.81%)
|
Success Rate
|
3 / 21 (14.29%)
|
High Score
|
dmwright for 501.48 points
|
Average Points
|
419.13
|
If you want a way to give many coders nightmares, computational geometry is
a pretty good one. Some people handle it easily - unfortunately I am not one
of them, and I bombed this problem pretty horribly. My algorithm was sound,
but my implementation was enormously flawed, so all I'm going to be able to
do here is give you an idea of how it *should* be done and let you pore over
the few surviving solutions for clues.
The weird bound of 200 points also serves to nicely eliminate an O( n^4 )
solution - in fact, this was its design. Luckily it's doable in O( n^3*log
n ) time without much trouble, probably much faster if you really know what
you're doing. SnapDragon's solution appears to be O( n^2*log n ), so there
ya go.
The basic algorithm I used was to take each pair of points seperately,
figure out the line of reflection centered between the two, then check to
make sure each point has a match on the other side. If they all do, add it
to a set (the C++ STL set<> structure guarantees uniqueness - you
could do something similar in other languages) in some canonical form.
Return the size of the set. A glitch that I didn't foresee that would have
killed me even if I *had* fixed all the other bugs is the situation that
happens when all the points are colinear, but that's easily fixed by adding
an identical check for a line of reflect passing through the two points.
Implementation is tricky.
What I'd done is try to first translate the entire field so that the line of
reflection passed through the origin (find the average of the two points,
subtract it from each point in series). Then rotate the field so the line of
reflect was vertical. Invert the X coordinate, un-rotate, un-translate, and
check for a match - easy to do if you have a C++ set<>, and I imagine
not hard in any other language. Java's HashTable will do the job nicely, for
example. I was trying to use matrices to rotate, and was sticking to
integers as much as possible - doing both rotations using the rise and the
run (and a distinctly non-unit vector) then doing it again and scaling it
down by the appropriate factor, checking first to make sure it was
integral - if it wasn't, it's all finished anyway, since the points are all
at integral locations.
It all seems pretty straightforward, but I still don't know where my bugs
were, so I'm not going to be much help at this point :P
The theory behind my implementation was that for any line of symmetry, a
point on the graph *must* map to a different point (except for that whole
colinear thing, but then we have a line that must go through two points). By
checking each combination of two points, you might get a few duplicates, but
you *would* get every line of reflection, and the canonical form would get
rid of any duplicates.
SnapDragon seems to have set something very similar up with vertices 0 and
n, for all n >= 1, which makes sense, because vertex 0 is going to have
to map to something also. He then does the same thing with 1 and n for all n
>= 2, first checking to see if points 0, 1, and n are all co-linear, then
doing the normal check. This makes some sense to me, but not enough so I can
explain it.
I think I'll just leave it at this.
Good luck all. I'll be seeing some of you in a month, and probably most of
you next week.
By
ZorbaTHut
TopCoder Member