|
Match Editorial |
SRM 348Wednesday, May 9, 2007
Match summary
The turnout for this match -- 839 competitors -- was the lowest since January, however
that couldn't stop it from being very exciting, as usual. Division 1 was
won by Petr, with
a 200+ point gap over second place marek.cygan
and third place tomek.
Relatively standard easy and medium problems led to many coders being able
to approach the hard, and to very tight standings at the top after the coding
phase. However, as most of the hards failed during either the challenge or
the system test, only 12 coders got the hard right, and only 8 of them
got all three problems. The challenge phase was quite eventful as some of the
competitors tried to improve their ranking, but the most notable change was
tomek losing his (ultimately)
second place due to an unsuccessful challenge.
Division 2 was won by newcomer
FHead,
jumping well into division 1 for the next match. He was followed
by karthik123
and macs.
The Problems
OptimalList Used as: Division Two - Level One:
Value
|
250
|
Submission Rate
|
401 / 434 (92.40%)
|
Success Rate
|
361 / 401 (90.02%)
|
High Score
|
Samlik for 247.72 points (2 mins 44 secs)
|
Average Score
|
200.23 (for 361 correct submissions)
|
This problem is strongly tied with the notion of the
Cartesian coordinate system.
That is, no matter how complicated the sequence of instructions is, the location of the
grandma's house is uniquely determined by two integer numbers, called coordinates — how many
blocks should Billy walk horizontally (east or west) and how many blocks should he
walk vertically (north or south). Let's call the first number dx, and the
second number dy.
We'll make dx positive if Billy needs to
walk east and negative if he needs to walk west (i.e., dx=-3 means
he needs to go 3 blocks west); positive dy will correspond to walking north.
To calculate dx then we need just to subtract the number of 'W's in the
input string from the number of 'E's in it, dy will be the number of 'N's
minus the number of 'S's.
Once we've determined the coordinates of grandma's house, it becomes easier to
find the shortest path: we know the moves we should do horizontally (if dx
is positive, we need to walk only east, and not west, or else the path would not
be shortest; a similar argument applies in other cases) and vertically, so we're
practically there. The only thing left to do is to find the alphabetically first
sequence of moves — but that's straightforward: 'E's go before 'N's and 'S's,
'W's go after 'N's and 'S's.
See dplass' solution for
an example of implementing this approach.
This approach can also be rephrased as the following greedy one: while
the current instructions string contains both 'W's and 'E's, remove one of each,
do the same for 'N's and 'S's. Sort the result and return as an answer. For an
implementation, see m4risU's code.
LostParentheses
Used as: Division Two - Level Two:
Value
|
500
|
Submission Rate
|
270 / 434 (62.21%)
|
Success Rate
|
188 / 270 (69.63%)
|
High Score
|
mi6091 for 486.14 points (4 mins 49 secs)
|
Average Score
|
310.43 (for 188 correct submissions)
|
Used as: Division One - Level One:
Value
|
250
|
Submission Rate
|
389 / 401 (97.01%)
|
Success Rate
|
344 / 389 (88.43%)
|
High Score
|
gevak for 248.75 points (2 mins 1 secs)
|
Average Score
|
197.36 (for 344 correct submissions)
|
This problem had several possible correct approaches. Arguably the easiest of
those was the following: the numbers before the first minus sign should
be added to the result, and the numbers after should be subtracted, as
coded
by Quelloquialism. To understand
why this approach works, we need to prove two facts: that this result is
achievable, and that it is minimal possible. We can achieve that number by
putting brackets between consecutive minuses, as follows: 3+5+7-(2+3+10)-(99+57)-(22)-(2)-(3+4)
— that way all the numbers in the brackets are subtracted. We can't reduce
the result even further, as the numbers before the first minus get added to
the final result no matter how we put the brackets.
If you've solved a lot of similar 'add-brackets' problems before, you
could easily skip this easy solution and go for a dynamic programming one.
That is, let's find the minimal possible and maximal possible values for
every substring of the input string that starts and ends with a number.
To find the minimum for a particular substring, we iterate over all
possibilities of which operator will be evaluated last. If that operator
is a '+', we need to take the minimum possible values of the strings
to the left of it and to the right of it, and add them. If it's a '-', we
need to take minimum on the left and maximum on the right. The maximum is computed
similarly. We can evaluate the minima and maxima in the order
of increasing substring length to be able to use previously computed
values, or we can implement the memoizing approach, like
marek.cygan
did.
For more information on dynamic programming, see the tutorial.
In fact, this problem allowed even some seemingly incorrect approaches
to pass. For example, you could make a mistake in the DP and not compute
the maxima, assuming the minimum can always be obtained by adding or
subtracting two minima, and still succeed because the method of putting brackets
that was discussed above when proving the greedy algorithm can indeed
be obtained by considering only minima.
IncreasingSubsequences
Used as: Division Two - Level Three:
Value
|
1000
|
Submission Rate
|
47 / 434 (10.83%)
|
Success Rate
|
21 / 47 (44.68%)
|
High Score
|
macs for 784.26 points (15 mins 50 secs)
|
Average Score
|
572.72 (for 21 correct submissions)
|
To solve this problem, one needed to think about how to tell if a
given increasing subsequence is maximal. The answer is not so complicated,
and is obtained by trying to unerase any other number: an increasing
subsequence is maximal,
if all the numbers erased before its first number are not less than it,
all numbers erased after its last number are not greater than it,
and all numbers erased between any two its consecutive numbers are either
not greater than the first of those two, or not less than the second.
One could then test all increasing subsequences, or even try to
write some backtracking solution, but both these methods will probably time out,
because the answer can be as high as about 86 million (try the sequence
3,2,1,6,5,4,9,8,7,...).
But the nature of the problem — the fact that the maximality test can be
done separately for each segment between the two consecutive numbers of the
sequence — leads us to a dynamic programming solution. Let's define
am[i] as the number of maximal increasing subsequences of the
first i elements of our sequence that end with exactly i-th
element. To compute am[i], we first check if the i-th element
can start a maximal subsequence, by checking that no element before is less than it. If
it can, we initialize am[i] with one, otherwise with zero. We then
check all the possible positions j of the previous element of the subsequence,
and if i-th element can follow j-th (i.e., the former is greater than the latter and no element
can be inserted in the middle), we increase am[i] by am[j].
To compute the final answer, we sum all the am's for the positions that can
end a maximal subsequence — that is, for which no following number is greater.
See FHead's solution
for an implementation of this approach.
RailwayTickets
Used as: Division One - Level Two:
Value
|
500
|
Submission Rate
|
249 / 401 (62.09%)
|
Success Rate
|
122 / 249 (49.00%)
|
High Score
|
abikbaev for 475.20 points (6 mins 33 secs)
|
Average Score
|
313.73 (for 122 correct submissions)
|
The first thing to understand in this problem is that we should not
be overconcerned with assigning seats to passengers. The only thing that really
matters is that, at any time, there should be no more than seats than passengers
inside the train. To prove that, we'll simply allow every entering passenger that
has his request accepted
to sit at any unoccupied seat. Obviously this strategy will fail only if there
are more than seats than passengers traveling simultaneously.
So, long story short, we have a set of segments on a line and need to choose
a subset such that every point is covered by at most seats segments. In case
seats is equal to 1, this problem is called an "Activity Selection Problem", and
its solution is described in the tutorial
on greedy algorithms. Try to read and understand the first part of that article (until "BioScore" caption)
before continuing with this editorial.
How to adapt that solution to work with a train with multiple seats? It turns out that
the same greedy approach works: we sort the passengers in the increasing
order of their exit (and not enter!) time, and then process them in the obtained
order and assign any passenger a seat while it's possible. Why does it work?
The proof is also similar to the one described in the above tutorial. Assume that at
some point we take passenger i, but not taking her could lead us to a better
solution — and the seat we give to passenger i is given to passenger j
instead. Let's consider the first such moment, i.e., let's assume that all passengers
that we placed earlier retain their seats. But since passenger i has exit time not later than passenger
j and all other remaining passengers, we can replace passenger j
with passenger i without affecting the passengers placed later, and we don't affect
the passengers placed earlier by the choice of i, so we've obtained a correct solution
of the same size but still containing passenger i. So there can be no error in
following our greedy algorithm.
The easiest way to check if adding the next passenger is possible, is to keep
the number of occupied seats for each unit segment of time (the segment from time x
to time x+1). For an implementation of this approach, see
RAVEman's solution.
This problem also allows for a totally different greedy approach. At first,
we will accept all the requests, and try to simulate the train's travel.
At any moment of time, we will remember the exit times of the passengers
that are on the train. When some passenger enters the train, we will add his exit time
to the list. When he exits, we remove it. The only possible problem is that
after some passenger enters we might get more than seats passengers
onboard — that's when we need to reject some request. But whose request
we will reject? After some thinking we can deduce that we should reject the request
with the latest exit time of those currently onboard; the formal proof of
this fact is left to the reader. A solution implementing this approach
will also be quite simple, like
that
of abikbaev.
The main difficulty in this problem is that it also has a lot of incorrect
greedy approaches. For example, sorting by the starting time or choosing the shortest trip are
incorrect. To avoid falling into that trap, one needed to be either lucky or
capable of proving her solution. This is really a situation where one can't
be too careful.
NormalizingTrees
Used as: Division One - Level Three:
Value
|
1000
|
Submission Rate
|
55 / 401 (13.72%)
|
Success Rate
|
12 / 55 (21.82%)
|
High Score
|
Petr for 770.47 points (16 mins 34 secs)
|
Average Score
|
549.84 (for 12 correct submissions)
|
The first, straightforward step in the solution of this problem is to construct
the tree (the list of adjacent vertices for each vertex) given the
input array.
Then, let's try all possibilities for the root vertex. Assuming we've
chosen the root, we need to renumber all the vertices in such a way that
the resulting vector is smallest lexicographically.
First, the root should obviously be labeled 0, as it is the only possibility for
the resulting vector to have -1 at the 0th position. Then, all the root's children will
correspond to zeroes in the resulting vector, and thus we need to label them with
1 to s in some order — it is the only possibility for the resulting
vector to have zeroes at next s positions, in all other cases some of
these numbers will be positive. Using the same argument we find that
we need to label the children of vertex 1 with next consecutive numbers, then children of vertex 2, etc.
The only freedom left is choosing the order of children at each step.
Consider the children of vertex 0. Their zeroes will be followed by the ones
of the children of vertex 1, and then by larger numbers. Thus, we need the number
of ones to be as big as possible, choosing the vertex with the greatest number of children
as the first child. Or, in other words, we need to sort the children of vertex 0
in the decreasing order of the number of their children. But what to do if several
of those numbers are equals? Then we need to check the number of children of the first child, etc.
All this sorting seems too complicated. But it turns out that we can
formulate the sorting criterion in a simple way. Assume our function
to find the lexicographically smallest description is called calc. Let's sort
the children in the lexicographical order of the descriptions obtained
by calling calc recursively for them, but with a slight modification: if one
of such descriptions is a starting segment of the other, the former should
be treated to be greater than the latter, and usually (for example,
like vectors are compared in C++ by default) it's the other way around. To achieve
that modification, we can append infinity to every description, then our sorting becomes
just the sorting of vectors.
It's not at all obvious why the above sorting works, but we'll omit the formal
proof because of its length and tediousness. To prove that for yourself,
try to understand step-by-step how the sequence for the parent is obtained from
the sequences for the children.
tomek's solution
is an excellent implementation of the above approach, refer to it for the details.