Match Editorial

SRM 134
Saturday, February 8, 2003

Match summary

This problem set had two rather tough problems. The Division-I Level 1 and Division-II Level 2 problems were the same, and had very low success rates. Division-I coders also had to deal with a very difficult Level 3 problem, which only Yarin solved. No one in Division-I was able to solve all three problems except for Yarin, and no one in Division-II was able to solve all three problems except for Eeyore.

The Problems

Defragment

Used as: Division-II, Level 1:

Value 250

Submission Rate 197 / 215 (91.63%)

Success Rate 152 / 197 (77.16%)

High Score stigant for 240.80 points

Implementation

The solution to this problem consists of performing the steps of defragmentation as described. First, we make sure we have the FAT stored in some sort of mutable array (in C++, string is already mutable, but in Java you will want to convert the String to a char[] using toCharArray or to a StringBuffer).

We then walk the array from the end. Whenever we encounter a 'C', we find the first occurrence of the '.' character in the string. If this occurrence is to the left of the 'C' we just found, we swap the two characters and continue. Otherwise, the FAT is already fully defragmented and we can go ahead and return.

ObjectCounter

Used as: Division-II, Level 2:

Value 500

Submission Rate 82 / 215 (38.14%)

Success Rate 9 / 82 (10.98%)

High Score XooleX for 406.29 points

Used as: Division-I, Level 1:

Value 250

Submission Rate 129 / 148 (87.16%)

Success Rate 43 / 129 (33.33%)

High Score Yarin for 243.30 points

Implementation

The easiest way to solve this problem is probably to iterate through all the different combinations. For instance, we can have a for-loop that tries each number of red cubes from 0 to 25. Inside this for-loop we'd have another one that does the same thing for red spheres. We continue in the same fashion for blue cubes and blue spheres.

For each combination we generate in this manner, we see if it contradicts the information we already know. For instance, the total number of shapes must exactly match the total if known. The total numbers of spheres and cubes each must match the given totals (if given). And if the number of a particular type of object is given, then of course it must match the number generated in our combination. As soon as we find a valid combination, we can determine the value of the entry in the array that is requested and return it. This is because of the constraint that explicitly specifies that there will be exactly one possible combination.

This problem was used in both Divisions, and in both Divisions the success rate was dismal. This probably was a major factor causing the submission rate for the Division-I Level 3 problem to be so low.

StoreDB

Used as: Division-II, Level 3:

Value 1000

Submission Rate 57 / 215 (26.51%)

Success Rate 5 / 57 (8.77%)

High Score frypan for 551.22 points

Implementation

We need to determine the implicit ordering of brands made by the customer by examining the shopping history. We know that in each entry of the history, exactly one brand is marked as bought, and that brand can be considered the most preferred brand among all available brands in the history.

A strict ordering can be represented by the transitive closure of a directed graph. A path from one vertex to another indicates that the former vertex is greater than the latter. So, as we process the history data, we build a directed graph.

Initially this graph should be empty. We iterate through the entries of the history array. For each entry, we locate the '$' character. We then iterate through the brands in that entry. For each occurrence of the '.' character, we add an edge from the brand that was bought to the brand that was available.

Once this graph is built, we should generate the transitive closure of the graph. The transitive closure is just the set of all paths in the graph. That is, for each pair of vertices in the graph there is a boolean value representing the presence of a path from one to the other. A very simple algorithm for generating this is Warshall's algorithm.

Assume that the graph is available as an adjacency matrix named adj (e.g., if adj[i][j] is true, then an edge directly connects vertex i to vertex j). We will populate the transitive closure matrix of the same dimensions and type, named paths.

                                        for(int i = 0; i < n; i++) {
                                            for(int j = 0; j < n; j++) {
                                                paths[i][j] = adj[i][j];
                                            }
                                        }
                                        for(int i = 0; i < n; i++) {
                                            for(int j = 0; j < n; j++) {
                                                if(path[i][j]) {
                                                    for(int k = 0; k < n; k++) {
                                                        if(path[j][k]) {
                                                            path[i][k] = true;
                                                        }
                                                    }
                                                }
                                            }
                                        }

This algorithm is very similar to Floyd's algorithm for finding all of the shortest paths in a graph. Now, once we have this data, we need to examine the current available brands to find the available brand that is more preferred than any other available brand. Just iterate through the available brand array. For each occurrence of '.', we check to see if it's the maximum available brand. We iterate through the available brand array again, and for each '.' we check our transitive closure matrix to see if there is a path from the latter to the former. If there is, we can simply repeat this process with the latter brand (or we could do it the slow and easy way and just continue iterating through the brands in the same way). Whenever we encounter a brand that is not known to be preferred less than any other available brand, we add it to the return array.

QuickCount

Used as: Division-I, Level 2:

Value 500

Submission Rate 83 / 148 (56.08%)

Success Rate 41 / 83 (49.4%)

High Score bmerry for 483.83 points

Implementation

The first step is to modify the lower and upper bounds so that they are both divisible by the base in which we are counting. We do this by increment the lower bound by 1 repeatedly until lower % base == 0. Whenever lower % base != 0, we increment the number of times we count lower % base by 1. We do exactly the same with the upper bound, except that we decrement the upper bound until upper % base == 0.

Now we can find the rest of the answer inductively. We simply chop off a trailing zero from the upper and lower bounds. The difference between these two new bounds will then be the number of times that each digit should be counted. We then go back to our first step above using these new bounds.

We repeat this process for as long as the lower bound is less than the upper bound. Once we get to this point we have our answer.

ExpertSystem

Used as: Division-I, Level 3:

Value 1000

Submission Rate 4 / 148 (2.7%)

Success Rate 1 / 4 (25%)

High Score Yarin for 455.18 points

Implementation

This probably basically stumped most of Division-I. The only person to successfully solve it was Yarin, and so this explanation will be based on his particular implementation.

For any pair of terms, there are certain states that the relation between the first and second can be in. The relation could be less than, greater than, equal, or unequal. For instance, for the statement "a <= b" the relation is either less than or equal. We will have a table of relations between terms, in which we will eliminate possible states that each relation can be in. Initially we have no data, and every state is possible. We will represent combinations of states as integers, where each of the first four bits represent the states listed above. Thus a value of 15 will represent the combination of all four states, and will mean "UNKNOWN."

After we declare and initialize an array that can hold the state of relations between all possible pairs of terms (including space for any numeric literals we encounter), we begin processing the input. For each input rule we parse all the terms and relations. For each term encountered, we will gain knowledge of the relation between it and each of the terms that follow it in that rule. We will store that knowledge by performing a bitwise and on the set of possible states of each relation with the set of states represented by that relation. For instance, "<=" represents LESS_THAN | EQUAL, while "!=" represents LESS_THAN | GREATER_THAN | UNEQUAL. We will treat numeric values just the same as variables at this point.

Next we do some special processing of numeric terms. We have a priori knowledge of the relation between each pair of numeric terms. So we iterate through our terms, finding all numeric ones, and then iterate through all pairs of numeric terms. For each pair, we will set the relation between the two to either LESS_THAN | UNEQUAL, GREATER_THAN | UNEQUAL, or EQUAL, as appropriate. We also must populate the matrix with EQUAL for the relation between identical terms (e.g. rel[i][i] = EQUAL for all terms i).

Next begins the process of making all the inferences we can possibly make. We can do this in a loop, iterating through all pairs and then all triples of terms. For each pair, we can see if we can infer anything reflexive. E.g., if we know that x may be less than y, then we know that y may not be less than x. Also, for each pair, if one is greater than or equal to the other and less than or equal to the other, then we know the relation between the two cannot be unequal. If one is not greater than the other, then the other cannot be less than the former. For each triple, we can see if there's a relation we can infer between the first and last term due to the existence of a relation between the first and second and a relation between the second and third. For instance, if x is less than y and y is less than z, then we know that x cannot be greater than z. Note how each of these inferences only involves generating negatives. There are 13 inferences that Yarin attempts to make in his solution, which can be assumed to be exhaustive.

We repeat this loop for as long as we are able to generate an inference. When we can no longer generate inferences, then the knowledge we currently have is as specific as it can get. First we check to see if there are any contradictions in our knowledge, which are indicated by there being no possible states for a relation between any pair of terms. If there are no contradictions, we can then look at the possible states for the relation between the requested terms. If all the states are possible, then the answer is "UNKNOWN". Otherwise the states should be one of the following:

State	Relation
`LESS_THAN \| UNEQUAL`	`<`
`GREATER_THAN \| UNEQUAL`	`>`
`EQUAL`	`=`
`LESS_THAN \| GREATER_THAN \| UNEQUAL`	`!=`
`LESS_THAN \| EQUAL \| UNEQUAL`	`<=`
`GREATER_THAN \| EQUAL \| UNEQUAL`	`>=`

These are basically taken directly from Yarin's solution. If anything about this description does not make sense, examine Yarin's code and you should be able to figure it out.

By Logan
TopCoder Member