JOIN
Get Time
long_comps_topcoder  Problem Statement
Contest: Silverline Media Advertisement Placement Challenge ROUND
Problem: AdOptimizer

Problem Statement

    

Prize Distribution

              Prize             USD
  1st                           $8,500
  2nd                           $6,000
  3rd                           $4,500
  4th                           $3,500
  5th                           $2,500
Total Prizes                   $25,000

Objective

In this challenge you are asked to simulate the task of placing ads to available time slots on TV channels in order to maximize revenue. You need to create an algorithm that distributes ads specified in advertisement deals daily in an optimal way: the more of them you manage without violating constraints on the placements, the better your score is.

Assigning advertisement deals can be a complicated task. Some deals are based only on the number of people who see it (impression-based). Other deals are based only on the times of day and channels on which they are shown (spot-based).

Input data

In this challenge we will use simulated data that resembles real life as much as possible. These are the most important kinds of data your algorithm must work with. The sections below define the universe of network advertising revenue.

SilverLine Media is a make-believe cable television network. SilverLine provides many services to its customers, and displays many public and private networks in its own network to its customers. SilverLine provides over 160 separate networks of programming, but in our contest we will focus on 13.

Customers are viewers who subscribe to SilverLine and watch content from their homes and offices. There are two types of customers. Some customers have "DVR" hardware from SilverLine, which allows them to record programs in realtime and watch them later. Other customers lack this hardware, and watch programming in realtime only.

When customers have DVR hardware, SilverLine is able to pre-position ad content on the hardware that is tailored to the customer's specific demographics (age, gender, location). This is an important feature, which is explored in more detail below as "Addressable ads".

Advertisers buy time from SilverLine on their network, generating revenue for the company. Advertisers create ads of varying lengths and pay networks like SilverLine to display them.

Deals are purchases made by advertisers to place ads on SilverLine. Deals are Linear - Guaranteed, Linear - Non-guaranteed, or Addressable. The network sales staff sells deals. SilverLine displays the ads to make money.

Channels are network TV (or cable) channels. Channels stand alone, and can also be clustered (all sports channels, for example). Deals can be purchased for specific channels or channel clusters.

Impressions (and CPM or Cost per thousand impressions, which is the payment rate tied to impressions) are a critical metric for SilverLine. An Impression is achieved when a viewer sees an advertisement. The method for measuring Impressions is dealt with later.

Slots and Spots: TV "slots" are the gaps in programming established by networks into which SilverLine is able to schedule ad "spots". Slots vary in length. There are typically two, separated, one minute slots per hour of programming, but the networks providing the programming will vary these according to the program. The minimum length for any slot is 30 seconds. The maximum is 120 seconds. Slots can be of two types: Linear, or Addressable. Linear slots must be filled with linear deals (see below for the definition of these), currently this assignment is done manually, improving this process is the main goal of this challenge. Addressable slots will be filled by addressable deals automatically by the DVR hardware.

Linear deals earn revenue by showing a spot an agreed-on number of times within a schedule. Linear deals can be guaranteed or non-guaranteed.

Guaranteed Linear deals commit SilverLine display an ad enough times to meet a minimum number of impressions. The deal supply includes the number of times a customer wishes a spot to air per channel, according to estimated or expected impressions for that slot. SilverLine must keep airing the spot until the required number of impressions is met. This can be a challenging metric to manage.

Non-Guaranteed Linear deals: SilverLine is paid for the number of times the ad is shown. Impressions are tracked, but SilverLine is not required to meet a minimum number of impressions; SilverLine only counts the number of times the ad is shown, and is paid each time the ad is shown.

Today, SilverLine prioritizes airing Guaranteed Linear deals in order to retire them quickly.

Addressable deals are for DVR customers, and involve storing ad content on their systems. Addressable ads are highly tailored to the specific profile of the customer, which allows SilverLine to command higher deal rates from advertisers for these ads. Ads for addressable slots are electronically sent by SilverLine to customer DVRs for storage well in advance of the times they may be displayed. It is the DVR hardware that determines which stored ad to show, when an Addressable slot is encountered. In this challenge we simulate the logic that determines the placement of addressable ads in a way that is simpler than how it is done in real life. The simulation will prioritize ads that

  • have a targeted demography (distribution of age and sex of viewers) that is similar to that of the predicted demography of the slot in which the ad must be placed,
  • are more profitable (their total payout per number of viewers and per spot length is higher),
  • and are more fulfilled (they have already been shown to a higher ratio of the targeted number of viewers).

For the exact algorithm you may refer to the source code of the provided off-line tester tool, but note that scheduling addressable deals is not in the scope of this challenge. As far as addressable deals are concerned, your task is simply to determine which slots should support showing addressable deals.

Every Addressable slot also functions as a regular linear slot, called the "underlying slot", for customers who lack a DVR, or have a DVR but do not match the demographic, geography or programming requirements of the addressable ads that are stored on their DVR hardware when the slot arrives.

Note that for simplicity, in this challenge there is a one to one mapping between deals and spots, so you can treat the words deal, spot, ad and advertisement the same.

Summarizing input so far

To generate revenue, SilverLine must place ad spots. Spots can be placed only in the slots that are reserved for them. Spots are addressable, linear-guaranteed, or linear non-guaranteed. Addressable slots also function as linear slots, for those customers who either can't view addressable spots or who lack addressable inventory on their devices.

Guaranteed linear deals generate revenue when SilverLine has confirmed the required number of customer impressions for that spot.

Non-guaranteed linear deals generate revenue when SilverLine simply displays the spot.

Addressable deals generate revenue when a customer who matches the demographic requirements of the spot sees the spot.

Your task is to schedule linear spots and to tell which slots should be available for addressable spots.

Spots are subject to additional constraints, which are described below.

Constraints limit which deals can be assigned to a specific slot, may set priority for one type of spot or product type over another, or set rules for spot sequence. Constraints are based on the fields of the deal, see the list and details of fields below where the Deal object type is defined. The constraints are:

  • Deal Flight. Dates on which the order can air. This can be a series of specific dates or the full range of days of the simulated period.
  • Ordered Channel. The channel or channels on which the spot is ordered to air.
  • Allowable Time. The time range or time ranges of day that the spot can air.
  • Maximum number of same category per slot. Can't schedule this ad to a slot if it already has this number of ads from the same product category.
  • Time Separation. The minimum time between showing the same spot on the same channel again. Default of 15 minutes.
  • Slot length vs spot length. The total length of the spots in a slot should not exceed the length of the slot. The length of a slot can be 30, 60, 90 or 120 seconds, 60 being the most common. The length of spot can be 15, 30, 60, 90, 120, the most common are in this order: 30, 60, 15, 120, 90.
  • Maximum number of showing per day. The same spot cannot be shown more than this many times per day on the same channel. Default: 5

Gameboard

The gameboard is comprised of slots distributed on the simulated 13 channels. At the start of the simulation each slot is marked as either addressable or linear, but you can change this assignment later.

The actual number of impressions are tracked by Nielsen in real life, in the challenge they will be simulated (by applying random perturbation to the predicted values) and made available to your scheduler after some days of delay, see details later. (The actual viewership data lags by several days, because it takes time to collect it from the field.)

The challenge

In this challenge your goal is to develop a general purpose bot that optimizes deal scheduling for highest revenue. You will receive many gameboards each of which comprises 30 days of 13 channels of programming and includes slots. There are between 0 and 3 slots per hour.

The game simulation works like this:

  1. You get a list of deals. The list will contain all 3 deal types (addressable, guaranteed linear, non-guaranteed linear) and will represent a high variablility in all variables like pricing, products, constraints.
  2. You get a gameboard of 30 days. Slots are tagged as addressable or linear.
  3. You get a list of estimated viewership data broken down by slots and demographics for the simulated period.
  4. You get reports of actual linear viewership (impression) data (a single integer number per slot). The data is current to today minus 10 days (it always lags 10 days behind).
  5. You get reports of actual addressable viewership data (an array of integers per slot representing viewers in each age group). The data is current to today minus 5 days (it always lags 5 days behind).

Your bot must create the spot-to-slot assignment for the next day given the above data - without violating deal constraints. Your bot will run once for each day on the game board. After each run, you will receive new data:

  • #4 and #5 will be updated with actual viewership for the next day in sequence,
  • you will receive random changes to #1. A small percent of deals end, new deals may appear, parameters of existing deals change (only the number of targeted impressions, the allowed days and time ranges may change).

For each new day, your bot is expected to:

  • Assign linear spots for the day such that revenue is maximized over the whole simulated period.
  • Update slots to be addressable or linear in order to maximize the revenue. Note once again that addressable deals must not be scheduled. Changing the addressability flag of a slot will be effective after two days. (E.g. if you change from linear to addressable on day 5, the slot will be addressable from day 7 onwards.)

Data format

In this challenge your algorithm receives data and generates output in arrays of different object types represented as key-value pairs in line-based, textual format. The object types (like Channel, Deal, Placement) are defined below by showing a typical instance for each type, extended with comments (marked by the # character). Empty lines are allowed and ignored. Note that the comments are added only as explanation, they will not be present in the output data and need not be present in your output. However, please read the comments below carefully as they provide important information not given elsewhere in this document.

Channel

  id: 10 # unique id, integer
  
  # Slots are defined in blocks starting with the "slot_id" field
  slot_id: 13 # slot identifier, unique for this channel, integer
  type: A # slot type on day 1, "A" for Addressable, "L" for Linear
  # Day of week when this slot is available. A single number in [1..7], 1 is Monday.
  # Note this is different from the days field of a Deal.
  day: 1 
  time: 07:40 # start time in "hh:mm" format
  length: 120 # length in seconds
  linear_impressions: 12000 # predicted viewership, used for linear slot type

  # Demography data representing the predicted viewership data for the current
  # slot is given in two lines (men / women) containing a comma separated list of numbers per
  # age group. There are 15 groups for the following age ranges:
  #   2-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-24, 25-29, 
  #   30-34, 35-39, 40-44, 45-49, 50-54, 55-64, 65+
  # This is used only for addressable type. Note that the total count across all age groups is
  # usually less than linear_impressions, because only a fraction of households have
  # support for addressable slots.
  addressable_impressions_m: 0, 0, 0, 0, 0, 0, 0, 0, 0, 483, 536, 596, 662, 596, 536
  addressable_impressions_w: 0, 0, 0, 0, 0, 0, 0, 0, 0, 337, 375, 417, 463, 417, 375

  slot_id: 14 # a new slot begins here...

Deal

  id: 28 # unique id, integer
  type: A # "A", "LG", "LN" for Addressable, Linear-Guaranteed, Linear-Non-Guaranteed
  # The "closed: true" field is present only if the deal is prematurely closed
  # during the simulated period. If present, all other fields than "id" can be 
  # ignored, and this deal should not be scheduled any more.
  closed: true

  guaranteed_impressions: 150000 # This is present only if type="LG"
  
  # Targeted addressable impression counts per age groups. Only for "A" deal types
  addressable_impressions_m: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10000, 10000, 0, 0, 0
  addressable_impressions_w: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  
  # Product category, one of "automotive", "pharmaceutical", "food", 
  # "insurance", "restaurants", "telecommunications", "entertainment", 
  # "beauty", "household", "health", "clothing", "services"
  category: food 
  
  length: 30 # spot length in seconds

  # For "LN" type "rate_per_s" is the fee of showing 1 second of 
  # the spot. For "LG" type "total_fee" is the fee of reaching the required
  # guaranteed number of impressions. For "A" type "total_fee" is the fee of reaching the
  # required number of impressions in all targeted age groups.
  # Only one of "rate_per_s" and "total_fee" is present.
  rate_per_s: 28.5
  total_fee: 20000

  # Some deal constraints are optional. If missing then 
  # their default value must be used.

  # Comma separated list of allowed days, values are in the [1..30] range. 
  # Optional. Default: all 30 days of simulation
  days: 5,6,10,11

  # Comma separated list of targeted channels, values are in the [1..13] range. 
  # Optional. Default: all channels.
  channels: 1,2,3

  # Comma separated list of allowed start time ranges in hh:mm-hh:mm format.
  # (Checked against the start time of the slot, not against the actual start
  # time of the ad, which can be 1-2 minutes later.)
  # Optional. Default: 00:00-23.59
  times: 08:00-11:59,16:30-18:29

  time_separation: 30 # Minimum delay in minutes. Optional. Default: 15.

  # Max number of ads per slot from the same category. Note it is a parameter
  # of the deal, not the slot. Optional. Default: 4
  max_no_per_category: 2

  # Max number of times this spot can be shown in a day on the same channel. 
  # Optional. Default: 5
  max_show_per_day: 2

Report (actual viewership data)

  day: 3 # actual data from this day
  channel_id: 10
  slot_id: 13
  # Two "addressable_impressions_X" fields and a single "linear_impressions" field,
  # representing actual viewership data for addressable and linear ads, respectively.
  # The "linear_impressions" fields may contain -1, it means there is no data 
  # available for linear ads for that day. Note that number of linear_impressions is correct
  # only for linear slots. If the slot is addressable then the correct number of 
  # linear impressions is smaller than the linear_impression count: you have to 
  # substract the sum of the addressable impression counts.
  addressable_impressions_m: 0, 0, 0, 0, 0, 0, 0, 0, 0, 483, 536, 596, 662, 596, 536
  addressable_impressions_w: 0, 0, 0, 0, 0, 0, 0, 0, 0, 337, 375, 417, 463, 417, 375
  linear_impressions: 12000

Placement

  channel_id: 10
  slot_id: 13
  # "deal_ids" holds a comma separated list of deal IDs that should be placed
  # (in the given order) into a linear slot or into an underlying linear slot of
  # an addressable slot.
  deal_ids: 6,66,123

SlotChange

  channel_id: 10
  slot_id: 13
  new_type: A # the required slot type, "A" for Addressable, "L" for Linear

Functions

In this challenge a 'task', or 'test case' is a certain combination of the data described previously: a given set of TV channels (with slots), and a set of deals. Based on these input data you need to provide a set of ad placements: which ads should be shown on which channels and when.

You need to implement a class called AdOptimizer that has two methods setupTask and schedule having the following specification:

public int setupTask(String[] channels, String[] deals)

will be called once at the beginning of each test case. The channels array contains 13 elements, in the format shown above (without the comments). The deals array contains the initial set of deals that you will need to place into slots. Note that deals may later change during the simulated time period. You may return any integer value.

public String[] schedule(int day, String[] deals, String[] reports)

will be called 30 times in each test case, once for each simulated day. The start of the simulation (day=1) is always a Monday.

  • The deals array may contain new deals (not known on previous days) and, additionally, it may specify changes to deals you already know of (having an id field that was seen already). In case of changes the new version of the deal will contain all fields, not only those that changed. Additionally, deals may be closed, these will have "closed:true" in their string representation.
  • The reports array will be empty in the first couple of days of simulation. Later, it will contain actual viewership data for an earlier day (5 and 10 days earlier, see details above).
  • Your return should be an array of Placement and SlotChange objects targeting the given day. Both types of objects can be added to the array, the scorer will determine the correct type for each entry.

The total time spent in your code in a test case should be less than 60 seconds.

Scoring

The quality of the ad placements your algorithm generates is measured by how much revenue it generates on average on a high number of tasks.

Your score for a test case will be -1 if any of these errors happen: running your algorithm takes longer than allowed; the returned strings are not formatted correctly (bad syntax, unknown field names or constant values, etc); unknown channel, slot or deal ids; the total length of ads placed into a slot exceeds the slot length; any deal constraints are violated (e.g. an ad is shown on a channel or at a time where or when it shouldn't, the same ad is shown on a channel more frequently than allowed, etc); an addressable or a closed deal is used in scheduling.

In case of no errors, the revenue on a test case is calculated as the sum of revenues achieved on the 3 deal types:

  • For linear non-guaranteed deals revenue = rate_per_s * length * N, where rate_per_s and length are as defined for the deal, N is the number of times the spot was shown.
  • For guaranteed linear deals revenue = total_fee * fulfillment_ratio, where fulfillment_ratio = actual_impressions / guaranteed_impressions, capped to [0...1].
  • For addressable deals revenue = total_fee * fulfillment_ratio, where fulfillment_ratio = sum_i ( min (actual_impressions_i, targeted_impressions_i)) / sum_i (targeted_impressions_i), where the summation is over the age groups.

The raw score for a test case is the total revenue calculated as above.

To calculate your overall score for the current test set (provisional or final) relative scoring will be used: for each test case where your score is not -1, you get score / max points; where score is your raw score as defined above, max is the best such score achieved by any of the contestants on the same task. Finally, the sum of points is multiplied by 1,000,000 and divided by the number of tasks.

Notes

  • The time limit is 60 seconds per test case (this includes only the time spent in your code). The memory limit is 1024 megabytes.
  • There is no explicit code size limit. The implicit source code size limit is around 1 MB (it is not advisable to submit codes of size close to that or larger). Once your code is compiled, the binary size should not exceed 1 MB.
  • The compilation time limit is 30 seconds. You can find information about compilers that we use and compilation options here.
  • There are 5 example test cases and 20 provisional test cases. There will be 100 test cases in the final testing. Example submissions can be used to verify that your implementation works according to the specifications and the online scorer can reproduce your local results. The scorer tool gives detailed error messages in case of errors during example testing. Though recommended, it is not mandatory to create example submissions. The scores you achieve on example submissions have no effect on your provisional or final ranking. Example submissions can be created using the "Test Examples" button on TopCoder's submission uploader interface. For provisional testing you have to submit full submissions using the "Submit" button on the submission uploader interface. The only feedback you will receive after making a full submission is your overall score on the provisional leaderboard. For final testing your last full submission will be used.
  • The match is rated.
  • An off-line tester tool is available here that you can use to develop your solution locally. See the source code of the tool for the exact algorithm of generating a test case and scoring. The tool also contains a baseline algorithm that you can use as a starting point. Feel free to reuse its code in your solution.
  • The set of parameters that drive the simulation (see the source code of the off-line tester tool) may be changed in the first week of the contest. Watch the contest forum for notifications.
  • Use the match forum to ask general questions or report problems, but please do not post comments and questions that reveal information about possible solution techniques.
 

Definition

    
Class:AdOptimizer
Method:setupTask
Parameters:String[], String[]
Returns:int
Method signature:int setupTask(String[] channels, String[] deals)
 
Method:schedule
Parameters:int, String[], String[]
Returns:String[]
Method signature:String[] schedule(int day, String[] deals, String[] reports)
(be sure your methods are public)
    
 

Examples

0)
    
Test case 1
1)
    
Test case 2
2)
    
Test case 3
3)
    
Test case 4
4)
    
Test case seed = 5

This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2010, TopCoder, Inc. All rights reserved.