Key Information

Register
Submit
The challenge is finished.

Challenge Overview

In this challenge, we want you to help to optimize the way we getting a project's latest activities in TopCoder Direct. This is done mainly by the query 'query_direct_latest_activities_replatforming' now. The project's latest activities are displayed in the project overview page like project_latest_activities.png 

Requirements Details

  • You need to read all the documents ( see Documentation section for details) we provided first to analysis and make an optimization plan. You can optimize from different aspects:
    • Optimize in the query level 
      • Optimize the current query by rewrite the joins or add / remove indexes for the related tables
      • Split the query it into multiple ones
      • Totally rewrite the query
    • Optimize in the code level - for example, remove unnessary calls, optimize the logic to improve performance etc.
  • You need to document your optimizations and why / how they improve the performance. Give measures on how the performance is improved. For example, the getting the latest activities of  project XXX which has 100 challenges and 20 active challenges takes 18 seconds before optimization, after optimization it only takes 1.5 seconds. The measurement should based on the same environment and same test data set. For test data and test environment, please refer to the Testing section.
  • You need to implement all the optimizations in your optimization document.

Testing

  • Use the test_data_generator.7z to generate a SQL file which inserts the test data into database. You can adjust build.properties to adjust the amout of data you want to generate.  For this challenge, you need to generate
    • Several direct projects (2 or 3)
    • Each Direct Project will have 20+ active challenges
    • Each Direct Project will have 20+ completed challenges in recent 30 days
    • Each Direct Project will have hunreds of challenges within the whole project
  • Insert the test data into the database on the VM assigned to you. Add the indexes which are existent in production database but missing in VM. Refer to production_database_tables.docx.
  • Run UPDATE STATISTICS for the related tables so the informix optimizer can capture the latest database information and do proper optimizations.
  • You need to figure out a way to measure the performance and 
    • Mesure the performance of the methods which use the old query on VM
    • Mesure the performance of the these methods after implementing your optimizations (assume these method signatures and return values do not change) on VM.

Documentation

The following documents are attached in the challenge forum: Documents Forum

  • production_database_tables.docx - the statistics like row number, existing indexes, number of unique rows for each column for all the related tables to my projects query.
  • explain-direct_latest_activities_replatforming.txt - the query explanation plan on the project latest activities query.  You can read http://www.ibm.com/developerworks/data/library/techarticle/dm-0409fan/ and http://www.ibm.com/developerworks/data/library/techarticle/dm-0410fan/ to know get details about what explanation is and how it can guide you to optimize the query and find out performance bottleneck.
  • query_direct_latest_activities_replatforming.sql - the SQL query content
  • test_data_generator.7z - the test data generator

Repository:

Direct : https://github.com/cloudspokes/direct-app/tree/challenge-30046749

Remote branch: challenge-30046749

DB Scripts SVN: https://coder.topcoder.com/internal/database/scripts/trunk (Revision 85346)

VM Environment

You need a TopCoder VM to work on this challenge

Information about VM can be found below:

VM specific information is found here: http://www.topcoder.com/wiki/display/docs/VM+Image+2.5. Upon registration as a submitter or reviewer you will need to request a VM based on the new TopCoder Cockpit/Direct image. To request your image, please use the forum. Before requesting your VM, you need to ensure that you have an SSH key created and in your member profile. Instructions to do so are here: http://www.topcoder.com/wiki/display/projects/Generate+SSH+Key, and instructions to connect afterwards are here: http://www.topcoder.com/wiki/display/projects/Connect+Using+SSH+Key.

For the direct on VM, please do not use the /home/direct/direct as your work base. Instead you should git clone from https://github.com/cloudspokes/direct-app into another direct for example /home/direct/direct-app.

  • Copy token.properties and topcoder_global.properties from the /home/direct/direct to /home/direct/direct-app
  • Update topcoder_global.properties to point to the new location  /home/direct/direct-app
    • ���Add a new line direct_service_libdir=${libdir}/tcs/ejb
  • Run 'export JAVA_HOME=/opt/jdk1.7.0_17' to use JDK7
  • You can run 'ant deploy' to deploy the direct after these steps


Final Submission Guidelines

  • A optimization plan document which documents
    • The optimization you make and why / how you make it
    • The improvments you gain by these optimization - time measurements.
  • The updated / added queries, java source codes, configurations etc.
  • The test data SQL you used.

ELIGIBLE EVENTS:

2015 topcoder Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30046947