Competitors with the best 5 submissions in this contest according to system test results will receive the following prizes:
Bonus Prize - $100 per winning submission
We will ask the 5 winners to submit a Docker file for their solution and will provide guidance in that regard. We will pay the winners who submit a Docker file an additional $100 each for their efforts.
This contest builds upon one we ran recently. It uses the same format that allows submissions to employ any language or open source libraries that meet the licensing requirements specified below in the "Final Code Submission" section.
Requirements to Win a Prize
In order to receive a prize, you must do all the following:
- Achieve a score in the top 5, according to system test results calculated using the Contestant Test Data. See the "Data Description" and "Scoring" sections below.
- Create an algorithm that both reads in a single 10 second language audio recording and outputs the results in < 60 seconds, running on an Amazon Web Services m4.xlarge virtual machine.
Within 7 days from the announcement of the contest winners, submit:
- A complete report at least 2 pages long, outlining your final algorithm, and explaining the logic behind and steps to its approach. More details appear in the "Report" section below.
- All code used in your final algorithm in 1 appropriately named file (or tar or zip archive). The file should include your final model, saved so that can make predictions without being retrained. Prospective winners will be contacted for this.
If you place in the top 5 but fail to do any of the above, then you will not receive a prize, and it will be awarded to the contestant with the next best performance who did all the above.
Faith Comes by Hearing ("FCBH") is dedicated to spreading the message of the Bible across the globe. Over the years, FCBH has observed that many people can"t read or live in oral communities. The organization wishes to allow as many people as possible to hear the Bible in their native languages.
To enable this, FCBH seeks algorithms that can correctly identify the languages spoken in audio recordings. Given speech data from multiple languages, your algorithm must identify which languages are spoken.
You will receive both a training data set ("Contestant Training Data") and a test data set ("Contestant Test Data"). Both data sets contain recorded speech in 176 languages ("Possible Languages Spoken"). A separate .mp3 file stores each speech recording, and only 1 language is spoken in each file.
The Contestant Training Data contains:
- 66,176 files, each of which contains approximately 10 second of speech recorded in 1 of the 176 Possible Languages Spoken ("Speech Files").
- The identity of the actual language spoken in each Speech File ("Actual Language Spoken").
The Contestant Test Data contains:
- 12,320 different Speech Files, each containing approximately 10 second of speech recorded in 1 of the 176 Possible Languages Spoken.
There is no indication of which Possible Language Spoken in each Contestant Test Data file is the Actual Language Spoken in that file (i.e., the data are unlabeled).
Algorithm Output Format Requirement
Your algorithm must output 3 records for each Speech File. Each record must contain 3 comma-separated fields in the following order:
- The Speech File name ("speechFile")
- The Possible Language Spoken in the Speech File ("possibleLang")
- A rank from 1 to 3 of the likelihood that each Possible Language Spoken was the Actual Language Spoken spoken in that Speech File, with 1 being most likely and 3 being least likely ("langRank"). You must use each rank only once for a given Speech File. Tied ranks are not allowed.
An example output appears below for 2 Speech Files:
During the contest, only your results will be submitted. You will submit code which implements only one function, getURL(). Your function will return a String corresponding to the URL of your answer .csv file. You may upload your .csv file to a cloud hosting service such as Dropbox which can provide a direct link to your .csv file.
To create a direct sharing link in Dropbox, right click on the uploaded file and select share. You should be able to copy a link to this specific file which ends with the tag "?dl=0". This URL will point directly to your file if you change this tag to "?dl=1". You can then use this link in your getURL() function.
Another common example is to use Google drive for sharing the link. If you choose that, please use the following format to create a direct sharing link: "https://drive.google.com/uc?export=download&id=" + id;
You can use any other way to share your result file but make sure the link you provide should open the filestream directly.
Your complete code that generates these results will be tested at the end of the contest.
The following links may provide helpful background information on previous, similar work. Sections "IV. Acoustic-Phonetic Approaches" and "V. Topics in System Developments" of Spoken Language Recognition: From Fundamentals to Practice by H. Li et al. (2013) could be particularly useful.
National Institute of Standards and Technology (NIST) Language Recognition Evaluation 2009
NIST Language Recognition Evaluation 2011
You may find the techniques, tools, and research papers in the following section useful.
Technique: Language Identification (Weka)
Type: Research Paper
Notes: Uses mel-frequency cepstral coefficients (MFCCs), pitch contour, and normalized pairwise variability index to generate coefficients that can be used in a feature space and algorithms implemented in Weka
Technique: Spoken Language Classification
Type: Research Paper
Notes: Uses MFCC , variance and Sd to generate features for analysis, and support vector machines , neural networks, and Gaussian mixture models for language identification
Technique: Voice Pattern Designs
Type: Research Paper
Notes: Uses MFCC, Spectral Energy Peak (SEP), Spectral Band Energy (SBE), Spectral Flatness Measure (SFM), Spectral Centroid (SC),Spectral Flatness Measure (SFM)
Technique: Audio Processing
Notes: Idiap is a Python based framework that uses Bob for audio analysis
Technique: Machine Learning through Audio Analysis
Notes: pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks, including feature extraction, classification, segmentation, and visualization
Type: Links to National Institute of Standards and Technology (NIST) language recognition evaluation tools
Notes: The NIST ran language recognition evaluations in 2009 and 2011 that are relevant
We will run both provisional and system tests using the same submission, but you will not know which Speech Files are used for which test.
Your algorithm's performance will be quantified as follows.
For each Speech File,
If possiblelLang = Actual Language Spoken and langRank = 1, then 1.0 points
Else if possiblelLang = Actual Language Spoken and langRank = 2, then 0.40 points
Else if possiblelLang = Actual Language Spoken and langRank = 3, then 0.16 points
Else 0.00 points
Score = 1,000 * Total points for all Speech Files in the Contestant Test Data
The maximum possible total scores for example, provisional, and system testing are 0, 3,520,000, and 8,800,000.
If there are ties in the final system test results, then we will break them using algorithm run time on the Amazon Web Services m4.xlarge virtual machine described above. However, we will not measure algorithm run time or break any ties during the contest.
Your report must be at least 2 pages long, contain at least the following sections, and use the section names below.
- First Name
- Last Name
- Topcoder handle
- Email address
- Final code submission file name
Please describe your algorithm so that we know what you did even before seeing your code. Use line references to refer to specific portions of your code.
- Approaches considered
- Approach ultimately chosen
- Steps to approach ultimately chosen, including references to specific lines of code
- Advantages and disadvantages of the approach chosen
- Comments on libraries
- Special guidance given to algorithm based on training
- Potential Algorithm Improvements
Final Code Submission
You must submit all code used in your algorithm. You may use any programming language you like, provided it is a free, open-source solution for you to use and would be for the client as well. If using something other than a standard Python/gcc/Java install, as usually used by our testers, please include any relevant installation instructions for said language, and likewise for any additional libraries and packages.
You must submit evidence that your code runs successfully to completion. The code will be run on CentOS 6.5 x86_64 HVM.
Data File Downloads