[R] Data Mining Competition 2008

Xiaogang Su xiaosu at mail.ucf.edu
Fri Feb 15 22:03:50 CET 2008


Dear Colleagues, please help distribute the following announcement of a
data mining competition. Thanks, -XG



=============================
Data Mining Competition 2008
=============================

Website: http://dms.stat.ucf.edu/competition08/home.htm 

Department of Statistics & Actuarial Science
University of Central Florida


ANNOUNCEMENT

The Data Mining program at the University of Central Florida (UCF) is
announcing a data mining competition on marketing response analysis in
collaboration with BlueCross BlueShields of Florida (BCBSFL). The
purpose of this project is to
develop a predictive model the can generate a list of potential
responders in a future promotion mailing campaign. The response/target
variable is 0-1 binary with value1 indicating a response in the previous
mail campaign. Most of the
explanatory variables or inputs used in this study are from census data
and the rest are from a list data vendor. We have renamed all input
variables as X1, X2, ... for data security and privacy concerns.


DATASET DOWNLOADS

Two formats of the datasets are made available: SAS formatted and
comma-separated values (CSV). Please select the one that serves best to
your  convenience after registration.

Register to Download Dataset

                Training                                               
          Test
SAS        training.sas7bdat (392.53 mb)                 test.sas7bdat 
(43.89 mb)
CSV         training.csv (257.00 mb)                         test.csv
(28.55 mb)


PARTICIPATION AND AWARDS

This competition is open to anyone interested. Please review the
following rules carefully and contact us with any questions at
data.mining.2008 at gmail.com.

Please build your model using the training data set and accordingly
obtain your predicted probability of response for each individual in the
test sample. Two deliverables must be submitted by 5:00 pm (Eastern
Time) on 3/31/2008 in order to participate in the contest.

— A data set with two columns: one is ID and the other is your
predicted probabilities of response (not 0-1 predicted outcomes).

— A one-page write-up that contains your contact information and a
brief description of your modeling methods and approaches. The contact
information should list the names, titles, academic degrees,
affiliations, and locations (city, state, and country, if international)
of all authors.


The top three winners will be selected according to predicted
probabilities on the test sample data. All participants will be  ranked
using the following two specific model performance measures.

— Criterion 1:    area under the receiver operating characteristic
(ROC) curve.
— Criterion 2:    percentage of responders caught among the first
10,000 individuals with highest prediction response probabilities.

Then the final ranking will be the sum of these two separate ranks. In
the case of ties (e.g., Tom has got No.1 in terms of Criterion 1 and
No.3 in terms of Criterion 2, while Jerry has got No. 2 with both
criteria), the one with higher rank in terms of Criterion 1 (i.e., Tom)
would win out.

All sponsored by BLBSFL, a cash prize of $1,000 will be awarded to the
best performer; $500 for the second and $250 for the third. The three
winning individuals or teams will also be invited to present their
results at the Fourth Annual Business Intelligence Symposium in Orlando,
FL on April 11, 2008. Award plates will be presented to the winners
during the symposium. The work can
be completed by an individual or group, but only one individual will be
invited to present their work at the Symposium for a winning team.


IMPORTANT DATES
Feburuary 08, 2008       Competition Announced
March 31, 2008              Submissions for Competition by 5:00 pm
(Eastern Time)
April 02, 2008                 Announcement of Winners
April 11-12, 2008            Fourth Annual Business Intelligence
Symposium in Orlando, FL



================================
Xiaogang Su, Ph.D.
Associate Professor / Undergraduate Coordinator 
Department of Statistics and Actuarial Science
University of Central Florida
Orlando, FL 32816
(407) 823-2940 [O]
xiaosu at mail.ucf.edu
http://pegasus.cc.ucf.edu/~xsu/



More information about the R-help mailing list