The experiment: This started as trying to answer the question "what do the decisions from one school will tell me about other schools?" Basically I am playing around with Machine Learning models for classification. I also wanted to see if there is anything that we can learn from the models.
The data: 19271 myLSN users from 05-06 to present who applied and heard back from T13 schools. All schools with status 'Pending' were considered dings for every past cycle, current cycle 'pending' are ignored. I am looking at initial decision so 'WL Accept' and 'WL Reject' is considered WL.
Weka is used for the Classification.
Evaluation: 10-Fold cross validation.
First Pass, I looked at OneR which tries to find one rule to best classify the data. This is very simplistic, but you always start with the simplest thing that can work, and I found the results interesting.
- Yale - Best predictor: stanford's acceptance, the rules 'Stanford-Accept -> Yale-Accept'; 'Stanford-Waitlist or Stanford-Reject -> Yale-Reject' correctly predicts the Yale results in 78.4804 % of users.
- Harvard - Best predictor: yale's results. The rules Yale-Accept or Yale-WL -> Harvard-Accept, and Yale-Re -> Harvard-Re. Correctly predicts 67.9604 % of Harvard results in users.
- Stanford - Best predictor: yale's results. The rules 'Yale Accept or WL -> Stanford Accept' and 'Yale Reject -> Stanford Reject' correctly classified the Stanford results in 75.5683 % of users.
- Columbia - Best predictor: nyu's results. The rules 'NYU Accept -> Columbia Accept', 'NYU WL-> Columbia WL' and 'NYU Reject -> Columbia Reject' correctly predicts columbia results in 65.1188 % of users.
- Chicago - Best predictor: nyu. The rules 'NYU Accept -> Chicago Accept', 'NYU WL-> Chicago WL' and 'NYU Reject -> Chicago Reject' correctly predicts Chicago results in 61.8695 % of the users.
- NYU - Best predictor: lsat score. The rules 'LSAT < 170.5 -> nyu-Reject' and 'LSAT >= 170.5 -> nyu-Accept' correctly predicts NYU results in 66.0256 % of applicants.
- Penn - Best predictor: uva. The rules ' UVA Accept -> Penn Accept', 'UVA WL-> Penn WL' and 'UVA Reject -> UVA Reject' correctly predicts Penn results in 56.84 % of LSN users.
- Duke - Best predictor: lsat score. The rules 'LSAT < 168.5 -> duke-Reject' and 'LSAT >= 168.5 -> duke accept' correctly predicts duke results in 61.8403 % of users.
- Berkeley - Best predictor: Harvard. The rules 'Harvard Accept or waitlist -> Berkeley Accept' and 'Harvard Reject -> Berkeley Reject' correctly predicts Berkeley results in 77.3727 % of users.
- UVA - Best predictor: penn. The rules 'Penn Accept -> UVA Accept', 'Penn WL-> UVA WL' and 'Penn Reject -> UVA Reject' correctly predicts UVA results in 54.14 % of users.
- Michigan - Best predictor: duke. "Duke Accept -> Michigan Accept', 'Duke WL -> Michigan WL', and 'Duke Reject -> Michigan Reject' correctly predicts Michigan results in 55.3852 % of users.
- Northwestern - Best predictor: nyu. The rules 'NYU Accept or WL -> Northwestern Accept' and 'NYU Reject -> Northwestern Reject' correctly predicts northwestern results in 59.6167 % of users.
- Cornell - Best predictor: michigan. The rules 'Michigan Accept or WL-> Cornell Accept' and 'Michigan Reject -> Cornell Reject' correctly predicts Cornell results in 60.693 % of users.
Discussion: For almost every school, there is another school whose admission can be used to predict at least 55% of that school's admission, and in some cases it is more (Harvard predicting 77% of Berkeley applicants). It is interesting to see that LSAT is better predictor for NYU and Duke then any other school, but this shouldn't be read to say that LSAT is the only this that those schools care about, because in the majority of applicants there is a correlation between LSAT and GPA, and law school applicants generally have good undergrad gpas.
I didn't do any error analysis, or go in depth at all, with these results, but I will on request.
Please let me know if there are things you would like to see with the models.
There is more to come, the next set of results will be from Naive Bayes models (because they seem to work best), and from decision trees because they are the most human readable.