Page 1 of 1

LSN raw data

Posted: Sun Jan 30, 2011 2:39 pm
by duckmoney
Does anyone know if it's possible to get raw data from LSN into an easily manipulable format (spreadsheet / excel / CSV)?

It seems like there's a lot of information out there that we've never delved into, but it's hard to see more than individual users and the graph.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:06 pm
by joebloe
Scraping the data would probably be fairly easy, but then you run into all those nasty problems you get with unauthorized scrapes. If you've got a legit, interesting use for the data, you might get what you want by asking.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:07 pm
by duckmoney
Any idea where contact info is for the LSN admins?

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:10 pm
by joebloe
duckmoney wrote:Any idea where contact info is for the LSN admins?
http://lawschoolnumbers.com/?about&a=contact

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:14 pm
by duckmoney
ty

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:27 pm
by ahduth
duckmoney wrote:Does anyone know if it's possible to get raw data from LSN into an easily manipulable format (spreadsheet / excel / CSV)?

It seems like there's a lot of information out there that we've never delved into, but it's hard to see more than individual users and the graph.
I'd be curious to see what I could get out of this too. For example, during that little mini-spat we had in the CLS thread, it seemed (at least to me) that LSN had 25/50/75 numbers that were off from the admitted class (they were higher). Read from that what you will, but if I could shove all this data in an Access database, I could crunch out any kind of report y'all wanted. Although the data needs to be scrubbed pretty heavily (missing LSAT/GPA/dates make a lot of the records invalid, even before you consider data submission inaccuracies).

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:30 pm
by duckmoney
That was basically my intention; there's potentially a wealth of data out there on the accuracy and reporting bias of LSN that could make it's results much more useful to all of us. I emailed them and I'll let you know if I get anything back.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:35 pm
by r6_philly
ahduth wrote:
duckmoney wrote:Does anyone know if it's possible to get raw data from LSN into an easily manipulable format (spreadsheet / excel / CSV)?

It seems like there's a lot of information out there that we've never delved into, but it's hard to see more than individual users and the graph.
I'd be curious to see what I could get out of this too. For example, during that little mini-spat we had in the CLS thread, it seemed (at least to me) that LSN had 25/50/75 numbers that were off from the admitted class (they were higher). Read from that what you will, but if I could shove all this data in an Access database, I could crunch out any kind of report y'all wanted. Although the data needs to be scrubbed pretty heavily (missing LSAT/GPA/dates make a lot of the records invalid, even before you consider data submission inaccuracies).
Real number crunchers don't use Access ;)

I just don't have the energy/reason to do this. I can have the entire LSN dataset in a SQL database in 5 minutes.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:36 pm
by r6_philly
duckmoney wrote:That was basically my intention; there's potentially a wealth of data out there on the accuracy and reporting bias of LSN that could make it's results much more useful to all of us. I emailed them and I'll let you know if I get anything back.
Read the user agreement and see what kind of licensing LSN uses. You may not need permission.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:39 pm
by ahduth
duckmoney wrote:That was basically my intention; there's potentially a wealth of data out there on the accuracy and reporting bias of LSN that could make it's results much more useful to all of us. I emailed them and I'll let you know if I get anything back.
Right. For Columbia 2009, LSN has 56 people declaring their intent to attend for a class of 350+ (I forget the exact number). The degree to which it should be relied on for hard analysis of trending in schools' admissions patterns is a bit suspect in mind.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:40 pm
by r6_philly
It's not a random sample.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:41 pm
by ahduth
r6_philly wrote:
ahduth wrote:
duckmoney wrote:Does anyone know if it's possible to get raw data from LSN into an easily manipulable format (spreadsheet / excel / CSV)?

It seems like there's a lot of information out there that we've never delved into, but it's hard to see more than individual users and the graph.
I'd be curious to see what I could get out of this too. For example, during that little mini-spat we had in the CLS thread, it seemed (at least to me) that LSN had 25/50/75 numbers that were off from the admitted class (they were higher). Read from that what you will, but if I could shove all this data in an Access database, I could crunch out any kind of report y'all wanted. Although the data needs to be scrubbed pretty heavily (missing LSAT/GPA/dates make a lot of the records invalid, even before you consider data submission inaccuracies).
Real number crunchers don't use Access ;)

I just don't have the energy/reason to do this. I can have the entire LSN dataset in a SQL database in 5 minutes.
I, sir, am a financial analyst. I have had the pro-/anti-Access fight many, many times, and am forever willing to defend it's utility to non-programmers such as myself. En garde!

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:42 pm
by ahduth
r6_philly wrote:It's not a random sample.
That's the point though - it's often quoted on TLS as though it is.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:46 pm
by r6_philly
ahduth wrote:
r6_philly wrote:It's not a random sample.
That's the point though - it's often quoted on TLS as though it is.
I don't think stats are required for every major. If the sample is relevant and representative, then the "most popular" law school list should correspond with the list of schools by total number of applications.

The dates are useful, and you can view people's decision making process. But those functions work well enough, so I don't see a point in remaking it. Using LSN for chance estimation is too much of a stretch.

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:50 pm
by duckmoney
I'd really like to see the actual extent of reporting bias - for instance, how many 170+s report as a fraction of the LSN population relative to the total population. And then you break this down by schools. And then you may have a better of your chances based on LSN.

I'd also like to an analysis of how closely people's numbers actually correspond to their acceptance rates and see which schools are more numbers based and which tend to favor softs more (more accurately than by word of mouth anyway).

Re: LSN raw data

Posted: Sun Jan 30, 2011 4:56 pm
by r6_philly
duckmoney wrote:I'd really like to see the actual extent of reporting bias - for instance, how many 170+s report as a fraction of the LSN population relative to the total population. And then you break this down by schools. And then you may have a better of your chances based on LSN.

I'd also like to an analysis of how closely people's numbers actually correspond to their acceptance rates and see which schools are more numbers based and which tend to favor softs more (more accurately than by word of mouth anyway).
biased distribution could be very very very far from real distribution, especially when we don't understand the bias. No data is better than flawed data.

Re: LSN raw data

Posted: Sun Jan 30, 2011 5:06 pm
by duckmoney
r6_philly wrote:
duckmoney wrote:I'd really like to see the actual extent of reporting bias - for instance, how many 170+s report as a fraction of the LSN population relative to the total population. And then you break this down by schools. And then you may have a better of your chances based on LSN.

I'd also like to an analysis of how closely people's numbers actually correspond to their acceptance rates and see which schools are more numbers based and which tend to favor softs more (more accurately than by word of mouth anyway).
biased distribution could be very very very far from real distribution, especially when we don't understand the bias. No data is better than flawed data.
This is true, but I'd like to be able to understand the bias. We know the real distribution, so if we can break down the sample distribution and compare it to the real one, we could begin to understand the LSN data.

Re: LSN raw data

Posted: Sun Jan 30, 2011 5:10 pm
by r6_philly
duckmoney wrote:
This is true, but I'd like to be able to understand the bias. We know the real distribution, so if we can break down the sample distribution and compare it to the real one, we could begin to understand the LSN data.
No one is going to understand the bias because we don't know how they are being referred to LSN, and what would make someone sign up for a LSN account and share the data. Think of political polling, it's only accurate when you don't use any sub divisions. Even then, a lot of cell phone users are cut out which skews the data.

Why do you think all internet polls have a disclaimer that it's for entertainment purposes only. They are not representative of the whole population by any stretch.

Again, having flawed data (without really know why, or how flawed it is) is very very bad.

Re: LSN raw data

Posted: Sun Jan 30, 2011 8:26 pm
by joebloe
r6_philly wrote:
duckmoney wrote:That was basically my intention; there's potentially a wealth of data out there on the accuracy and reporting bias of LSN that could make it's results much more useful to all of us. I emailed them and I'll let you know if I get anything back.
Read the user agreement and see what kind of licensing LSN uses. You may not need permission.
I was actually going to point OP to that initially, but I couldn't even find a TOS at LSN. Maybe there's one when you sign up for an account, but I'm not seeing one browsing around.

Not that the absence of one implies permission to scrape the data.