After reading bits and pieces of the Dec. 13 waiting thread, I noticed that there was some debate as to how the LSAT is equated exactly. I'm more curious about the process in general than I am in obsessing over a speculative curve for this particular test ,so bear with me for a moment of geek-iness.

Historically, the December LSAT has always had a more lenient scale. A few people suggest this is because the pool of test takers is less competitive.
This explanation doesn't make sense entirely though because the test is generally thought to be equated *before* it is even administered (i.e. based on previously administered experimental sections).

With that said, it is very possible that the test makers could tweak a predetermined scale post administration if there is a discrepancy between expected and actual scores. Inconsistencies of this sort might be explained by:

a) individual sections on the actual test are not comprised of whole experimental sections of the past, but rather, a combination of questions. (i.e. necessary assumption question from an experimental section administered in year A produced an appropriate amount of right/wrong answers and was combined with a parallel question from an experimental section administered in year B that also produced the appropriate amount of right/ wrong answers.) If this "combining" process is assumed for an entire section, there is risk that something superficial about the questions do not produce expected results when combined in a different context than originally administered. For example, lets say a question that ends up being used for the actual test is about meteorites, and that five questions earlier there was a confusing question on a similar topic, and that collectively, test takers are consequently "primed" with certain expectations/anxieties and end up doing generally worse on that question as a whole than was expected. This phenomenon thus causes adjustments to a predetermined scale post administration.

b) In the second scenario, tests are comprised of a combination of *whole* past experimental sections. This could create a similar problem as stated above, but on a broader level.
Ex. LR 1, LR 2, RC, LG , each from experimental sections of differently administered tests, produce an appropriate *overall* scale when combined. However, something subjective about said combination throws test takers off, (or, benefits them), causing the scale to be adjusted post administration.

Theoretically, if a pool of test takers *is* weaker enough to produce statistically significant variations of such nature, the actual difficulty of the test can be slightly distorted as a result of not being able to distinguish the testing pool's generally lower skill level from the actual "hardness" of the test.


Here's how LSAT scores are determined:

The actual raw score ---> scaled score conversion is pre-equated. So that conversion has already been decided before the test is administered publicly. Contrary to popular belief, the conversion itself has absolutely nothing to do with how the group of test takers did on the whole.

Percentile Rank, on the other hand, is entirely based on how everyone did as a group. But it's based on the last 3 years of test takers- NOT just the ones who took that particular LSAT.

LSAC publishes a lot of research on their scoring system- it's public knowledge that they pre-equate. It's just not common knowledge, unfortunately. (most LSAT instructors I know don't actually know this is how it works)

