adaptive scoring and half of questions wrong at any level?

This topic has 21 expert replies and 11 member replies
Ian Stewart GMAT Instructor
02 Jun 2008
Post Tue Jul 17, 2012 2:43 pm
I also wanted to reply to at least some of what tutorphd posted above, but nothing below will be of any value to GMAT test takers, so they should feel free to ignore this post!

A few points:

* first, the GMAT is not a test of pure mathematical ability, and no one claims that it is. The GMAT can best be described as a test of 'GMAT ability'. It turns out that GMAT ability is a very good predictor of success in MBA programs, which is why the test is used as part of the MBA admissions process. 'GMAT ability' is really a portfolio of skills - basic numeracy, logical reasoning, attention to detail, time management, ability to succeed under pressure, and so on.

* the conclusions you draw from your few experiments with GMATPrep seem to me to be substantially tainted by confirmation bias. I strongly disagree with many of your interpretations - for example, as I posted above, the fact that you can recover from a completely abysmal performance early in the test to achieve a Q47 score is evidence to me that the test offers you quite a bit of latitude to make errors on easy questions. You seem to have reached the opposite conclusion.

* In any case, you are making incorrect generalizations from isolated experiments. That is, you're drawing conclusions about the GMAT algorithm in general from a single trial using the GMATPrep exam - to paraphrase, you've said something like "I got several early questions wrong on this one GMATPrep test and got a low score, so therefore getting early questions wrong on every GMAT test will always produce a low score". That is obviously not a logically sound conclusion to draw. The correct conclusion to draw is the following: if you get a lot of easy questions wrong on the GMAT, you will get a low score. It is irrelevant where those questions appear on the test - the algorithm has no idea if the easy questions you got wrong were early or late in the exam. Now, it is true that during the test, the algorithm uses an estimate of your ability as one factor in determining which question to give you next. So a bad performance early makes it more likely that you see easier questions later. But ability is just one variable that is used by the question selection algorithm. There are also content requirements (you need a certain number of questions on each topic) and security provisions (you won't be given a question which has been overused during a testing cycle) and because of these other factors, you can often see questions at any point in the test which are very far above or below your ability estimate. So even if you are doing very well, you can get an easy question which counts late in your test. Getting that question wrong will hurt you just as much as would getting an easy question wrong early in your test.

* Similarly your conclusions about strings of consecutive wrong answers are incorrect generalizations. It certainly is true that consecutive wrong answers normally are worse than 'spread out' wrong answers. That's because one wrong answer lowers your ability estimate, making it more likely your next question is easier, and getting that easier questions wrong hurts you more than getting a hard question wrong. But because there are many variables at work in question selection, that conclusion is not universally true - sometimes if you get a question wrong, your next question might actually be harder, and getting that wrong won't hurt you so much. Or sometimes your next question will be experimental, and won't affect your score no matter what you do.

* Finally, while I agree with you that adaptive testing has its limitations, we cannot in fairness entertain a discussion of that without concomitantly acknowledging the limitations of classical testing. In its current form, the GMAT can produce a Quant scaled score in 1 point increments on a 6-51 scale. That is, it is able to produce scores on a 46-point scale. You might want to imagine how long a standard test, based only on your percentage of right and wrong answers, would need to be to produce that kind of resolution. Clearly you would need at least 46 questions, but on a multiple choice test like the GMAT, where everyone should get 20% of questions right just by guessing, you actually need about 60 questions. Then if for each wrong answer you lose one point on the 6-51 scale, you could achieve the same resolution as the current GMAT. But then notice how sensitive the test becomes to careless mistakes. With just four wrong answers out of sixty questions, your score is capped at a 47. That is, if you use a classical testing paradigm, the test becomes far more sensitive to careless errors. You seem to think the opposite is true.

The great advantages of adaptive testing are that it is efficient - because the GMAT is adaptive, the test is less than half as long as a standard classical test - and that it offers much better resolution at the extreme ends of the scoring scale. An adaptive test can far more reliably distinguish between a Q48 test taker and a Q50 test taker than can a standard test.

If you are looking for online GMAT math tutoring, or if you are interested in buying my advanced Quant books and problem sets, please contact me at ianstewartgmat at gmail.com

lunarpower GMAT Instructor
03 Mar 2008
Post Tue Jul 17, 2012 10:50 pm
very well said.

Pueden hacerle preguntas a Ron en castellano
Potete chiedere domande a Ron in italiano
On peut poser des questions à Ron en français
Voit esittää kysymyksiä Ron:lle myös suomeksi


Quand on se sent bien dans un vêtement, tout peut arriver. Un bon vêtement, c'est un passeport pour le bonheur.

Yves Saint-Laurent


JPBarros010 Newbie | Next Rank: 10 Posts
14 Jul 2017
Post Fri Jul 14, 2017 3:19 am
OfficialGMAT wrote:
Hello! Our psychomatrician team sent you this overview:

For each question, we estimate and scale the three parameters (location, shape, and pseudo-guessing) using the 3-parameter logistic model from examinees’ response data before a question is used in operations. At the test sites, after each question is administered on computer, we estimate the examinees’ ability using maximum likelihood method with the parameters of all the answered questions and examinees’ response vector. The next question will be selected to match the interim ability in difficulty until the end of the test. The final scores are the MLE estimators converted to our reporting scores.

If you have more questions, you can look for literature on computer adaptive testing in the measurement field.

Thank you!
Hi!, thanks for this information. Could you explain us, what do "location, shape, and pseudo-guessing" exactly mean?


