Part 2 of 3. CAT FAQ: Intermediate

David is the Vice President of Research at Knewton. Learn more about the company's GMAT course or read Knewton articles on BTG.

TypingThis is a follow to my last piece, CAT FAQ: Beginner.

1. My score doesn’t seem to match my performance: I only got a few questions wrong, but my score isn’t as high as I thought it would be / I got a bunch of questions wrong, yet my score seems higher than it should be.

Most exams are linear assessments, like the SAT or your 10th grade history final. These are scored by counting the number of questions you answer correctly, and sometimes by penalizing for each question you answer incorrectly. The result, a raw score, is then converted to a scaled score, like the 600-2400 range for the SAT.

A computer-adaptive test (CAT) works very differently. It doesn’t really care as much about how many you get right or wrong, but rather which questions you get right and wrong. The CAT algorithm estimates your ability based on a variety of criteria, including the difficulty of a question. After each question, it evaluates your response and updates this estimate. When the test is over, the algorithm converts your quantitative and verbal ability estimates into the quantitative and verbal scaled scores, and then separately combines your quantitative and verbal ability estimates to calculate the overall score.

2. Do the first X number of questions matter more?

Many variables that come into play when the CAT selects your next question. One of them is the CAT’s current estimate of your ability. It uses this estimate to select questions that will be most useful in refining that estimate (if you’re a high performing student, giving you low difficulty questions isn’t usually as useful in discerning your true ability as giving you harder questions, and vice versa). What is important to remember is that you should not try to guess how you are doing by whether the question in front of you seems easy or difficult; every question deserves your full attention. With that understood, unless you have completely bombed the test, it is usually the case that missing a couple of very hard questions late in the test will have a smaller effect on your final score than missing a couple of very easy questions earlier, not because of their position within the test but because of their levels of difficulty.

3. How severe is the penalty for not finishing a section?

The penalty is significant. You can expect your scaled score to decrease by roughly 1 point for every question that you don’t answer. For example, if you correctly answer every question you encounter but fail to answer the last five, you generally won’t score higher than a 46.

4. I took the GMAT and got a 710, 44q/44v/6 AWA. A friend of mine happened to take the test 6 days later and get the exact same quant/verbal scaled scores but he got a 720. How this could happen?

Both the individual section scores and the overall score are calculated using an estimate of your Math and Verbal abilities derived from your performance on the CAT. Your overall score is not calculated from your section scores. Because your underlying ability estimate might be slightly different from your friend’s, your overall scores might be different. For example, there are a range of ability estimates that translate into a Verbal score of 40, and there are a range of ability estimates that translate into a Math score of 42. Depending on which specific estimate is calculated for you, your overall score could range from 660 to 680. Please note that the Standard Error of Measurement (SEM) on the overall score for GMAT is 29 points, so scores of 660 – 680 all fall within the standard error.

Check back next week for part 3. Until then, do your homework!

4 Comments

  1. Is GMATPrep Practice Test 1 and 2 CAT or just normal questions?

    • I don't think its a CAT because the database of the test is too small to judge one's performance. Also, in CAT(actual exam) percentage of people who answered the question correctly is also taken into account, which cannot be done in GMAT prep CD. Experts, plz correct me if I am wrong :)

    • It is indeed a CAT - the database of questions in GMATPrep is actually quite large, and the test uses an algorithm similar to (perhaps identical to) the real GMAT scoring algorithm.

  2. The mention of Standard Error in section 4 strikes me as a potentially confusing non sequitur. Test takers can achieve the same Quant and Verbal Scaled Scores, but can have different Total Scores out of 800, because Scaled Scores are rounded off to whole number values, whereas the Total Score is computed using non-rounded values. That is, a test taker with a 42Q/40V might have had 'true' scaled scores of (using just one decimal place) 41.7V/39.7Q, or of 42.3V/40.3Q, which could produce slightly different Total Scores out of 800 (the test actually doesn't use the 6-51 scale internally to measure Quant and Verbal ability - those Scaled Scores are translated from the internal values, which are on a different scale altogether, at the conclusion of the test - but I've ignored that fact for clarity).

    This has nothing to do with Standard Error of Measurement. Standard Error measures how close actual test scores are to the scores that would be produced by an 'ideal' test -- one that is infinitely long. To put it another way, if a test taker could take 10,000 GMATs in one day, the Standard Error of Measurement tells you how close together the test taker's scores would be. Actual GMAT scores are influenced by several factors unrelated to the test taker's 'true ability' - how lucky a test taker is when he or she guesses, for example, along with test day conditions. A longer test, and one with better questions (questions close in level to the ability of the test taker and with good 'discrimination') will have a lower Standard Error than a short test, or one with bad questions. The GMAT has a Standard Error of about 29 points, which means that (simplifying things) more than two thirds of the time, your score is within 29 points of what you 'deserve' - i.e. what you would get if luck and other factors did not affect your performance.

    The passage above seems to suggest (to my reading at least) that Standard Error is related to some kind of 'rounding error'; it's not. It's determined by the amount of information the 78 questions on the GMAT give the algorithm about the test taker's ability.

Ask a Question or Leave a Reply

The author David Kuntz gets email notifications for all questions or replies to this post

Guidelines: Some HTML allowed. Keep your comments above the belt or risk having them deleted. Signup for a Gravatar to have your pictures show up by your comment.