adaptive scoring and half of questions wrong at any level?

5-Day Free Trial

5-day free, full-access trial TTP

Available with Beat the GMAT members only code

MORE DETAILS

Magoosh

Study with Magoosh GMAT prep

Available with Beat the GMAT members only code

MORE DETAILS

This topic has expert replies

Post new topic Post Reply

Previous Topic Next Topic

GMAT/MBA Expert

lunarpower: GMAT Instructor; Posts: 3380; Joined: Mon Mar 03, 2008 1:20 am; Thanked: 2256 times; Followed by:1535 members; GMAT Score:800

by lunarpower » Sat Jul 14, 2012 12:21 am

tutorphd:

1/
your post contains much sound and (especially) fury. a great deal of this "sound" seems to indicate that you haven't read any of the literature about how "item response theory" actually works.
if you haven't, you should; you'll find many, possibly all, of the answers you're looking for.

2/
you are referring to percentiles as though those are assigned by the algorithm. they aren't; the algorithm just gives you a scaled score.
the percentiles, like all percentiles everywhere, are a function of the test-taking population. the gmat population happens to have a huge component of tech / engineer types whose first language isn't english (like the majority of posters right here on this board). therefore, the math percentiles are skewed downward; the verbal ones, upward.
as a result, it's not surprising at all that "1 in 12 people did better than" level xxxxxx -- just as it wouldn't be surprising if 1 in 12 people were over 6'6" in a sample of which 50% were college basketball players.

3/
you should stop for a moment and consider whether your posts are sufficiently professional in tone. (the other posts on this thread are a useful yardstick.)
the good news about the internet is that there's a "backspace" key -- oh how we all wish there were one in real life! the bad news, of course, is that it's all the more jarring when that backspace key should be used but isn't.

Ron has been teaching various standardized tests for 20 years.

--

Pueden hacerle preguntas a Ron en castellano
Potete chiedere domande a Ron in italiano
On peut poser des questions Ã Ron en franÃ§ais
Voit esittÃ¤Ã¤ kysymyksiÃ¤ Ron:lle myÃ¶s suomeksi

--

Quand on se sent bien dans un vÃªtement, tout peut arriver. Un bon vÃªtement, c'est un passeport pour le bonheur.

Yves Saint-Laurent

--

Learn more about ron

Quote

tutorphd: Master | Next Rank: 500 Posts; Posts: 126; Joined: Sun Jun 24, 2012 10:11 am; Location: Chicago, IL; Thanked: 36 times; Followed by:7 members

by tutorphd » Sat Jul 14, 2012 1:28 am

1. I don't have to read any 'literature' to know that a mathematical theory works in practice only when its assumptions are met. The assumptions in the 'adaptive scoring' are that errors are indicative of the test-taker level and that there is a pristine correlation between a candidate abilities in different math sub-areas.

As a practicing tutor I know that BOTH assumptions are completely wrong: (1) smart test takers do make dumb mistakes quite often because GMAT is a rushed nerve wrecking exam and (2) ofthen there is no correlation between abilities in different math areas tested on GMAT.

So much about the 'nice theory of item response'. I guess the other tests like GRE, SAT, ACT have not read the item response theory too, cause they keep the old non-adaptive format.

2. You missed the point about the test with 92%. According to GMAT, 1 in 12 people will get a better score than that particular test I simmulated. I am aware what percentile means (thanks for the lecture though). I question that percentile truthfulness as a skill measure due to the way the score was obtained. My claim was that the test taker was penalized just because he made 2+2 errors in the begining, something that happens pretty often in real life, and was not given second chance later, while some of the 1 in 12 'better' test takers most probably made about 6-7 errors scattered around the middle and end of the test.

So my claim is, due to the skewed adaptive algorithm, test takers that tend to make errors earlier in the test are underestimated, in comparison to ones that don't. Now how probable is that to happen? From my experience, everybody underperforms in the begining and end of the test. How probable is that to happen in two questions in a row? As probable as two questions not in a row LOL

In that light, the old 'myth' that a test taker should be especially careful with the first 10 problems is actualy reallity.

In a real truthful test, covering all topics uniformly, with problems from lowest to highest difficulty in each topic, such problem would not arise. But who am I to argue with the 'item response theory', which is so much better for scoring because it is based on probabilistic assumptions that nobody has ever bothered to check. Yeah lets use likelihood instead of a simple old fashioned test, it's so much fancier LOL

3. I hope you aren't trying to tell me that YOUR posts are a useful yardsticks for a professional tone? Because to me they sound like a white noise and don't offer an actual scientific discussion of the facts presented but simply your condescending brain-washed opinion.

Skype / Chicago quant tutor in GMAT / GRE
https://gmat.tutorchicago.org/

Quote

GMAT/MBA Expert

lunarpower: GMAT Instructor; Posts: 3380; Joined: Mon Mar 03, 2008 1:20 am; Thanked: 2256 times; Followed by:1535 members; GMAT Score:800

by lunarpower » Sat Jul 14, 2012 2:38 am

tutorphd wrote:1. I don't have to read any 'literature' to know that a mathematical theory works in practice only when its assumptions are met. The assumptions in the 'adaptive scoring' are that errors are indicative of the test-taker level and that there is a pristine correlation between a candidate abilities in different math sub-areas.

well ... yes!
on average, errors are indicative of "level" -- because that's actually how "level" is defined in the first place. they throw experimental questions at a large-ish population of test takers; the ones that people miss more often are called "harder", and the ones that people miss less often are called "easier".
it matters not whether the errors stem from genuine perplexity or from "stupid" mistakes. so, what you're referring to as an "easy" problem, IF enough people make mistakes on it, will be classified as a hard problem.

As a practicing tutor I know that BOTH assumptions are completely wrong: (1) smart test takers do make dumb mistakes quite often because GMAT is a rushed nerve wrecking exam

right, well, that's actually a good thing, because the exam isn't meant to give the best scores to the "smartest" people -- it's meant to give the best scores to people who are smart and prepared, but also attentive. it is testing people's ability to focus just as much as it's testing people's "smart"-ness.
your mistake here is that you are dismissing errors from lack of organization, or from inattention, or from carelessness, as somehow less "authentic" than errors from lack of comprehension. they aren't; they're both full-fledged errors. and, frankly, the former type of error is much more consequential in the world outside the ivory tower of academe.

and (2) ofthen there is no correlation between abilities in different math areas tested on GMAT.

the correlation is not perfect, but there is definitely not a complete lack of correlation.

So much about the 'nice theory of item response'. I guess the other tests like GRE, SAT, ACT have not read the item response theory too, cause they keep the old non-adaptive format.

actually, the main reason why the ACT and SAT have not seriously considered an adaptive format is the logistics. each of those tests is taken by approximately 1.5 million students per year; the GMAT is taken by only about 200,000 per year. so, imagine the colossal network of test centers (and employees, and etc.) you'd have to build.

2. You missed the point about the test with 92%. According to GMAT, 1 in 12 people will get a better score than that particular test I simmulated. I am aware what percentile means (thanks for the lecture though). I question that percentile truthfulness as a skill measure due to the way the score was obtained. My claim was that the test taker was penalized just because he made 2+2 errors in the begining, something that happens pretty often in real life, and was not given second chance later, while some of the 1 in 12 'better' test takers most probably made about 6-7 errors scattered around the middle and end of the test.

the 92 percentile score is 50, which is one point below the maximum quant score. (you cannot score higher than 51 on either component of this exam.) you shouldn't interpret it as "8 percentiles down", but, rather, as "one point below max". that's a significant difference.

So my claim is, due to the skewed adaptive algorithm, test takers that tend to make errors earlier in the test are underestimated, in comparison to ones that don't. Now how probable is that to happen? From my experience, everybody underperforms in the begining and end of the test.

everybody?

3. I hope you aren't trying to tell me that YOUR posts are a useful yardsticks for a professional tone? Because to me they sound like a white noise and don't offer an actual scientific discussion of the facts presented but simply your condescending brain-washed opinion.

ok, it seems that what needs to be said has been said, so it's time for me to exit this discussion. but, i will continue ponder the concept of "white" (= toneless) noise that somehow has a tone.

as edward r. murrow said ... good night, and good luck.

Quote

David@VeritasPrep: GMAT Instructor; Posts: 2193; Joined: Mon Feb 22, 2010 6:30 pm; Location: Vermont and Boston, MA; Thanked: 1186 times; Followed by:512 members; GMAT Score:770

by David@VeritasPrep » Sat Jul 14, 2012 4:11 am

tutor -

Please remember why the GMAT is here in the first place. It is not a test of math! In fact the Quant section only tests math because it needs some kind of subject. At a meeting with the officials at GMAC, the leading companies were basically told that the reason that math and sentence correction and so forth are even on the test is that they need something to be the subject. But math is the canvas it is not the painting. The GMAT is a test of REASONING. It is a test that puts people under pressure and tries to cause them to make these avoidable errors!

Graduates of business school can hire math geniuses if they need them. The one thing that all MBAs need is the ability to make decisions. Please start to see the test through this lens. I, and many others, who have not spent as much time in pure math still outperform many mathematicians on the Quantitative Reasoning section. This is not unfair, it is what the GMAT is designed for; it is what business schools wanted. People who can reason, not a class full of "math jocks."

Now with that understood, it would be great if you could take a step back and take a look at what is being said on this thread.

We are all telling you that some of what you are saying is true, but that it is not a big deal. Yes, missing questions that are very low in difficulty early in the test does impact your score. Of course, one of the worst things that you can do is to miss several easy questions out of the first ten. We all understand that and we all agree with you. But that does not mean that the algorithm is flawed. If you had a tutoring student and you asked them to try 5 lower level difficulty problems from the Official Guide and if the student missed several of those you would probably be skeptical of that students ability as well.

You mention the "First 10 questions" myth and say that it has some truth in it. Of course it does, most myths do. Here is the truth of it...The first 10 questions cannot make your score, even if you get all of them right, but they can lower your potential to score if you get lots of them wrong. We have been telling students this for years, because it is true.

You talk about the fact that silly mistakes lower a student's score on the test as if this were a revelation of some dark secret. Of course not. Much of the time that I spend with students is to help them to avoid these avoidable errors. This is a huge part of the test. As I said above every part of the GMAT is a test of decision-making and not a test where the mathematician is automatically given the higher score on the Quant section.

Veritas Prep | GMAT Instructor

Veritas Prep Reviews
Save $100 off any live Veritas Prep GMAT Course

Quote

David@VeritasPrep: GMAT Instructor; Posts: 2193; Joined: Mon Feb 22, 2010 6:30 pm; Location: Vermont and Boston, MA; Thanked: 1186 times; Followed by:512 members; GMAT Score:770

by David@VeritasPrep » Sat Jul 14, 2012 4:19 am

tutor -

Now to the point of the tone. If you look at the experts who have posted on this thread, you can see that both Ron and Ian have been since 2008, I am the newest of the three as I have only been here for 2.5 years. I have made over 1100 posts and been thanked nearly 600 times while Ron and Ian have made more than 2000 posts each and been thanked hundreds of times each. Ironically for this discussion, Ian and Ron are known for understanding the scoring algorithm as well as anyone outside of GMAC and for their postings on quantitative topics.

The entire reason that Ian and Ron even joined this discussion was to share their knowledge with you. Knowledge that they have gained over many, many years of looking deeply into the questions that you are exploring.

In other words, they, who are in a way are your competitors, since they teach the GMAT - as do I - have taken time out of their lives to help you to understand this subject a little better. I for one would appreciate it if you would adopt a slightly different tone. A more professional tone if you like. No one is attacking you. We have been trying to share information with you. So please look at it in this manner and do not get defensive.

My tone, here has been very professional, so if you respond to me in the same way that you have to Ron and Ian we will know that you do not want to learn things so that you can help your students, but rather that you just want to pick a fight with the test itself and with those who teach it.

Veritas Prep | GMAT Instructor

Veritas Prep Reviews
Save $100 off any live Veritas Prep GMAT Course

Quote

GMAT/MBA Expert

beatthegmat: Site Admin; Posts: 6773; Joined: Mon Feb 13, 2006 8:30 am; Location: Los Angeles, CA; Thanked: 1249 times; Followed by:994 members

by beatthegmat » Sat Jul 14, 2012 10:15 am

Thanks for the reminders about tone guys. Let's keep Beat The GMAT friendly. For experts on this site, please refer to the special rules for you at the bottom of this page: https://www.beatthegmat.com/mba/community-rules

Beat The GMAT | The MBA Social Network
Community Management Team

Research Top GMAT Prep Courses:
https://www.beatthegmat.com/gmat-prep-courses

Research The World's Top MBA Programs:
https://www.beatthegmat.com/mba/school

Quote

GMAT/MBA Expert

beatthegmat: Site Admin; Posts: 6773; Joined: Mon Feb 13, 2006 8:30 am; Location: Los Angeles, CA; Thanked: 1249 times; Followed by:994 members

by beatthegmat » Sat Jul 14, 2012 10:18 am

Moved this thread to the GMAT Strategy forum. This conversation doesn't seem to fit the 'Ask the Test Maker' forum any more. Thanks.

Quote

sunman: Master | Next Rank: 500 Posts; Posts: 165; Joined: Thu Feb 17, 2011 5:05 am; Location: San Diego, CA; Thanked: 14 times; Followed by:9 members; GMAT Score:750

by sunman » Sun Jul 15, 2012 2:38 am

Tutors who bang their heads against the wall figuring out a method to "game the game" instead of actually assisting students with quantitative and verbal strategies and fundamentals immediately lose all credibility in my eyes.

For some of us, the GMAT is actually a tool to help us get into our dream business schools, not some Rubik's cube for nerds to obsessively dissect as a hobby.

"Never doubt that a small group of thoughtful, committed citizens can change the world. Indeed, it's the only thing that ever has" - Margaret Mead

Quote

thulsy: Senior | Next Rank: 100 Posts; Posts: 42; Joined: Thu Feb 02, 2012 2:02 am; Thanked: 9 times; Followed by:6 members; GMAT Score:760

by thulsy » Mon Jul 16, 2012 10:09 am

Since experts are here, let me ask a question (stemmed from Ian's post saying "If the questions on which you guess are all very hard questions, that won't hurt your score much. If many of them are easy questions, that will hurt your score a lot. ")

For a student who typically score around 750, would the score be more negatively affected if he/she missed a easy question (say 500-600 level) or a hard question (say 700-800 level), assuming the question is not experimental?

My guess: if the student's overall performance is indicative of a 750 level, the CAT will treat an occasional mistake at a low-level question as an outlier, which does not count much. so missing a "on-level question" (a 700-800 level question) will be more severe. Just guessing....
If the answer is the opposite - missing easier questions weights more, then the implication may be that it's more important for a 750-level student to avoid silly mistakes on easier question than to instead take that extra proof-reading time to tackle hard questions.

I believe you experts have personal experience on this issue. Your insights are greatly appreciated.

Quote

David@VeritasPrep: GMAT Instructor; Posts: 2193; Joined: Mon Feb 22, 2010 6:30 pm; Location: Vermont and Boston, MA; Thanked: 1186 times; Followed by:512 members; GMAT Score:770

by David@VeritasPrep » Mon Jul 16, 2012 12:11 pm

Thulsy - the folks at GMAC have actually answered this question just recently in their "ask the test maker" forum.

Here is what they said:

"Yes incorrectly answering easier questions will lower your score. The raw IRT score is a value that takes into consideration the item difficulties and the responses to all the questions that were administered. This IRT score is then transformed to the historic GMAT scaled score with a proportional deduction for questions not answered. This is not the same as a penalty for guessing. If a question is very hard (say a 750 question and you are really a 650 test taker) and you guess incorrectly, the question will not count much. If a question is very easy (say a 550 question and you are a 650 test taker) and you answer it incorrectly, again, the question will not have much weight. If however, you miss many 550 questions, then this set of questions will count more when all the responses and difficulties are evaluated.

I added the bold type to their posting as this is what really addresses your question.

Here is what I said in response -- sort of an interpretation of this GMAC wording.

"This confirms that one of the most important things that a test taker can do is to focus on correctly answering the questions that she can/should get right. A single false positive or false negative does not matter much. A string of false positives (a student correctly answering a question far above her level) is very unlikely since that would mean that she is indeed capable at that level. It is a number of false negatives that can skew the score, since it is certainly possible for a student to make a number of errors and miss questions that are within her level of ability. For example, if a test-taker scores 100 points lower than expected then one explanation could be that the test-taker made several avoidable errors on questions below her level of difficulty."

And your implication is right, the first thing a test taker can do is to limit silly mistakes. You can only be hurt so much by missing several 95th percentile questions. After all a 51 is 98th percentile so missing questions at this level may take you down to a 50 scaled score (92nd percentile) or so.

But essentially in the (GMAC example above) if you miss lots of 550-level (50th percentile questions) when you are otherwise a 650-level (80th percentile) scorer - as the GMAC quote tells us - "this set of questions will count more when all the responses and difficulties are evaluated."

Does that help to answer the question?

Veritas Prep | GMAT Instructor

Veritas Prep Reviews
Save $100 off any live Veritas Prep GMAT Course

Quote

GMAT/MBA Expert

lunarpower: GMAT Instructor; Posts: 3380; Joined: Mon Mar 03, 2008 1:20 am; Thanked: 2256 times; Followed by:1535 members; GMAT Score:800

by lunarpower » Mon Jul 16, 2012 3:02 pm

thulsy wrote:My guess: if the student's overall performance is indicative of a 750 level, the CAT will treat an occasional mistake at a low-level question as an outlier, which does not count much. so missing a "on-level question" (a 700-800 level question) will be more severe. Just guessing....

no, that's not how it works. missing easier questions is worse for everyone.
imagine you're a chess grandmaster, and you accidentally lose to some kid who barely knows how to play. are you going to "reject" that as an "outlier"? of course not, you're going to judge it very severely.

however, all of this palaver about difficulty levels misses the real point, which is that YOU SHOULD NOT BE THINKING ABOUT THIS KIND OF THING, EVER.
you will not have any idea of the "difficulty levels" of the questions you are answering, so all of these considerations are irrelevant to your actual test-day strategy.

Quote

David@VeritasPrep: GMAT Instructor; Posts: 2193; Joined: Mon Feb 22, 2010 6:30 pm; Location: Vermont and Boston, MA; Thanked: 1186 times; Followed by:512 members; GMAT Score:770

by David@VeritasPrep » Mon Jul 16, 2012 3:43 pm

Completely agree with Ron!

This will be my final word on the subject. I never have my students think about the difficulty level of any question. I only speak about these things in the context of making sure that you avoid "silly mistakes" on test day. Some people mistakenly hurry through questions that they think are "easy" in the search for "harder" questions --as if they can recognize those. My point is simply that students should take care to avoid silly mistakes that cause them to miss questions that they would otherwise get right.

Veritas Prep | GMAT Instructor

Veritas Prep Reviews
Save $100 off any live Veritas Prep GMAT Course

Quote

tutorphd: Master | Next Rank: 500 Posts; Posts: 126; Joined: Sun Jun 24, 2012 10:11 am; Location: Chicago, IL; Thanked: 36 times; Followed by:7 members

by tutorphd » Tue Jul 17, 2012 9:07 am

Wow GMAC said it itself:

This confirms that one of the most important things that a test taker can do is to focus on correctly answering the questions that she can/should get right. A single false positive or false negative does not matter much. A string of false positives (a student correctly answering a question far above her level) is very unlikely since that would mean that she is indeed capable at that level. It is a number of false negatives that can skew the score, since it is certainly possible for a student to make a number of errors and miss questions that are within her level of ability. For example, if a test-taker scores 100 points lower than expected then one explanation could be that the test-taker made several avoidable errors on questions below her level of difficulty.

That was exactly what I was saying, despite the fact that some of the experts here were claiming that a string of false negatives is 'not very likely'. Notice GMAC did not say a string of false negatives is 'unlikely' like they said about a string of false positives.

Thulsy,

you should not worry about question difficulty on the exam BUT your should worry about it in your preparation. For some students, improving their score boils down to improving their accuracy (the ability to solve correctly a question at your level or below). Analyze your mistakes on GMATPret tests and see if you make silly errors on questions below your level and train strategies to avoid those.

Another thing you should know, that is not in the above quote, is that the scoring algorithm is more sensitive to a string of false negatives in the begining of the test when it is 'making its mind' about your level. That suggests that you should train to be especially careful and focused with the initial 10-15 questions on the test. That doesn't mean that you can do whatever you want at the end of the test, but be especially careful in the begining. A string of mistakes below your level in the begining will swing the scoring algorithm badly towards giving you lower level questions later and you won't be able to earn your true level score even if you answer those correctly.

I've shown that effect with GMATPrep tests, despite the fact that the experts here were claiming that this is not true and that it doesn't really occur in real life. Many people taking GMAT get scores quite lower than their practice scores with GMATPrep exactly because they have higher anxiety on the test day and they swing the scoring algorithm in the begining of each section towards lower scores with a string of careless errors below their true level.

Skype / Chicago quant tutor in GMAT / GRE
https://gmat.tutorchicago.org/

Quote

David@VeritasPrep: GMAT Instructor; Posts: 2193; Joined: Mon Feb 22, 2010 6:30 pm; Location: Vermont and Boston, MA; Thanked: 1186 times; Followed by:512 members; GMAT Score:770

by David@VeritasPrep » Tue Jul 17, 2012 9:33 am

The quote that you cited above is actually from me. The other quote in my earlier posting is from the GMAC psychometric team.

Veritas Prep | GMAT Instructor

Veritas Prep Reviews
Save $100 off any live Veritas Prep GMAT Course

Quote

GMAT/MBA Expert

Ian Stewart: GMAT Instructor; Posts: 2621; Joined: Mon Jun 02, 2008 3:17 am; Location: Montreal; Thanked: 1090 times; Followed by:355 members; GMAT Score:780

by Ian Stewart » Tue Jul 17, 2012 1:53 pm

To respond to thulsy's question, the scoring algorithm is based on probabilities. For each question on the test, the algorithm knows the probability that, say, a 500-level test taker will answer correctly, and the probability that a 700-level test taker will answer correctly. A question which is a '500-level' question is one that a 500-level test taker should answer correctly about 60% of the time. A 700-level test taker should probably answer that question correctly about 80-90% of the time. That is, the test acknowledges that high-level test takers will occasionally answer low-level questions incorrectly, and you can certainly recover from a careless error on an easy question. But suppose you get two 400-level questions wrong in a row, and suppose those are questions a 700-level test taker should get right 90% of the time. The probability a 700-level test taker will get both of those questions wrong is (0.1)(0.1) = 0.01, or 1%. So a 700-level test taker will almost never get both of those questions wrong, and it becomes very hard to persuade the test that you are a 700-level test taker if you make a few mistakes on low-level questions.

That's why I said above that careless errors on easy questions can hurt you a lot, and for test takers, this is the most important takeaway from this thread. If you get a 750-level question wrong, that only provides the test with evidence that you might not be a 750+ level test taker. Well, not many people are! But if you get a 400-level question wrong, that provides the test with evidence that you might be a sub-400 level test taker. That's not what you want. So when you decide how to apportion your time during the test, you ought to be spending a little longer making sure that you get those questions right that you know how to solve, because careless mistakes on questions below your level are very costly. You should be saving time by moving on quickly from questions that are too hard for you (and on an adaptive test, you're sure to see some questions above your level unless you're a 51-level test taker) since getting those questions wrong doesn't hurt you much at all - in fact, it's exactly what the test expects you to do.

For online GMAT math tutoring, or to buy my higher-level Quant books and problem sets, contact me at ianstewartgmat at gmail.com

ianstewartgmat.com