Hey everyone, wanted to weigh in on two things here.
Generally, GMATPrep is good practice, and we recommend it to our students. That said, it's also worth noting that it is common practice for testing companies to use old, discarded, and often sub-optimal questions in their test prep - they need to reserve the best questions for the real test. That can affect the predictive validity of the GMATPrep tests.
Now, with the "initial X questions"... probably the most popular question / discussion around the GMAT CAT. Items used in a CAT have quite a few psychometric characteristics associated with them. They key ones are the discrimination, difficulty (or location) and guessing parameters. Difficulty is an easy one. The guessing parameter is the probability that a test-taker with very low ability will answer the item correctly, and isn't relevant here. Discrimination is a bit more difficult to understand, but is critical in item selection and scoring and for the question at hand. Let's investigate...
Discrimination, or the
a-value in CAT terms, is a measurement of how well an item differentiates between test-takers of two ability levels. It's graphed in the form of a curve from the lower left to upper right. If it's a gradual curve, then the item has low discrimination and isn't terribly precise in differentiating test-takers of similar ability levels. On the other hand, if the curve is very steep (like a step), it would be incredibly precise in being able to differentiate between test-takers of very close ability levels.
This value is important because the test needs to figure out how much weight to give each item. In the beginning of the test, each question can jump your estimated ability a good deal, as the test is just trying to find out which ballpark you are in. Since the test knows it has a lot of questions to ask you, there's no need to dole out very discriminating items in the beginning -- it would be a waste. It's just trying to figure out if you're closer to 65 or 80, or maybe 40 or 55, for example. As it goes along, and has a pretty good idea that you are around X ability level. That's when it starts dealing out the questions with high information value, a.k.a. more discriminating, because it's now trying to differentiate between smaller ranges like 68 to 72.
So the algorithm smartly uses questions with different characteristics at different times to best do its job. The confusion comes when students think those bigger jumps in the beginning preclude any chance from proving (what they more often than not hope to show) higher ability. The algorithm is flexible enough to recover from some anomalies, even early on, which is why consistent performance and finishing the section are of paramount importance. The algorithm is even flexible enough to throw out a wrong answer or two if they are entirely inconsistent with the rest of your exam, as in a top level student who make a bad mistake on an easy question. After all, the goal is to determine true ability, not penalize for every careless mistake.
Anyway, hopefully that wasn't too technical / rambling and that it helps a bit!
