Stats.. relationship among different pieces

This topic has expert replies
Master | Next Rank: 500 Posts
Posts: 103
Joined: Sat Jun 02, 2012 9:46 pm
Thanked: 1 times
Can someone please summarize the relationship b/w mean, median, standard deviation and range. What does changing one or more tell us about the rest?

There's a lot of content on stuff like... if all numbers change by x%, then SD changes by x%. But can't find a whole lot on inter-relations b/w these identities themselves.

Master | Next Rank: 500 Posts
Posts: 103
Joined: Sat Jun 02, 2012 9:46 pm
Thanked: 1 times

by topspin360 » Thu Aug 16, 2012 8:26 pm
In addition, can someone please explain the most efficient way to do the following three problems. I got them right by plugging numbers but it took me ages... #8 is a crazy one! actually didn't get that right.


7. Which of the following data sets has the third largest standard deviation?
(A) {1, 2, 3, 4, 5}
(B) {2, 3, 3, 3, 4}
(C) {2, 2, 2, 4, 5}
(D) {0, 2, 3, 4, 6}
(E) {-1, 1, 3, 5, 7}

8. The table below represents three sets of numbers with their respective medians, means and standard deviations. The third set, Set [A+B], denotes the set that is formed by combining Set A and Set B.

Median Mean StandardDeviation
Set A: X, Y, Z.
Set B: L, M, N.
Set [A + B]: Q, R, S.
If X - Y > 0 and L - M = 0, then which of the following must be true?
I. Z > N
II. R > M
III. Q > R
(A) I only
(B) II only
(C) III only
(D) I and II only
(E) None

9. E is a collection of four odd integers and the greatest difference between any two integers in E is 4. The standard deviation of E must be one of how many numbers?
(A) 3
(B) 4
(C) 5
(D) 6
(E) 7

Junior | Next Rank: 30 Posts
Posts: 13
Joined: Tue Jul 17, 2012 12:43 am
Location: London
Thanked: 2 times
GMAT Score:760

by willrc » Fri Aug 17, 2012 2:35 am
The relationships between mean, median, standard deviation and range are quite complex. A few generalisations can be offered:

1. Mean and median are similar but not identical measures. Intuitively, both are a measure of the "middle" of the data. In particular, if the data is distributed symmetrically about the mean (as in for instance an arithmetic series), the mean and median coincide. When the median is, for example, lower than the mean, this suggests that the mean is being 'pulled up' by some very large values, and vice versa.

2. Standard deviation and range share some features but are not the same. Both reflect, intuitively, a measure of the "spread" of data, or how much variability it contains. Range is a more trivial measure of the difference between the outlying points (minimum and maximum). Standard deviation is a more practical measure which considers the spread of the data as a whole (not just the outlying points).

Now for the questions.

7. Here we could use the standard deviation formula and plug in each dataset in turn. More intuitively however, we can look at the datasets and see how widely they are spread around the mean. D and E have "gaps" of 2 between each number and so have the largest SD. A has a consistent "gap" of 1 and B and C are more tightly distributed. Hence A has the middle, or third highest, SD.

8. Interpreting the question, we're told that in A the median is greater than the mean, whilst in B the two measures are equal. Going through the statements:
I -- we don't know anything about the standard deviations of either dataset, so we can't know which is greater.
II -- for this to be true, the mean of the combined set should be higher than the mean of set B. This would have to be true if Y>M but we don't know that this is the case.
III -- this is more difficult. Plugging in a few example numbers appears to be the easiest approach. If we take 0, 2 and 3 for A (ensuring X>Y) and 5 and 6 for B, we get X=2, Y=5/3, L=5.5, M=5.5. The combined set is 0, 2, 3, 5, 6, so Q=3 and R=16/5. R is slightly higher than Q in this example so statement III is not necessarily true.

Answer: E

9. Here we need to establish what E could contain. Since the range is 4 we must (at least) have two odd numbers 4 apart, e.g. 1 and 5. The other two numbers in the set could each be 1, 3 or 5. To save time, we can see that being 5 is the same (in terms of impact on SD) as being 1. Hence the combinations we need to consider are:
1,1,1,5
1,1,3,5 or 1,3,1,5 (the same)
1,3,3,5

Note that the choice of outliers (1 and 5) does not affect SD (because the distance from the mean is what matters -- try it and see).

Answer: A
London-based private tutor
www.gmatprolondon.com

Master | Next Rank: 500 Posts
Posts: 103
Joined: Sat Jun 02, 2012 9:46 pm
Thanked: 1 times

by topspin360 » Fri Aug 17, 2012 8:41 pm
OA for #9 is B actually. I believe you forgot to consider 1,3,5,5

User avatar
GMAT Instructor
Posts: 3225
Joined: Tue Jan 08, 2008 2:40 pm
Location: Toronto
Thanked: 1710 times
Followed by:614 members
GMAT Score:800

by Stuart@KaplanGMAT » Fri Aug 17, 2012 9:01 pm
topspin360 wrote:OA for #9 is B actually. I believe you forgot to consider 1,3,5,5
1,3,5,5 is the same as 1,1,3,5

Similarly, 1,5,5,5 is the same as 1,1,1,5

The missing set is 1,1,5,5

On a side note, you will NEVER need to calculate SD on the GMAT (and, accordingly, you don't need to know the SD formula) - you just might need to know what SD measures (how spread out are the terms of a set) and, for DS, what's required to calculate SD (# of terms and the exact spacing of the set).
Image

Stuart Kovinsky | Kaplan GMAT Faculty | Toronto

Kaplan Exclusive: The Official Test Day Experience | Ready to Take a Free Practice Test? | Kaplan/Beat the GMAT Member Discount
BTG100 for $100 off a full course