Please find links to key chapters from this area - Statistics, Descriptive and Applications, of the course below. In addition, here is a growing collection of practice questions for you to use as a reviw exercise for the whole unit. Currently there are 5 exam style questions worth 38 marks and should be completed in about 40 minutes, before looking at the solutions.

START QUIZ! *Shorter P1 style question - 6 marks *

The number of days of rain per month in a town in Louisiana was recorded for a priod of 12 months. the data is shown below.

10, 8, 8, 7, 8, 12, 13, 12, 10, 8, *x*, 7

The mdeian value for this data set is 8.5

a) Find the value of *x*

b) Find the mean number of days it rains per month

c) Find the standard deviation

a) For the median value to be 8.5, the average of the 6th and 7th value must be 8.5. this can only happen if the 6th is 8 and the 7th is 9. There is no 9 in the list so x must be 9.

b) From the GDC, the mean value is 9.33 (3sf)

c) From the GDC, the Stndard deviation is 1.97 (3sf)

*Be aware - make sure your calculator is counting '1' of each of the items in list 1 and not looking for a frequency in list 2.

*Shorter P1 style question - 6 marks*

Looking for something in the water around Cold Mountain, scientists measured the amount of disolved Oxygen in the water and the temperature of the water on different days investigate the change. Here are the results.

Temperature | 18 | 7.7 | 4.2 | 23.3 | 14.4 | 22.1 | 11.2 |

Dissolved O2 (mg/l) | 4.7 | 8.6 | 12.5 | 4 | 5 | 4 | 7 |

Using this data, find

a) The Pearson's product moment correlation coefficient

b) The equation of the regression line y on x

Using the equation of the regression line,

c) Estimate the concentration of dissolved oxygen when the temperature is 10°

a) From the GDC, r = -0.925 (3sf)

b) From the GDC, y = -0.402x + 12.3 (3sf)

c) y = -0.402 X 10 + 12.3 = 8.28

*Shorter P1 style question - 6 marks*

In a prison, a sample of 310 inmates are surveyed about the number of books they read in the last year. The results are shown in the table below.

Number of books | Number of inmates |

0 to 4 | 60 |

5 to 9 | 106 |

10 to 14 | 93 |

15 to 19 | a |

20 to 24 | 9 |

a) What is the value of a?

b) In which class interval does the median value lie?

c) Work out an estimate for the mean number of books read by the inmates.

a) Because the total is 310 (given in the question) the missing frequency must be 42 to make that so.

b) There are 310 results, the median will be between the 155th and 156th, both of these are in this second category

c) From the GDC making sure that it is set up to read list 2 as a frequency. List one should be the midpoints of the class intervals. In this case, with discrete data, there are 5 possible answers for each interval, so the midpoint is the middle one. In the first case, 0 to 4, the 5 posisble answers are 0, 1, 2, 3, 4, so the midpoint is 2.

At a nursing home the following observed frequencies were collected to see if the likelihood of speaking another language was dependent on age group. The following data was collected

| Speaks 1 language | Speaks more than 1 loanguage |

71 - 80 | 13 | 18 |

81 - 90 | 43 | 28 |

91 - 100 | 21 | 14 |

a) Calculate the expected frequency for the number of people in the 71 - 80 age group who speak more than 1 language

b) Calculate the expected frequency for the number of people in the 91 - 100 age group who speak only one language.

c) Calculate the number of degrees of freedom for this test.

d) Calculate the p number for this test

e) Conclude using a 10% significance level (enter A, for accept and R for reject)

a) \(\frac { Total\quad 71\quad -\quad 80 }{ Total\quad people } \times Total\quad >1\quad language\\ \frac { 31 }{ 137 } \times 60=13.57664234=13.6\quad (3sf)\)

b)\(\frac { Total\quad 91\quad -\quad 100 }{ Total\quad people } \times Total\quad 1\quad language\\ \frac { 35 }{ 137 } \times 77=19.67153285=19.7\quad (3sf)\)

c) \(dof=(rows-1)(columns-1)\\ =(3-1)(2-1)\\ =2\)

d) From GDC, rounded to 3sf

e) We accept because the probability of independence (p number) is 19% which is higher than the significance level of 10% (this explanation will be expected)

The height of a sample of men in 1935 was normally distributed with a mean average of 170cm and a standard deviation of 6cm.

aWhat is the probability that a man chosen from that sample would be

a) more than 160 cm tall?

b) Between 180 cm and 190 cm tall

In 1999, a similar sample showed a normal distribution with an average od 178cm and a standard deviation of 8.5cm

For this sample, the probability of random selected man being above 'x' cm tall is 0.3

c) What is the value of x?

The probability of a randomly selected person from the sample being between 'y' and 180 cm is also 0.3.

d) What is the value of y?

a) From GDC entering

\(Lower\quad limit\quad =\quad 160\\ Upper\quad limit\quad =\quad { 10 }^{ 99 }\quad (or\quad similar\quad large\quad number)\\ \mu \quad =\quad 170\\ \sigma \quad =\quad 6\\ p=0.952\)

b) From GDC entering

\(Lower\quad limit\quad =\quad 180\\ Upper\quad limit\quad =\quad 190\quad \\ \mu \quad =\quad 170\\ \sigma \quad =\quad 6\\ p=0.0473\)

c) Using the inverse normal calculator

\(Tail\quad Left\quad and\quad Area\quad (p)\quad =0.7\quad OR\\ Tail\quad right\quad and\quad area\quad (p)\quad =\quad 0.3\\ \mu \quad =\quad 178\\ \sigma \quad =\quad 8\\ Height\quad =\quad 182.195204\\ =182cm\quad (3sf)\)

d) Two steps

step 1 - Work out the probability of being less than 180cm

\(Lower\quad limit\quad =\quad 0\\ Upper\quad limit\quad =\quad 180\quad \\ \mu \quad =\quad 178\\ \sigma \quad =\quad 6\\ p=0.59870632=0.599\quad (3sf)\)

Now you know that the probability of being less than 'y' = 0.599-0.3=0.299, because the 0.3 area is between y and 180, so, using inverse normal

\(Tail\quad Left\quad and\quad Area\quad (p)\quad =0.299\quad \\ \mu \quad =\quad 178\\ \sigma \quad =\quad 8\\ Height\quad =\quad 173.78177\\ =174cm\quad (3sf)\)

This topic is all about these two related tools for helping us look at how a data set is spread out. Learn about filling in cumulative frequency tables, plotting the corresponding curves and using the curves to draw box plots and answer questions...

Description of the concept... The following is a series of slides and videos that will help you understand, learn about and review this sub-topic. Keep track of your progress and practice the exam questions on this ACTIVITY LINK Use these slides...

The normal distribution is a fascinating, naturally occurring phenomenon that has very relevant applications to understanding the world around us. When a data set is normally distributed it has some key properties that allow us to make predictions about th

The chi squared independence tests is a widely used technique for looking for a relationship between variables that are categorical. We can use a scatter graph to look for a relationship between GDP and life expectancy, but what about GDP and...

*Which of the following best describes your feedback?*