Statistics in Language Studies (d)

Gap-fill exercise

Fill in all the gaps using the AWL words in the list, then press "Check" to check your answers. Use the "Hint" button to get a free letter if an answer is giving you trouble. Note that you will lose points if you ask for hints or clues!
   accessible      assume      available      chapter      clarify      constitute      constructed      construction      extract      incidentally      interval      investigation      notion      notional      obvious      options      random      region      regions      relevant      required      researchers      resources      selected      similar      sources      specify      theoretical      ultimately   
4. 3 The solution
It will perhaps help us to answer these questions if we introduce the of a sampling frame by way of a non-linguistic example. This will some of the difficulties we saw earlier in attempts to populations.

Suppose are interested in the birth weights of children born in Britain in 1984 (with a view to comparing birth weights in that year with those of 1934). As is usual with any , their will only allow them to collect a subset of these measurements - but a fairly large subset. They have to decide where and how this subset of values is to be collected. The first decision they have to make concerns the of their information. Maternity hospital records are the most choice, but this leaves out babies born at home. Let us that health visitors (who are to visit all new-born children and their mothers) have records which can be used. What is now is some well-motivated limits on these records, to a sampling frame within which a sample of birth weights can be .

The most common type of sampling frame is a list (actual or ) of all the subjects in the group to which generalisation is intended. Here, for example, we could a list of all the babies with birth-dates in the year from the records of all health visitors in Great Britain. We could then choose a simple sample ( 5) of n of these babies and note the birth weights in their record. If n is large, the mean weight of the sample should be very ( 7) to the mean for all the babies born in that year. At the very least we will be able to say how big the discrepancy is likely to be (in terms of what is known as a ‘confidence ’- see 7).

The problem with this solution is that the of the sampling frame would be extremely time-consuming and costly. Other are . For example, a sampling frame could be in two or more stages. The country (Britain) could be divided into large , Scotland, Wales, North-East, West Midlands, etc., and a few chosen from this first stage sampling frame. For each of the a list of Health Districts can be drawn up (second stage) and a few Health Districts chosen, at , from each . Then it may be possible to look at the records of all the health visitors in each of the chosen Districts or perhaps a sample of visitors can be chosen from the list (third stage) of all health visitors in each district.