In the sciences we collect data by sampling individuals from a population, then using that sample to infer about the population as a whole. The picture to the right illustrates the process.
The ideal way to sample a population so that your data are statistically valid is to take a random sample, where every individual in the original population has an equal probability of being sampled. The best way to take a proper random sample is to number every individual in the population then to use a random number table or generator to pick which numbers to sample. In practice, a completely random sample is usually impossible so we often use other methods to take a random sample, such as:
- select every nth individual encountered, where n is determined from a random number generator
- randomly select locations to sample within the population distribution
- divide the population into equal groups or equal ranges, and select random individuals within each section, either by random locations or some other sampling method
Many other kinds of samples are called random, but they are not actually random and should be referred to correctly:
- Haphazard sample: unsystematic selection of individuals to include in the sample following no predetermined rules. This method is often called random but it's not because it is subject to the conscious or unconscious biases of the person making the selection
- Convenience sample: individuals that are readily available are included in the sample