Population and Sample
Population
A population is the complete set of individuals, objects, or measurements that possess some common characteristic that the researcher is interested in studying.
- Population size: \(N\)
- Parameters: Fixed numerical characteristics of the population (e.g., \(\mu\), \(\sigma\))
- Examples: All students at a university, all voters in a country, all products from a factory
Sample
A sample is a subset of the population selected for study. It should be representative of the population to allow valid inferences.
- Sample size: \(n\)
- Statistics: Numerical characteristics calculated from the sample (e.g., \(\bar{x}\), \(s\))
- Purpose: To estimate population parameters and make inferences
Sampling
Why Sample?
- Practicality: Studying entire populations is often impossible or impractical
- Cost-effectiveness: Samples require fewer resources than censuses
- Time efficiency: Faster data collection and analysis
- Accuracy: Well-designed samples can provide highly accurate estimates
Sampling Methods
Probability Sampling (Random)
- Simple Random Sampling: Every member has equal chance of selection
- Stratified Sampling: Population divided into strata, then random sampling within each
- Cluster Sampling: Population divided into clusters, random clusters selected
- Systematic Sampling: Selecting every kth member from a list
Non-Probability Sampling
- Convenience Sampling: Using readily available participants
- Purposive Sampling: Selecting specific individuals who meet criteria
- Snowball Sampling: Participants refer other participants
Sampling Error
Sampling error is the difference between a sample statistic and the corresponding population parameter. It occurs because we're studying a subset rather than the entire population.
- Reduced by: Larger sample sizes, better sampling methods
- Quantified by: Standard error, confidence intervals
Representativeness
A sample is representative if its characteristics closely match those of the population. Key factors:
- Sampling method: Probability methods generally produce more representative samples
- Sample size: Larger samples tend to be more representative
- Response rate: High participation reduces potential bias
Relationship to Parameter Estimation
Sampling provides the foundation for [[Parameter Estimation]]. Through careful sampling, we obtain data that allows us to:
- Calculate sample statistics (\(\bar{x}\), \(s\), etc.)
- Use these statistics to estimate population parameters (\(\mu\), \(\sigma\), etc.)
- Quantify the uncertainty in our estimates
- Make valid inferences about the population