# Simple random sample

Given a random variable X, a simple random sample is a set of random variables, independent and identically distributed, obtained from the random variable X and that are distributed the same as it.

Formally, the previous definition is the one that defines a simple random sample. Now, actually, the concept can be defined more simply. Of course, to properly understand the concept of a simple random sample it is important to define it precisely.

Since the formal definition is complex, we are going to reel off each part of the definition little by little.

## The Simple Random Sample Concept Step by Step

Thus, in the first place, we have to take into account that a simple random sample is a sample. As a sample, it is obtained from a random variable. We have called this random variable X. An example of a random variable could be the math grade of high school students. Therefore, the first part of the definition is clear. A simple random sample is a sample obtained from any random variable.

The second part of the definition is more complex. Above all, by the concepts of "independent and identically distributed random". The concept of random means chance. As the sample has been obtained randomly, the variables, consequently, are random. The concept of independent refers to the fact that the data obtained are not related to each other. That is, choosing a certain data does not depend on the data chosen previously or that will be chosen later. Finally, identically distributed refers to the statistical distribution being the same.

In summary, we have that a simple random sample is a sample that has been obtained in a totally random way. Thus, the data that make up the sample are not related to each other and inherit the characteristics of the population random variable X.

## Why is the simple random sample concept so important?

When we want to conduct research on certain characteristics of a data set, the quality of the sample is essential. For the calculated metrics and therefore the research conclusions to be reliable, we must have what is known as a representative sample. That is, a sample that adequately represents the characteristics of the total population.

One of the main characteristics of a representative sample is that it is random. Therefore, knowing the concept of a simple random sample is of vital importance for our study to be valid in the scientific community.

## Simple random sample example

Suppose we want to carry out a study on the monthly salaries of the citizens of a country. Our random variable will be the monthly salary of the citizens.

The sample concept arises due to the impossibility of asking each and every one of the citizens of a country. That would take a long time or a lot of financial resources. So instead of asking 50 million people, we decided to ask 50,000.

Once we have defined the variable on which we are going to work and the data population, we have to proceed to obtain the sample. There is an extensive literature on obtaining the correct sample. But, since the objective of this definition is to approach this concept in a simple way, we will not go into the matter.

Simplifying a lot, generally, we will have two options. Or ask citizens in a totally random way or choose a selection process. For the sample to meet the criterion of "random" we must do it completely at random. We cannot choose cities, or zones, or neighborhoods, or anything.

If we choose the selection process consciously, then our sample will likely be biased. The correct thing to do would be to use a tool that randomly extracts the names of citizens.

Once we have our simple random sample, then we have to work with the data. That is, make statistical inference. This statistical inference will allow us to draw conclusions from the study. For example, statements such as: "the average monthly salary in Spain is 1,200 euros" or, "only 5% of citizens with the highest salaries earn the equivalent of the poorest 30%."

All this with a clear margin of error. But that is already taken care of by statistical inference.