Traditional Statistics

 

In my last post I discussed the value of data. In that post I discussed how value is extracted from data by processing raw data and packaging it but what does this mean? Data in its purest form is just points of information. Without context or analysis applied it doesn’t necessarily serve much use. One way in which human beings derive value from is data is through statistics.

Traditional statistics is a tool for understanding and interpreting data. There are many ways in which this tool can be applied to data, with two main branches: descriptive and inferential. Both are ways to extract meaning from data but are used in different ways for different purposes.

Descriptive Statistics

Descriptive statistics are used to describe or summarise data in a meaningful way. They help us visualise what the data is showing and make it easier to recognise patterns that may emerge from the data. Descriptive statistics are applied to a sample and measures chosen properties of that sample to present the data in different ways.

For example, if we had 100 people and their annual incomes, we could apply descriptive statistics to present this data in a meaningful way. We could view the mean, median, and mode of this data to derive the typical income from the sample:

  • Mean – By adding the annual income of all 100 people in the sample and dividing by 100 we could see the central point of the sample, giving us the average annual income of the sample.
  • Median – If we organise the sample from highest to lowest income and take the middle value then we can see that half the people in the sample earn less than this annual income and half the sample earn more, splitting the sample in half.
  •  Mode – By viewing each annual income figure within the sample and seeing which figure appears the most then we can see the most common level of income within the sample.

These are all examples of measures of central tendency. Each way the sample is presented can tell us different things about the sample or more cynically can be used to push different ideas based on the same sample of data.

Results of this kind of statistics is typically presented through graphs, tables, and statistical commentaries to present a certain view of the data. But that’s all it can do, make a judgement of the specific set of data. It cannot make conclusions beyond the data sample and conclusion drawn from the analysis. This is the key difference between descriptive statistics and inferential statistics.

Inferential Statistics

Inferential statistics are used to make a wider estimation of the populations a sample is drawn from based on the results drawn from the measured properties of that sample. This kind of statistics is used when studying populations that would be impossible or impractical to sample fully. Because the sample if being used to make generalisations about the population it is drawn from then it is essential that the sample is representative of the population otherwise the results will only reflect the sample.  A larger sample size will help alleviate the natural variation the occurs when sampling a population but there will always be some drawback to sampling, so it is important to keep this in mind when drawing a conclusion using inferential statistics.

Using our example from before, if we take the same 100 people and use them as a sample of a wider population of a village of 1000 people for example. Using inferential statistics, we can use the data gathered from this sample to make inferences about the population of the village. Unlike descriptive statistics you could not simply view the data within this sample and say the average income of the population is that of the sample size due to natural sampling error. But if you had a hypothesis of the average income of the populations before the test, you could use the sample as evidence for or against the hypothesis using inferential statistics. If you had two samples from the village, with one receiving a financial benefit scheme and a test sample not receiving it. You could make inferential estimations based on the results drawn from the two samples.

 

Descriptive statistics can be used to draw meaning from data sets but are limited in their scope to the data set whereas inferential statistics can make wider estimations but will always be subject to sample error. Each is a tool with a specific purpose and has its uses and drawbacks.

Comments

  1. I love statistic. Great breakdown of descriptive vs. inferential statistics! I like how you highlighted measures of central tendency and the importance of sample representativeness in inferential statistics.

    ReplyDelete

Post a Comment

Popular posts from this blog

Value of Data

What is Big Data?