Traditional Statistics
In my last post I discussed the value of data. In that post I
discussed how value is extracted from data by processing raw data and packaging
it but what does this mean? Data in its purest form is just points of
information. Without context or analysis applied it doesn’t necessarily serve much
use. One way in which human beings derive value from is data is through
statistics.
Traditional statistics is a tool for understanding and
interpreting data. There are many ways in which this tool can be applied to
data, with two main branches: descriptive and inferential. Both are
ways to extract meaning from data but are used in different ways for different purposes.
Descriptive Statistics
Descriptive statistics are used to describe or summarise
data in a meaningful way. They help us visualise what the data is showing and
make it easier to recognise patterns that may emerge from the data. Descriptive
statistics are applied to a sample and measures chosen properties of that sample
to present the data in different ways.
For example, if we had 100 people and their annual incomes,
we could apply descriptive statistics to present this data in a meaningful way.
We could view the mean, median, and mode of this data to derive the typical income
from the sample:
- Mean – By adding the annual income of all 100 people in the sample and dividing by 100 we could see the central point of the sample, giving us the average annual income of the sample.
- Median – If we organise the sample from highest to lowest income and take the middle value then we can see that half the people in the sample earn less than this annual income and half the sample earn more, splitting the sample in half.
- Mode – By viewing each annual income figure within the sample and seeing which figure appears the most then we can see the most common level of income within the sample.
These are all examples of measures of central tendency. Each
way the sample is presented can tell us different things about the sample or
more cynically can be used to push different ideas based on the same sample of
data.
Results of this kind of statistics is typically presented through
graphs, tables, and statistical commentaries to present a certain view of the
data. But that’s all it can do, make a judgement of the specific set of data. It
cannot make conclusions beyond the data sample and conclusion drawn from the analysis.
This is the key difference between descriptive statistics and inferential
statistics.
Inferential Statistics
Inferential statistics are used to make a wider estimation
of the populations a sample is drawn from based on the results drawn from the
measured properties of that sample. This kind of statistics is used when studying
populations that would be impossible or impractical to sample fully. Because
the sample if being used to make generalisations about the population it is drawn
from then it is essential that the sample is representative of the population
otherwise the results will only reflect the sample. A larger sample size will help alleviate the natural
variation the occurs when sampling a population but there will always be some drawback
to sampling, so it is important to keep this in mind when drawing a conclusion
using inferential statistics.
Using our example from before, if we take the same 100
people and use them as a sample of a wider population of a village of 1000 people
for example. Using inferential statistics, we can use the data gathered from
this sample to make inferences about the population of the village. Unlike
descriptive statistics you could not simply view the data within this sample
and say the average income of the population is that of the sample size due to natural
sampling error. But if you had a hypothesis of the average income of the populations
before the test, you could use the sample as evidence for or against the
hypothesis using inferential statistics. If you had two samples from the
village, with one receiving a financial benefit scheme and a test sample not
receiving it. You could make inferential estimations based on the results drawn
from the two samples.
Descriptive statistics can be used to draw meaning from data
sets but are limited in their scope to the data set whereas inferential statistics
can make wider estimations but will always be subject to sample error. Each is
a tool with a specific purpose and has its uses and drawbacks.
I love statistic. Great breakdown of descriptive vs. inferential statistics! I like how you highlighted measures of central tendency and the importance of sample representativeness in inferential statistics.
ReplyDelete