Measuring Data Spread: Explained with Standard Deviation and Variance

Measuring Data Spread: Explained with Standard Deviation and Variance

Measuring the data spread of the given observations is very essential for finding the dispersion within a data set. Assume you have some estimated students' statistics test results. The class performance can be roughly estimated from the mean score, but what if the scores are dispersed widely?
Having mean estimation only is not enough in that case if the data is dispersed widely. Then the term variance and standard deviation come to deal with the spread of data observations. By understanding these concepts and how to calculate them, statisticians and data analysts are able to more effectively analyze and interpret data.
In this article, we will explore the definition, formulas, and solved examples of standard deviation and variance.

What are Standard Deviation and Variance?

Standard deviation and variance are the fundamental types of statistical measures that are useful for computing the spread of data or calculating the dispersion of data observations within a data set. Data variability can be determined by the deviation of these statistical measures from the expected value, which indicates how different the data observations are from each other. Let’s discuss both statistical measures.

Standard Deviation

In statistics, the measure of dispersion with a dataset that shows the average amount by which each data observation in the given set differs from the expected value is said to be the standard deviation. This is the square root of the variance. The basic moto of this statistical measure is to find how the data observations are spread out from the expected value. When the standard deviation is smaller, it indicates that the data observations are closely clustered around the expected, while when it is larger, it indicates that the data observations are more widely dispersed. Below steps will help let you know how to evaluate the standard deviation.

Sr. No.

Steps


Take the given observations and measure the expected value of the given data set. 


Subtract the expected value from each observation of the data set. 


Square the results of step 2.


Add all the squared terms and find their average


Take the square root of the above steps


The result will be the standard deviation of the given set.

Variance

The other statistical measure named Variance is also used to calculate the data spread that computes the average squared deviation (square of differences divided by the number of observations) of each observation from the expected value.

It is the square of the result of SD that give us a numerical representation of the variability within the dataset. Smaller variance values indicate close clustering around the expected value, while larger values indicate a wider distribution and more variability.

Below steps will help let you know how to evaluate the standard deviation.

Sr. No.

Steps


Take the given observations and measure the expected value of the given data set. 


Subtract the expected value from each observation of the data set. 


Square the results of step 2.


Add all the squared terms and find their average


The result will be a variance of the given set.

Formulas of Standard Deviation and Variance

Below are the basic formulas for finding the variance and standard deviation.

Statistical Measure

Sample Data

Population Data

Standard deviation 

s = √ [∑ (xi - x̅)2/(n – 1)]

σ = √ [∑ (xi - μ)2/n]

Variance

s2 = ∑ (xi - x̅)2/(n – 1)

σ2 = ∑ (xi - μ)2/n

The formulas of the variance and standard deviation are helpful for finding the spread of data.

How Standard Deviation and Variance Are Helpful to Measure the Spread of Data?

In statistical analysis, standard deviation and variance provide valuable insights into the distribution, variability, and reliability of data. Comparing datasets, assessing the spread of data, identifying outliers, and making informed decisions based on the data characteristics are some of the tasks that they assist with.
Standard deviation and variance are closely related; each has its own significance in analyzing the variability of a dataset. Standard deviation and variance serve as useful measures of data spread because they indicate how much data points deviate from the expected value. 
Here are some key reasons why standard deviation and variance are useful:

Quantifying Variability

Standard deviation and variance are essential for computing the dispersion within the data set and the variability in the dataset. Both statistical measures provide an amount of how distant each data observation is from the expected value, which allows us to understand better the spread of data.

Comparing Datasets

The access of relative variability will come by the comparison of the various dataset’s variances & SD. Datasets with smaller standard deviations or variances are more tightly clustered and less diverse, while larger values indicate greater variability and wider spreads.

Outlier Detection

The outlier in the dataset will be identified with the help of standard deviation and variance. Outliers are data points that deviate significantly from the rest of the data. Potential outliers that may require further investigation can be identified by examining the values of several std away from the expected value.

Decision Making

There are vast applications and uses for the spread of data such as quality control, finance, and scientific research. Standard deviation and variance provide a basis for informed decision-making, risk assessment, and determining the appropriate strategy.

Examples of Standard Deviation and Variance

Here are a few examples of Standard Deviation and Variance.

Example of Standard Deviation 

What is the standard deviation of the monthly sales data of a company for the past year, given the sales values [100, 120, 80, 140, 90, 135, 95, 110, 115, 85, 120, 75]?

Solution

Step 1: Find the expected value of the yearly sales of a company.

Yearly sales = 100, 120, 80, 140, 90, 135, 95, 110, 115, 85, 120, 75

Expected value = [100 + 120 + 80 + 140 + 90 + 135 + 95 + 110 + 115 + 85 + 120 + 75] / 12

Expected value = 1265/12

Expected value = 105.42

Step 2: Calculate the deviation of each monthly sale from the expected value

Deviation = [100 – 105.45, 120 – 105.45, 80 – 105.45, 140 – 105.45, 90 – 105.45, 135 – 105.45, 95 – 105.45, 110 – 105.45, 115 – 105.45, 85 – 105.45, 120 – 105.45, 75 – 105.45]

Deviation = [-5.42, 14.58, -25.42, 34.58, -15.42, 29.58, -10.42, 4.58, 9.58, -20.42, 14.58, -30.42]

Step 3: Now take the square of each deviation.

Square of Deviation = [(-5.42)2, (14.58)2, (-25.42)2, (34.58)2, (-15.42)2, (29.58)2, (-10.42)2, (4.58)2, (9.58)2, (-20.42)2, (14.58)2, (-30.42)2]

Squared Deviation = [29.38, 212.58, 646.18, 1195.78, 237.78, 874.98, 108.58, 20.98, 91.78, 416.98, 212.58, 925.38]

Step 4: Now take the average of the squared deviations.

Sum of Squared Deviation = [29.38 + 212.58 + 646.18 + 1195.78 + 237.78 + 874.98 + 108.58 + 20.98 + 91.78 + 416.98 + 212.58 + 925.38]

Sum of Squared Deviation = 4972.96

Average of Squared Deviation = 4972.96/12

Average of Squared Deviation = 414.41

Step 5: To evaluate the standard deviation, take the square root of the average squared deviation.

standard deviation = √414.41

standard deviation = 20.36

To solve SD problems without getting involved in lengthy calculations, you can also get help from an online standard deviation calculator.

Example of Variance

What is the variance of the daily temperatures in degree Celsius recorded in a city over the course of two weeks, given the data set [34, 35, 36, 33, 32, 34, 38, 37, 35, 33, 34, 39, 40, 41]?

Solution

Step 1: Find the expected value of the city’s temperature in two weeks.

City’s temperature = 34, 35, 36, 33, 32, 34, 38, 37, 35, 33, 34, 39, 40, 41

Expected value = [34 + 35 + 36 + 33 + 32 + 34 + 38 + 37 + 35 + 33 + 34 + 39 + 40 + 41] / 14

Expected value = 501/14

Expected value = 35.79

Step 2: Calculate the deviation of each day's temperature from the expected value

Deviation = [34 – 35.79, 35 – 35.79, 36 – 35.79, 33 – 35.79, 32 – 35.79, 34 – 35.79, 38 – 35.79, 37 – 35.79, 35 – 35.79, 33 – 35.79, 34 – 35.79, 39 – 35.79, 40 – 35.79, 41 – 35.79]

Deviation = [-1.79, -0.79, 0.21, -2.79, -3.79, -1.79, 2.21, 1.21, -0.79, -2.79, -1.79, 3.21, 4.21, 5.21]

Step 3: Now take the square of each deviation.

Square of Deviation = [(-1.79)2, (-0.79)2, (0.21)2, (-2.79)2, (-3.79)2, (-1.79)2, (2.21)2, (1.21)2, (-0.79)2, (-2.79)2, (-1.79)2, (3.21)2, (4.21)2, (5.21)2]

Squared Deviation = [3.20, 0.62, 0.04, 7.78, 14.36, 3.20, 4.88, 1.46, 0.62, 7.78, 3.20, 10.30, 17.72, 27.14]

Step 4: Now take the average of the squared deviations.

Sum of Squared Deviation = [3.20 + 0.62 + 0.04 + 7.78 + 14.36 + 3.20 + 4.88 + 1.46 + 0.62 + 7.78 + 3.20 + 10.30 + 17.72 + 27.14]

Sum of Squared Deviation = 102.3

Average of Squared Deviation = 102.3/14

Average of Squared Deviation = 7.311

Alternatively, an online variance calculator can be used  to find the variance of the given data values to get rid of the above lengthy calculations. 

Conclusion

Now you can easily grab all the basics of measuring the spread of data through standard deviation and variance from this post. As we have discussed all the basics of the given topic with examples. 
Next Post Previous Post
No Comment
Add Comment
comment url