The branch of mathematics that deals with the analysis and manipulation of data and numbers is called statistics. Its core functions are collection, analysis, interpretation, presentation, and organization of data. The concepts of statistics have applications across multiple disciplines such as business and data analysis, banking and finance, business management and development, etc.
Fundamentals of Statistics
The measure of central tendency and the measure of dispersion are among the basics of statistics. There are three central tendencies (mean, median, and mode) and two dispersions (variance and standard deviation).
- Central Tendencies: The process of measuring the central tendency involves identifying the central point within a certain set of data. The three ways of finding the midpoint are:
- Mean: The most popular measure of central tendency is mean. It is denoted by x and is calculated by adding up all the values in a given set of data and dividing the sum by the number of values present in that set. To derive the formula we can take a set of data where the elements are x1x2x3x4………xn and n is the number of elements.
x = x1+x2+x3+x4+………xnn
x = xn
*NOTE: Here, ‘’ is a Greek capital letter, pronounced sigma, which means the sum of the given entities.
A median is the middle element of a set of data that is arranged in ascending order of magnitude. It is denoted by ‘M’. To find the median, one has to arrange the data in ascending order first. Now, there can be two conditions:
- If the number of elements is odd, the middle element becomes the median. The formula of median becomes
M =n+12th term.
For example, in the set of the first seven natural numbers, i.e, 1,2,3,4,5,6,7. The element ‘4’ is the median of the set.
- If the number of elements is even, the average of the middle two elements is taken as the median. The formula of median becomes
M = n2th term +n2+1th term2
For example, in the set of the first 10 even numbers, i.e, 2,4,6,8,10,12,14,16,18,20, the average of ‘10’ and ‘12’ will be the median.
Here, the median is 10+122= 11.
The element which has the highest frequency in a certain set of data is known as the mode or the modal value. Let’s understand this concept with the help of one example.
Set A = 2,2,2,3,7,7,9,11
The mode of this set is 2 because it is occurring for the most number of times.
*NOTE: There can be more than one mode in a set of data. For two and three modes in a set of data, the terms bimodal and trimodal are used. For more than three modes, the term multimodal is used.
- Dispersions: The concept of dispersion enables us to determine the extent to which a certain set of data is expanded or compressed. It helps us to interpret the variable nature of the given data. There are two major methods of computing dispersion: variance and standard deviation. These concepts are interlinked with each other through the principles of powers and roots. Standard deviation is the dispersion of a group of elements from the mean. It is denoted by ‘’. The variance shows how distant each element of a set of data is from its mean. Since it is the square of standard deviation, it is denoted by ‘2’. Let’s study the formulas of dispersion
n = number of elements
xi = ith element
x = arithmetic mean
*NOTE: Here, i=1n-1 asserts the addition function for xi with i ranges from 1 to n-1.