Python Statistics Mean














































Python Statistics Mean



Article No:2
                               MEAN METHOD
NOTE:
The introduction to statistics module is explained in previous article(article no:1)
Link: javascript:nicTemp();

MEAN INTRODUCTION

                                  The statistical mean refers to the mean or average that is used to derive the central tendency of the data in question. It is determined by adding all the data points in a population and then dividing the total by the number of points. The resulting number is known as the mean.


                    Mean = sum of quantities/no.of quantities

                                   mean=(x1+x2+x3+.........+xn)/n
                                             {Here x1,x2,x3..........x are data points and n is number of                                 quantities}

Sample Program: 

           >>>import statistics as st

           >>>st.mean([1,2,3,4,5]

Sensitivity to Outliers:

Outlier Definition:  

                    In statistics, an outlier is a data point that differs significantly from other observations.         An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set.  

-> Mean is very sensitive to outliers.

Example program for sensitivity to outliers

                    >>>import statistics as st

                    >>>st.mean([1,2,3,4,50]

                    

          Here in the above if we replace 5th item (5) by 50 then the mean is increased largely that is from 3 to 12 

How to overcome the problem of overfitting 

Trimmed mean:

      Trimmed Mean is computed by dropping 'k' extreme elements from either side{Note that we have to drop same number of elements from both sides}

Example on Trimmed Mean:  

                  >>>import statistics as st

                    >>>st.mean([2,3,4]                    


          

          Here in the above case if the last element is '50' the mean is '12' and if we use trimmed mean then mean is 3.Thus by using trimmed mean we can   overcome the problem of overfitting.



''Follow my profile for more articles on statistics''



Comments