We now-a-days find data – both quantitative and qualitative – everywhere, generated from various surveys and experiments. These data, which are primarily raw in nature, are seldom useful to the individual or society or decision-maker, unless they are converted into information, to arrive at some meaningful conclusion about the data. The knowledge of statistics plays a vital role to convert the raw data into useful information which would certainly be valuable for improving the decision-making process and ultimately resulting in the goals, being aimed at, achieved. There is hardly any branch of discipline, which does not use statistics for its growth and development. Individuals, governments, organizations, and businesses all collect data to help them to track progress, measure performance, analyze problems, and prioritize avenues of growth. Statistics are not just numbers and facts. Instead, it is an array of knowledge and procedures that allow us to learn from the data more effectively. Statistical knowledge helps us to use proper methods to collect data, employ correct analytical techniques, and present the results effectively. More clearly, the knowledge of statistics will help to present the facts in a definite form, simplify the mass of data, facilitate comparison of data, formulate and test the hypothesis and help in prediction of the future. Hence, even if statistics is not one’s own primary field of study, it helps him/her to make an impact in his/her chosen field. There are very higher chances high that one will be required to have a working knowledge of statistics to produce new findings in his/her own field and to understand the work of others. The world today produces more data and thus, a better knowledge of statistics is all the more required than ever before. Thus, understanding statistics is highly important these days, irrespective of one’s own major discipline. That is the reason why, there is and there will be a high demand for statistical skills in a wide variety of areas: universities, research labs, government, industry, etc. and in variety of disciplines: biological sciences, engineering sciences and social sciences. This book written with the basic objective of infusing the knowledge of statistics into the backbone of every discipline one may care to mention is expected to provide the basics and the applications of statistics for those who require them.
Dr. M. Thirunavukkarasu is the former Professor and Head, Dept. of Livestock Business Management, Madras Veterinary College, Tamil Nadu Veterinary and Animal Sciences University (TANUVAS), Chennai, Tamil Nadu (India). He is graduated from Madras Veterinary College in the year 1986, did his post -graduation in Agricultural Economics at Tamil Nadu Agricultural University (1987-90), Coimbatore and doctorate in Animal Husbandry Economics (1993-96) at TANUVAS, Chennai. He joined the TANUVAS, Chennai in 1989 as an Assistant Professor and was promoted as an Associate Professor in 1997. He was elevated as the Professor of Animal Husbandry Statistics and Computer Applications in 2000. Considering his eminent achievements in education, research, extension, administration and institution building, he was then raised up as the Professor(Higher Grade) in 2011. His laudable accomplishments have also enabled him to hold the coveted posts of the Controller of Examinations, TANUVAS during 2012-15 and the Dean, Veterinary College and Research Institute, Tirunelveli (2016-18), besides holding the post of the Registrar, TANUVAS for a brief period from April to August, 2018. He underwent advanced training programmes on Livestock Economics and Planning and Quantitative Methods in Livestock Health and Production at the University of Reading, UK, in 1998 and training on Instructional Technology – eLearning guidelines, standards and protocols at Michigan State University, USA in 2009. He has also visited Uganda and Malawi for research and consultancy works. He has published 117 research papers in International and National journals, organised 3 international conferences, presented more than 65 research papers (including 5 lead/invited papers) in International & National Conferences/ Seminars, edited 11 books and authored 18 books, teaching manuals and booklets. He has been bestowed with 19 awards including the Best PG Student Award, Best Teacher Award, Best Scientist Award and Lifetime Achievement Award in his professional career spanning more than 33 years. He has implemented 21 Research Projects funded by National and International agencies including ICAR, NAIP, FAO, USDA, GoI, GoTN, AWBI, etc. He has started four new post-graduate programmes – M.V.Sc., M.F.Sc. and M.Sc. in Biostatistics (2011-12) and PG Diploma in Animal Health Economics (2019-20). He has contributed significantly in the areas of Livestock Economics, Animal Health Economics, Bio-statistics, IT Applications in Animal Sciences and Development of e-Courses for Veterinary Education.
1. INTRODUCTION ........................................................................... 1
1.1 History of Statistics .................................................................. 1
1.2 Definition of Statistics ............................................................. 2
1.3 Functions of Statistics ............................................................. 3
1.4 Applications of Statistics ........................................................ 4
1.5 Classification of Statistics ....................................................... 5
1.6 Concepts in Statistics .............................................................. 5
1.7 Types of Data ........................................................................... 6
2. COLLECTION, CLASSIFICATION AND TABULATION OF DATA............................................................. 9
2.1 Collection of Data .................................................................... 9
2.1.1 Primary and Secondary Data................................... 9
2.2 Classification of Data ............................................................ 12
2.2.1 Objectives (uses) of Classification .......................... 12
2.2.2 Types of Classification ............................................. 12
2.3 Tabulation of Data ................................................................ 12
2.3.1 Objectives of Tabulation .......................................... 12
2.3.2 Parts of a Table.......................................................... 13
2.3.3 General Structure of a Table ................................... 13
2.3.4 Types of Tables .......................................................... 13
2.3.5 Essentials of a Good Table....................................... 15
2.4 Classification of Data according to Class Interval (Frequency Distribution) ...................................................... 15
2.4.1 Formation of Frequency Distribution ................... 15
2.4.2 Advantages of Presenting Data on a Table of Frequency Distribution ............................ 19
2.4.3 Relative Frequency Distribution ............................ 19
3. GRAPHICAL AND DIAGRAMMATIC PRESENTATION OF DATA ...................................................... 25
3.1 Types of Graphs ..................................................................... 25
3.2 Types of Diagrams ................................................................. 26
3.3 Advantages and Limitations of Diagrams and Graphs . 26
3.4 Graphical Presentation of Data .......................................... 26
3.4.1 Description of Types of Graphs ............................. 27
3.5 Diagrammatic Presentation of Data .................................. 30
3.5.1 Description of Types of Diagrams ......................... 31
3.5.2 Limitations of Diagrammatic Presentation .......... 38
4. MEASURES OF CENTRAL LOCATION / TENDENCY (AVERAGES) .................................................................................. 39
4.1 Objectives of Calculating Averages.................................... 39
4.2 Types of averages................................................................... 40
4.2.1 Arithmetic Mean (AM) or Mean or Average ...... 40
4.2.2 Geometric mean (GM) ............................................. 46
4.2.3 Harmonic mean (HM) ............................................. 48
4.2.4 Median........................................................................ 50
4.2.5 Mode ........................................................................... 52
4.3 Choice of an Average ........................................................... 54
4.4 Situations where different Averages are used ................. 55
4.5 Common Properties of Means ............................................. 55
4.6 Exercises .................................................................................. 56
5. MEASURES OF VARIABILITY (MEASURES OF DISPERSION) ................................................................................ 61
5.1 Dispersion – Meaning and Measures ................................. 61
5.1.1 Absolute and Relative Dispersion ......................... 62
5.1.2 Different Measures of Dispersion .......................... 62
5.2 Range ....................................................................................... 62
5.3 Quartile Deviation (QD)....................................................... 63
5.4 Mean Deviation (MD)........................................................... 64
5.5 Standard Deviation (SD)...................................................... 65
5.6 Variance .................................................................................. 66
5.7 Coefficient of Variation (CV) .............................................. 67
5.8 Standard Error (SE)............................................................... 67
5.9 Degrees of Freedom............................................................... 68
5.10 Exercises .................................................................................. 69
5.11 Merits and Demerits of Dispersion Measures .................. 72
5.12 Choice of Dispersion Measure ............................................ 73
5.13 Skewness ................................................................................. 74
5.13.1 Measures of Skewness ............................................. 75
5.13.2 Exercises ..................................................................... 75
5.14 Moments .................................................................................. 77
5.15 Kurtosis .................................................................................... 78
5.15.1 Measure of Kurtosis ................................................. 78
5.16 Curves of Distribution .......................................................... 79
5.17 Commonly used Descriptive Statistics and Graphs ........ 79
6. SAMPLING AND SAMPLING METHODS.......................... 81
6.1 Advantages and Disadvantages of Census and Sampling Methods ................................................................. 81
6.2 Need for sampling ................................................................. 82
6.3 Types of sampling methods ................................................. 82
6.3.1 Random (Probability) Sampling Methods ............ 83
6.3.2 Non-random or Non-probability Sampling Methods ...................................................................... 92
6.3.3 Differences between Probability Sampling and Non-Probability Sampling .............................. 95
6.4 Sampling Errors ..................................................................... 95
7. PROBABILITY THEORY AND PROBABILITY DISTRIBUTION ..........................................................................101
7.1 Basic Concepts in Probability ............................................102
7.2 Basic Laws of Probability ...................................................102
7.3 Probability Notations ..........................................................103
7.4 Approaches to the Study of Probability ..........................103
7.4.1 Classical or Mathematical or a Priori Probability Approach ............................................103
7.4.2 Relative Frequency Approach ..............................104
7.4.3 Subject Approach ...................................................105
7.5 Theorems of Probability......................................................105
7.5.1 Addition or Total Probability Theorem ..............105
7.5.2 Multiplication or Compound Probability Theorem ...................................................................105
7.6 Conditional Probability ......................................................106
7.7 Probability Distribution ......................................................107
7.7.1 Discrete Probability Distribution .........................109
7.7.2 Continuous Probability Distribution ...................110
7.8 Probability of Compound Events ......................................110
7.9 Exercises ................................................................................110
7.10 Types of Probability Distributions ....................................111
7.10.1 Binomial Distribution.............................................111
7.10.2 Poisson Distribution ...............................................115
7.10.3 Normal Distribution ...............................................116
7.11 Exercises ................................................................................122
8. TEST OF HYPOTHESIS AND TESTS OF SIGNIFICANCE...........................................................................125
8.1 Procedure for Carrying out a Test of Significance ........125
8.2 Different Tests of Significance ...........................................128
8.3 Hypothesis Testing ..............................................................129
8.3.1 Critical Region or Rejection Region ....................130
8.3.2 Calculating Suitable Statistic ................................131
8.3.3 Choice of Probability Distribution .......................132
8.3.4 Reaching a Conclusion ..........................................132
8.3.5 One Sided and Two Sided Tests ..........................133
8.3.6 Hypothesis Testing - Univariate Case (One Sample) ...........................................................133
8.3.7 Test of Hypothesis - Z Test - Large Sample Test ..............................................................135
8.3.8 Test of Hypothesis - t Test - Small Sample Test ............................................................................141
8.3.9 Test of Hypothesis - X2 Test (Chi-Square Test of Significance).........................145
8.3.10 Test of Hypothesis - F Test or Variance-Ratio Test ................................................148
8.4 Non-Parametric Tests .........................................................150
9. DESIGN OF EXPERIMENTS (EXPERIMENTAL DESIGNS) ....................................................................................153
9.1 Concepts ................................................................................154
9.2 Basic Principles of Experimental Designs/Principles of a Good Experiment ......................................155
9.3 Analysis of Variance (ANOVA) – An Explanation ......158
9.4 Steps in Design of Experiments .........................................160
9.5 Criteria for Making Blocks .................................................160
9.6 Major Experimental Designs .............................................161
9.7 Completely Randomized Design (CRD) - Analysis of VAriance (ANOVA) ......................................162
9.7.1 Stepwise Procedure for a CRD ............................163
9.7.2 Exercises ...................................................................165
9.8 Randomised (Complete) Block Design (RBD or RCBD) – Two Way ANOVA .............................170
9.8.1 Blocking technique .................................................170
9.8.2 Randomization........................................................171
9.8.3 Stepwise Procedure for a RBD .............................171
9.8.4 Advantages of RBD................................................173
9.8.5 Disadvantages of RBD...........................................173
9.8.6 Exercise .....................................................................174
9.9 Latin Square Design (LSD) – Three-way ANOVA .......175
9.9.1 Layout of LSD .........................................................175
9.9.2 Analysis and Interpretation of LSD Results ......178
9.9.3 Exercise .....................................................................179
9.9.4 Advantage of LSD ..................................................180
9.9.5 Disadvantages of LSD ...........................................180
9.10 Missing Plot Technique .......................................................181
9.10.1 Example of Data with One Missing Observation..............................................................182
9.10.2 Steps in computation of ANOVA and comparisons of treatment means .........................182
9.11 Repeated Measures Design (RMD) – A Note .................184
9.12 Factorial Experiments .........................................................184
9.12.1 Terminologies used.................................................185
9.12.2 Types of Factorial Experiments ............................185
9.12.3 Notations ..................................................................185
9.12.4 Simple and Main Effects .......................................186
9.12.5 Interaction ................................................................186
9.12.6 Computation of Main Effects and Interactions 187
9.12.7 Layout of Factorial Experiments ..........................187
9.12.8 Analysis and Interpretation of Factorial Experiments .............................................................187
9.12.9 Exercise .....................................................................188
9.12.10 Advantages of Factorial Experiments ................192
9.12.11 Disadvantages of Factorial Experiments ...........192
9.13 Analysis of Covariance (ANACOVA).............................193
9.13.1 Exercise .....................................................................193
9.14 Split-Plot Design ..................................................................194
9.14.1 Example ....................................................................195
9.14.2 Advantages of Split-plot Designs ........................196
9.14.3 Disadvantages of Split-plot Designs ...................196
10. CORRELATION ..........................................................................197
10.1 Meaning of Correlation ......................................................197
10.2 Coefficient of Correlation...................................................197
10.3 Real and Spurious Correlation..........................................198
10.4 Types of Correlation............................................................198
10.4.1 Simple, Partial and Multiple Correlations .........198
10.4.2 Linear and Non-linear Correlations ....................198
10.5 Methods of Studying Correlation .....................................199
10.5.1 Scatter Diagram ......................................................199
10.5.2 Correlation Graph ..................................................200
10.5.3 Karl Pearson’s Correlation Coefficient ...............200
10.5.4 Concurrent Deviation Method.............................202
10.5.5 Spearman’s Rank Correlation Method ...............203
10.5.6 Measuring Correlation from Covariance ...........204
10.5.7 Probable Error of Correlation Coefficient ..........204
10.5.8 Coefficient of Determination ................................204
11. REGRESSION ..............................................................................205
11.1 Objectives of Regression Analysis.....................................205
11.2 Difference between Correlation and Regression ...........205
11.3 Types of Regression Analysis ............................................206
11.4 Regression Line and Linear Regression ...........................206
11.5 Properties of Regression Coefficient (b)...........................209
11.6 Coefficient of Determination (R2) .....................................209
11.7 Multiple Linear Regression ................................................210
11.7.1 Assumptions in a Multiple Linear Regression...211
11.7.2 Multiple Linear Regression Formula ...................211
11.7.3 An Example of Multiple Linear Regression .......212
12. NON-PARAMETRIC TESTS....................................................215
12.1 Parametric tests Vs. Non-Parametric tests ......................215
12.2 Reasons to use Nonparametric Tests ...............................216
12.3 Types of Non-parametric Tests .........................................217
12.4 Approach (Hypothesis testing) to (in) a non-parametric test.............................................................218
12.5 Non-parametric tests...........................................................220
12.5.1 Kruskal Wallis Test.................................................221
12.5.2 Mann Whitney U Test ...........................................222
12.5.3 Wilcoxon Signed-Rank Test ..................................225
12.5.4 Sign Test ...................................................................227
12.5.5 Spearman Rank Correlation .................................228
12.6 Advantages of Non-parametric Tests ..............................230
12.7 Disadvantages of Non-parametric Tests .........................230
13. CHOOSING A RIGHT STATISTICAL ANALYSIS / TEST / METHOD .........................................................................231
13.1 Parametric and their Alternative Nonparametric Methods ....................................................231
13.2 Statistical Methods to Compare the Proportions ...........232
13.3 Statistical Analysis in Different Populations and Variables ................................................................................233
13.4 Statistical Inference Test for Different Goals and Data Types ............................................................................234
13.5 Statistical Test for Different Number and Types of Dependent and Independent Variables ..........235
14. BIOLOGICAL ASSAY ...............................................................237
14.1 Principle of Bioassay ...........................................................237
14.2 Advantages and Uses of Bioassay ....................................238
14.3 Bioassay Methods (Types of Bioassays) ...........................239
14.4 Bioassay Systems and Techniques ....................................240
15. APPENDIX – STATISTICAL TABLES...................................243