Quantitative Biology : MSC Questions and Answers
Chapter 1: Descriptive statistics
Q. Find regression equation for following data:
x 6 2 10 4 8
y 9 11 5 8 7
-
Data:
+---+----+
| x | y |
+---+----+
| 6 | 9 |
| 2 | 11 |
|10 | 5 |
| 4 | 8 |
| 8 | 7 |
+---+----+
Calculations:
+---+----+-------+-------+------+------+
| x | y | dx | dy | dx*dy | dx^2 |
+---+----+-------+-------+------+------+
| 6 | 9 | 0 | 1 | 0 | 0 |
| 2 | 11 | -4 | 3 | -12 | 16 |
|10 | 5 | 4 | -3 | -12 | 16 |
| 4 | 8 | -2 | 0 | 0 | 4 |
| 8 | 7 | 2 | -1 | -2 | 4 |
+---+----+-------+-------+------+------+
Summation:
Σ(dx*dy) = -26
Σ(dx^2) = 40
Regression Equation:
m = Σ(dx*dy) / Σ(dx^2) = -26 / 40 = -0.65
b = ȳ - m * x̄ = 8 - (-0.65 * 6) = 8 + 3.9 = 11.9
Regression Equation:
y=mx+b
y = -0.65x + 11.9
Q. Write a short note on poisson distribution.
-
Poisson Distribution:
The Poisson distribution is a probability distribution that is commonly used to model the number of events that occur within a fixed interval of time or space. It is often applied to situations where events occur randomly and independently at a constant average rate over time.
Key characteristics of the Poisson distribution include:
1) Average Rate: The distribution is determined by a single parameter, λ (lambda), which represents the average rate.
2) Discrete Values: The Poisson distribution is discrete, meaning that it describes the probabilities of observing a specific number of events (0, 1, 2, 3, and so on) within the interval.
3) Independence: The events must occur independently of each other, with no influence from previous or future events.
4) Memorylessness: The probability of an event occurring in a given interval is not influenced by the time since the last event occurred.
Q. Write a short note on level of significanec.
-Significance Level:
The significance level, often denoted as α (alpha), is a critical component in hypothesis testing and represents the threshold for accepting or rejecting a null hypothesis.
Key points about the significance level include:
1) Determining Statistical Significance:It helps determine if the observed results are statistically significant or simply due to random chance.
2) Commonly Used Values: The significance level is typically set at 0.05 (5%) or 0.01 (1%).
3) Type I Error: The significance level is directly related to the Type I error, which occurs when the null hypothesis is rejected even though it is true. A lower significance level reduces the probability of committing a Type I error.
4) Contextual Interpretation: The significance level should be chosen carefully, considering the consequences of both Type I and Type II errors in the specific context of the study.
Q. Following is the data recorded on two variables in population.
Calculate correlation and regression coefficient and comment on it.
X 9 8 10 9 10 12 9 11 13 9
Y 11 16 18 20 16 11 17 18 17 19
-
| X | Y | X - x̄ | Y - ȳ | (X - x̄)(Y - ȳ) | (X - x̄)² | (Y - ȳ)² |
|---|---|---|---|---|---|---|
| 9 | 11 | -1 | -5.3 | 5.3 | 1 | 28.09 |
| 8 | 16 | -2 | -0.3 | 0.6 | 4 | 0.09 |
| 10 | 18 | 0 | 1.7 | 0 | 0 | 2.89 |
| 9 | 20 | -1 | 3.7 | -3.7 | 1 | 13.69 |
| 10 | 16 | 0 | -0.3 | 0 | 0 | 0.09 |
| 12 | 11 | 2 | -5.3 | -10.6 | 4 | 28.09 |
| 9 | 17 | -1 | 0.7 | -0.7 | 1 | 0.49 |
| 11 | 18 | 1 | 1.7 | 1.7 | 1 | 2.89 |
| 13 | 17 | 3 | 0.7 | 2.1 | 9 | 0.49 |
| 9 | 19 | -1 | 2.7 | -2.7 | 1 | 7.29 |
Sum of (X - x̄)(Y - ȳ): 5.3 + 0.6 + 0 - 3.7 + 0 - 10.6 - 0.7 + 1.7 + 2.1 - 2.7 = -7.9
Sum of (X - x̄)²: 1 + 4 + 0 + 1 + 0 + 4 + 1 + 1 + 9 + 1 = 22
Sum of (Y - ȳ)²: 28.09 + 0.09 + 2.89 + 13.69 + 0.09 + 28.09 + 0.49 + 2.89 + 0.49 + 7.29 = 83.1
Now we can calculate the correlation coefficient (r) and the regression coefficient (b):
r = Sum of (X - x̄)(Y - ȳ) / sqrt((Sum of (X - x̄)²) * (Sum of (Y - ȳ)²))
= -7.9 / sqrt(22 * 83.1)
≈ -7.9 / sqrt(1826.2)
≈ -7.9 / 42.77
≈ -0.1845
b = Sum of (X - x̄)(Y - ȳ) / Sum of (X - x̄)²
= -7.9 / 22
≈ -0.3591
The correlation coefficient (r) is approximately -0.1845, and the regression coefficient (b) is approximately -0.3591.
Comment: The correlation coefficient (r) indicates a weak negative linear relationship between the variables X and Y. As the value of X increases, the value of Y tends to slightly decrease
Q. Calculate mean and mode of the following data:
Class Interval 0-5 5-10 10-15 15-20 20-25 25-30
Frequency 2 4 8 5 4 1
-
1. Calculate mean with its table
Class Interval Midpoint Frequency Midpoint x Frequency
0-5 2.5 2 5
5-10 7.5 4 30
10-15 12.5 8 100
15-20 17.5 5 87.5
20-25 22.5 4 90
25-30 27.5 1 27.5
Step 3: Sum up the values in the "Midpoint x Frequency" column.
Sum of Midpoint x Frequency = 5 + 30 + 100 + 87.5 + 90 + 27.5 = 340
Step 4: Sum up the frequencies.
Total Frequency = 2 + 4 + 8 + 5 + 4 + 1 = 24
Step 5: Calculate the mean by dividing the sum of Midpoint x Frequency by the total frequency.
Mean = Sum of Midpoint x Frequency / Total Frequency = 340 / 24 = 14.17
To calculate the mode:
The mode is the value that appears most frequently in the data set.
Looking at the frequencies, the class interval 10-15 has the highest frequency of 8. Therefore, the mode of the given data is 10-15.
Q. What is standard error of mean.
- Standard Error of Mean:
- Standard Error of Mean (SEM) measures the uncertainty or variability of the sample mean.
- It tells you how much the sample mean is likely to deviate from the true population mean.
- A smaller SEM means that the sample mean is a more precise estimate of the population mean.
- SEM is calculated by (standard deviation of the sample)/sq.root(sample size)
Q. What is degrees of freedom.
-Degrees of Freedom:
- -Degrees of Freedom (df) is a concept used in statistical analysis.
- -It refers to the number of values in a calculation that are free to vary.
- -In simple terms, it represents the number of observations in a sample that are independent and can provide information.
- -Degrees of Freedom are often used in hypothesis testing and estimating population parameters.
- -The df value affects the accuracy of statistical tests and determines the critical values from the distribution tables.
- -For example, if you have a sample of 10 data points, you would typically have 9 degrees of freedom because the last data point's value is determined by the previous 9.
Q. Calculate the mean of the following data.
Sr. No. Bonus No. of Persons
1 500 1
2 600 3
3 700 5
4 800 7
5 900 6
6 1000 2
7 1100 1
-
Sr. No. | Bonus | No. of Persons | Product (Bonus * No. of Persons)
1 | 500 | 1 | 500
2 | 600 | 3 | 1800
3 | 700 | 5 | 3500
4 | 800 | 7 | 5600
5 | 900 | 6 | 5400
6 | 1000 | 2 | 2000
7 | 1100 | 1 | 1100
Sum of Product = 500 + 1800 + 3500 + 5600 + 5400 + 2000 + 1100 = 19900
Total number of persons = 1 + 3 + 5 + 7 + 6 + 2 + 1 = 25
Mean = Sum of Product / Total number of persons
= 19900 / 25
= 796
Therefore, the mean of the given data is 796.
Q. Define Variance
- Variance is a statistical measure that quantifies the spread or dispersion of a set of data points around the mean. It provides information about how the individual data points deviate from the average value.
The formula for variance, assuming a sample, is as follows:
s^2 = Σ((x - x̄)^2) / (n - 1)
s represents the standard deviation
Σ denotes the sum of
x represents each data point
x̄ represents the sample mean
n represents the sample size
Q. Determine the standard deviation from the following data: 10, 15, 25, 30
and 50.
Answer:
-Data | σ =(Data - Mean) | (σ^2)
10 | -16 | 256
15 | -11 | 121
25 | -1 | 1
30 | 4 | 16
50 | 24 | 576
Variance = (256 + 121 + 1 + 16 + 576) / 5 = 970 / 5 = 194
Standard Deviation = √(Variance) ≈ √194 ≈ 13.928
Chapter 2: Inferential Statistics- I
Q. Write a note on type 1 and type 2 errors.
-
Type 1 and Type 2 errors are concepts used in hypothesis testing.
Type 1 Error (False Positive):
1) A Type 1 Error occurs when we reject a null hypothesis that is actually true.
2) We conclude that there is relationship or effect, when in reality, there is no such effect or
relationship present.
3) It is also known as false positive
4) Probability of commiting type 1 error is denoted by α (alpha).
5) By choosing a smaller significance level we decrease the chance of making a type 1 error.
Type 2 Error (False Negative):
1) A Type 2 Error occurs when we accept a null hypothesis that is actually false.
2) We conclude that there is no significant effect or relationship, even though in reality there is a significant effect or relationship.
3) This error is known as false negative.
4) The probability of coming type 2 error is denoted by β (beta).
5) Increasing the sample size can reduce risk of Type 2 error.
Q. Explain one tailed and two tailed tests.
- One Tailed Test:
1) It is used to test a directional hypothesis, when data supports hypothesis in one particular direction.
2) The critical region is only on one side of distribution.
3) It can be upper tailed test and lower tailed test.
4) For upper tailed test, alternative hypothesis is formulated to have parameter greater than certain value.
5) For lower tailed test, alternative hypothesis is formulated to have parameter smaller than certain value.
Two Tailed Test:
1) It is used to test a non directional hypothesis, when data supports hypothesis in either of the direction.
2) It allows parameter to be significantly different from hypothesized value either greater or smaller.
3) It does not specify any specific direction of effect.
Q. In a mutation breeding experiment, effect of gamma radiation on weight
of 10 seeds was determined. Mean weight in grams per plant of bean
variety is given. Analyze the data using t-test.
Control: 2.9, 3.1, 3.5, 3.4, 3.0, 4.0, 3.7, 3.0, 4.0, 4.0.
Test: 2.7, 2.8, 3.0, 3.5, 3.7, 3.2, 3.0, 3.0, 2.9, 2.8.
- T Test : We use t test when we want to compare means of two groups and determine if there is any significant difference between them.
Step 1: State the null hypothesis (H0) and alternative hypothesis (H1):
Null hypothesis (H0): There is no significant difference between the mean weights of the control and test groups.
Alternative hypothesis (H1): There is a significant difference between the mean weights of the control and test groups.
Step 2: Calculate the means of the control and test groups:
Control group mean (X1) = (2.9 + 3.1 + 3.5 + 3.4 + 3.0 + 4.0 + 3.7 + 3.0 + 4.0 + 4.0) / 10 = 3.46
Test group mean (X2) = (2.7 + 2.8 + 3.0 + 3.5 + 3.7 + 3.2 + 3.0 + 3.0 + 2.9 + 2.8) / 10 = 3.06
Step 3 : Calculate the standard deviation (s) for each group:
S = √[(Σ(X - X̄)^2) / (n - 1)]
Where:
Σ represents the sum of
X represents each individual value in the data set
X̄ represents the mean of the data set
n represents the number of data points in the data set
For the control group:
Deviation from mean (X1 - X̄1):
(2.9 - 3.44), (3.1 - 3.44), (3.5 - 3.44), (3.4 - 3.44), (3.0 - 3.44), (4.0 - 3.44), (3.7 - 3.44), (3.0 - 3.44), (4.0 - 3.44), (4.0 - 3.44)
S1 = √[(Σ(X1 - X̄1)^2) / (n1 - 1)]
Substituting the values:
s1 = √[(0.2024 + 0.0256 + 0.1024 + 0.0064 + 0.1936 + 0.3481 + 0.0864 + 0.1936 + 0.3481 + 0.3481) / (10 - 1)]
= √(1.8543 / 9)
= 0.4308
For the test group:
Deviation from mean (X2 - X̄2):
(2.7 - 3.08), (2.8 - 3.08), (3.0 - 3.08), (3.5 - 3.08), (3.7 - 3.08), (3.2 - 3.08), (3.0 - 3.08), (3.0 - 3.08), (2.9 - 3.08), (2.8 - 3.08)
S2 = √[(Σ(X2 - X̄2)^2) / (n2 - 1)]
Substituting the values:
s2 = √[(0.0064 + 0.0196 + 0.0004 + 0.1681 + 0.3481 + 0.0016 + 0.0004 + 0.0004 + 0.0064 + 0.0196) / (10 - 1)]
= √(0.5722 / 9)
= 0.2534
Step 4: Calculate the pooled standard deviation (sp):
sp = √[((n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2)]
Substituting the values:
sp = √[((10 - 1) * 0.4308^2 + (10 - 1) * 0.2534^2) / (10 + 10 - 2)]
= √[(9 * 0.1858 + 9 * 0.0642) / 18]
= √(0.361 + 0.058)
= √0.419
= 0.6472
Step 5: Calculate the t-value:
t = (X̄1 - X̄2) / (sp * √[(1/n1) + (1/n2)])
Substituting the values:
t = (3.44 - 3.08) / (0.6472 * √[(1/10) + (1/10)])
= 0.36 / (0.6472 * √(0.1 + 0.1))
= 0.36 / (0.6472 * √0.2)
= 0.36 / (0.6472 * 0.4472)
= 0.36 / 0.2893
= 1.245
Step 6: Determine the degrees of freedom (df):
df = n1 + n2 - 2
= 10 + 10 - 2
= 18
Step 8: Determine the critical t-value
at a(α = 0.05) with df = 18, the critical t-value is approximately 2.101.
If Calculated t-value > critical value we reject null hypothesis
If calculated t-value <= critical value we accept null hypothesis
Since the calculated t-value (1.245) is smaller than the critical t-value (2.101), we accept the null hypothesis. This means that there is not enough evidence to suggest a significant difference between the mean weights of the control and test groups in the mutation breeding experiment.
Chapter 3: Inferential Statistics- II
Q. From the data given below find out whether the means of three samples
differ significantly or not by ANOVA.
Sample One Sample Two SampleThree
20 19 13
10 13 12
17 17 10
17 12 15
16 09 05
- ANOVA (Analysis of Variance) is used to find whether are are siginificant differences between the means of three or more groups.
Here we find if there is significant difference between the means of the three samples.
Step 1: Hypothesis
Null hypothesis (H0): The means of the three samples are equal.
Alternative hypothesis (Ha): The means of the three samples are not equal (at least one pair differs significantly).
Step 2: Calculate the group means and the overall mean:
Let's calculate the means for each sample and the overall mean.
Sample One: Mean = (20 + 10 + 17 + 17 + 16) / 5 = 16
Sample Two: Mean = (19 + 13 + 17 + 12 + 9) / 5 = 14
Sample Three: Mean = (13 + 12 + 10 + 15 + 5) / 5 = 11
Overall Mean = (16 + 14 + 11) / 3 = 13.67
Step 3: Calculate the sum of squares between groups (SSB):
SSB measures the variation between the group means.
SSB = Σ(ni * (x̄i - x̄)^2)
Where:
ni represents the number of observations in the i-th group.
x̄i represents the mean of the i-th group.
x̄ represents the overall mean (mean of all groups combined).
SSB = (5 * (16 - 13.67)^2) + (5 * (14 - 13.67)^2) + (5 * (11 - 13.67)^2)
= 3.67 + 0.11 + 9.44
= 13.22
Step 4: Calculate the sum of squares within groups (SSW):
SSW measures the variation within each group.
SSW = Σ((xi - x̄i)^2)
Where:
xi represents an individual observation in the i-th group.
x̄i represents the mean of the i-th group.
SSW = (20 - 16)^2 + (10 - 16)^2 + (17 - 16)^2 + (17 - 16)^2 + (16 - 16)^2
+ (19 - 14)^2 + (13 - 14)^2 + (17 - 14)^2 + (12 - 14)^2 + (9 - 14)^2
+ (13 - 11)^2 + (12 - 11)^2 + (10 - 11)^2 + (15 - 11)^2 + (5 - 11)^2
= 46 + 36 + 1 + 1 + 0 + 25 + 1 + 9 + 4 + 25 + 4 + 1 + 1 + 16 + 36
= 230
Step 5: Calculate the degrees of freedom:
Degrees of freedom (df) are calculated as follows:
df_between = number of groups - 1 = 3 - 1 = 2
MSB = SSB / df_between
= 13.22 / 2
= 6.61
df_within = total number of observations - number of groups = 15 - 3 = 12
MSW = SSW / df_within
= 230 / 12
= 19.17
Step 7: Calculate the F-statistic:
The F-statistic is the ratio of the mean squares.
F = MSB / MSW = 6.61 / 19.17 = 0.344
Step 8: Determine the critical value and compare it with the calculated F-statistic:
let's assume a significance level of 0.05. Looking up the critical value for df_between = 2 and df_within = 12 in the F-distribution table, we find a critical value of 3.885.
Step 9: Make a decision:
If Caculated F-statistics > Critical Value we reject null hypothesis
If calculated F-statistics <= Critical value, we accept null hypothesis
Here F statistics = 0.344 < 3.885
We accept null hypothesis.
We conclude that there is insufficient evidence to suggest significant differences between the means of the three samples.
Q, Nephropathy was observed in 100 patients of four classes of diabetes as
per severity of the disease.
Class I II III IV
Number of patents 8 15 14 7
Is this difference due to chance? Test by chi square test.
-
Step 1: State the null hypothesis (H0) and alternative hypothesis (H1):
H0: The observed distribution of nephropathy among the four classes of diabetes is due to chance.
H1: The observed distribution of nephropathy among the four classes of diabetes is not due to chance.
Step 2: Setup observed and expected frequencies table
Class Observed Frequency Expected Frequency
I 8 (8/44) * 100 = 18.18
II 15 (15/44) * 100 = 34.09
III 14 (14/44) * 100 = 31.82
IV 7 (7/44) * 100 = 15.91
Step 3: Calculate the chi-square statistic (χ^2):
Using the formula:
χ^2 = Σ((Observed frequency - Expected frequency)^2 / Expected frequency)
where Σ represents the sum of,
and the calculation is performed for each class.
Class Obs freq Exp Freq (O - E)^2 / E
I 8 18. 18 (8 - 18.18)^2 / 18.18 = 5.0317
II 15 34.09 (15 - 34.09)^2 / 34.09 = 8.0505
III 14 31.82 (14 - 31.82)^2 / 31.82 = 9.1929
IV 7 15.91 (7 - 15.91)^2 / 15.91 = 5.7673
Chi square value
Sum of (O - E)^2 / E = 5.0317 + 8.0505 + 9.1929 + 5.7673 = 28.0424
Step 5: Determine the degrees of freedom (df):
df = (Number of categories - 1)
In this case, df = 4 - 1 = 3.
Step 6 : Find critical value
With a significance level (α) of 0.05 and 3 degrees of freedom (df), the critical chi-square value from the chi-square distribution table is approximately 7.815.
Here, Calculated value > Critical Chi Square Value, so we reject null hypothesis
Comparing the calculated chi-square value (28.0424) with the critical chi-square value (7.815), we can conclude that the observed difference in nephropathy among the four classes of diabetes is statistically significant at the 0.05 significance level.
Q. In an experiment of pea breeding following frequency of the seeds in F2
generation were obtained. With the help of chi square determined whether
the obtained ratio match with Mendel’s dihybrid ratio. [7]
Round yellow : 315
Wrinkled yellow : 101
Round green : 108
Wrinkled green : 32
Answer:
The chi-square test is used to determine whether the observed frequencies in a categorical data set differ significantly from the expected frequencies.
Step 1: State the null hypothesis (H0) and alternative hypothesis (H1):
H0: The observed frequencies match the expected frequencies based on Mendel's dihybrid ratio.
H1: The observed frequencies do not match the expected frequencies based on Mendel's dihybrid ratio.
Step 2: Calculate the expected frequencies:
Based on Mendel's dihybrid ratio, the expected frequencies can be calculated. The expected ratio for each category in Mendel's dihybrid ratio is 9:3:3:1.
Category Observed Frequency Expected Frequency
Round yellow 315 (9/16) * 556 = 310.875
Wrinkled yellow 101 (3/16) * 556 = 104.625
Round green 108 (3/16) * 556 = 104.625
Wrinkled green 32 (1/16) * 556 = 34.875
Step 3: proceed to calculate the chi-square statistic.
Category Obs Freq Exp Freq (O - E)^2 / E
Round yellow 315 310.875 (315 - 310.875)^2 / 310.875 = 0.0516
Wrinkled yellow 101 104.625 (101 - 104.625)^2 / 104.625 = 0.1308
Round green 108 104.625 (108 - 104.625)^2 / 104.625 = 0.1285
Wrinkled green 32 34.875 (32 - 34.875)^2 / 34.875 = 0.1685
Sum of (O - E)^2 / E = 0.0516 + 0.1308 + 0.1285 + 0.1685 = 0.4794
The calculated chi-square statistic (χ^2) is 0.4794.
Step 4: Find degrees of freedom
degrees of freedom (df = 4 - 1 = 3)
Step 5: compare this calculated chi-square value with the critical chi-square value
The critical chi-square value for a significance level of 0.05 and 3 degrees of freedom is approximately 7.815.
Since the calculated chi-square value (0.4794) is smaller than the critical chi-square value (7.815), we accept the null hypothesis.
the obtained ratio of the pea breeding experiment does not significantly differ from Mendel's dihybrid ratio, based on the chi-square test at a significance level of 0.05.
Q. In F2 generation Mendel obtained 621 tall and 187 dwarf plants. Suggest
by applying chi square test, whether this ratio is in accordance with the
Mendel monohybrid ratio or it deviates from this ratio.
-
Step 1: State the null hypothesis (H0) and alternative hypothesis (H1):
H0: The observed frequencies match the expected frequencies based on Mendel's monohybrid ratio.
H1: The observed frequencies do not match the expected frequencies based on Mendel's monohybrid ratio.
Step 2: Set up the observed and expected frequency tables:
Category Observed Frequency Expected Frequency
Tall 621 (3/4) * 808 = 606.75
Dwarf 187 (1/4) * 808 = 201.25
Now, we can proceed to calculate the chi-square statistic.
Category Obs Freq Exp Freq (O - E)^2 / E
Tall 621 606.75 (621 - 606.75)^2 / 606.75 = 2.1025
Dwarf 187 201.25 (187 - 201.25)^2 / 201.25 = 4.0301
Sum of (O - E)^2 / E = 2.1025 + 4.0301 = 6.1326
Steps 3: Degree of Freedom
Since there are 2 categories (Tall and Dwarf), the degrees of freedom (df) is 2 - 1 = 1.
significance level (α) of 0.05 and 1 degree of freedom (df),
Step 4: Find critical value
the critical chi-square value from the chi-square distribution table is approximately 3.841.
Step 5: Result
Comparing the calculated chi-square value (6.1326) with the critical chi-square value (3.841), we can conclude that the observed ratio of tall to dwarf plants deviates significantly from Mendel's monohybrid ratio at the 0.05 significance level.
based on the chi-square test, we can suggest that the observed ratio of tall to dwarf plants in the F2 generation does not match Mendel's monohybrid ratio and deviates significantly from it.
Chapter 4: Probability and Probability distribution.
Q. An average five cars arrive at toll booth every min. Assume it is Poisson distribution, what is the probability that exactly 0, 1, 2, 3 and 4 cars arrive in one min - Poisson Distribution: We use poisson distribution to find number of events occuring in a fixed interval of time and space when average rate of occurence is known. 1. FInd average rate of occurence : 5 cars/min 2. Use poison formula: P(x; λ) = (e^(-λ) * λ^x) / x! P(x; λ) represents the probability of x events occurring given the average rate of occurrence λ. e= (approximately 2.71828). λ= average rate of occurrence. x is the number of events you want to calculate the probability for. 3. Calculate the probabilities: Use poisson formula for each number of cars (0,1,2,3,4,) P(0; 5) = (e^(-5) * 5^0) / 0! P(1; 5) = (e^(-5) * 5^1) / 1! P(2; 5) = (e^(-5) * 5^2) / 2! P(3; 5) = (e^(-5) * 5^3) / 3! P(4; 5) = (e^(-5) * 5^4) / 4! P(0; 5) = (0.00674 * 1) / 1 ≈ 0.00674 P(1; 5) = (0.00674 * 5) / 1 ≈ 0.0337 P(2; 5) = (0.00674 * 25) / 2 ≈ 0.0843 P(3; 5) = (0.00674 * 125) / 6 ≈ 0.1405 P(4; 5) = (0.00674 * 625) / 24 ≈ 0.1756 4. Interpret the Results: The calculated values are probabilities of observing 0,1,2,3,4 cars ariving at toll booth 1 one minute.
Q.Random testing of ABO blood group in the offspring of only AB couples
in an Europian population obtained the following distribution of blood
groups.
A-312, AB-575 & B-313
Test whether the data is consistent with the normal segregation of alleles
in the population (i.e. 1:2:1 ratio)
-
-let's set up the null hypothesis (H0) and alternative hypothesis (Ha):
Step 1:
H0: The observed frequencies are consistent with the expected frequencies under normal segregation of alleles.
Ha: The observed frequencies are not consistent with the expected frequencies under normal segregation of alleles.
Step 2: Expected Frequencies
E(A) = (312 + 313) / 4 = 625 / 4 = 156.25
E(AB) = 2 * E(A) = 2 * 156.25 = 312.5
E(B) = E(A) = 156.25
We can calculate the chi-square statistic using the following formula:
χ^2 = Σ [(O(i) - E(i))^2 / E(i)]
Applying this formula to our data:
χ^2 = [(312 - 156.25)^2 / 156.25] + [(575 - 312.5)^2 / 312.5] + [(313 - 156.25)^2 / 156.25]
Calculating this:
χ^2 = (155.75^2 / 156.25) + (262.5^2 / 312.5) + (156.75^2 / 156.25)
χ^2 = 155.75 + 175 + 156.75
χ^2 = 487.5
degrees of freedom (df) is (3 - 1) = 2.
Assuming a significance level of 0.05, the critical chi-square value with 2 degrees of freedom is approximately 5.991.
Since 487.5 > 5.991, we reject the null hypothesis (H0). This means that the observed distribution of blood groups is not consistent with the expected distribution under normal segregation of alleles (1:2:1 ratio).
Q. A traffic police records an average of three road accidents per week.
The number of accidents is distributed according to a poisson distribution.
Calculate the probability of exactly two accidents in any week.
- The Poisson distribution is often used to model the number of events occurring in a fixed interval of time or space, given the average rate of occurrence.
In this case, we know that the average number of accidents per week is three. Let's denote this average as λ (lambda).
The formula for the Poisson distribution is:
P(x; λ) = (e^(-λ) * λ^x) / x!
Where:
P(x; λ) is the probability of x events occurring given the average rate λ.
e is the mathematical constant approximately equal to 2.71828.
x is the actual number of events (in this case, two).
λ is the average rate of events (in this case, three).
x! denotes the factorial of x.
Now let's calculate the probability of exactly two accidents in any week using the Poisson distribution formula:
P(2; 3) = (e^(-3) * 3^2) / 2!
Calculating this:
P(2; 3) = (2.71828^(-3) * 3^2) / 2!
P(2; 3) = (0.049787 * 9) / 2
P(2; 3) = 0.14936
So, the probability of exactly two accidents in any given week is approximately 0.14936, or 14.94%.
Q. In a town, 10 accidents take place in 50 days. Assuming its PD, find out the probability of at least 3 accidents in a day. - Poisson Distribution: We use poisson distribution to find number of events occuring in a fixed interval of time and space when average rate of occurence is known. 1. FInd average rate of occurence : 10 accidents in 50 days therefore, 10 accidents/ 50 days = 0.2/per day 2. Use poison formula: P(x; λ) = (e^(-λ) * λ^x) / x! P(x; λ) represents the probability of x events occurring given the average rate of occurrence λ. e= (approximately 2.71828). λ= average rate of occurrence. x is the number of events you want to calculate the probability for. 3. Calculate the probabilities: To find at least 3 accidents probability, add probabilities of three or more accidents p(atleast 3 accidents)= P(3;0.2)+P(4;0.2)+P(5;0.2)+...P(10;0.2) 4: Perform Calculations: P(3; 0.2) ≈ (0.8187 * 0.008) / 6 P(4; 0.2) ≈ (0.8187 * 0.0016) / 24 P(5; 0.2) ≈ (0.8187 * 0.00032) / 120 ... P(10; 0.2) ≈ (0.8187 * 0.0000001024) / 3,628,800 P(at least 3 accidents) = P(3; 0.2) + P(4; 0.2) + P(5; 0.2) + ... + P(10; 0.2) P(at least 3 accidents) ≈ 0.0010916 + 0.00005417 + 0.00000136 + 2.614e-08 + 2.059e-10 + 1.546e-12 + 8.466e-15 + 1.414e-17 5. Interpret the result: Poisson distribution with an average of 10 accidents in 50 days, is approximately 0.00114717 Q. If a chairman is to be selected from five persons with their profile as follows: Sex Age Male 40 Male 43 Female 38 Female 27 Male 65 What is the probability that it would be female or a person over 30 years? - 1. Condition: the Chairman should either be a female or a person over 30 years old. 2. Count the candidates: There are five candidates in total. 3. Calculate the probability of selecting a female candidate: There are two female candidates, so the probability of selecting a female candidate is 2 out of 5 or 2/5, which is 0.4. 4. Calculate the probability of selecting a person over 30 years: There are three candidates over 30 years old (two males and one female). So, the probability of selecting a person over 30 years is 3 out of 5 or 3/5, which is 0.6. 5. Adjust for double counting: One female candidate satisfies both conditions (female and over 30 years old). So, we need to subtract the probability of selecting a candidate who is both female and over 30 years old. There is one such candidate out of the five, so the probability is 1 out of 5 or 1/5, which is 0.2. 6. Add the probabilities of selecting a female candidate and selecting a person over 30 years, and then subtract the probability of selecting both conditions. P(female or over 30 years) = P(female) + P(over 30 years) - P(female and over 30 years) = 0.4 + 0.6 - 0.2 = 0.8 7. Interpret the result: The probability that the selected chairman would be either female or a person over 30 years is 0.8 or 80%.
Q. From a pack of 52 cards, one card is drawn at random. What is the
probability that it is a king or queen of heart?
- There are 52 cards, so 52 possible outcomes.
- There are 2 favourable outcomes, 1 king of hearts and 1 queen of hearts.
- Calculate probability
Probability= Number of favourable outcomes/ Number of Possible Outcomes
Probability= 2/56
Probability= 1/26
Therefore, the probability of drawing a king or queen of hearts from a deck of 52 cards is 1/26.
Q. What is the probability of getting either ace or spade from a pack of 52 cards?
- - There are 52 cards, so 52 possible outcomes.
- There are 4 aces in one deck (spades, hearts, diamonds, clubs). Thre are 13 spades in one deck (ace,2,3,4,5,6,7,8,9,10,jack,queen,king)
- Dont double count the ace of spade.
- Total favourable outcomes = 4+12=16
- Calculate probability.
Probability= Number of favourable outcomes/ Number of Possible Outcomes
Probability= 16/52
Probability= 4/13
Therefore, probability of drawing either an ace or a spade from a deck of 52 cards is 4/13.
Q. The probability that evening college students will graduate is 0.6. Find the probability that out of 5 students i) None graduate ii) One graduate iii) At least one graduate
- Here since there are fixed number of trials with 2 possible outcomes of success or failure , we will use bionmial distribution formula.
- It has specific number of successes(k) , out of given number of trials (n) with a known probability of success (p).
- The formula is P(X = k) = C(n, k) * p^k * (1 - p)^(n - k)
1) None Graduate
- - n= 5, k= 0, p=0.6
C(n, k) = n! / (k! * (n - k)!)
P(X = 0) = C(5, 0) * 0.6^0 * (1 - 0.6)^(5 - 0) = 1 * 1 * 0.4^5 = 0.4^5 = 0.01024
2) One Graduate
- - n= 5, k= 1, p=0.6
P(X = 1) = C(5, 1) * 0.6^1 * (1 - 0.6)^(5 - 1) = 5 * 0.6 * 0.4^4 = 5 * 0.6 * 0.4^4 = 0.1536
3) Atleast one graduate.
- - n= 5
To find probability of atleast one graduate
P(atleast one graduate )= 1- P (none graduate)
=1-0.01024
=0.98976
Q. What is parametric and non parametric test.
-Parametric tests:
-Assume that data follows a specific pattern, like a bell-shaped curve.
-They make assumptions about how the data is spread out and located.
-Examples include tests like t-tests, ANOVA, and correlation.
-They work better when data follows these assumptions.
-If the assumptions are not met, results may be wrong.
Non-parametric tests:
-Don't make assumptions about how the data is shaped or spread out.
-They are more flexible and can be used with different types of data.
-Examples include tests like Mann-Whitney U test, Kruskal-Wallis test, and correlation without assuming a specific pattern.
-They work well even when data doesn't follow a specific pattern.
-They are useful when dealing with small sample sizes or data that doesn't fit the assumptions of parametric tests.
-In simple terms, parametric tests assume specific patterns in the data and work well when those patterns are met. Non-parametric tests don't assume any specific pattern and can be used with different types of data or when the assumptions of parametric tests aren't met.
Q. When 10 coins are tossed, find the probability of exactly six heads.
- Here since there are fixed number of trials with 2 possible outcomes of Head or Tail , we will use bionmial distribution formula.
- It has specific number of successes(k) , out of given number of trials (n) with a known probability of success (p).
- The formula is P(X = k) = C(n, k) * p^k * (1 - p)^(n - k)
- - n= 10, k= 6, p=0.5 (probability of getting a head or tail)
- P(X = 6) = C(10, 6) * (0.5)^6 * (1 - 0.5)^(10 - 6)
P(X = 6) = 210 * (0.5)^6 * (0.5)^4 = 210 * 0.015625 * 0.0625 = 0.328125
Therefore, the probability of exactly six heads when 10 coins are tossed is approximately 0.328125 or 32.81%.
Q. What is the probability that a queen, king and Joker are drawn in the
same order from pack of 52 cards without replacement?
- There are 52 cards, so 52 possible outcomes.
- There are 16 favourable outcomes,
4 queens, 4 kings and 1 joker.
For first card drawn= there are 4 queens availble.
For second card drawn (after the queen), there are 4 kings availbale
For third card drawn (after queen and king), there is 1 joker availbale.
4*4*1=16
- Calculate probability
Probability= Number of favourable outcomes/ Number of Possible Outcomes
Probability= 16/52
Probability=4/13
Therefore, the probability of drawing a queen, king, and joker in the same order from a deck of 52 cards without replacement is 4/13.
Comments
Post a Comment