Descriptive Statistics

Congruent data are dispalyed in ORANGE Incongruent data are displayed in GREEN



Independent Variable

The words that are given to the paricipants or their color. As a matter of fact the weather the word was congruent or incongruent was an independent variable. The test takers choose these out of their hats practically.




Dependent Variable

The dependent variable is the time taken by all the participants because their time was dependent on the independent variable of the test. All the statical clculations are based on the evaluation of the dependent variable and hence all the conclusions too.



Hypothesis and Justifications !!

There are several types of Hypothesis Analysis. This is appropriate if we evaluate the mean of both the groups; i.e. Congruent and Incongruent. Basically we would like to know if incongruent words affects our reading abilities because of semantic interference. There are no information given about the samples other than their congruent and incongruent scores. So we would assume that these samples are randomly choosen which means that the participants are of different age, both male and female and of different education level. In this scenario, our hypothesis would be that giving incongruent words is statistically siginificant or not. In this context our NULL Hypothesis is that there is no significant difference in the time between congruent and incongruent scores while the ALTERNATIVE hypothesis is that there would be a significant difference.

There are multiple hypothesis test that can be performed like Z-test, t-test, F-test, but not all of these are suitable for us. Here we don't have the population standard deviation and other population parameters hence we can't possibly go with z-test. F-test are good when we have more than two samples. We are taking two different types of test on the same sample and we don't have population parameters available and because both the tests are taken by the same sample, this is an dependent t-test. Hence we go with the one tailed and two tailed t-tests to evaluate how significant the incongruent words are on the time taken. We would also give a 95% confidence interval. r-squared value is also given to understand that fact how much variation in the time between both the test is due to stroop effect.



Mean

14.051

22.0159




Median

14.356

21.017




Mode

10 to 15 (45.8%)

20 to 25 (50%)



Variability & Central Tendency

As we can see in the above picture, the mean and the median is almost same for both the congruent and the incongruent scores but they are not equal. This tells that firstly, the distribution is not perfectly normal otherwise these values would have been same for both. Secondly, there are not much outliers but still there are few. That's why we have got the differnce between the mean and median to very small but not zero.
There are different ways to clean the outliers in the samples but here we would report the Inter Quartile Range or IQR. Cleaning of outliers is important because we need to spot that how much our distribution is spread out. Outliers are targated using the statistic below

`tt"Outliers " < Q_1 - 1.5 xx (IQR)`
`tt" or,"`
`tt" "> Q_3 + 1.5 xx (IQR)`

| `Q_1 = tt"11.71"` | `Q_3 = tt"16.39"` | `IQR = tt"4.68"` |


| `Q_1 = tt"18.69"` | `Q_3 = tt"24.20"` | `IQR = tt"5.51"` |



`tt"Outliers: [None]"`

`tt"Outliers:[34.288, 35.255]"`


As it is quiet event that there are only 2 outliers in the incongruent sample however no outliers in the congruent one. This would be more evident in the regression plot in the visualisation section. Hence we can say that the variability in the congruent graph is less as compared to the incongruent graph. This would mean that the standard deviation of the congruent data is less as compared to the incongruent data. Here are the values of their respective standard deviations.


`S_c = tt"3.559"`

`S_i = tt"4.7970"`