Can t-test be calculated on large samples with non-normal distribution?
For example, the number of users in group A is 100K, the number of users in group B is 100K. I want to test whether the average session duration of these two groups is statistically significant.
1st method) We calculated the average session duration of these users on the day after the AB test (DAY1) as
- 31.2 min for group A
- 30.2 min for group B.
We know that users in groups A and B have a non-normal distribution of DAY1 session values. In such a case, would it be correct to use two samples t-test to test the DAY1 avg session durations of two groups? (We will accept n=100K) (Some sources say that calculating t-scores for large samples will give accurate results even with non-normal distribution.)
2nd method) Would it be a correct method to calculate the t-score over the daily average session duration during the day the AB test is open? E.g; In the scenario below, the average daily session duration of 100K users in groups A and B are calculated. We will accept the number of days here as the number of observations and get n=30. We will also calculate the two-sample t-test calculation over n=30.
Group | day0 avg duration | day1 avg duration | day2 avg duration | ... | day30 av gduration |
---|---|---|---|---|---|
A | 30.2 | 31.2 | 32.4 | ... | 33.2 |
B | 29.1 | 30.2 | 30.4 | ... | 30.1 |
Do these methods give correct results or is it necessary to apply another method in such scenarios? Would it make sense to calculate t-test on large samples in AB test?