Proportion Test: Z-test vs bootstrap/permutation - different results

Question

I'm learning hypothesis testing, and going through the following example:

The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are very satisfied with the service they receive. To test this claim, the local newspaper surveyed 100 customers, using simple random sampling. Among the sampled customers, 73 percent say they are very satisified. Based on these findings, can we reject the CEO's hypothesis that 80% of the customers are very satisfied? Use a 0.05 level of significance.

I'm getting different results when calculating the p-value using the one-sample z-test compared to bootstrapping method in python.

Z-Test Method:

σ = sqrt [(0.8 * 0.2) / 100] = sqrt(0.0016) = 0.04 z = (p - P) / σ = (.73 - .80)/0.04 = -1.75

Two-tailed test so P(z < -1.75) = 0.04, and P(z > 1.75) = 0.04.

Thus, the P-value = 0.04 + 0.04 = 0.08.

Bootstrapping method (in Python):

The general method is to take a random sample of size 100 from the population (1,000,000) of which 80% are satisfied

repeat 5000 times:
    take random sample of size 100 from population (1,000,000, 80% of which are satisfied)
    count the number of satisfied customers in sample, and append count to list satisfied_counts
calculate number of times that a value of 73 or more extreme (<73) occurs. Divide this by the number of items in satisfied_counts

Since it's a two-tailed test, double the result to get the p-value.

With this method, p-value 0.11.

Here is the code:

population = np.array(['satisfied']*800000+['not satisfied']*200000)     # 80% satisfied (1M population)
num_runs = 5000
sample_size = 100
satisfied_counts = []

for i in range(num_runs):
    sample = np.random.choice(population, size=sample_size, replace = False)
    hist = pd.Series(sample).value_counts()
    satisfied_counts.append(hist['satisfied'])

p_val = sum(i <= 73 for i in satisfied_counts) / len(satisfied_counts) * 2

How come the two results are different? Any help / point in the right direction appreciated!

btilly · Accepted Answer · 2019-03-06T01:14:32.097

1

The difference is a form of fencepost/roundoff error.

The normal approximation says that the odds of getting 0.73 is approximately the odds of the corresponding normal distribution being between 0.725 and 0.735. Therefore you should use 0.735 for your cutoff. That will make the two numbers much closer.

edited Mar 06 '19 at 01:14

answered Mar 05 '19 at 23:50

btilly

43,296
3
59
88

Thanks, this is it! – jafjaf Mar 07 '19 at 20:39

Proportion Test: Z-test vs bootstrap/permutation - different results

1 Answers1