Convert Z-score (Z-value, standard score) to p-value for normal distribution in Python

Question

How does one convert a Z-score from the Z-distribution (standard normal distribution, Gaussian distribution) to a p-value? I have yet to find the magical function in Scipy's stats module to do this, but one must be there.

I have started one here http://statsandprobability.codeplex.com/ — user123976, Feb 11 '11 at 08:19

score 72 · Accepted Answer · edited Mar 12 '15 at 22:48

I like the survival function (upper tail probability) of the normal distribution a bit better, because the function name is more informative:

p_values = scipy.stats.norm.sf(abs(z_scores)) #one-sided

p_values = scipy.stats.norm.sf(abs(z_scores))*2 #twosided

normal distribution "norm" is one of around 90 distributions in scipy.stats

norm.sf also calls the corresponding function in scipy.special as in gotgenes example

small advantage of survival function, sf: numerical precision should better for quantiles close to 1 than using the cdf

Myles Baker · Answer 2 · 2021-04-06T00:16:36.467

45

I think the cumulative distribution function (cdf) is preferred to the survivor function. The survivor function is defined as 1-cdf, and may communicate improperly the assumptions the language model uses for directional percentiles. Also, the percentage point function (ppf) is the inverse of the cdf, which is very convenient.

>>> import scipy.stats as st
>>> st.norm.ppf(.95)
1.6448536269514722
>>> st.norm.cdf(1.64)
0.94949741652589625

Edit: A user requested an example for ''vectors'':

import numpy as np
vector = np.array([.925, .95, .975, .99])
p_values = [st.norm.ppf(v) for v in vector]
f_values = [st.norm.cdf(p) for p in p_values]

for p,f in zip(p_values, f_values):
 print(f'p: {p}, \tf: {f}')

Yields:

p: 1.4395314709384563,  f: 0.925
p: 1.6448536269514722,  f: 0.95
p: 1.959963984540054,   f: 0.975
p: 2.3263478740408408,  f: 0.99

edited Apr 06 '21 at 00:16

answered Jan 01 '14 at 18:25

Myles Baker

3,600
2
19
25

Could you provide a more complete code answer that shows how to convert a vector of Z-scores to a vector of p-values? – Robin De Schepper Apr 03 '21 at 11:13
1

@RobinDeSchepper Added – Myles Baker Apr 06 '21 at 00:17
1

I may be mistaken, but am I not seeing z-scores and percentiles, but **no** p-values in the above solution? I like the solution a lot; it's just I don't see any p-values; they seem to be z-scores. – George Hayward Apr 08 '22 at 02:46

gotgenes · Answer 3 · 2010-08-16T20:54:38.527

12

Aha! I found it: scipy.special.ndtr! This also appears to be under scipy.stats.stats.zprob as well (which is just a pointer to ndtr).

Specifically, given a one-dimensional numpy.array instance z_scores, one can obtain the p-values as

p_values = 1 - scipy.special.ndtr(z_scores)

or alternatively

p_values = scipy.special.ndtr(-z_scores)

edited Aug 16 '10 at 20:54

answered Aug 16 '10 at 19:46

gotgenes

38,661
28
100
128

Strange terminology, "Z-distribution" instead of "Normal curve". Z-score I'd probably call standard deviation in this context as well. – Nick T Aug 16 '10 at 19:52
Well, the Z-distribution == "standard normal distribution" == `N(0, 1)`. That said, your point is well taken. I have updated the question to reflect the various terminology for the same concepts. – gotgenes Aug 16 '10 at 20:43

score 8 · Answer 4 · answered Apr 25 '20 at 21:21

Starting Python 3.8, the standard library provides the NormalDist object as part of the statistics module.

It can be used to apply the inverse cumulative distribution function (inv_cdf, also known as the quantile function or the percent-point function) and the cumulative distribution function (cdf):

NormalDist().inv_cdf(0.95)
# 1.6448536269514715
NormalDist().cdf(1.64)
# 0.9494974165258963

score 3 · Answer 5 · edited Oct 03 '17 at 19:30

3

From formula:

import numpy as np
import scipy.special as scsp
def z2p(z):
    """From z-score return p-value."""
    return 0.5 * (1 + scsp.erf(z / np.sqrt(2)))

edited Oct 03 '17 at 19:30

Brad Solomon

38,521
31
149
235

answered Dec 29 '14 at 06:06

Arnaldo P. Figueira Figueira

1,100
14
14

1

This isn't the best solution; it isn't vectorized like the above answer. – hlin117 Feb 22 '15 at 17:00
3

You can get a vectorized version simply by replacing `math.erf` and `math.sqrt` by `erf` and `sqrt` from scipy. – NullSpace Sep 22 '15 at 13:56
this is the best solution, if z is not a vector – Erik Aronesty Jan 14 '19 at 11:39

score 1 · Answer 6 · edited Aug 08 '19 at 16:40

1

p_value = scipy.stats.norm.pdf(abs(z_score_max)) #one-sided test 
p_value = scipy.stats.norm.pdf(abs(z_score_max))*2 # two - sided test

The probability density function (pdf) function in python yields values p-values that are drawn from a z-score table in a intro/AP stats book.

edited Aug 08 '19 at 16:40

HK boy

1,398
11
17
25

answered Aug 08 '19 at 16:03

Vivek Gopalan

19
2

Sunil Yadav · Answer 7 · 2020-09-10T18:44:54.933

For Scipy lovers, Tough this is old question but relevant, and we can have not only normal but other distributions as well so here is solution for few more distributions:

def get_p_value_normal(z_score: float) -> float:
    """get p value for normal(Gaussian) distribution 

    Args:
        z_score (float): z score

    Returns:
        float: p value
    """
    return round(norm.sf(z_score), decimal_limit)


def get_p_value_t(z_score: float) -> float:
    """get p value for t distribution 

    Args:
        z_score (float): z score

    Returns:
        float: p value
    """
    return round(t.sf(z_score), decimal_limit)


def get_p_value_chi2(z_score: float) -> float:
    """get p value for chi2 distribution 

    Args:
        z_score (float): z score

    Returns:
        float: p value
    """
    return round(chi2.ppf(z_score, df), decimal_limit)

Convert Z-score (Z-value, standard score) to p-value for normal distribution in Python

7 Answers7

Linked