I'm following a tutorial on the usage of Python in bioinformatics. In the tutorial a Mann-Whitney U test was performed via the function below.
numpy.random.seed was used in the first line after packages but nowhere else. I was wondering what is the use for this action as it seemingly doesn't effect the results?
def mannwhitney(descriptor, verbose=False):
from numpy.random import seed
from numpy.random import randn
from scipy.stats import mannwhitneyu
seed(1)
selection =[descriptor, "Bioactivity_Class"]
df = df_2class[selection]
active = df[df.Bioactivity_Class == "active"]
active = active[descriptor]
selection=[descriptor,"Bioactivity_Class"]
df = df_2class[selection]
inactive = df[df.Bioactivity_Class == "inactive"]
inactive = inactive[descriptor]
stat,p = mannwhitneyu(active,inactive)
#creating a result dataframe for easier interpretation
alpha = 0.05
if p> alpha:
interpretation = "Same distribution (fail to reject H0)"
else:
interpretation = "Different distribution (reject H0)"
results = pd.DataFrame ({"Descriptor": descriptor,"Statistics": stat,"p":p,
"alpha":alpha, "Interpretation":interpretation},
index =[0])
return results