1

I'm following a tutorial on the usage of Python in bioinformatics. In the tutorial a Mann-Whitney U test was performed via the function below.

numpy.random.seed was used in the first line after packages but nowhere else. I was wondering what is the use for this action as it seemingly doesn't effect the results?

def mannwhitney(descriptor, verbose=False):

  from numpy.random import seed 
  from numpy.random import randn
  from scipy.stats import mannwhitneyu 

  seed(1)

  selection  =[descriptor, "Bioactivity_Class"]
  df = df_2class[selection]
  active = df[df.Bioactivity_Class == "active"]
  active = active[descriptor]

  selection=[descriptor,"Bioactivity_Class"]
  df = df_2class[selection]
  inactive = df[df.Bioactivity_Class == "inactive"]
  inactive = inactive[descriptor]

  stat,p = mannwhitneyu(active,inactive)

  #creating a result dataframe for easier interpretation 
  
  alpha = 0.05

  if p> alpha:
    interpretation = "Same distribution (fail to reject H0)"

  else: 
    interpretation = "Different distribution (reject H0)"

  results = pd.DataFrame ({"Descriptor": descriptor,"Statistics": stat,"p":p,
                           "alpha":alpha, "Interpretation":interpretation},
                          index =[0])
  
  return results
        

0 Answers0