0

I am new to programmimg. I am trying to find targets for chemicals using STITCH API. When I run the code, for some of the inputs in the list, I get the output. But in the end, few lines show index error as quoted above (like if I have 10 input IDs, I get the output for 7 of them, the other three won't run as I get Index error). Please see my code below and help with the solution. My input is a list of PubChem CIDs, nothing else.

import requests
import pandas as pd
from io import StringIO

out_df = pd.DataFrame(columns=['Chemical_ids', 'Target EnsemblID'])
Path = 'boswellia_stitch_input.txt'
df = pd.read_csv(Path, sep='\t')

for row, line in enumerate(df['Pubchem_CID']):
    base_url = "http://stitch.embl.de/api/tsv/interactorsList"
    Chemical_ids = (f'CID{line}')

    params = {"identifiers": Chemical_ids, "species": "9606", "limit": "400"}

    res = requests.get(base_url, params=params)

    if res.status_code == 200:
        data = [row.split('.')[1] if '.' in row else row for row in res.text.split('\n')[1:-1]]
        result = pd.DataFrame(data[1:], columns=[data[0]])
        
        temp = pd.DataFrame()
        
        temp['Target EnsemblID'] = result.values.flatten()
        temp.reset_index(drop=True, inplace=True)
        temp['Chemical_ids'] = [line]*len(result.index)
        #temp['Chemical_name'] = [df.at[row, 'molecule_name']]*len(result.index) 
        display(temp)
        out_df = out_df.append(temp, ignore_index=True)
#display(out_df)

My error:

IndexError                                Traceback (most recent call last)
<ipython-input-14-1bb58828596d> in <module>
      9     if res.status_code == 200:
     10         data = [row.split('.')[1] if '.' in row else row for row in res.text.split('\n')[1:-1]]
---> 11         result = pd.DataFrame(data[1:], columns=[data[0]])
     12 
     13         temp = pd.DataFrame()

IndexError: list index out of range
Boo
  • 13
  • 4
  • It fails when `data` is an empty list. Put a `print()` call above that line to see why. – BoarGules Aug 02 '21 at 10:15
  • Hi, I was mix-matching with code to see if the error goes away. When I entered result = pd.DataFrame(data[1:]) , without including the colums=data[0] line in it. It gave a complete list. IDK how and what is the logic behind it. I'm not sure whether it is the right way of doing. Is it fine? What is the reason, could you explain? – Boo Aug 03 '21 at 07:07
  • Without your data I can't explain anything. All I said was that the only thing in that line, as you presented it, with the data as it was at that point, that could give the error you report, was the reference to `data[0]` caused by `data` being an empty list. Since as you say you have been mixing & matching the code, it's impossible to explain anything. I can only explain code (and data) I can see. – BoarGules Aug 03 '21 at 08:07

0 Answers0