-2

I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.

Full traceback:

Traceback (most recent call last):
  File "src\pipelines\training_pipeline.py", line 12, in <module>
    train_data_path,test_data_path = obj.initiate_data_ingestion()
TypeError: cannot unpack non-iterable NoneType object

data_ingestion.py:

import os
import sys
import pandas as pd
from src.logger import logging
from src.exception import CustomException
from src.components.data_ingestion import DataIngestion

if __name__=='__main__':
    obj = DataIngestion()
    train_data_path,test_data_path = obj.initiate_data_ingestion()
    print(train_data_path,test_data_path)

training_pipeline.py:

import os
import sys
from src.exception import CustomException
from src.logger import logging
import pandas as pd
from sklearn.model_selection import train_test_split
from dataclasses import dataclass


## intialize the data ingestion configuration

@dataclass
class DataIngestionconfig:
    train_data_path=os.path.join('artifacts','train.csv')
    test_data_path=os.path.join('artifacts','test.csv')
    raw_data_path=os.path.join('artifacts','raw.csv')

## create a data ingestion class
class DataIngestion:
    def __init__(self):
        self.ingestion_config=DataIngestionconfig()

    def initiate_data_ingestion(self):
        logging.info('Data Ingestion method starts')

        try:
            df=pd.read_csv(os.path.join('notebooks/data','gemstone.csv'))
            logging.info('Dataset read as pandas Dataframe')

            os.makedirs(os.path.dirname(self.ingestion_config.raw_data_path),exist_ok=True)

            df.to_csv(self.ingestion_config.raw_data_path,index=False)

            logging.info("Train test split")
            train_set,test_set = train_test_split(df,test_size=0.30,random_state=42)

            train_set.to_csv(self.ingestion_config.train_data_path,index=False,header=True)
            test_set.to_csv(self.ingestion_config.test_data_path,index=False,header=True)

            logging.info('Ingestion of data is completed')
            
            return(
                self.ingestion_config.train_data_path,
                self.ingestion_config.test_data_path
            ) 

        except Exception as e:
            logging.info('Error occured in Data Ingestion config')

if __name__=="__main__":
    obj=DataIngestion()
    train_data_path,test_data_path=obj.initiate_data_ingestion()

I tried returning the two values as a list but that did'nt work as well.

desertnaut
  • 57,590
  • 26
  • 140
  • 166

1 Answers1

1

initiate_data_ingestion returns a tuple when there is no Exception, but None if there is (the return statement is inside the try block, while there is no return statement in the except block, in Python it's the same as to return None). Meanwhile the function caller always expects a tuple as an output. That's the source of the error.

The correct way to handle exceptions is to make sure the function always returns the correct type (in this case, a tuple) regardless of whether an exception occurs.

In general you should never do a catch-all try...except block, because it leads to problems like this: you catch an Exception that you don't know what it is, so your program doesn't know how to handle it, and and it leads to more errors down the line.

Your solutions is to remove the try...except block to see what the real error is and fix that. Then rewrite the code to only catch exceptions that you already expect and know how to handle.

Felix Fourcolor
  • 320
  • 2
  • 8