I have a tuning step in my sagemaker pipeline, in the following step i'm using train.py script inside the tuning step container. Inside the train.py script i'm using imported module called 'dill'. It seems that the sagemaker SKLearn container didn't install the requirements as it supposed to. Running the pipeline followed with importing error: ModuleNotFoundError: no module named 'dill'
My tuning step container:
sk_estimator = SKLearn(
entry_point="train.py",
role=role,
instance_count=1,
instance_type="ml.c5.xlarge",
source_dir="custom-model-sklearn/src/",
hyperparameters={
"target_col":'target_col',
"penalty": 'none',
"fit_intercept": True,
"solver": 'lbfgs',
"verbose": 0,
"C": 1,
},
py_version="py3",
framework_version="1.0-1",
script_mode=True,
sagemaker_session=pipeline_session,
disable_profiler=True,
output_path = "s3://{}/{}/TrainingStep".format(bucket,model_prefix)
)
base_job_name = f'sklearn-model'
The train.py script and the requirements.txt file which contains dill are inside the directory - /custom-model-sklearn/src.
train.py:
import ...
import ...
.
.
import dill
.
.
requirements.txt:
dill
It seems that the source_dir is configured correctly due to the fact that the error is in the train.py
script.
Currently i'm moving my code from one account to the another. In the previous account I did the same thing with the same hierarchy of directories and it did manage to install the module inside the tuning container.
Any help would be appreciated.