based on the docs, we can create different steps and chain them together in sagemaker pipeline, but I am wondering, if i wanted to just run one training step, without processing step , like in the example below, will i able to pass a s3 location as an argument , instead of the output from previous step , i.e. step_process. or in other words , how can i pass a s3 location uri instead of => step_process.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri
inputs={
"train": TrainingInput(
s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
"train"
].S3Output.S3Uri,
content_type="text/csv"
),
"validation": TrainingInput(
s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
"validation"
].S3Output.S3Uri,
content_type="text/csv"
)
}
from sagemaker.workflow.pipeline_context import PipelineSession
from sagemaker.inputs import TrainingInput
from sagemaker.workflow.steps import TrainingStep
from sagemaker.xgboost.estimator import XGBoost
pipeline_session = PipelineSession()
xgb_estimator = XGBoost(..., sagemaker_session=pipeline_session)
step_args = xgb_estimator.fit(
inputs={
"train": TrainingInput(
s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
"train"
].S3Output.S3Uri,
content_type="text/csv"
),
"validation": TrainingInput(
s3_data=step_process.properties.ProcessingOutputConfig.Outputs[
"validation"
].S3Output.S3Uri,
content_type="text/csv"
)
}
)
step_train = TrainingStep(
name="TrainAbaloneModel",
step_args=step_args,
)