This way I wanted to ask a question about AWS Sagemaker. I must confess that I'm quite a newbee to the subject and therefor I was very happy with the SageMaker Canvas app. It works really easy and gives me some nice results.
First of all my model. I try to predict solar power production based on the time (dt), the AWS IoT Thingname (thingname), clouds percentage (clouds) and temperature (temp). I have a csv filled with data measured by IoT things
clouds
+ temp
+ dt
+ thingname
=> import
dt,clouds,temp,import,thingname
2022-08-30 07:45:00+02:00,1.0,0.1577,0.03,***
2022-08-30 08:00:00+02:00,1.0,0.159,0.05,***
2022-08-30 08:15:00+02:00,1.0,0.1603,0.06,***
2022-08-30 08:30:00+02:00,1.0,0.16440000000000002,0.08,***
2022-08-30 08:45:00+02:00,,,0.09,***
2022-08-30 09:00:00+02:00,1.0,0.17,0.12,***
2022-08-30 09:15:00+02:00,1.0,0.1747,0.13,***
2022-08-30 09:30:00+02:00,1.0,0.1766,0.15,***
2022-08-30 09:45:00+02:00,0.75,0.1809,0.18,***
2022-08-30 10:00:00+02:00,1.0,0.1858,0.2,***
2022-08-30 10:15:00+02:00,1.0,0.1888,0.21,***
2022-08-30 10:30:00+02:00,0.75,0.1955,0.24,***
In AWS SageMaker canvas I upload the csv and build the model. All is very easy and when I use the predict tab I upload a CSV where the import column is missing and containing API weather data for some future moment:
dt,thingname,temp,clouds
2022-09-21 10:15:00+02:00,***,0.1235,1.0
2022-09-21 10:30:00+02:00,***,0.1235,1.0
2022-09-21 10:45:00+02:00,***,0.1235,1.0
2022-09-21 11:00:00+02:00,***,0.1235,1.0
2022-09-21 11:15:00+02:00,***,0.12689999999999999,0.86
2022-09-21 11:30:00+02:00,***,0.12689999999999999,0.86
2022-09-21 11:45:00+02:00,***,0.12689999999999999,0.86
2022-09-21 12:00:00+02:00,***,0.12689999999999999,0.86
2022-09-21 12:15:00+02:00,***,0.1351,0.69
2022-09-21 12:30:00+02:00,***,0.1351,0.69
2022-09-21 12:45:00+02:00,***,0.1351,0.69
From this data SageMaker Canvas predicts some real realistic numbers, from which I assume the model is nicely build. So I want to move this model to my Greengrass Core Device to do predictions on site. I found the best model location using the sharing link to the Junyper notebook.
From reading in the AWS docs I seem to have a few options to run the model on an edge device:
- Run the Greengrass SageMaker Edge component and run the model as a component and write an inference component
- Run the SageMaker Edge Agent yourself
- Just download the model yourself and do your thing with it on the device
Now it seems that SageMaker used XGBoost to create the model and I found the xgboost-model
file and downloaded it to the device.
But here is where the trouble started: SageMaker Canvas never gives any info on what it does with the CSV to format it, so I have really no clue on how to make a prediction using the model. I get some results when I try to open the same csv file I used for the Canvas prediction, but the data is completely different and not realistic at all
# pip install xgboost==1.6.2
import xgboost as xgb
filename = f'solar-prediction-data.csv'
dpredict = xgb.DMatrix(f'{filename}?format=csv')
model = xgb.Booster()
model.load_model('xgboost-model')
result = model.predict(dpredict)
print('Prediction result::')
print(result)
I read that the column order matters, the CSV may not contain a header. But it does not get close to the SageMaker Canvas result.
I also tried using pandas
:
# pip install xgboost==1.6.2
import xgboost as xgb
import pandas as pd
filename = f'solar-prediction-data.csv'
df = pd.read_csv(filename, index_col=None, header=None)
dpredict = xgb.DMatrix(df, enable_categorical=True)
model = xgb.Booster()
model.load_model('xgboost-model')
result = model.predict(dpredict, pred_interactions=True)
print('Prediction result::')
print('===============')
print(result)
But this last one always gives me following error:
ValueError: DataFrame.dtypes for data must be int, float, bool or category. When
categorical type is supplied, DMatrix parameter `enable_categorical` must
be set to `True`. Invalid columns:dt, thingname
To be honest, I'm completely stuck and hope someone around here can give me some advice or clue on how I can proceed.
Thanks! Kind regards
Hacor