6

I am currently working on setting up a pipeline in Amazon Sagemaker. For that I set up an xgboost-estimator and trained it on my dataset. The training job runs as expected and the freshly trained model is saved to the specified output bucket. Later I want to reimport the model, which is done by getting the mode.tar.gz from the output bucket, extracting the model and serializing the binary via pickle.

# download the model artifact from AWS S3
!aws s3 cp s3://my-bucket/output/sagemaker-xgboost-2021-09-06-12-19-41-306/output/model.tar.gz .

# opens the downloaded model artifcat and loads it as 'model' variable
model_path = "model.tar.gz"
with tarfile.open(model_path) as tar:
    tar.extractall(path=".")

model = pkl.load(open("xgboost-model", "rb"))

Whenever I try to tun this I receive an unpickling stack underflow:

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-9-b88a7424f790> in <module>
     10     tar.extractall(path=".")
     11 
---> 12 model = pkl.load(open("xgboost-model", "rb"))
     13 

UnpicklingError: unpickling stack underflow

So far I retrained the model to see, if the error occurs with a different model file and it does. I also downloaded the model.tar.gz and validated it via gunzip. When extracting the binary file xgboost-model is extracted correctly, I just can't pickle it. Every occurence of the error I found on stackoverflow points at a damaged file, but this one is generated directly by SageMaker and I do note perform any transformation on it, but extracting it from the model.tar.gz. Reloading a model like this seems to be quite a common use case, referring to the documentation and different tutorials. Locally I receive the same error with the downloaded file. I tried to step directly into pickle for debugging it but couldn't make much sense of it. The complete error stack looks like this:

Exception has occurred: UnpicklingError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
unpickling stack underflow
  File "/sagemaker_model.py", line 10, in <module>
    model = pkl.load(open('xgboost-model', 'rb'))
  File "/usr/local/Cellar/python@3.9/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/Cellar/python@3.9/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/usr/local/Cellar/python@3.9/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/usr/local/Cellar/python@3.9/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/Cellar/python@3.9/3.9.1_5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

What could cause this issue and at which step during the process could I apply changes to fix or workaround the problem.

lmoe42
  • 91
  • 1
  • 7
  • Try this `with tarfile.open(fname, "r:gz") as tar`. As the error `unpickling stack underflow` can happen when a pickle ends unexpectedly, which might be an indication that the file is corrupt or it's not being extracted correctly. – Mohamed Ali JAMAOUI Sep 08 '21 at 11:01
  • ```model_path = "model.tar.gz" with tarfile.open(model_path, "r:gz") as tar: tar.extractall(path=".") model = pkl.load(open("xgboost-model", "rb"))``` yields the same unpickling error and the extraction itself generates at least the correctly named binary – lmoe42 Sep 08 '21 at 13:17
  • Could you run the command file on the extracted pickle just to confirm the file type. https://man7.org/linux/man-pages/man1/file.1.html? – Mohamed Ali JAMAOUI Sep 08 '21 at 15:37
  • file just says 'data'. Thank you for your efforts. – lmoe42 Sep 08 '21 at 15:45

3 Answers3

7

Latest XGBoost versions seem to have changed this process. This worked for 1.3.x:

import xgboost as xgb

model = xgb.Booster()
model.load_model('xgboost-model')
Tulio Casagrande
  • 1,499
  • 1
  • 15
  • 20
3

The issue rooted in the model version used for the xgboost framework. from 1.3.0 on the default output changed from pickle to json and the sagemaker documentation does not seem to have been updated accordingly. So if you want to read the model via

    tar.extractall(path=".")

model = pkl.load(open("xgboost-model", "rb"))

as described in the sagemaker docs, you need to import the XGBOOST framework with with a former version, e.g. 1.2.1.

lmoe42
  • 91
  • 1
  • 7
  • I am having the same problem.. Would you be able to explain how you solved this issue step by step? – Bengi Koseoglu Nov 23 '21 at 12:33
  • When you import the framework like you construct the estimator like estimator = XGBoost(...). There the framework has to be set to 1.2.1 and not ^1.3.0 as from this version on the modeloutput is not serialized with pickle anymore. – lmoe42 Nov 23 '21 at 16:28
1

After reading the answer @Imoe41, I would also liked to contribute to the question. The problem is as you would see if you click on the link in the error is that (https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html), from version 1.0 of xgboost the models are being saved in json, and before version 1.0, the models were being saved in pickle. I trained the xgboost model in 2020 with sagemaker, using the xgboost version of 0.90. However, in my notebook the xgboost package version was 1.5.1.

Solution:

  1. Check the version of installed xgboost

import xgboost as xgb print(xgb.version)

  1. If the version is higher then 1.0, then you need to downgrade it. In order to downgrade xgboost, you also need to downgrade the other packages.
pip install scipy==1.4.1
pip install shap==0.37.0
pip install xgboost==0.90.0
  1. Load the model as a pickle
import pickle as pkl
import tarfile
t = tarfile.open('model.tar.gz', 'r:gz')
t.extractall()
model = pkl.load(open("xgboost-model", 'rb'))
Bengi Koseoglu
  • 159
  • 4
  • 10