0

I am trying to save a DataFramMapper object to use on new data for a model.

 mapper = DataFrameMapper([
        (['price', 'Argentina', 'Canada', 'Australia', 'barcat_numeric'], None),
        ('TTL',CountVectorizer( ngram_range=(1, 2))),
        ('BARCAT', CountVectorizer( ngram_range=(1, 2)))
    ])
    with open('company_dill.pkl', 'wb') as f:
        dill.dump(mapper, f)

when i read in the data:

with open('company_dill.pkl', 'rb') as f:
mapper_v = dill.load(f)
print(type(mapper_v))

output is:
but when i try using it i get:

---> 20     X = mapper.transform(data_frame)
 21     return X

C:\Users\eliav\Anaconda3\lib\site-
packages\sklearn_pandas\dataframe_mapper.py in transform(self, X)
    269         extracted = []
    270         self.transformed_names_ = []
--> 271         for columns, transformers, options in self.built_features:
    272             input_df = options.get('input_df', self.input_df)
    273             # columns could be a string or list of

TypeError: 'NoneType' object is not iterable

When i do not save to pickle it works fine, used both pickle and dill

eliavs
  • 2,306
  • 4
  • 23
  • 33

1 Answers1

0

OK i found my mistake: I saved the DataFrameMapper object before i fitted the data, this is the right way:

 mapper = DataFrameMapper([
        (['price', 'Argentina', 'Canada', 'Australia', 'barcat_numeric'], None),
        ('TTL',CountVectorizer( ngram_range=(1, 2))),
        ('BARCAT', CountVectorizer( ngram_range=(1, 2)))
    ])
    X=mapper.fit_transform(data_frame)
    pickle.dump(mapper, open( "mapper.pkl", "wb") )
eliavs
  • 2,306
  • 4
  • 23
  • 33