0

I am having this warning.

Instructions for updating: To construct input pipelines, use the tf.data module.

I have had some search but I couldn't figure out the logic behind the tf.data.Dataset, so I couldn't manage converting pd.DataFrame into tf.data.Dataset.

I also need help for predictions at the end of the code, I couldn't figure out right way to compare predictions(high probability output) with label.

Note: data has no column names, so I have added a1 to a784 names to columns so I can assign them to feature_columns.

Thanks is advance.

Here is the code:

import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf 

from sklearn import metrics
from tensorflow.python.data import Dataset

mnist_df = pd.read_csv("https://download.mlcc.google.com/mledu-datasets/mnist_train_small.csv",header=None)

mnist_df.describe()

mnist_df.columns

hand_df = mnist_df[0]

matrix_df = mnist_df.drop([0],axis=1)

matrix_df.head()

hand_df.head()

#creating cols array and append a1 to a784 in order to name columns
cols=[]
for i in range(785):
    if i!=0:
        a = '{}{}'.format('a',i)
        cols.append(a)

matrix_df.columns = cols

mnist_df = mnist_df.head(10000)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(matrix_df, hand_df, test_size=0.3, random_state=101)

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

matrix_df = pd.DataFrame(data=scaler.fit_transform(matrix_df),
                         columns=matrix_df.columns,
                         index=matrix_df.index)

#naming columns so I will not get error while assigning feature_columns
for i in range(len(cols)):
    a=i+1
    b='{}{}'.format('a',a)
    cols[i] = tf.feature_column.numeric_column(str(b))

matrix_df.head()

input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train,
                                                 batch_size=10,num_epochs=1000,
                                                 shuffle=True)

my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.03)

my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)

model = tf.estimator.DNNClassifier(feature_columns=cols,
                                   hidden_units=[32,64],
                                      n_classes=10,
                                      optimizer=my_optimizer,
                                      config=tf.estimator.RunConfig(keep_checkpoint_max=1))

model.train(input_fn=input_func,steps=1000)

predict_input_func = tf.estimator.inputs.pandas_input_fn(x=X_test,
                                                         batch_size=50,
                                                         num_epochs=1,
                                                         shuffle=False)

pred_gen = model.predict(predict_input_func)

predictions = list(pred_gen)

predictions[0]
hkacmaz
  • 41
  • 1
  • 1
  • 5
  • What is the issue you are having? The warning is harmless and should not stop the code from executing. – Sergei Lebedev Dec 28 '18 at 13:37
  • For now the warning is ok for now, but this way to do it will not be available in the future version of tensorflow, as far as I know tf.data.Dataset is faster. But I couldn't convert this pd.DataFrame model into tf.data.Dataset. Also at the end I couldn't figure out how to compare prediction and labels. Thanks. – hkacmaz Dec 28 '18 at 18:58
  • Not sure you would see a significant improvement since a dataset would merely wrap a bunch of in-memory tensors. However, if you do want to try, have a look at this answer: https://stackoverflow.com/a/50647478/262432 – Sergei Lebedev Jan 01 '19 at 21:32

0 Answers0