Training issue in keras

Question

I am trying to train my lstm model for sentiment analysis but the program doesnt proceed at all after displaying the following output:

F:\Softwares\Anaconda\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Extracting features & training batches
Training...
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 134, 70)           42481880  
_________________________________________________________________
dropout_1 (Dropout)          (None, 134, 70)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               101888    
_________________________________________________________________
dense_1 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
activation_1 (Activation)    (None, 64)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 65        
_________________________________________________________________
activation_2 (Activation)    (None, 1)                 0         
=================================================================
Total params: 42,592,089
Trainable params: 42,592,089
Non-trainable params: 0
_________________________________________________________________
None
Train on 360000 samples, validate on 90000 samples
Epoch 1/8
2018-12-08 15:56:04.680836: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

The code below has some commented out since it was used to save some textual data on disk beforehand. Now, the code only trains the lstm model using that training and testing textual data. It is given below:

import pandas as pd
import Preprocessing as pre
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.utils import shuffle
import pickle
import numpy as np
import sys
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras.layers import LSTM
from keras.preprocessing.sequence import pad_sequences
from keras.models import model_from_json
from keras.preprocessing.text import Tokenizer
import os

# fileDir = os.path.dirname(os.path.realpath('__file__'))
# df = pd.read_csv(os.path.join(fileDir, '../Dataset/tweets.csv'),header=None,encoding = "ISO-8859-1")
# df=shuffle(df)
# length=df.size
#
# train=[]
# test=[]
# Y=[]
# Y2=[]
#
# count=450000
# for a in range(450000):   #loading data
#     b=pre.preprocess_tweet(df[1][a])
#     label=int(df[0][a])
#     train.append(b)
#     Y.append(label)
#     count-=1
#     print("Loading training data...",  count)
#
# with open('training_data(latest).obj', 'wb') as fp:
#     pickle.dump(train, fp)
# with open('training_labels(latest).obj', 'wb') as fp:
#     pickle.dump(Y, fp)
with open ('training_data(latest).obj', 'rb') as fp:
    train = pickle.load(fp)
with open ('training_labels(latest).obj', 'rb') as fp:
    Y = pickle.load(fp)

# count=156884
# for a in range(450000,606884):   #loading testin data
#     b = pre.preprocess_tweet(df[1][a])
#     label=int(df[0][a])
#     test.append(b)
#     Y2.append(label)
#     count-=1
#     print("Loading testing data...",  count)
#
# with open('testing_data(latest).obj', 'wb') as fp:
#     pickle.dump(test, fp)
# with open('testing_labels(latest).obj', 'wb') as fp:
#     pickle.dump(Y2, fp)

with open ('testing_data(latest).obj', 'rb') as fp:
    test = pickle.load(fp)
with open ('testing_labels(latest).obj', 'rb') as fp:
    Y2 = pickle.load(fp)

# vectorizer = CountVectorizer(analyzer = "word",tokenizer = None, preprocessor = None, stop_words = None, max_features = 2000)
# # # fit_transform() does two functions: First, it fits the model
# # # and learns the vocabulary; second, it transforms our training data
# # # into feature vectors. The input to fit_transform should be a list of
# # # strings.
#
# train = vectorizer.fit_transform(train)
# test = vectorizer.transform(test)
tokenizer = Tokenizer(split=' ')
tokenizer.fit_on_texts(train)
train = tokenizer.texts_to_sequences(train)
max_words = 134
train = pad_sequences(train, maxlen=max_words)
tokenizer.fit_on_texts(test)
test = tokenizer.texts_to_sequences(test)
test = pad_sequences(test, maxlen=max_words)

print('Extracting features & training batches')

print("Training...")
embedding_size=32
model = Sequential()
model.add(Embedding(606884, 70, input_length=134))
model.add(Dropout(0.4))
model.add(LSTM(128))
model.add(Dense(64))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
print(model.summary())
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

batch_size = 100
num_epochs = 8

model.fit(train, np.array(Y),  batch_size=batch_size, epochs=num_epochs ,validation_split=0.2,shuffle=True,verbose=2)

# Save the weights
model.save_weights('LSTM_model_weights_updated.h5')

# Save the model architecture
with open('LSTM_model_updated.json', 'w') as f:
    f.write(model.to_json())
# #
# Model reconstruction from JSON file
# with open(os.path.join(fileDir, '../Dataset/LSTM_model.json'), 'r') as f:
#     model = model_from_json(f.read())
#
# # Load weights into the new model
# model.load_weights(os.path.join(fileDir, '../Dataset/LSTM_model_weights.h5'))
# model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

scores = model.evaluate(test, np.array(Y2))
print('Evaluation Test accuracy:', scores[1])


count=0
sum=0
#
#
b=model.predict(test)
for a in b:
    print(count)
    if a<0.5:
        sum = sum + abs(Y2[count] - 0)  # error finding
    else:
        sum=sum+ abs(Y2[count]-1)    #error finding
    count+=1

acc=100-((sum/156884)*100)
print ("Accuracy=",acc,"count",count)

The log `Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2` simply means that your CPU has **AVX2** extension for fast vector operations, but TF can't utilize it, cause it wasn't compiled to support this extension. But this does **NOT** prevent training your model. What's your machine's configuration (e.g CPU, GPU, RAM, etc.)? — Reza Behzadpour, Dec 08 '18 at 12:22
RAM: 6GB, GPU Intel HD(R) Graphics 4400 memory 2176MB, CPU @ 1.70GHz 2.40 GHz 64-bit OS — Mujtaba Faizi, Dec 08 '18 at 12:46
How long did you wait? using verbobse=2 in model.fit means it will only print another message after an epoch has finished, which might take a bit. remove verbose and you should get the default progressbar. — Dr. Snoopy, Dec 08 '18 at 17:16
Removing verbose did it. It is odd though, that worked fine before — Mujtaba Faizi, Dec 09 '18 at 08:38

Reza Behzadpour · Accepted Answer · 2018-12-08T17:14:45.657

1

Total params: 42,592,089
Trainable params: 42,592,089
Non-trainable params: 0

Your model has more than 42 million trainable parameters which is too much for your machine's configuration (CPU, RAM, etc.), thus it can't handle it. What are the options?

Use smaller model
Use a better more powerful computer (with GPU of course)
Consider using an online cloud solution like crestle or paperspace

edited Dec 08 '18 at 17:14

answered Dec 08 '18 at 15:04

Reza Behzadpour

638
5
16

Training issue in keras

1 Answers1