I'm using Python 3.7 and learning on a published code. In my first attempt to write the code, it ran without errors which means all of the necessary libraries (NumPy, Pandas, scikit-learn) should have been installed properly by me. When I changed the source file in order to work on different data, it stopped working. However, it works on another computer. I updated the system with all of the apps and tried to shorten the imported file, but it didn't change anything. The input files are identical on both computers.
This is the error with warnings:
/usr/lib64/python3.7/site-packages/sklearn/externals/six.py:7: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/usr/lib64/python3.7/site-packages/sklearn/utils/__init__.py:4: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
from collections import Sequence
/usr/lib64/python3.7/site-packages/sklearn/model_selection/_split.py:18: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
from collections import Iterable
/usr/lib64/python3.7/site-packages/sklearn/model_selection/_search.py:16: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
from collections import Mapping, namedtuple, defaultdict, Sequence
/usr/lib64/python3.7/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
from numpy.core.umath_tests import inner1d
Traceback (most recent call last):
File "algorithm2.py", line 26, in <module>
classifier.fit(X_train, y_train)
File "/usr/lib64/python3.7/site-packages/sklearn/ensemble/forest.py", line 248, in fit
y = check_array(y, accept_sparse='csc', ensure_2d=False, dtype=None)
File "/usr/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 453, in check_array
_assert_all_finite(array)
File "/usr/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 44, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
This is the code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
data=open("./data3.csv")
headernames=data.readline()
dataset=pd.read_csv("data3.csv")
dataset.head()
X = dataset.iloc[:, :-2].values
y = dataset.iloc[:, -1].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30)
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators = 50)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
result = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(result)
result1 = classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:", result2)