-1

I'm using Python 3.7 and learning on a published code. In my first attempt to write the code, it ran without errors which means all of the necessary libraries (NumPy, Pandas, scikit-learn) should have been installed properly by me. When I changed the source file in order to work on different data, it stopped working. However, it works on another computer. I updated the system with all of the apps and tried to shorten the imported file, but it didn't change anything. The input files are identical on both computers.

This is the error with warnings:

/usr/lib64/python3.7/site-packages/sklearn/externals/six.py:7: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
/usr/lib64/python3.7/site-packages/sklearn/utils/__init__.py:4: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Sequence
/usr/lib64/python3.7/site-packages/sklearn/model_selection/_split.py:18: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Iterable
/usr/lib64/python3.7/site-packages/sklearn/model_selection/_search.py:16: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
  from collections import Mapping, namedtuple, defaultdict, Sequence
/usr/lib64/python3.7/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
  from numpy.core.umath_tests import inner1d
Traceback (most recent call last):
  File "algorithm2.py", line 26, in <module>
    classifier.fit(X_train, y_train)
  File "/usr/lib64/python3.7/site-packages/sklearn/ensemble/forest.py", line 248, in fit
    y = check_array(y, accept_sparse='csc', ensure_2d=False, dtype=None)
  File "/usr/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 453, in check_array
    _assert_all_finite(array)
  File "/usr/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 44, in _assert_all_finite
    " or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

This is the code:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

data=open("./data3.csv")

headernames=data.readline()
dataset=pd.read_csv("data3.csv")
dataset.head()

X = dataset.iloc[:, :-2].values
y = dataset.iloc[:, -1].values

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30)

from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators = 50)
classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
result = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(result)
result1 = classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:", result2)
J.T.
  • 1
  • 1

1 Answers1

0

Are you using the imp module in your code? Like import imp?
If so, can you try using import importlib.

Also, I see warnings related to collections.
Try to check the answer provided in this SO answer

Apart from these, you have an error of
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
You might have to take a look at your input file once again. It seems there are issues with the input.

Akash Mahapatra
  • 2,988
  • 1
  • 14
  • 28
  • I edited the question and added the whole code (this is my first contact with Python). There is no imp. About the input file - it is really long, so I'm not sure how to search through it, so I chose 50 first lines as an output and it didn't change anything. Besides, the file and the program work on the second computer. – J.T. Aug 12 '20 at 22:45