0

According to this document https://koalas.readthedocs.io/en/latest/getting_started/install.html

System info:

numpy   1.24.3  
koalas  1.8.2 
pyspark 3.4.0 
Python  3.8.10  

Facing Issue when trying to read csv file

import databricks.koalas as ks
import time
import numpy as np
df_koalas=ks.read_csv('train.csv') 

AttributeError: module 'numpy' has no attribute 'bool'

AttributeError: module 'numpy' has no attribute 'bool'.
`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
Sam777
  • 15
  • 6
  • Please read the error message. It has all of the information you need. – AKX Jun 03 '23 at 14:12
  • Does this answer your question? [How to solve AttributeError: module 'numpy' has no attribute 'bool'?](https://stackoverflow.com/questions/74893742/how-to-solve-attributeerror-module-numpy-has-no-attribute-bool) – Talha Tayyab Jun 03 '23 at 14:12

1 Answers1

2

Koalas hasn't been maintained as an individual project in a while, as its functionality was incorporated directly into PySpark as of Spark 3.2.0. It is not compatible with recent NumPy versions. You need to migrate to the new Spark Pandas API.

user2357112
  • 260,549
  • 28
  • 431
  • 505