6

I have calculated using dask by

from dask import dataframe
all_data = dataframe.read_csv(path) 
total_sum = all_data.account_balance.sum()

The csv file has a column named account_balance.

The total_sum is a dd.Scalar object, which seems to be difficult to change it to integer. How to get the integer version of it? or save it in a .txt file containing the number is also ok.

I have also tried total_sum.compute().

Thanks.

1 Answers1

9

.compute() does indeed bring you a real number, as you can see in this example:

In [18]: import dask.dataframe as dd

In [19]: d = dd.from_pandas(pd.DataFrame({'a': [3,3,3,3]}), npartitions=2)

In [20]: d.a.sum().compute()
Out[20]: 12
mdurant
  • 27,272
  • 5
  • 45
  • 74