I am trying to use dask.bag
to hold objects of a given class, where each instance captures various properties of a document (title, wordcount, etc.).
This object has some associated methods that set different attributes of the object.
For example:
import dask.bag as db
class Item:
def __init__(self, value):
self.value = 'My value is: "{}"'.format(value)
def modify(self):
self.value = 'My value used to be: "{}"'.format(self.value)
def generateItems():
i = 1
while i <= 100:
yield(Item(i))
i += 1
b = db.from_sequence(generateItems())
# looks like:
b.take(1)[0].value #'My value is: "1"'
How do I create a bag of each modify
-d instance in the first bag (b
)?
Desired output: 'My value used to be: "My value is: "1""'
etc.
I tried:
c = b.map(lambda x: x.modify() )
c.take(1)[0].value
#AttributeError: 'NoneType' object has no attribute 'value'
# Also tried:
d = b.map(lambda x: x[0].modify() )
b.take(1) # TypeError: 'Item' object does not support indexing