I have a Jupyter Notebook running on AWS SageMaker. One of the cells in the notebook was reading data row by row from a large (~5m rows) datastore.
I ran the cell and then stopped it after confirming that it was reading the data.
The code is using a while loop
(sample code from the docs):
import pulsar
client = pulsar.Client('pulsar://localhost:6650')
consumer = client.subscribe('my-topic', 'my-subscription')
while True:
msg = consumer.receive()
try:
print("Received message '{}' id='{}'".format(msg.data(), msg.message_id()))
# Acknowledge successful processing of the message
consumer.acknowledge(msg)
except Exception:
# Message failed to be processed
consumer.negative_acknowledge(msg)
client.close()
I am unable to open the notebook despite having enough memory (32gb) and clear the output from notebook memory / disk / kernel. The notebook size is not >350mb from a few kBs before. How do I clear the output / disk space and optimize my code for better performance.
free -h
total used free shared buff/cache available
Mem: 7.7G 1.0G 4.6G 688K 2.0G 6.4G
Swap: 0B 0B 0B
https://pulsar.apache.org/docs/2.2.1/client-libraries-python/