How can I delete a Riak bucket in Python?

Question

I want to delete a Riak bucket in order to purge old data from my system. I understand that there is no single Riak API to do this, but instead one deletes all the keys in the bucket, which effectively deletes it. Riak does provide an API to fetch all the keys, so this is fairly straightforward.

I found some code online to do this, but it was written in JavaScript and runs under Node. I want something in Python. This is probably a simple thing to do. Does anyone have any examples?

score 11 · Accepted Answer · answered Jan 25 '13 at 23:15

Like I said in the question, I figured this was pretty simple, especially with the requests library, so I developed a script to do this. I started with the Riak keys=true (i.e. non-chunked) mode, but that failed on my larger buckets. I switched to chunked mode (keys=stream), but the output was not a single JSON object anymore, but a series of concatenated objects (i.e. {...}{...}...{...}. A colleague provided me with a regex to split the JSON objects out from the aggregated Riak response, which I parsed and processed sequentially. Not too bad. Here's the code:

#!/usr/bin/python
# script to delete all keys in a Riak bucket

import json
import re
import requests
import sys

def processChunk(chunk):
    global key_count
    obj = json.loads(chunk.group(2))
    if 'keys' in obj:
        for key in obj['keys']:
            r = requests.delete(sys.argv[1] + '/' + key)
            print 'delete key', key, 'response', r.status_code
            key_count += 1


if len(sys.argv) != 2:
    print 'Usage: {0} <http://riak_host:8098/riak/bucket_name>'.format(sys.argv[0])
    print 'Set riak_host and bucket_name appropriately for your Riak cluster.'
    exit(0)

r = requests.get(sys.argv[1] + '?keys=stream')
content = ''
key_count = 0

for chunk in r.iter_content():
    if chunk:
        content += chunk

re.sub(r'(?=(^|})({.*?})(?={|$))', processChunk, content)

print 'Deleted', key_count, 'keys'

While my problem is largely solved at this point, I suspect there are better solutions out there. I welcome people to add them on this page. I won't accept my own answer unless no alternatives are provided after a few weeks.

We did the same in PHP for Riak-admin console https://github.com/pentium10/riak-admin — Pentium10, Feb 07 '14 at 16:26

score 2 · Answer 2 · answered Aug 13 '14 at 11:52

2

If using the python riak-client is an option for you, this can be achieved with less code:

#!/usr/bin/python
import riak

riak_handle = riak.RiakClient(pb_port=8087, protocol='pbc')
riak_bucket = riak_handle.bucket('default')

for keys in riak_bucket.stream_keys():
    for key in keys:
        print('Deleting %s' % key)
        riak_bucket.delete(key)

You could adapt that to use arguments if that is your primary use case.

answered Aug 13 '14 at 11:52

jfd

1,151
10
10

Interesting approach. Thanks for adding it. – Randall Cook Aug 13 '14 at 19:28

How can I delete a Riak bucket in Python?

2 Answers2