0

I am encountering an issue when trying to delete all blobs under a specific prefix/directory in my Azure Blob Storage. When running the code below, item.isPrefix() returns null.

BlobContainerClient containerClient = blobServiceClient.getBlobContainerClient(containerName);
PagedIterable<BlobItem> blobItems = containerClient.listBlobsByHierarchy(folderKey);

Iterator<BlobItem> iterator = blobItems.iterator();

while (iterator.hasNext()) {
    BlobItem item = iterator.next();
    if (item.isPrefix() == false) {
        BlockBlobClient blobClient = containerClient.getBlobClient(item.getName()).getBlockBlobClient();
        blobClient.delete();
    }
}

However, when debugging this code isPrefix actually has a value (false) set. This leads me to believe that the state of BloblItem is not finalised when running normally but when debugging the fact that the debugger halts execution allows for the BlobItem to be set fully. I read in the docs that listBlobsByHierarchy returns a "reactive publisher" but I am not sure how to handle that.

How can I safely delete all blobs under a specific directory?

NOTE: If I do not check for isPrefix and delete everything listBlobsByHierarchy returns I get a 404 error when trying to delete the BlobItem that represents a directory.

UPDATE:

Files hierarchy inside the container:

Container/
    - directory1
        - sub-directory1
            - file1
            - file2
    - directory2
        - sub-directory1
            - file3
edoDev
  • 551
  • 1
  • 4
  • 20

2 Answers2

0

A simple code to delete blob with specific prefix:

package testbowman;
import com.azure.storage.blob.*;
import com.azure.storage.blob.models.*;
import java.io.*;
import java.time.Duration;
/**
 * Hello world!
 *
 */
public class App 
{
    public static void main( String[] args )
    {
        String connectStr = "DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;AccountKey=OczyRTeVtOxJpGaoq39QFExdTbIWNeKpVuTQdeco1hPfMkXci2hqSi3w5U2DGrYilnJKanueDFurzPXLWbTa8w==;EndpointSuffix=core.windows.net";
        
        BlobServiceClient blobServiceClient = new BlobServiceClientBuilder().connectionString(connectStr).buildClient();

        String containerName = "test";
        ListBlobsOptions options = new ListBlobsOptions()
                                    .setPrefix("test/")
                                    .setDetails(new BlobListDetails()
                                    .setRetrieveDeletedBlobs(true)
                                    );

        BlobContainerClient containerClient = blobServiceClient.getBlobContainerClient(containerName);
        containerClient.listBlobsByHierarchy("/", options, Duration.ofSeconds(30l)).forEach(blob -> {
            BlobClient blobClient = containerClient.getBlobClient(blob.getName());
            blobClient.delete();
        });
        System.out.println( "Hello World!" );
    }
}

Above code works fine on my side, you can have a try.

Cindy Pau
  • 13,085
  • 1
  • 15
  • 27
  • thanks for the code sample. Unfortunately, I have the same issues reported above with your code. I have updated my question to show my container hierarchy. If I try to delete all content doing a listBlobsByHierarchy on directory1 I get "The specified blob does not exist." if I do not check for isPrefix(), while if I check fo isPrefix() I get a null pointer exception when trying to delete the individual files directly. Both of which are valid scenarios in my application. – edoDev Nov 12 '20 at 14:27
0

I was facing the very same problem. To me, it looks like the BlobItem.IsPrefix property remains null for actual blobs (non-directories) and is set to Boolean.TRUE for directories. This is the debug output when listing my container, using prefix 70000/10000/file-1.pdf/ (has 3 blobs inside and 1 directory):

XXX bi.getName(): 70000/10000/file-1.pdf/1.0
XXX bi.isPrefix(): null
XXX bi.getName(): 70000/10000/file-1.pdf/1.1
XXX bi.isPrefix(): null
XXX bi.getName(): 70000/10000/file-1.pdf/1.2
XXX bi.isPrefix(): null
XXX bi.getName(): 70000/10000/file-1.pdf/1.3/
XXX bi.isPrefix(): true

Debugging the code (with plenty of time to "finalize" the object being iterated over) did not change a thing for me, still was null for blobs.

As the end result, I iterate over the items using this (more defensive) code:

PagedIterable<BlobItem> blobItems =
        _blobContainerClient.listBlobsByHierarchy(
            PATH_DELIMITER, opts, null);

// fetch all blobs under given directory; only take non-directories; drop the shared prefix

List<String> versions = blobItems.stream()
        .filter(bi -> bi.isPrefix() == null || !bi.isPrefix())
        .map(bi -> bi.getName().substring(blobsPrefix.length()))
        .collect(Collectors.toList());

return versions.toArray(new String[versions.size()]);