Here is what you want to do using the latest version (v12)
According to the documentation,
The source blob for a copy operation may be a block blob, an append blob,
or a page blob. If the destination blob already exists, it must be of the
same blob type as the source blob.
Right now, you cannot use start_copy_from_url to specify a blob type. However, you can use the synchronous copy APIS to do so in some cases.
For example, for block to page blob, create the destination page blob first and invoke update_range_from_url
on the destination, with each chunk of 4 MB from the source.
Similarly, in your case, create an empty block blob first and the use the stage_block_from_url
method.
from azure.storage.blob import ContainerClient
import os
conn_str = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
dest_blob_name = "mynewblob"
source_url = "http://www.gutenberg.org/files/59466/59466-0.txt"
container_client = ContainerClient.from_connection_string(conn_str, "testcontainer")
blob_client = container_client.get_blob_client(dest_blob_name)
# upload the empty block blob
blob_client.upload_blob(b'')
# this will only stage your block
blob_client.stage_block_from_url(block_id=1, source_url=source_url)
# now it is committed
blob_client.commit_block_list(['1'])
# if you want to verify it's committed now
committed, uncommitted = blob_client.get_block_list('all')
assert len(committed) == 1
Let me know if this doesn't work.
EDIT:
You can leverage the source_offset
and source_length
params to upload blocks in chunks.
For example,
stage_block_from_url(block_id, source_url, source_offset=0, source_length=10)
will upload the first 10 bytes i.e. bytes from 0 to 9.
So, you can use a counter to keep incrementing the block_id and track your offset and length till you exhaust all your chunks.
EDIT2:
for step in range(....):
###
blob.stage_block_from_url(...)
##do not commit it##
#outside the for loop
blob.commit_block_list([j for j in range(i+1)]) (#or i+2?)