0

I am trying to programmatically update my Figshare repository using rapiclient. Following the answer to this question, I managed to authenticate and see my repository by:

library(rapiclient)
library(httr)

# figshare repo id
id = 3761562

fs_api <- get_api("https://docs.figshare.com/swagger.json")
header <- c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))
fs_api <- list(operations = get_operations(fs_api, header), 
               schemas = get_schemas(fs_api))
reply <- fs_api$operations$article_files(id)

I also managed to delete a file using:

fs_api$operations$private_article_file_delete(article_id = id, file_id = F) 

Now, I would like to upload a new file to the repository. There seem to be two methods I need:

fs_api$operations$private_article_upload_initiate
fs_api$operations$private_article_upload_complete

But I do not understand the documentation. According to fs_api$operations$private_article_upload_initiate help:

> fs_api$operations$private_article_upload_initiate
private_article_upload_initiate
Initiate Upload
Description:
  Initiate new file upload within the article. Either use link to
  provide only an existing file that will not be uploaded on figshare
  or use the other 3 parameters(md5, name, size)

Parameters:
  link (string)
    Url for an existing file that will not be uploaded on figshare
  md5 (string)
    MD5 sum pre computed on the client side
  name (string)
    File name including the extension; can be omitted only for linked
    files.
  size (integer)
    File size in bytes; can be omitted only for linked files.

What does "file that will not be uploaded on Figshare" mean? How would I use the API to upload a local file ~/foo.txt?

fs_api$operations$private_article_upload_initiate(link='~/foo.txt') 

returns HTTP 400.

Otto Kässi
  • 2,943
  • 1
  • 10
  • 27

1 Answers1

1

I feel like I sent you down a bad path with my previous answer because I am not sure how to edit some of the api endpoints when using rapiclient. For example, the corresponding endpoint for fs_api$operations$private_article_upload_initiate() will be https://api.figshare.com/v2/account/articles/{article_id}/files, and I am not sure how to substitute for {article_id} prior to sending the request.

You may have to define your own client for operations you cannot get working any other way.

Here is an example of uploading a file to an existing private article as per the goal of your question.

library(httr)

# id of previously created figshare article
my_article_id <- 99999999

# make example file to upload
my_file <- tempfile("my_file", fileext = ".txt")
writeLines("Hello World!", my_file)

# Step 1 initiate upload
# https://docs.figshare.com/#private_article_upload_initiate
r <- POST(
  url = sprintf("https://api.figshare.com/v2/account/articles/%s/files", my_article_id),
  add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))),
  body = list(
    md5  = tools::md5sum(my_file)[[1]], 
    name = basename(my_file), 
    size = file.size(my_file)
  ),
  encode = "json"
)
initiate_upload_response <- content(r)

# Step 2 single file info (get upload url)
# https://docs.figshare.com/#private_article_file
r <- GET(url = initiate_upload_response$location,
         add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
single_file_response <- content(r)

# Step 3 uploader service (get number of upload parts required)
# https://docs.figshare.com/#endpoints
r <- GET(url = single_file_response$upload_url, 
         add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
upload_service_response <- content(r)

# Step 4 upload parts (this example only has one part)
# https://docs.figshare.com/#endpoints_1
r <- PUT(url = single_file_response$upload_url, path = 1, 
         add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))),
         body = upload_file(my_file)
)
upload_parts_response <- content(r)

# Step 5 complete upload (after all part uploads are successful)
# https://docs.figshare.com/#private_article_upload_complete
r <- POST(
  url = initiate_upload_response$location,
  add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
complete_upload_response <- content(r)
the-mad-statter
  • 5,650
  • 1
  • 10
  • 20
  • Wow, thanks so much! I will need to test this a bit before accepting, but thank you for now – Otto Kässi Jun 12 '21 at 04:42
  • you have been super helpful so far. Unfortunately I encountered a new issue. In step 4, I get the error `Cannot PUT /1`. Do you have any ideas on how to start debugging that? – Otto Kässi Jun 18 '21 at 11:04
  • Did the upload service say to upload the file in one part (like my example) or multiple parts? – the-mad-statter Jun 18 '21 at 13:03
  • Your exact example (where i swapped my own article_id instead for `99999999`) gives this error. I did not have the opportunity to figure out the multipart upload yet. – Otto Kässi Jun 18 '21 at 13:17
  • Of course the example works for me, but I have not been able to think of anything to likely have gone wrong for you. Perhaps, yet another question might get someone else to think of something. – the-mad-statter Jun 22 '21 at 01:16
  • I could not wrap my head around the Figshare R integration, so I gave up and ended writing the whole data pipeline in Python based on the examples here https://docs.figshare.com/old_docs/api/upload_example/ – Otto Kässi Jun 22 '21 at 08:27