0

I'm pretty new to Google genomics APIs. I'm trying to create an annotation. I used both web version and Python API call:

service.annotations().create(body={ 'annotationSetId': '101', 'name': 'TestAnnotation', 'referenceName': 'chrM', 'start': '1', 'end': '1'}, fields='id') 

Here is a sample annotation:

{
  "annotationSetId": "101",
  "name": "TestAnnotation",
  "referenceName": "chrM",
  "start": "1",
  "end": "1",
}

I get the following error for both cases:

500 Internal Server Error
{
 "error": {
  "code": 500,
  "message": "Unknown Error.",
  "status": "UNKNOWN"
 }
} 

Any Suggestion?

One more observation.

We can add a variant set by only submitting datasetId and name; no need to specify referenceId, but we cannot create an annotation set w/o referenceId. Why?

400 HTTP/2.0 400
- SHOW HEADERS -
{
 "error": {
  "code": 400,
  "message": "Invalid value for field \"annotationSet.referenceSetId\": empty or not specified",
  "status": "INVALID_ARGUMENT"
 }
}

BTW, how can I set the WRITE permission for the caller?

Caller must have WRITE permission for the associated annotation set.

Thank you in advance!

AmirCS
  • 321
  • 1
  • 2
  • 14

6 Answers6

1

So to have an annotationset associated to a dataset, you would need write permission to that dataset. If you created the dataset then you would have write permission, which would be associated with your account. If it is a public dataset, then you might need to ask for permission from the person who loaded that dataset to add you with write permissions to it, or you could reload it under you account.

Now assuming you created a dataset, then you can create an AnnotationSet via curl directly - you will need to use your API key from the console (please don't post your API key publicly here). Below is the command and what you would fill in:

curl -v -X POST -H "Content-Type: application/json" -d '{"datasetId":"YourActualDatasetID", "referenceSetId":"YourActualReferencesetID"}' https://genomics.googleapis.com/v1/annotationsets?fields=asdf&key=YOUR_API_KEY

Let me know if this worked for you, and if there is anything else that I can help you with.

Thanks,

Paul

Paul Grosu
  • 61
  • 2
  • Thanks Paul. Based on your answer, the problem is authentication, and it is not about referenceId. I'm the owner and created the dataset, so I should have all the access permissions, but I followed @Melissa 's response, and also added myself as admin. I created API key, and submitted my request. I'm getting "Code: 401 -> UNAUTHENTICATED." – AmirCS Jun 09 '16 at 23:31
  • Can you confirm and add an annotationSet using the web method?{ "datasetId": "YOUR_DATASET_ID", "referenceSetId": "1213", "name": "Ann1" } Does that work for you? I get ""code": 404, "message": "couldn't associate with referenceSetId \"1213\": Reference set \"1213\" not found". – AmirCS Jun 09 '16 at 23:31
  • The association error is most likely due because 1213 is not a valid ReferenceSetID at the backend. For example, a EJjur6DxjIa6KQ is a valid ID. You can find them via the API Explorer at https://developers.google.com/apis-explorer/#p/genomics/v1/genomics.referencesets.search?_h=1&resource=%257B%250A%257D& – Paul Grosu Jun 10 '16 at 05:49
  • Thanks Paul! Correct! ReferenceSetID was the problem. I didn't know how to generate ReferenceID. Do you have any documents about ReferenceIDs? – AmirCS Jun 10 '16 at 16:05
0

to add to Paul's answer:

annotationSetId must be the id to a real annotation set. We will work on improving the error message.

We would like to require referenceId for all our APIs. We don't for our Variant API because the Reference API didn't exist when we created the Variant API.

To give a user WRITE permission, add the user as a Project Editor. See https://cloud.google.com/iam/docs/quickstart-roles-members#add_a_project_member_and_grant_them_an_iam_role

Melissa
  • 902
  • 2
  • 7
  • 11
  • Adding to my comment, when I try to add { "datasetId": "MYDEATASET ID", "name": "VSet5", "referenceSetId": "VSet50121" }; I'm getting { "error": { "code": 404, "message": "Requested entity was not found.", "status": "NOT_FOUND" } }, BUT if I remove referenceId, then it is executed correctly. – AmirCS Jun 09 '16 at 21:51
  • Correction: the ReferenceID mentioned doesn't exist, so that's why I got the error. Thanks to Paul! – AmirCS Jun 10 '16 at 19:18
0

My previous comment didn't get formatted properly, so I'm writing it as an answer instead. For this specific test I would need to enable billing for my account, so my guide is the raw information in the Genomics REST API via the Discovery service:

https://www.googleapis.com/discovery/v1/apis/genomics/v1/rest

Based on the REST API, the scopes for creating a AnnotationSet are the following:

"https://www.googleapis.com/auth/cloud-platform", "https://www.googleapis.com/auth/genomics"

Since you are getting an authentication error, it would be good to first check on the console (https://console.cloud.google.com) for your project that is tied to your API (server) key that you used, if it is enabled for the Genomics and Cloud APIs?

~p

Paul Grosu
  • 61
  • 2
0

Glad to hear you got everything to work Amir! It was a fun team effort by the three of us, and I'm always happy to help out as I've used and seen the evolution of the API over the past two years :)

Regarding ReferenceIds I see you already found some of the same links I am posting here. These are basically the id that point to a reference which is a sequence such as a chromosome. A collection of reference IDs belong to a ReferenceSet which represents a reference assembly, and references.bases belong to a ReferenceID. I have not seen in the REST API a way to create load a new reference genome - those are probably populated and made available by Google manually via the backend. Maybe Melissa might have more information regarding that.

Below are a collection of links that may be helpful regarding References - some of which you also discovered - and am listing them as a collection in case others might find them useful in the future:

http://googlegenomics.readthedocs.io/en/latest/use_cases/discover_public_data/reference_genomes.html

https://cloud.google.com/genomics/v1/users-guide#references

https://cloud.google.com/genomics/v1/reference-sets#finding-references

https://cloud.google.com/genomics/reference/rest/v1/referencesets

https://cloud.google.com/genomics/reference/rest/v1/references

https://cloud.google.com/genomics/reference/rest/v1/references.bases

Each of the above of the REST APIs will have their own specific methods for searching and associating to data.

Hope it helps,

~p

Paul Grosu
  • 61
  • 2
0

To use the REST API for annotation:

gcloud auth login
TOKEN=$(gcloud auth print-access-token)
curl -v -X POST -H "Authorization: Bearer $TOKEN" -d '{"datasetId": "YOUR_DATA_SET" ,  "referenceSetId": "EMWV_ZfLxrDY-wE" }'  --header "Content-Type: application/json" https://genomics.googleapis.com/v1/annotationsets 
AmirCS
  • 321
  • 1
  • 2
  • 14