I want to do a project of speech-to-text analysis where I would like to 1) Speaker recognition 2) Speaker diarization 3)Speech-to-text. Right now I am testing various APIs provided for various companies like Microsoft, Google, AWS, IBM etc I could find in Microsoft you have the option for user enrollment and speaker recognition (https://cognitivewuppe.portal.azure-api.net/docs/services/563309b6778daf02acc0a508/operations/5645c3271984551c84ec6797) However, all other platforms do have speaker diarization but not speaker recognition. In speaker diarization if I understand correctly it will be able to "distinguish" between users but how will it recognize unless until I don't enrol them? I could find only enrollment option available in azure
But I want to be sure so just want to check here maybe i am looking at correct documents or maybe there is some other way to achieve this in Google cloud, Watson and AWS transcribe. If that is the case can you folks please assist me with that