No, it is not easy easy as a CGI protocol. The main differences are:
- ONVIF is based on SOAP, while many proprietary protocols are based on REST or just parameters encoded in the URL
- the ONVIF device model is more complicated, because it supports a wider set of use cases.
Thus, after you either generate the code from the WSDL files or get a library that implements the necessary functions, you have to do:
- get the device services
- verify that it has a PTZ service
- verify that it has a Media service, either 1 or 2 (the latter is for profile T devices)
- get the list of media profiles
- select the media profile that has a PTZNode and that is actually the one you are looking for
- select an adeguate coordinate space from the PTZ service capabilities
- send the Move command with the correct parameters
This could seem overcomplex, but you need to remember that the ONVIF protocol needs to support devices with more that one input, such as multichannels encoders. These encoders may have a few fixed cameras and other cameras connected may have a PTZ controlled by the encoder. In practice, the list I just gave you lets you understand what the device you are controlling looks like.