vkGetInstanceProcAddress
is to get the function pointer that will always work with any device created from the instance passed in.
However the functions returned may include dispatch logic (typically to account for extensions that may or may not be enabled for the device) that may slow down the call. This is why the vkGetDeviceProcAddress
exist to get the function that doesn't have dispatch logic. You are not obliged to use them but it may help get some extra speed.
This is especially noticeable when you have activated several layers:

With the device specific function pointer the final dispatch can be removed:

images from the khonos loader and layer interface document
If you only use 1 device then the order of operations for the application would be:
get vkGetInstanceProcAddress
from the platform/loader.
load vkCreateInstance
from it and the extension and layer queries. (using null as the instance parameter)
create the instance. (you will use this as first parameter for loading the other functions)
load vkEnumeratePhysicalDevices
and related to query devices.
create the device with vkCreateDevice
specifying the extensions you want.
load all the other functions you will need with vkGetDeviceProcAddress
and passing the device as the first parameter.