Regarding complexity, returning or passing a reference is just like passing a pointer. Its overhead is equivalent to passing an integer the size of a pointer, plus a few instructions. In short, that is as fast as is possible in nearly every case. Builtin types (e.g. int, float) less than or equal to the size of a pointer are the obvious exception.
At worst, passing/returning a reference can add a few instructions or disable some optimizations. Those losses rarely exceed the costs of returning/passing objects by value (e.g. calling a copy constructor + destructor is much higher, even for a very basic object). Passing/returning by reference is a good default unless every instruction counts, and you have measured that difference.
Therefore, using references has incredibly low overhead.
One can't really quantify how much faster it would be without knowing the complexity of your types and their constructor/destructor, but if it is not a builtin type, then holding a local and returning it by reference will be fastest in most cases - it all depends on the complexity of the object and its copy, but only incredibly trivial objects could come close the speed of the reference.