4

It's hard to put into words so I will directly jump into a semi-pseudo-code.

I have a download function (http GET), that is being called many many times inside my main code.

std::string download_data(){
    std::shared_ptr<HttpResponse> response = some_http_client->send_request("some_link");
    return std::string(response->body()); // response->body() is a std::string_view.
}

The http_client that I am using, returns a shared_ptr as a response, this response (I excluded the code for HTTP error handling, assume its 200.), contains a response->body(), which is a std::string_view.

This code works fine, however, I want to make sure that the downloaded data isn't copied everytime that this function is called / returned.

My main questions:

  • Is the current code I use, subject to return value optimization ? (Is there anything needs to be done ?)
  • If not, can I just return return response->body(); ? Is a string_view inside a shared_ptr valid after function returns ?

Things I considered, or used in older versions of my code:

  • Returning std::string (with another http client that returned std::string as body).
  • Returning with std::move.
  • Instead of writing a function, just replace all the places this function is called by the body of the function, to directly use the response->body, avoiding a return (I hated it).

What is the correct way of doing this ?

My toolchain:

Ubuntu 20.04 (GLIBC 2.31), g++ 10.2, C++20.

Max Paython
  • 1,595
  • 2
  • 13
  • 25

1 Answers1

7

Your code will use RVO. It returns a temporary of the same type as the function returns, which is one of the cases where RVO is mandatory.

Of course, it will still require one copy of the data, as part of the string constructor that accepts a string_view as an argument.

You cannot just pass the string_view on its own. It is nothing more than a pair of pointers into someone elses data. Based on your code, that is almost certainly data owned by response, which will expire before you can use the string_view that got returned.

You basically have two options. You can either copy the data, or preserve it. Your current code copies it once (thanks to RVO), so it's as ideal as we can get for that case. However, there is another approach. We can return an "aliased" shared pointer to the string view. Make your function return a std::shared_ptr<std::string_view>, and we'll set things up to make that work.

The aliasing constructor of shared_ptr<T> looks like this:

template <typename Y>
shared_ptr(const shared_ptr<Y>& custodian, T* ward)

It creates a shared_ptr which, when dereferenced, points at the ward. However, it "owns" the custodian, which can be of any other type. The custodian will not be destroyed until after this shared pointer is destroyed.

To use this, we have to create a new class which wraps a shared_ptr<HttpResponse> and a body string_view which references data in that response. I'll name it BodyCustodian to make the naming consistent as possible.

struct BodyCustodian
{
    BodyCustodian(const std::shared_ptr<HttpResponse>& response)
    : response(response)
    , body(response->body()
    { }

    std::shared_ptr<HttpResponse> response;
    std::string_view              body;
};

Now, in your code, you'll want to create one of these BodyCustodian objects which holds onto its own response (so that the characters behind the body never expire) and a body which is the actual string_view you want to return. We construct one of these, then use the aliasing shared_ptr constructor to create a pointer to body (an element of the BodyCustodian that is valid as long as the custodian is alive), which "owns" the custodian.

std::shared_ptr<std::string_view>, download_data(){
    std::shared_ptr<BodyCustodian> custodian = std::make_shared<BodyCustodian>(some_http_client->send_request("some_link"));
    return std::shared_ptr<std::string_view>(custodian, &custodian->body);
}

This shared pointer owns the custodian (which keeps the response alive), so that the body string view is still valid.

This approach does require creating a small object (~6 pointers in size) on the heap. This is typically fast, and does not depend on the length of the body (which was the issue you were worried about when copying into a std::string). I use make_shared here to make sure I create one ~6 pointer sized object, rather than allocating ~4 pointers worth of space for a BodyCustodian and then ~2 pointers worth of space for the shared_ptr control block. make_shared is smart enough to do them together.

Cort Ammon
  • 10,221
  • 31
  • 45
  • I don't store the responses, they are processed and discarded immediately. But can I std::move() the body inside a std::string so return that ? Avoiding any copy ? – Max Paython Mar 19 '21 at 18:15
  • 1
    Doing so will not change anything. The `std::move` will turn it into a `string_view&&`, but the `string` constructor that gets picked only uses a `const string_view&`, so it would immediately revert back to copying. However, what we are talking about here is the copying of 2 pointers *not* the copying of the characters. The copying of the characters occurs inside the `string` constructor, not in the argument handling. – Cort Ammon Mar 19 '21 at 18:21
  • 1
    If you are just processing and discarding the data immediately, your use case is basically what the aliasing constructor of `shared_ptr` was written to support. Take advantage of that. – Cort Ammon Mar 19 '21 at 18:22
  • I got the error no matching constructor, return value of the function is `std::shared_ptr`, response is `std::shared_ptr`, response->body() returns `std::string_view`. – Max Paython Mar 19 '21 at 18:47
  • Ahh, minor oversight on my part, and fixed in an edit. The issue was that the aliasing contructor expects the 2nd argument to be a pointer, and I didn't hand it a pointer at all. I changed it to `&response->body()`. Also, I just had to change the `body()` function. Instead of returning a `string_view`, it needs to return a `const string_view&`, refering to a `string_view` that's actually a member of response. – Cort Ammon Mar 19 '21 at 18:51
  • The aliasing constructor's intent is to create a pointer to an element of the original "owned" object, so those changes make sure this pattern fits the original intent. You pass the `shared_ptr` a pointer to an element of the `response`, and the changes I just described make sure that's a reality. – Cort Ammon Mar 19 '21 at 18:52
  • If you don't use `body()` for other things, you could also consider having it return a `const string_view*`, which would make it more clear that it must be referring to an existing `string_view` object, and you won't accidentally construct a new `string_view` on the stack when you least expected it. It wouldn't change how things work, but it would make it more likely that mistakes turn into syntax errors, and that's always helfpul. – Cort Ammon Mar 19 '21 at 18:54
  • Unfortunately, the library I use just returns a string_view, and I can't directly access the body (which is a char*), the library returns a string view like this `return string_view{getBodyData(), getBodyLength()}`. This means I can't use this method ? – Max Paython Mar 19 '21 at 18:54
  • Blast. What that means is that you will have to, at the very least, create one new object on the heap (which isn't too expensive), but it is something I was trying to avoid. Let me edit that in. – Cort Ammon Mar 19 '21 at 18:55
  • Okay, that edit is in place. It sadly calls for allocating an object on the heap, but its small. I also used the names "custodian" and "ward." I borrowed them from boost. I find they do a very good job of describing the relationship -- the custodian is the object which is responsible for the ward being alive, and we start the whole aliasing pattern with the custodian "owned" by a shared_ptr. – Cort Ammon Mar 19 '21 at 19:07
  • Thanks for all your help, I have one more quick question. Wouldn't `std::string_view a = response->body();` then `return std::shared_ptr(response, &a);` work ? – Max Paython Mar 19 '21 at 19:28
  • 1
    @MaxPaython The `string_view a;` is local to this function call, it gets destroyed when the function returns, thus the pointer to it will become invalid (hence you have to allocate it on the heap) – ph3rin Mar 19 '21 at 19:37
  • Meowmere is correct. Another approach would be to pass in a `shared_ptr&` which download_data can point at the response (keeping it alive), and then return `response->body()` (a `string_view`, not any complicated share ptr). The caller just needs to know that the return value is only valid as long as the pointer they pass in remains alive. It's inellegant, though. Easy for a user to forget about these issues and try to use the string_view after destroying the pointer. – Cort Ammon Mar 19 '21 at 21:00
  • I applied your suggestions and it's working, I have a question, would returning `std::pair(response, response->body)` work ? without using a custodian ? Maybe it's counter-intuitive because it's no different than just returnning the response itself ? – Max Paython Mar 20 '21 at 15:27
  • That would work. You end up with a copy of the `string_view`, and the characters behind it are kept alive by `response`. It has the same issue as in my comment above, where you are obliged to know that that body is being kept alive by a `HttpResponse` and you're obliged to keep the `shared_ptr` around until you are done with the body. However, if you can work with those issues, a `std::pair` based solution would certainly work. – Cort Ammon Mar 20 '21 at 21:40
  • The code to generate the aliased `shared_ptr` is dependent of `BodyCustodian`'s implementation details. Wouldn't it make sense to move the code into static member function if it? `static auto convert(shared_ptr& resonse) { return shared_ptr(response, &response->ward); }` – doak Sep 09 '22 at 13:34