30

I'm using an internal library that was designed to mimic a proposed C++ library, and sometime in the past few years I see its interface changed from using std::string to string_view.

So I dutifully change my code, to conform to the new interface. Unfortunately, what I have to pass in is a std::string parameter, and something that is a std::string return value. So my code changed from something like this:

void one_time_setup(const std::string & p1, int p2) {
   api_class api;
   api.setup (p1, special_number_to_string(p2));
}

to

void one_time_setup(const std::string & p1, int p2) {
   api_class api;
   const std::string p2_storage(special_number_to_string(p2));
   api.setup (string_view(&p1[0], p1.size()), string_view(&p2_storage[0], p2_storage.size()));
}

I really don't see what this change bought me as the API client, other than more code (to possibly screw up). The API call is less safe (due to the API no longer owning the storage for its parameters), probably saved my program 0 work (due to move optimizations compilers can do now), and even if it did save work, that would only be a couple of allocations that will not and would never be done after startup or in a big loop somewhere. Not for this API.

However, this approach seems to follow advice I see elsewhere, for example this answer:

As an aside, since C++17 you should avoid passing a const std::string& in favor of a std::string_view:

I find that advice surprising, as it seems to be advocating universally replacing a relatively safe object with a less safe one (basically a glorified pointer and length), primarily for purposes of optimization.

So when should string_view be used, and when should it not?

T.E.D.
  • 1,069
  • 1
  • 9
  • 11
  • 2
    you should never have to call the `std::string_view` constructor directly, you should just pass the strings to the method taking a `std::string_view` directly and it will automatically convert. – Mgetz Jan 16 '18 at 20:29
  • @Mgetz - Hmmm. I'm not (yet) using a full-blown C++17 compiler, so perhaps that's most of the issue. Still, the sample code [here](http://en.cppreference.com/w/cpp/string/basic_string_view/basic_string_view) seemed to indicate its required, at least when declaring one. – T.E.D. Jan 16 '18 at 20:43
  • 4
    See my answer the conversion operator is in the `` header and happens automatically. That code is deceiving and wrong. – Mgetz Jan 16 '18 at 20:46
  • 1
    "with a less safe one" how is a slice less safe than a string reference? – CodesInChaos Jan 16 '18 at 20:55
  • @CodesInChaos - Well, my thinking was that worst case with a string reference is generally the string changes out from under you (even if its const!). Worst case with a pointer is the underlying memory goes away, and your program either bombs, or gets taken over by the North Koreans. I prefer worst case A. :-) – T.E.D. Jan 16 '18 at 21:05
  • That's still a problem if you have data races, don't use `std::string_view` when you need an owning `std::string`. But within the context of a single call a `std::string_view` is safe assuming no data races. – Mgetz Jan 16 '18 at 21:06
  • 3
    @T.E.D. The caller can just as easily free the string your reference is pointing to as they can free the memory the slice is pointing into. – CodesInChaos Jan 16 '18 at 21:31

3 Answers3

33
  1. Does the functionality taking the value need to take ownership of the string? If so use std::string (non-const, non-ref). This option gives you the choice to explicitly move in a value as well if you know that it won't ever be used again in the calling context.
  2. Does the functionality just read the string? If so use std::string_view (const, non-ref) this is because string_view can handle std::string and char* easily without issue and without making a copy. This should replace all const std::string& parameters.

Ultimately you should never need to call the std::string_view constructor like you are. std::string has a conversion operator that handles the conversion automatically.

Mgetz
  • 466
  • 5
  • 8
  • Just to clarify one point, I'm thinking such a conversion operator would also take care of the worst of the lifetime issues, by making sure your RHS string value stays around for the entire length of the call? – T.E.D. Jan 16 '18 at 21:00
  • 3
    @T.E.D. if you're just reading the value then the value will outlast the call. If you're taking ownership then it needs to outlast the call. Hence why I addressed both cases. The conversion operator just deals with making `std::string_view` easier to use. If a developer mis-uses it in an owning situation that's a programming error. `std::string_view` is strictly non-owning. – Mgetz Jan 16 '18 at 21:02
  • Why `const, non-ref` ? The parameter being const is up to the specific use, but in general is reasonable as non-const. And you missed **3. Can accept slices** – v.oddou May 13 '19 at 07:14
  • 1
    What's the problem of passing `const std::string_view &` in place of `const std::string &`? – ceztko Dec 16 '19 at 23:01
  • @ceztko it's completely unnecessary and adds an extra indirection when accessing the data. – Mgetz Dec 17 '19 at 13:32
  • Ok, I asked if it was dangerous in some cases: since it's not, I think I will prefer to keep passing by const reference in most cases for better api clarity unless there are performance critical hot paths. – ceztko Dec 17 '19 at 13:47
  • 1
    @ceztko you're obviously free to do that, I would generally suggest against it because it obfuscates the lifetime of the `string_view` by creating a temporary. Compilers can easily optimize most of this away, but clear lifetime is at least for me a must since this is a non-owning structure. I want the API to be clear I'm giving this as a borrow only. – Mgetz Dec 17 '19 at 13:50
  • This is a great answer, but I think it needs a third case to recommend what to do when the function doesn't require taking ownership or reading the string but simply passes it on to another function that may do 1 or 2. – Adrian McCarthy Jul 07 '20 at 12:08
  • So your recommendation is to make this interface decision based on an implementation detail of the function (that is, whether the function happens to store a copy of the input string value)? Really? – Don Hatch Jan 18 '21 at 23:40
  • 1
    @DonHatch That "implementation-detail" mirrors a major design-point, and is thus quite stable and non-surprising. – Deduplicator Apr 11 '21 at 13:14
  • @Deduplicator It mirrors a major design point in some cases. In many others, it does not; it varies from being a major design point, to a minor one, to an implementation detail (which may vary from implementation to implementation), in which case the question of how to make the interface decision remains unanswered. – Don Hatch May 01 '21 at 00:37
  • @DonHatch - That seems to be a fundamental design problem with string_view. Its effectively a pointer to the string object its created from, created as a manual developer optimization. So the developer is *forced* to be cognizant of how its going to be used. – T.E.D. Jan 20 '22 at 22:14
20

A std::string_view brings some of the benefits of a const char* to C++: unlike std::string, a string_view

  • does not own memory,
  • does not allocate memory,
  • can point into an existing string at some offset, and
  • has one less level of pointer indirection than a std::string&.

This means a string_view can often avoid copies, without having to deal with raw pointers.

In modern code, std::string_view should replace nearly all uses of const std::string& function parameters. This should be a source-compatible change, since std::string declares a conversion operator to std::string_view.

Just because a string view doesn't help in your specific use case where you need to create a string anyway does not mean that it's a bad idea in general. The C++ standard library tends to be optimized for generality rather than for convenience. The “less safe” argument doesn't hold, as it shouldn't be necessary to create the string view yourself.

amon
  • 132,749
  • 27
  • 279
  • 375
  • 5
    The big drawback of `std::string_view` is the absence of a `c_str()` method, resulting in unnecessary, intermediate `std::string` objects that need to be constructed and allocated. This is especially a problem in low-level APIs. – Matthias Jun 21 '18 at 17:12
  • 1
    @Matthias That's a good point, but I don't think its a huge drawback. A string view allows you to point into an existing string at some offset. That substring cannot be zero-terminated, you need a copy for that. A string view does not prohibit you from making a copy. It allows many string processing tasks that can be performed with iterators. But you are right that APIs that need a C string won't profit from views. A string reference can then be more appropriate. – amon Jun 21 '18 at 17:27
  • @Matthias, doesn't string_view::data() match c_str()? – Aelian Nov 21 '18 at 20:30
  • 4
    @Jeevaka a C string has to be zero-terminated, but a string view's data is usually not zero-terminated because it points into an existing string. E.g. if we have a string `abcdef\0` and a string view that points at the `cde` substring, there is no zero character after the `e` – the original string has an `f` there. The [standard](http://eel.is/c++draft/string.view.access) also notes: “data() may return a pointer to a buffer that is not null-terminated. Therefore it is typically a mistake to pass data() to a function that takes just a const charT\* and expects a null-terminated string.” – amon Nov 21 '18 at 21:04
  • @amon, I see, thanks. I didn't think about the sub-string scenario. – Aelian Nov 21 '18 at 21:24
  • 1
    @kayleeFrye_onDeck The data already is a char pointer. The problem with C strings is not getting a char pointer, but that a C string must be null-terminated. See my previous comment for an example. – amon Feb 20 '19 at 06:40
  • 1
    @amon - I just (finally) got the opportunity to use a full-blown C++17 compiler for writing an API using string_view, and I had to abandon string_view for exactly the reason Matthias mentioned. I actually *love* the idea of moving away from null-terminated strings *in theory*, but the fact that its API doesn't support them at all means string_view is worse than useless if the underlying code interfaces that string with null-terminated string routines or std::string. This includes not only old C char array handling routines and OS APIs, but C++ standard library operations like std::*fstream. – T.E.D. Feb 22 '20 at 02:43
11

I find that advice surprising, as it seems to be advocating universally replacing a relatively safe object with a less safe one (basically a glorified pointer and length), primarily for purposes of optimization.

I think this is slightly misunderstanding the purpose of this. While it is an "optimization", you should really think of it as unshackling yourself from having to use a std::string.

Users of C++ have created dozens of different string classes. Fixed-length string classes, SSO-optimized classes with the buffer size being a template parameter, string classes that store a hash value used to compare them, etc. Some people even use COW-based strings. If there's one thing C++ programmers love to do, it's write string classes.

And that ignores strings which are created and owned by C libraries. Naked char*s, maybe with a size of some kind.

So if you're writing some library, and you take a const std::string&, the user now has to take whatever string they were using and copy it to a std::string. Maybe dozens of times.

If you want access to std::string's string-specific interface, why should you have to copy the string? That's such a waste.

The principle reasons not to take a string_view as a parameter are:

  1. If your ultimate goal is to pass the string to an interface that takes a NUL-terminated string (fopen, etc). std::string is guaranteed to be NUL terminated; string_view isn't. And it's very easy to substring a view to make it non-NUL-terminated; sub-stringing a std::string will copy the substring out into a NUL-terminated range.

    I wrote a special NUL-terminated string_view style type for exactly this scenario. You can do most operations, but not ones that break its NUL-terminated status (trimming from the end, for example).

  2. Lifetime issues. If you really need to copy that std::string or otherwise have the array of characters outlive the function call, it's best to state this up-front by taking a const std::string &. Or just a std::string as a value parameter. That way, if they already have such a string, you can claim ownership of it immediately, and the caller can move into the string if they don't need to keep a copy of it around.

Nicol Bolas
  • 11,813
  • 4
  • 37
  • 46
  • Is this true? The only *standard* string class I was aware of in C++ prior to this was std::string. There's some support for using char*'s as "strings" for backward compatibility with C, but I almost never need to use that. Sure, there are lots of user-defined third party classes for almost anything you can imagine, and strings are probably included in that, but I almost never have to use those. – T.E.D. Jan 18 '18 at 15:12
  • @T.E.D.: Just because you "almost never have to use those" doesn't mean that *other people* don't routinely use them. `string_view` is a lingua franca type that can work with anything. – Nicol Bolas Jan 18 '18 at 15:23
  • I can see that point. I was just unaware of C++ itself having "dozens of different string classes". – T.E.D. Jan 18 '18 at 15:27
  • 3
    @T.E.D.: That's why I said "C++ as a programming environment", as opposed to "C++ as a language/library." – Nicol Bolas Jan 18 '18 at 15:28
  • What does that even mean then? The superset of all code everyone has written anywhere? So I could equally say "C++ as a programming environment has thousands of container classes"? – T.E.D. Jan 18 '18 at 16:26
  • 2
    @T.E.D.: "*So I could equally say "C++ as a programming environment has thousands of container classes"?*" And it does. But I can write algorithms that work with iterators, and any container classes that follow that paradigm will work with them. By contrast, "algorithms" that can take any contiguous array of characters were much harder to write. With `string_view`, it's easy. – Nicol Bolas Jan 18 '18 at 16:57
  • Ironically, it occurs to me I've over the past few years taken to writing "range" classes for my containers that don't own the contained items. So now I don't know if I should argue against this, or that by limiting it to char arrays it doesn't go nearly far enough. Perhaps this whole string_view thing is [Alexandrescu](https://stackoverflow.com/questions/838721/c-iterators-considered-harmful) starting to win? I think I'll just quit while I'm behind... (oh, and +1 for you). – T.E.D. Jan 19 '18 at 19:22
  • 1
    @T.E.D.: Character arrays are a very special case. They are exceedingly common, and different containers of contiguous characters differ only in how they manage their memory, not in how you iterate across the data. So having a single lingua franca range type that can cover all such cases without having to employ a template makes sense. Generalization beyond this is the province of the Range TS and templates. – Nicol Bolas Jan 19 '18 at 19:28
  • 1
    Another reason to take a `const std::string &` is when that string will just be used as the key to an associative container, like a `std::map`. If you take a `std::string_view` instead, you'll have to make a temporary copy before you can use it as a key. – Adrian McCarthy Jul 08 '20 at 18:26
  • @AdrianMcCarthy: Not since C++14, which allows [heterogeneous lookup for `map::find`](https://en.cppreference.com/w/cpp/container/map/find) (though you do have to ask for it explicitly). `operator[]` and `at` can't do it, because they have to be able to construct a key if it is not in the container. – Nicol Bolas Jul 08 '20 at 20:23
  • @NicolBolas: Cool, but that seems limited to `find`, which I haven't needed very often. `operator[]`, `at`, and `erase` (by key) are all fundamental ways of working with associative containers. – Adrian McCarthy Jul 08 '20 at 20:36
  • @AdrianMcCarthy: They are fundamental ways for *inserting items* into associative containers. Well, not `erase` obviously, but you get my point: they don't merely search for a key in the object. Insertion requires that a key exists. And `erase` can be called with the result of `find`. – Nicol Bolas Jul 08 '20 at 20:38