31

From the standard, std::includes:

Returns: true if [first2, last2) is empty or if every element in the range [first2, last2) is contained in the range [first1, last1). Returns false otherwise.

Note: as this is under [alg.set.operations], the ranges must be sorted

Taking this literally, if we let R1=[first1, last1) and R2=[first2, last2), this is evaluating:

∀a∈R2 a∈R1

However, this is not what is actually being evaluated. For R1={1} and R2={1,1,1}, std::includes(R1, R2) returns false:

#include <algorithm>
#include <iomanip>
#include <iostream>
#include <vector>

int main() {
    std::vector<int> a({1});
    std::vector<int> b({1,1,1});

    // Outputs 'false'
    std::cout << std::boolalpha
        << std::includes(a.begin(), a.end(), b.begin(), b.end()) << '\n';
}

Live on Wandbox

This is surprising. I verified it with both libstdc++ and libc++, but it seems unlikely to me that this would be a bug in the standard library implementation, considering it's part of the algorithms library. If this isn't the algorithm that std::includes is supposed to run, what is?

Justin
  • 24,288
  • 12
  • 92
  • 142
  • 2
    This looks like a candidate for a defect report. I interpret *every element in the range [first2, last2) is contained in the range [first1, last1)* to mean you would need `{1,1,1}` to match `{1,1,1}` but the way it is worded I can see why you could expect your case to work. – NathanOliver May 24 '18 at 18:49
  • 2
    It's more a check for R2 ⊆ R1 – JHBonarius May 24 '18 at 18:52
  • @Mgetz I understand the parameter order. But by the literal wording in the standard, this (psuedocode) `std::includes({1}, {1, 1, 1})` should return true by my interpretation – Justin May 24 '18 at 18:54
  • 1
    @Mgetz but every element of `{1,1,1}` is contained in `{1}` since they are all ones. A literal reading of the standard allows for both interpretations IMHO. – NathanOliver May 24 '18 at 18:54
  • 1
    @NathanOliver the operative wording is *every element*, meaning that repeated elements must be represented, its a difference between sets and lists. The standard is using lists not sets. – Mgetz May 24 '18 at 18:55
  • 3
    @Mgetz It doesn't say something like "every element in R2 must have a corresponding element in R1", but "every element in R2 must be contained in R1" – Justin May 24 '18 at 18:57
  • 1
    @Justin I'll agree there is a defect here, the standard should be clear on lists vs. sets. But I do think the implementation is compliant since sorting implies lists which implies that position and repetition matter. TL;DR; Casey is right and we've spent too much time on this – Mgetz May 24 '18 at 18:58
  • @Mgetz The implementation is compliant if you read the whole section. See my answer. – curiousguy Jun 28 '18 at 04:12
  • 1
    not sure why "multiset" was added to tags, this question does not involve `std::multiset` – M.M Jun 28 '18 at 04:15
  • @M.M Then read the description of that tag. – curiousguy Jun 28 '18 at 05:18
  • 1
    "The semantics of the set operations are generalized to multisets in a standard way by defining set_­union() to contain the maximum number of occurrences of every element, set_­intersection() to contain the minimum, and so on." Pretty clear IMHO. All algorithms in the section are described in terms of set operations; for behaviour on multisets one needs the preceding paragraph. – n. m. could be an AI Jun 28 '18 at 05:57
  • Why do people insist that it isn't a multiset issue, **when it is 100% about multisets**? – curiousguy Jun 28 '18 at 06:15
  • @JHBonarius "_R2 ⊆ R1_" inclusion as defined for [tag:multiset] – curiousguy Jun 29 '18 at 15:53

3 Answers3

30

I posted this in the cpplang slack, and Casey Carter responded:

The description of the algorithm in the standard is defective. The intent is to determine [if] every element in the needle appears in order in the haystack.

[The algorithm it actually performs is:] "Returns true if the intersection of sorted sequences R1 and R2 is equal to R2"

Or, if we ensure we are certain of the meaning of subsequence:

Returns: true if and only if [first2, last2) is a subsequence of [first1, last1)

link to Casey Carter's message

Community
  • 1
  • 1
Justin
  • 24,288
  • 12
  • 92
  • 142
3

I believe you're trying to check if a includes b in your example, a doesn't include b but b does include a. If you swap b and a it will return true, since a is included in b.

I hope I'm not missing something obvious.

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
    std::vector<int> a({1});
    std::vector<int> b({1,1,1});

    // Outputs 'true'
    std::cout << std::boolalpha
        << std::includes(b.begin(), b.end(), a.begin(), a.end()) << '\n';
}

What I've understood by playing around with algorithm is, when you type includes(R2, R1) it checks if R2 owns R1 as a subgroup, if yes returns true if not returns false. Also if it's not ordered throws an error: sequence not ordered.

  • 1
    I've thought so at first to, but after (external) discussion with Justin, it's pretty clear he really means that "every element in R2" (ie: 1) is in R1. I can't say its wrong by a literal reading of the standard, but that is not the intent of the algorithm – KABoissonneault May 24 '18 at 18:47
  • Oh, now I get it... my bad. English isn't a strong language of mine, sometimes I miss the point. – Tuğberk Kaan Duman May 24 '18 at 18:47
  • 1
    "Every element from R2" - I understand it as "all three 1s", not just "1". – jrok May 24 '18 at 18:51
  • @TuğberkKaanDuman I think OP does not ask what algorithm actually does, but if what it does is what standard says it should do. – Slava May 24 '18 at 19:02
  • 1
    Title is saying `What does std::includes actually do?` I've observed and written down my experience. Not every documentation is perfect, they're written by humans too. It'll get better now I guess. @Slava – Tuğberk Kaan Duman May 24 '18 at 19:03
  • @jrok "_"all three 1s",_" But then you cannot differentiate between equal values. This is really a comparison of the count of equal elements. – curiousguy Jul 03 '18 at 17:26
-2

As always when reading the standard, you must read the unwritten words.

Focus on the intent not just the letter. (The wording of the standard was often found to be vague, incomplete, self-referential, or contradictory.)

Read the introduction of section "28.7.6 Set operations on sorted structures [alg.set.operations]" :

This subclause defines all the basic set operations on sorted structures. They also work with multisets containing multiple copies of equivalent elements. The semantics of the set operations are generalized to multisets in a standard way by defining set_­union() to contain the maximum number of occurrences of every element, set_­intersection() to contain the minimum, and so on.

So it's perfectly clear that the words in the description of includes:

Returns: true if [first2, last2) is empty or if every element in the range [first2, last2) is contained in the range [first1, last1). Returns false otherwise.

must be ignored. You need to know a priori what multiset operations are, what "includes" means for two multisets, ignore the description and rebuild in your head what was the obvious intent.

Multiset inclusion:

A is included in B iff A union B = B.

This is true for sets or multisets.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • 1
    @n.m. who upvotes this? 90% of it is saying "Ignore the standard" in a sarcastic and overly bloated manner. Then it answers the question in the last 3 lines. IMO this should be deleted as it would be confusing to anyone who took it literally – M.M Feb 11 '19 at 01:29
  • @M.M What do you believe the compiler writers do? They do what is correct NOT what the std says. Of course you want to delete it, it doesn't fit your worldview (where std cannot be criticized?). – curiousguy Feb 11 '19 at 10:05