Short Version
What does the SHCIDS_ALLFIELDS
flag of IShellFolder.CompareIDs mean?
Long Version
In Windows 95, Microsoft introduced the shell. Rather than assuming the computer is made up of files and folders, it is made up of an abstract namespace of items.
- rather than paths starting at the root of a drive (e.g.
C:\Documents & Settings\Ian
) - paths start at the root of the namespace (
Desktop
)
And in order to accommodate things that are not files and folders (e.g. network printers, Control Panel, my Android phone):
- you don't use a series of names separated by backslashes (e.g. C: \ Users \ Ian )
- you use pidls, a series of opaque blobs (e.g. Desktop This PC OS (C:) Users Ian)
PIDLs are opaque blobs, each blob only makes sense to the folder that generated it.
In order to extend (or use) the shell namespace, you implement (or call) an IShellFolder interface.
One of the methods of IShellFolder is used to ask a namespace extension to compare to ID Lists (PIDLs):
IShellFolder::CompareIDs method
Determines the relative order of two file objects or folders, given their item identifier lists.
HRESULT CompareIDs( [in] LPARAM lParam, [in] PCUIDLIST_RELATIVE pidl1, [in] PCUIDLIST_RELATIVE pidl2 );
For many years, the LPARAM
was documented to nearly always be 0. From shlobj.h
c. 1999:
// IShellFolder::CompareIDs(lParam, pidl1, pidl2)
// This function compares two IDLists and returns the result. The shell
// explorer always passes 0 as lParam, which indicates "sort by name".
// It should return 0 (as CODE of the scode), if two id indicates the
// same object; negative value if pidl1 should be placed before pidl2;
// positive value if pidl2 should be placed before pidl1.
And so you compared two ID Lists - whatever it meant to compare them, and we were done.
Windows 2000 added additional sorting option flags
Starting with Version 5 of the shell, the upper 16-bits of the LPARAM
can now contain additional flags to control how the IShellFolder should handle sorting.
From ShObjIdl.idl
c. the Windows 8.1 SDK:
// IShellFolder::CompareIDs lParam flags
// *these should only be used if the folder supports IShellFolder2*
//
// SHCIDS_ALLFIELDS
//
// only be used in conjunction with SHCIDS_CANONCALONLY or column 0.
// This flag requests that the folder test for *pidl identity*, that is
// "are these pidls logically the same". This implies that cached fields
// in the pidl that would distinguish them should be tested.
// Without this flag, you are comparing the *object* s the pidls refer to.
//
// SHCIDS_CANONICALONLY
//
// This indicates that the sort should be *the most efficient sort possible*, the implication
// being that the result will not be displayed to the UI: the SHCIDS_COLUMNMASK portion
// of the lParam can be ignored. (Before we had SHCIDS_CANONICALONLY
// we assumed column 0 was the "efficient" sort column.)
Note the important points here:
- SHCIDS_CANONICALONLY is meant to be the fastest most efficient sort we have
- and it doesn't have to be logical from a UI usability perspective; it just has to be consistent
As Raymond Chen pointed out, it's the moral equivalent of a Unicode ordinal comparison.
The header file even notes that we used to just assume column 0 was the "fastest" sort. But now we will use a flag to say "use the fastest sort available":
Before we had
SHCIDS_CANONICALONLY
we assumed column 0 was the "efficient" sort column.
It also notes that you can ignore the lower 16-bits of LPARAM (i.e. the column), because we don't care - we're using the most efficient one.
A lot of this is mirrored in the official documentation:
SHCIDS_CANONICALONLY
Version 5.0. When comparing by name, compare the system names but not the display names. When this flag is passed, the two items are compared by whatever criteria the Shell folder determines are most efficient, as long as it implements a consistent sort function. This flag is useful when comparing for equality or when the results of the sort are not displayed to the user. This flag cannot be combined with other flags.
But with SHCIDS_ALLFIELDS we start to run off the rails
The header file notes that AllFields can only be combined with CanonicalOnly:
only be used in conjunction with SHCIDS_CANONCALONLY or column 0.
But the SDK says that CanonicalOnly must appear alone:
This flag cannot be combined with other flags.
So which is it?
We could decide that the header file is wrong, that the SDK is cannon, and do what it says.
But what is AllFields saying?
There is some concept that AllFields is trying to ask for, but is obscured behind the documentation.
Compare all the information contained in the ITEMIDLIST structure, not just the display names.
ItemIDList doesn't contain a display name, it contains an ItemIDList. Are they trying to say i should only look at the contents of the pidl blob?
- For instance, if the two items are files, the folder should compare their names, sizes, file times, attributes, and any other information in the structures.
In what situation could two references to the *same** file have different names, sizes, file times, attributes, etc?
The SDK examples do something different
The Windows SDK Explorer Data Provider Shell Extension sample (github), seems to act as though CanonicalOnly and AllFields flags would appear together:
HRESULT CFolderViewImplFolder::CompareIDs(LPARAM lParam, PCUIDLIST_RELATIVE pidl1, PCUIDLIST_RELATIVE pidl2)
{
if (lParam & (SHCIDS_CANONICALONLY | SHCIDS_ALLFIELDS))
{
// First do a "canonical" comparison, meaning that we compare with the intent to determine item
// identity as quickly as possible. The sort order is arbitrary but it must be consistent.
_GetName(pidl1, &psz1);
_GetName(pidl2, &psz2);
ResultFromShort(StrCmp(psz1, psz2));
}
// If we've been asked to do an all-fields comparison, test for any other fields that
// may be different in an item that shares the same identity. For example if the item
// represents a file, the identity may be just the filename but the other fields contained
// in the idlist may be file size and file modified date, and those may change over time.
// In our example let's say that "level" is the data that could be different on the same item.
if ((ResultFromShort(0) == hr) && (lParam & SHCIDS_ALLFIELDS))
{
//...
}
}
else
{
//...Compares by the column number in LOWORD of LPARAM
}
So we have completely conflicting documentation, headers, and samples:
SHCIDS_ALLFIELDS
- SDK: can never appear with SHCIDS_CANONICALONLY
- Headers: can appear anytime
- Examples: can only appear with SHCIDS_CANONICALONLY
What is it trying to ask
Windows always assumed that column 0 was the fast column. This may have been because Windows shell API authors assumed that a PIDL's ItemID would always contain the name inside the pidl opaque blob.
This is reinforced by the fact that the shell STRRET structure lets you point to a string inside your pidl.
Bonus Reading: The kooky STRRET structure
And so at some point they added an express flag that says:
- we don't care about localization, and locale specific linguistic sorting rules, and unicode normalization algorithms
- just sort them because we need to find duplicates and check for equality
And that makes sense for the canonical flag:
- just tell me if two IDLists point to the same object
But then what does the SDK example mean when they talk about the All Fields option:
If we've been asked to do an all-fields comparison, test for any other fields that may be different in an item that shares the same identity. For example:
- if the item represents a file, the identity may be just the filename
- but the other fields contained in the idlist may be file size and file modified date, and those may change over time.
If two PIDLs represent the same file what is the point in comparing their size, date, etc? I already told you they were the same file, what are you asking me for with the All Fields flag? Why can't i just do a binary compare of the blobs? Why won't the shell? What does CompareIDs do that
MemCmp(pidl1, pidl2)
doesn't?
- Will
SHCIDS_ALLFIELDS
only appear withSHCIDS_CANONICALONLY
? - Will
SHCIDS_ALLFIELDS
never appear withSHCIDS_CANONICALONLY
? - Can
SHCIDS_ALLFIELDS
appear both with and withoutSHCIDS_CANONICALONLY
? - What does
SHCIDS_ALLFIELDS
withSHCIDS_CANONICALONLY
mean? - What does
SHCIDS_ALLFIELDS
withoutSHCIDS_CANONICALONLY
mean?
What does it want me to do if SHCIDS_ALLFIELDS
is passed? Should i hit the underlying data store to query all fields i know of?
Is CompareIDs used to compare IDs, or is it used to compare objects?
I wondered if the purpose of CompareIDs was to absolutely not hit the underlying data store (e.g. hard disk, phone over USB, Mapi), and only compare based on what you have on-hand in the pidl.
That makes sense for two reasons:
- it's faster; many namespace contain some amount of metadata in their PIDL blobs - no need to go back to disk/Internet
- even though the pidls may refer to the same object, it's possible their metadata is out of date
- SHCIDS_CANONICALONLY lets the caller realize two pidls are the same thing
- but a separate call with
SHCIDS_CANONICALONLY | SHCIDS_ALLFIELDS
can tell us extra metadata may be out of date (although i have no idea what use that information is to the caller)
And so perhaps SHCIDS_CANONICALONLY
means:
- please limit yourself to the pidl - don't touch the disk to perform the compare
- and omitting it means: "Yeah, you can hit the hard drive if you really need to"
Is that the case?
- If
SHCIDS_CANONICALONLY
means: "Don't look at anything besides whats in the pidl, and tell me if these two things are the same object" - Then what is gained by
SHCIDS_ALLFIELDS
? - When would they be different?
- What is the shell asking me?
Bonus Question
- If
SHCIDS_CANONICALONLY
means perform the most efficient sort, - does the lack of
SHCIDS_CANONICALONLY
means it's ok to sort based on localization and customization of the name? - does the lack of
SHCIDS_CANONICALONLY
means it's mandatory to sort based on localization and customization of the name?
What does it mean to "sort" to itemID lists?
The SDK example does a switch
based on each column, and looks up the values for every column. If it means i have to load a video from over a network in order to load the audio sample rate?
- Am i compare PIDLs
- or am i comparing the objects those pidls point to?