10

I want to configure clang-format to sort in C++ the included headers as follows:

  • main header (associated with the current cpp file),
  • local headers included via "",
  • other headers included via <>,
  • headers from specific external libraries (e.g. boost, catch2),
  • system/standard headers.

I'm using clang-format 8.0.0 on macOS. My current configuration (snippet related only to includes) is as follows:

SortIncludes: true
IncludeBlocks: Regroup
IncludeCategories:
  # Headers in <> without extension.
  - Regex:           '<([A-Za-z0-9\/-_])+>'
    Priority:        4
  # Headers in <> from specific external libraries.
  - Regex:           '<((\bboost\b)|(\bcatch2\b))\/([A-Za-z0-9.\/-_])+>'
    Priority:        3
  # Headers in <> with extension.
  - Regex:           '<([A-Za-z0-9.\/-_])+>'
    Priority:        2
  # Headers in "" with extension.
  - Regex:           '"([A-Za-z0-9.\/-_])+"'
    Priority:        1

In this configuration I assume, that system/standard headers are without extension. It will not work for UNIX/POSIX headers. Main header is automatically detected and assigned the priority 0. So far, all seems working as expected, except for the category for external libraries. It looks like clang-format is assigning it to the priority 2.

Expected result:

#include "test.h"

#include <allocator/region.hpp>
#include <page.hpp>
#include <page_allocator.hpp>
#include <test_utils.hpp>
#include <utils.hpp>
#include <zone_allocator.hpp>

#include <catch2/catch.hpp>     // <--------

#include <array>
#include <cmath>
#include <cstring>
#include <map>

Actual result:

#include "test.h"

#include <allocator/region.hpp>
#include <catch2/catch.hpp>     // <--------
#include <page.hpp>
#include <page_allocator.hpp>
#include <test_utils.hpp>
#include <utils.hpp>
#include <zone_allocator.hpp>

#include <array>
#include <cmath>
#include <cstring>
#include <map>

How to configure priority 3 to have the expected result?

eclipse
  • 693
  • 9
  • 30

3 Answers3

5

I got it working by using and modifying an example from clang-format docs for this option:

SortIncludes: true
IncludeBlocks: Regroup
IncludeCategories:
  # Headers in <> without extension.
  - Regex:           '<([A-Za-z0-9\Q/-_\E])+>'
    Priority:        4
  # Headers in <> from specific external libraries.
  - Regex:           '<(catch2|boost)\/'
    Priority:        3
  # Headers in <> with extension.
  - Regex:           '<([A-Za-z0-9.\Q/-_\E])+>'
    Priority:        2
  # Headers in "" with extension.
  - Regex:           '"([A-Za-z0-9.\Q/-_\E])+"'
    Priority:        1

In particular, I changed the priority 3 regex to be more like from the original example:

'^(<|"(gtest|gmock|isl|json)/)'

Also, I added the \Q and \E modifiers to avoid the problem mentioned by the Julio. Now everything works as expected. However I still don't know why the solution from the question post doesn't work.

eclipse
  • 693
  • 9
  • 30
  • 1
    Probably word boundaries are not supported on posix ere regexes. Try with your original rules and remove all `\b` from the second rule – Julio Apr 22 '19 at 14:42
  • Yep, you're right. Without \b original approach works. Thanks! – eclipse Apr 22 '19 at 14:51
  • If you use the original approach bear in mind of scaping the `-` or put it at the end of the class `[A-Za-z0-9\/_-]`. I'll add my comment as an answer, as it seems to address the original problem – Julio Apr 22 '19 at 17:50
4

The problem is that Clan-format uses POSIX ERE regexes. And those do not support word boundaries.

So <catch2/catch.hpp> will never match the second rule. Then, the same string is evaluated for the third rule, that matches.

If it had matched the second rule, It would have stopped there, but since it hadn't, it goes on with next rule.

Just remove all \b on the regex. It is safe to remove them because you already have word boundaries: at the left you have < and to the right you have / so even if you could use word boudaries, it would be useless.

  - Regex:           '<(boost|catch2)\/([A-Za-z0-9.\/-_])+>'
    Priority:        3

NOTE: Bear in mind that - inside [] should be scaped with a backslash unless It is placed on the last position. That is because It is used for ranges. So when you write [A-Za-z0-9.\/-_] you mean A-Za-z0-9. or range from / to _ which probably you don't mean to be like that.

Julio
  • 5,208
  • 1
  • 13
  • 42
  • Unfortunately neither of these ways work. In the first case (reordering) nothing changes and in the second one catch2 header is still in the same category + system headers are moved in the middle. I think clang-format is expecting something different there. – eclipse Apr 22 '19 at 10:28
  • A pity, It seems that clang format uses POSIX ERE regexes, that do not support lookahead assertions. Anyways, I think It is still doable, I'll give it a thought... – Julio Apr 22 '19 at 10:35
  • Do you have many libs like boost and catch2 or just those 2? @eclipse – Julio Apr 22 '19 at 10:45
  • Well, in this particular project only those 2, but ideally I would like to have a generic solution for other projects. Why does it matter? – eclipse Apr 22 '19 at 10:46
  • Because even It is doable it gets tricker with every lib. Take a look at this example: regex101.com/r/3x2Oxr/3 For every lib the regex grows. Perhaps you could create your own script for generating such regexes given some input libs. @eclipse – Julio Apr 22 '19 at 11:08
  • It looks very complicated... I managed to find the solution (see my own answer), but it still doesn't tell me why the original approach doesn't work. Thanks for your effort anyway. – eclipse Apr 22 '19 at 12:08
0

If you follow the convention that you put all the external headers in <> and local headers in "", you can use the following config file. It will sort the headers nicely based on their category:

.clang-format

SortIncludes: CaseSensitive
IncludeBlocks: Regroup
IncludeCategories:
  # Specific external headers in <> to put first
  - Regex: '<(catch2|gtest).*>'
    Priority: 1
  # External headers in <> with extension or /
  - Regex: '<[-\w\/-_]+[\.\/][-\w\/-_]+>'
    Priority: 2
  # Standard headers in <>
  - Regex: '<[-\w\/-_]+>'
    Priority: 3
  # Local headers in ""
  - Regex: '"[-\w\/-_]*"'
    Priority: 4

That will sort the includes like this:

#include <catch2/catch_test_macros.hpp>

#include <Eigen/Core>
#include <fmt/format.h>
#include <yaml-cpp/yaml.h>

#include <algorithm>
#include <cmath>
#include <concepts>
#include <filesystem>

#include "../some_other_local_header"
#include "./some_local_header.hpp"
Amin Ya
  • 1,515
  • 1
  • 19
  • 30