First of all, I have many thanks for this and this answers. They became the starting point of my analysis of the problem.
Those answers present two different approaches in achieving the goal of "whitelisting" by paths. The first one rebuilds the whitelist paths structure from scratch (i.e. starting from the empty object creates the needed routes). The implementation parses the string paths and tries to rebuild the tree based on the parsed path. This approach needs very handy work of considering all possible types of paths and therefore might be error-prone. You can find some of the mistakes I have found in my comment to the answer.
The second approach is based on the json.net object tree API (Parent, Ancestors, Descendants, etc. etc.). The algorithm traverses the tree and removes paths that are not "whitelisted". I find that approach much easier and much less error-prone as well as supporting the wide range of cases "in one go".
The algorithm I have implemented is in many points similar to the second answer but, I think, is much easier in implementation and understanding. Also, I don't think it is worse in its performance.
public static class JsonExtensions
{
public static TJToken RemoveAllExcept<TJToken>(this TJToken token, IEnumerable<string> paths) where TJToken : JContainer
{
HashSet<JToken> nodesToRemove = new(ReferenceEqualityComparer.Instance);
HashSet<JToken> nodesToKeep = new(ReferenceEqualityComparer.Instance);
foreach (var whitelistedToken in paths.SelectMany(token.SelectTokens))
TraverseTokenPath(whitelistedToken, nodesToRemove, nodesToKeep);
//In that case neither path from paths has returned any token
if (nodesToKeep.Count == 0)
{
token.RemoveAll();
return token;
}
nodesToRemove.ExceptWith(nodesToKeep);
foreach (var notWhitelistedNode in nodesToRemove)
notWhitelistedNode.Remove();
return token;
}
private static void TraverseTokenPath(JToken value, ISet<JToken> nodesToRemove, ISet<JToken> nodesToKeep)
{
JToken? immediateValue = value;
do
{
nodesToKeep.Add(immediateValue);
if (immediateValue.Parent is JObject or JArray)
{
foreach (var child in immediateValue.Parent.Children())
if (!ReferenceEqualityComparer.Instance.Equals(child, value))
nodesToRemove.Add(child);
}
immediateValue = immediateValue.Parent;
} while (immediateValue != null);
}
}
To compare the JToken instances it's necessary to use reference equality comparer since some of JToken types use "by value" comparison like JValue does. Otherwise, you could get buggy behaviour in some cases.
For example, having source JSON
{
"path2":{
"path2Inner2":[
"id",
"id"
]
}
}
and a path $..path2Inner2[0]
you will get the result JSON
{
"path2":{
"path2Inner2":[
"id",
"id"
]
}
}
instead of
{
"path2":{
"path2Inner2":[
"id"
]
}
}
As far as .net 5.0 is concerned the standard ReferenceEqualityComparer
can be used. If you use an earlier version of .net you might need to implement it.