Questions tagged [canonicalization]

is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form.

As per Wikipedia:

In computer science, canonicalization (abbreviated c14n, where 14 represents the number of letters between the C and the N; also sometimes standardization or normalization) is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form. This can be done to compare different representations for equivalence, to count the number of distinct data structures, to improve the efficiency of various algorithms by eliminating repeated calculations, or to make it possible to impose a meaningful sorting order.

197 questions
6
votes
2 answers

How to get canonicalized path (realpath) of nonexistent file in PHP?

script.php $filename = realpath(sprintf("%s/%s", getcwd(), $argv[1])); var_dump($filename); Let's try some things [/foo/bar/bof] $ php script.php ../foo.txt string(16) "/foo/bar/foo.txt" [/foo/bar/bof] $ php script.php…
Mulan
  • 129,518
  • 31
  • 228
  • 259
5
votes
1 answer

May a C++ compiler normalize Unicode identifiers?

In C++, we can use a wide variety of Unicode characters in identifiers. For example, you could name a variable résumé. Those accented es can be represented in different ways: either as a precomposed character or as a plain e with a combining accent…
5
votes
1 answer

Understanding Nauty algorithm

I am trying to understand the Nauty algorithm. Following this article: http://www.math.unl.edu/~aradcliffe1/Papers/Canonical.pdf In this algorithm the vertices are distinguished based on their degree and the relative degree of a group corresponding…
Dip
  • 192
  • 15
5
votes
1 answer

Why does my canonicalized path get prefixed with \\?\

I'm working on a personal project that I was trying to solve via canonicalizing a relative path in Rust. However, whenever I do so, the new path gets prefixed with a strange \\?\ sequence. For example, something as simple as: let p =…
MutantOctopus
  • 3,431
  • 4
  • 22
  • 31
5
votes
1 answer

Does (Exclusive) XML Canonicalization ignore whitespaces outside tags (indentation)?

When XML must be canonicalized according to http://www.w3.org/TR/xml-exc-c14n/, should the following pieces of XML become equals? (note, the . character stands for a ' '…
taper
  • 9,236
  • 5
  • 28
  • 29
5
votes
2 answers

Canonicalisation of usernames

What is the best way to get a canonical representation of a username that is idempotent? I want to avoid having the same issue as Spotify: http://labs.spotify.com/2013/06/18/creative-usernames/ I'm looking for a good library to do this in Python. I…
X-Istence
  • 16,324
  • 6
  • 57
  • 74
5
votes
1 answer

Redirect URLs with FQDN (dot after TLD) to equivalent with PQDN

Many websites can be accessed with a FQDN (i.e., appending a dot to the TLD): https://www.ebay.com./ https://www.google.com./ https://www.reddit.com./ https://stackoverflow.com./ https://en.wikipedia.org./wiki/Main_Page Some sites can’t be…
unor
  • 92,415
  • 26
  • 211
  • 360
4
votes
1 answer

Create NodeList of all Document nodes manually

I currently generate a NodeList of all the Document nodes (in document order) manually. The XPath expression to get this NodeList is //. | //@* | //namespace::* My first attempt for walking the DOM manually and collecting the nodes (NodeSet is a…
emboss
  • 38,880
  • 7
  • 101
  • 108
4
votes
1 answer

Is there a Perl6 canonical form?

The Perl6 standard grammar is relatively large. Although this facilitates expression once mastered, it creates a barrier to mastery. For instance, core constructs often have multiple forms supporting different programming paradigms. A basic…
user3673
  • 665
  • 5
  • 21
4
votes
2 answers

Decoding Huffman file from canonical form

I am writing a Huffman file where I am storing the code lengths of the canonical codes in the header of the file. And during decoding, I am able to regenerate the canonical codes and store them into a std::map>. The…
WDRKKS
  • 125
  • 3
  • 11
4
votes
2 answers

google safe browsing api url encoding (canonicalization)

In my application I am checking user-entered urls for malware by sending them to google. To test getting a "malware found" reaction I used the url http://malware.testing.google.test/testing/malware To my surprise this url was not marked as…
4
votes
1 answer

Wordpress auto-generated "canonical" links - how to add a custom URL parameter?

Does anyone know how to modify the Wordpress canonical links to add a custom URL parameter? I have a Wordpress site with a page that queries a separate (non-Wordpress) database. I passed the URL parameter "pubID" to display individual books and it…
4
votes
0 answers

XML Canonicalisation in C#

I am trying to create a canonical form of an XML input by using an XPATH expression however I am not sure if I can receiving the correct output form the document. I am using this expression at the moment but I am expected to be able to use the…
RLW
  • 132
  • 1
  • 9
4
votes
2 answers

How to convert to with libxml (converting empty elements to start-end tag pairs)

While generating an XML content, I get an empty node , and I want it to be . (Since is the correct form of c14n, the progress called "converting empty elements to start-end tag pairs") How should I convert…
iOS Padawan
  • 418
  • 1
  • 5
  • 16
3
votes
0 answers

Spring boot - converting to and from canonical form of property names maintaining nesting?

I have a bunch of Spring configuration files in various formats from various projects. I am doing some analytics on the combined merged results of the properties files. So I would like to parse out the hierarchy of each file (yaml, json and…
Nicholas DiPiazza
  • 10,029
  • 11
  • 83
  • 152
1
2
3
13 14