I'm currently working on a toy language that works like this: one can embed blocks written in this language into a C++ source, and before compilation, these blocks are translated into C++ in an extra preprocessing step, producing a valid C++ source.
I want to make sure that these blocks can always be identified in the source unambiguously and also, whenever such a block is present in the source, it cannot be valid C++. Moreover, I want to achieve these by putting as few constraints to the embedded language as possible (the language itself is still somewhat fluid).
The obvious way would be to introduce a pair of special multi-character parentheses, made of characters that cannot appear together in valid C++ code (or in the embedded language). However, I'm not sure how to ensure that particular a character sequence is good for this purpose (not after GotW #78, anyway (: ).
So what is a good way to escape these blocks?