The (f)lex interface does not separate buffer management from lexical analysis, so there is no mechanism to just "call another lexer" in the middle of a scan. [Note 1]
However, you can easily equip the generated scanner to handle different lexical contexts, using start conditions. Each condition defines a completely independent set of rules, so as long as the scanners can be made to use the same semantic value type (in C, this would involve expanding the semantic union) you can pack any number of different token sets into one generated grammar. [Note 2]
The tricky part is getting the transitions right, and that would be a little tricky even if the interface were more cooperative. LALR(1) grammars, as generated by common parser generators, rely on advance reading of a lookahead token. When a parser action is executed, the next input token will normally already have been processed. So if a parser action invokes the scanner procedure to change state, that state change will often not take place until the token after the lookahead token.
But even that cannot be guaranteed, because the parser might choose to execute a reduction action without asking for a lookahead token, in the case that the reduction woyld be performed regardless of what token follows. Bison implements this optimisation but not all parser generators do, and not all of them even document whether or not they do it.
So if at all possible, it is useful to make the delimiter strings be the same token in both languages. If this is not possible, it might also be possible to write the grammar in such a way that both tokens have the same effect. There are a variety of other parser/scanner-specific hacks available depending on circumstances.
Except in very rare cases, it will not be possible to trigger the transition in a scanner action, because that would require duplicating a significant part of the parse into the scanner. But in rare cases, when the delimiter strings are particularly identifiable, it may be possible.
Examples "in the wild" of all these strategies are available, including in the code of flex itself (which needs to recognize regular expressions in one context and embedded C code in another one).
Notes:
It is hard to criticise a design decision made so long ago. It is totally understandable. But IMHO, it's really unfortunate that we still have to suffer from it. Control-flow inverted scanners would be useful for a variety of applications, such as online parsing from an input stream inside of an event loop.
With flex, you don't have to make start conditions completely independent. Conditions can be "exclusive" (only patterns specifically marked as part of that condition apply) or non-exclusive (unmarked patterns also apply). Which option you take depends in part on how similar the tokenisations are.