1

I have a lot of data frame such as [COMPANY]in my html text file which I want exclude while Deepl translating my text. I use Deepl Java lib with api and not allowed to change the data frame format.

Any Idea how to exclude df[TEXT] from translation?

Example text:

Dear client,

Please find enclosed [EVENT] for the order you wish to execute for your account [ACCOUNT_NAME_TEXT].


Kind regards,

[COMPANY_NAME]

html file

<!DOCTYPE html>
<html>
    <head>
    </head>
    <body>
        <p>Dear client,</p>
        <p>Please find enclosed the Events for the order you wish to execute for your account [ACCOUNT_NAME_TEXT].</p>
        <p>&#160;</p>
        <p>Kind regards,</p>
        <p>[COMPANY_NAME]</p>
    </body>
</html>
lsa
  • 63
  • 6

1 Answers1

2

For now, I solved it by parsing my df[TEXT] to ignore tag before translating and setting it back to the original. see the below method, it may help someone with the same request.

  private static final String BEGIN_IGNORE_TAG = "<loveIgnoreTag>";
  private static final String END_IGNORE_TAG = "</loveIgnoreTag>";

  public String translate( String source , String target, String text )
                throws DeepLException, InterruptedException
        {
            //https://www.deepl.com/docs-api/xml/ignored-tags/
            ArrayList<String> ignoreTags = new ArrayList<>( ) ;
            ignoreTags.add( "loveIgnoreTag" );
            
            text = parseToIgnoreTage(text);
            
            TextTranslationOptions translationOptions = new TextTranslationOptions( )
                    .setTagHandling( "xml" )
                    .setFormality( Formality.PreferMore )
                    .setPreserveFormatting( true )
                    .setIgnoreTags( () -> ignoreTags.iterator( )  )
                    .setSentenceSplittingMode( SentenceSplittingMode.All );
    
            TextResult result = translator.translateText( text, source, target, translationOptions );
            String translationResult = parseToDataFrame(result.getText( ));     
            return translationResult;
        }
    
        private String parseToIgnoreTage( String text )
        {
            text = text.replace( "[", BEGIN_IGNORE_TAG ).replace( "]", END_IGNORE_TAG );
            return text;
        }
    
        private String parseToDataFrame( String result )
        {
            result = result.replace(BEGIN_IGNORE_TAG,"[" ).replace(  END_IGNORE_TAG, "]" );
            return result;
        }
lsa
  • 63
  • 6
  • Thats a nice idea :) Our tag handling currently does not support the brackets you're using. You'd need to replace your [] brackets with <> to make it work. – Tim Cadenbach Oct 14 '22 at 10:36