I want to align source and target sentences in a multilingual translation setting.
Conceptually, I want to do something like the following for an exemplary English source sentence and a German target sentence:
0 1 2 3 4 5 6 7
i saw the man walking on the street
ich sah den mann auf der straẞe gehen
Word-level alignment would be: 0-0 1-1 2-2 3-3 4-7 5-4 6-5 7-6
Or in the case of different lengths between source and target sentence:
0 1 2 3 4 5 6 7 8 9
it is a different way of saying the same thing
es ist eine andere art , dasselbe zu sagen
Word-level alignment should be something like: 0-0 1-1 2-2 3-3 4-4 5-5 6-[7,8] 7-6 8-6 9-6
What's the best way to achieve this? Thanks for any suggestions!