0

I would like to get a normalized list of German verb conjugations starting out withe the wiktionary XML dump.

I think I can manage to parse the XML dump, but I don't understand how wiktionary translates a Flexion template into a normalized display like for instance https://de.wiktionary.org/wiki/Flexion:lesen

which seems to be expanded from:

{{Deutsch Verb unregelmäßig|2=les|3=las|4=läs|5=gelesen|6=lies|7=-s|8=i|vp=ja|zp=nein|gerund=ja}}

Pointers to this normalization code would be hugely appreciated. I found a number of XML parsers for wiktionary on GitHub, but none seem to cover verb conjugations, and other question about wiktionary don't seem to cover this either.

Many thanks in advance

  • I found some info on irregular verbs here: https://de.wiktionary.org/wiki/Vorlage:Deutsch_Verb_unregelm%C3%A4%C3%9Fig But I don't know if there is a complete set of templates with adequate documentation – Klapaucius Klapaucius Sep 14 '20 at 22:44
  • More: https://github.com/zavsil/SAPA/blob/563fd3fe6645b86db9800b64ea922c96c3b9cb26/lib/CorrectorOrtografico/src/com/inet/jorthodictionaries/BookGenerator_de.java Also, a writeup at https://home.uni-leipzig.de/horst-rothe/verbform.htm It would still be nice to have an authoritative piece of code in github etc. – Klapaucius Klapaucius Sep 15 '20 at 16:13

0 Answers0