Are there any best-practices for config-file documentation, especially for python?
Particularly in scientific computing, it is common to use a config file as the input to control a batch processing job (such as a simulation), and expect the user to customise a substantial portion of the config for their scenario. (The config also likely selects among different processing modules, each possessing different suites of config fields.) Thus, the user ought to know: what each setting means or effects; which settings are unused (in which circumstances); what are the default values (and the permissible values or ranges); etc.
I've found incomplete config file docs to be common. The fundamental problem seems to be that if the docs are maintained separately from the code, they grow out of sync. (This seems less of a problem with API docs due to standard practices involving colocated docstrings and autogeneration from function signatures/argspec.) For example if the standard python configparser is used once to parse the config file, then the code for accessing individual attributes (and implicitly determining the config schema) may still be spread out across the entire code base (and perhaps only available at runtime rather than when building docs).
Further thoughts:
- Is it bad practice to replace a config file (yaml or similar) with a user-customised python script (so as to only need API docs)?
- Distribution of a well commented example config file (that is also used in automatic tests): how to maintain if different scenarios duplicate large sections but need some completely different fields?
- Can a single schema be maintained, both for use in code (to help parse, validate, and set defaults) and to generate docs somehow?
- Is there a human readable/writeable way of (des)serialising the state of some (sub)class instance that represents a new batch process (so that config is covered by existing docs)?