You are right about one thing being special about ML: data scientists are generally clever people, so they have no problem in presenting their ideas in code. The problem is that they tend to create fire&forget code, because they lack software development craftsmanship - but ideally this shouldn't be the case.
When writing code it shouldn't make any difference what the code is for1. ML is just another domain like anything else, and should follow clean code principles.
The most important aspect always should be SOLID. Many important aspects directly follow: maintainability, readability, flexibility, testability, reliability etc. What you can add to this mix of features is risk of change. It doesn't matter whether a piece of code is pure ML, or banking business logic, or audiological algorithm for a hearing instrument. All the same - the implementation will be read by other developers, will contain bugs to fix, will be tested (hopefully) and possibly refactored and extended.
Let me try to explain this in more detail while addressing some of your questions:
1,2) You shouldn't think that OOP is the goal in itself. If there is a concept that can be modeled as a class and this will make its usage easy for other developers, it will be readable, easy to extend, easy to test, easy to avoid bugs then of course - make it a class. But unless it's needed, you shouldn't follow the BDUF antipattern. Start with free functions and evolve into a better interface if needed.
4) Such complex inheritance hierarchies are typically created to allow implementation to be extensible (see "O" from SOLID). In this case, library users can inherit BaseEstimator
and it's easy to see what methods can they override and how this will fit into scikit's existing structure.
5) Almost the same principles as for code, but with people who will create/edit these config files in mind. What will be the easiest format for them? How to choose parameter names so it will be obvious what do they mean, even for a beginner, who is just starting to use your product?
All these things should be combined with guessing how likely is it that this piece of code will be changed/extended in the future? If you are sure something should be written in stone, don't worry about all aspects too much (e.g. focus only on readability), and direct your efforts to more critical parts of the implementation.
To sum up: think about people who will interact with what you create in the future. In case of products/config files/user interfaces it should be always "user first". In case of code, try to put yourself in the shoes of a future developer who will want to fix/extend/understand your code.
1 There are of course some special cases, like code that needs to be formally proven correct or extensively documented because of formal regulations and this main goal imposes some particular constructs/practices.