I think the right approach here is to separate the specification of what a test needs to do (the test cases, but also the setup and teardown activities) from the specification of how to run them all (such a specification might be called the 'test execution strategy').
An additional reason for doing this is that often it isn't really possible to separate test cases from setup actions.
Rather, you have a bunch of actions you can perform: some test the application, some prepare for such tests, and some do both.
For instance, a web application workflow typically looks like
- open a particular URL
- log in as a specific user
- performing some activity by interacting with the site (this can be repeated indefinitely)
- (optionally) log out
All four steps are valid test cases on their own: they may fail in ways that imply the application is misbehaving.
However, they can't be run in isolation: step 1 must be executed prior to step 2, and 2 must be run prior to 3 or 4. The steps are dependent.
Therefore it makes sense to try various execution strategies:
- Run all of this in sequence; on failure, blame the failure on the last step executed. So we get the test sequence 1, 2, 3, 4.
- For each step, create a separate test sequence to test that step: so we get the test sequences 1; 1, 2; 1, 2, 3; 1, 2, 4.
The second strategy could be called the 'autonomous' strategy. Other strategies are possible: e.g. you may want to run both 1, 2, 4 and 1, 2, 3, 4.
The specification of test cases will list all of the test cases with the dependencies between them.
It is best to list dependencies by listing preconditions: specific conditions on the application state that must hold prior to executing a test step. Some preconditions hold initially, some are postconditions (they will hold after successfully executing a particular step). A dependency is a precondition that is also a postcondition.
For instance, the state of being logged in is a precondition of step 3 and 4 and a postcondition of step 2.
So you end up with a graph of test steps and conditions.
Actually, a hypergraph, as each step may be associated with multiple pre- and postconditions.
Furthermore, each step may have variable input and variable observable output. The output is what you'll be using to determine the current state.
So the result is a Mealy machine, if each step has at most one precondition and at most one postcondition, and with multiple pre- and postconditions, something more akin to a colored Petri net or an abstract state machine.
If the application is well-designed, all relevant states you need to reach are both
- reachable: you can design and execute steps to take you there
- observable: you can design and execute steps such that their output will effectively tell you whether you've successfully reached that state or not
Then, testing basically consists of walking the hypergraph and observing the outputs to verify that all possible steps take the application into the expected state.
A test execution strategy is a systematic way to walk that hypergraph in order to make sure every possible step gets tested.
The objective will be to end up testing everything you need to test. But that doesn't necessarily mean you need to make the tests as autonomous as possible. If you can be confident you can always blame failure of a test sequence on the last step executed, the first strategy is also viable, and a lot faster.
Things may get more complicated:
- with more complex graphs, the choice between execution strategies also becomes more complex
- sometimes, what you want to test is not a particular step, but a particular path through the graph (but this implies there is a hidden precondition you should really specify)
- you may want to execute the same steps with different input data or pick steps depending on the output from previous steps
This is all the more reason for separating the specification of test steps from the specification of test execution. I think specifying these things, in one form or another, is far more important than whether or not your test sequences end up being as autonomous as possible.
So I'm a little surprised none of this is discussed in the Selenium documentation.