The link from camlspotter has a nice overview of ML history, and mentions an implementation by Luca Cardelli called "Cardelli's ML". I poked around for that, and found this paper: ML under Unix. Luca Cardelli describes an implementation of "ML", and I'm pretty sure this would predate Standard ML as it's dated 1983. This is the list of features in the abstract:
- interactive
- strongly typed
- polymorphic type system
- abstract data types
- exceptions
- modules
This is a pretty good list, although some parts seem unclear. I think this list could serve as an informal definition of what features a language should have in order to be "considered an ML", however there are a couple of things worth noting.
The requirement that the system be "interactive" is a somewhat nit-picky implementation detail, perhaps specific to the implementation described in this paper. The Standard ML compiler MLton doesn't have an interactive REPL (because it's a whole-program optimizing compiler), however I doubt anyone seriously suggests the language MLton implements isn't ML.
Furthermore, "strongly typed" is pretty vague, so it's worth reading the rest of that paragraph for more context:
Every ML expression has a type, which is determined
statically. The type of an expression is usually automatically inferred by the system,
without need of type definitions. The ML type system guarantees that any expression that
can be typed will not generate type errors at run time. Static typechecking traps at compile-time a large proportion of bugs in programs.
This list also doesn't mention pattern matching at all, however the paper does cover pattern matching, although I don't know if the ur-ML used in LCF had pattern matching, and if not, how one would manipulate data types without it. I would argue that in 2013, a language with these features, but lacking pattern matching, would be a tough to sell as an ML.
Note that Haskell mostly conforms to this list, if you squint a bit. But in practice it diverges enough that I think most people consider Haskell inspired by ML, but "not an ML", mostly because Haskell is pure and lazy while ML has historically been impure and strict. Furthermore, the ML module system, in both SML and OCaml, differs quite a bit from Haskell's, and neither of the MLs have typeclasses.
This isn't an exhaustive reply to all your questions, but I hope it helps nonetheless.