It's heavily opinion based and I wouldn't get surprised if this question got closed for this reason. This answer only touches on few aspects of what you're asking about. It's a very broad topic and it's going to be hard - if not impossible - to include everything in one post which wouldn't end up being several pages long. However to give you my perspective on the topic, while trying to remain unbiased, the short answer is.. it depends.
If you're asking about what is used in most common cases, it's likely going to be the HAL (previously StdPeriph) functions you've mentioned. The reason is - they get the job done in most common cases. After all it always comes down to what the cost of creating a product is going to be. If HAL functions are "good enough" for the purpose, they're going to be used simply because they're faster to develop with. The higher the development cost, the more you'll want to cut it (or move it elsewhere) and using abstractions is one way of doing so.
However, even though I think it's safe to assume that HAL / Std Periph / any other (including proprietary) abstraction layer is generally used, it's not always the case for at least two reasons I can think of:
Existing functions may not be suitable for your purpose. Giving HAL as an example, it works pretty well for most common cases, but sometimes your needs may be so specific that you'll have to go and mess "under the hood", often ending up either writing your own variation of the functions of building something new on top of HAL. Personally I can think of at least few examples where HAL functions weren't exactly what I needed. It doesn't necessarily mean that the library is bad, it's just sometimes the requirements are very specific.
Messing with registers directly may sometimes be required for performance reasons. HAL and similar are an abstraction layer and as any abstraction, they take more time to execute than using the registers directly. If you're trying to squeeze absolute maximum out of given peripheral, you'll sometimes have to go down to register level.
Now to a more biased portion of my answer.. I can see why you ask this question. Coming from PIC world where Flash or CPU clocks were more precious, it does make sense to use registers directly there. In case of STM32, it's not as critical anymore. Having said that, you'll sometimes stumble upon opinions that "using registers is the only true way", but personally I find such discussions ending up being purely academic. I see registers or any abstraction built on top of it as tools and you should use the right tools for the right job. Two examples of NOT using the right tools:
You use only registers as "the only right way" either because you believe it yourself or you've been told so. Your products take twice (if not more) time to develop, your code takes less space in flash (so now you use 46% of 1MB flash instead of 48%). Code that is performance-critical meets its goals. Code that has relaxed execution time constraints is also super efficient, but it doesn't affect the end customer much, if at all. Your code is also less reusable - you find yourself rewriting same portions of code over and over every time you release new product for a new MCU family.
You only use HAL / any other similar abstraction because "you didn't pick so powerful MCU to have to go down to register level", or because you're told you should never ever touch registers. You develop much faster and you're able to release two products instead of just one using registers. However when there are execution time constraints / transmission speeds you have to hit, you find yourself picking MCUs more powerful than should theoretically be needed. Sometimes you find yourself writing wrappers around HAL because they don't give you exactly the functionality you need - it feels like making it more complicated than it should be.
So after all, if there was anything to take out of what I'm trying to say is that you should use what is suitable for the job on a case-by-case basis. In case of STM32, you nowadays have 3 options: HAL (top abstraction level), HAL LL (Low Level abstraction - often simple wrapper functions around register acceses) or using registers directly. Which one you choose should come from what your requirements are.