I have a pyspark dataframe with a lot of columns, and I want to select the ones which contain a certain string, and others. For example:
df.columns = ['hello_world','hello_country','hello_everyone','byebye','ciao','index']
I want to select the ones which contains 'hello' and also the column named 'index', so the result will be:
['hello_world','hello_country','hello_everyone','index']
I want something like df.select('hello*','index')
Thanks in advance :)
EDIT:
I found a quick way to solve it, so I answered myself, Q&A style. If someone sees my solution and can provide a better one I will appreciate it