I have a large table with tweets containing a username and tweet text per row. The tweet text has often mentionings (@username). I want to extract all usernames and build a new table for a social network analysis, where each row has one mentioning.
Example rows:
|-------------------|--------------------------------------|
| username | tweet |
|-------------------|--------------------------------------|
| userA | great stuff @userC and @userB |
|-------------------|--------------------------------------|
| userB | thanks for mentioning @userE |
|-------------------|--------------------------------------|
Shall result in:
|-------------------|--------------------------------------|
| tweet_by | repied_to |
|-------------------|--------------------------------------|
| userA | userC |
|-------------------|--------------------------------------|
| userA | userB |
|-------------------|--------------------------------------|
| userB | userE |
|-------------------|--------------------------------------|
I found this question, but I could not figure out a solution with split() and regexp_extract: Transform data in Google bigquery - extract text, split it into multiple columns and pivoting the data