I have a large data frame called data_frame
with two columns PRE
and STATUS
that look like this:
PRE STATUS
1_752566 GAINED
1_776546 LOST
1_832918 NA
1_842013 LOST
1_846864 GAINED
11_8122943 NA
11_8188699 GAINED
11_8321128 NA
23_95137734 NA
23_95146814 GAINED
What I would like is to create a new column CHR
with only the number(s) before the underscore and make sure they are matched up next to the original column correctly like this:
PRE STATUS CHR
1_752566 GAINED 1
1_776546 LOST 1
1_832918 NA 1
1_842013 LOST 1
1_846864 GAINED 1
11_8122943 NA 11
11_8188699 GAINED 11
11_8321128 NA 11
23_95137734 NA 23
23_95146814 GAINED 23
From here I'd like to group CHR
by number and then find the sum of each group. If possible, I would like a new data table showing the sums of each group number like this:
NUM SUM
1 5
11 3
23 2
I would then plot this to visualize the sums of each number where my x-axis is NUM
and my y-axis is SUM