I am a bit stuck with a simple task of aggregating two example sets using rapidminer and need an advise on a process and possible operators.
I have two CSVs like:
key A B C D E
first 2 2 2 2 2
second 3 3 3 3 3
and
key D E F G H I J K
first 4 4 4 4 4 99 99 99
third 5 5 5 5 5 77 77 77
What I want to have is an aggregated dataset which is basically a Union of these two, but also I would like to make the 'key' attribute a uniqie one in the resulting dataset. Since the 'first' value appears in both datasets, I need to sum the corresponding values, so the resulting dataset would be the following:
key A B C D E F G H I J K
first 2 2 2 6 6 4 4 4 99 99 99
second 3 3 3 3 3 ? ? ? ? ? ?
third ? ? ? 5 5 5 5 5 77 77 77
Note that 'First D' and 'First E' cells have become 6 (2+4).
Question marks are missing values as produced by the Union operator and will be replaced later.
Thanks in advance.