I have two tables that their shared columns do not exactly match (differences in capital character or existence of some characters like comma,space and ...). How can I merge these two tables based on their shared column (in R, Knime, Excel-power query or sql)?
1 Answers
In your example Result table it's not clear where the row
gene1 | go3 | 14
comes from, because there's no entry for go3
in Table2. I'm assuming that's a mistake and you meant Table2 to include the row
go3 | 14
If that's correct, here's how to do this in KNIME:
The two Table Creator nodes just create the two tables with column names as shown in your example - replace these with your actual data sources. Cell Splitter splits column Goes
using a comma as the delimiter. The Unpivoting node is configured like this:
and the Joiner like this:
All other settings were left as default. Add nodes to reorder and filter the columns in the Joiner output if you need to. Note that you'll see different Goes_Arr[n]
columns depending on how many different values of Goes
there are - the Enforce exclusion
and Enforce inclusion
settings make sure that Unpivoting handles this correctly.
This workflow should cope with whitespace between the commas, but I think you also mention differences in capital letters - if you need to handle these, pass each table through a Case Converter node to make them consistent.
Pivoting and unpivoting are hard to understand (IMHO - especially given the cryptic descriptions of their KNIME nodes) but very powerful. I recommend taking time to play around with these nodes to figure out how they work.

- 5,988
- 1
- 20
- 27