I have an array consisting of 1.5 millions pairs of elements (separated by ' '):
$array {
[0] => "element1 element2"
[1] => "element2 element3"
[2] => "element8 element4"
[3] => "element8 element5"
[4] => "element4 element5"
[5] => "element6 element7"
[6] => ...
}
Each pair of element is unique, and elements are strings of 15 to 20 characters.
In my pipeline, this array means [0] "element1 is related to element2",[1] "element2 is related to element3", ... I would like to cluster together all related elements and get an output similar to:
$array_output {
[0] => "element1 element2 element3"
[1] => "element8 element4 element5"
[2] => "element6 element7"
[3] => ...
}
I guess this task is very simple and I'm probably missing an obvious way to do it, but I didn't find a fast way to cluster my elements (i.e from a few minutes to a few hours).