In a certain moment of my code I have two different typed Datasets. I need data from one to filter data to the other. Assuming there is no way to change the code from this point back, is there any way to do what I'm describing in the comment below without collecting all data from report2Ds and use it inside Spark function?
Dataset<Report1> report1Ds ...
Dataset<Report2> report2Ds ...
report1Ds.map((MapFunction<Report3>) report -> {
String company = report.getCompany();
// get data from report2Ds where report2.getEmployeer().equals(company);
}, kryo(Report3.class));
Any suggestion, or even help on better designs to avoid cases like this, will be really appreciated.