0

I am looking to grab all unique IDs in DF_2015 by comparing it to DF_2014 using the "Employee_ID" column. The logic is to analyze all 2015 new hires.

The problem is that I have different lengths for both DataFrames; however both share the same number of columns and column names. DF_2015 length is 1219 while DF_2014 length is 1356.

I have tried to reset the Index for both DataFrames.

The following is the code that I attempted:

DF_14_15= np.where [(DF_2015['Employee_ID'] != DF_2014['Employee_ID'])]

I am getting the following error:

~\Anaconda3\lib\site-packages\pandas\core\series.py in _cmp_method(self, other, op)

   5494

   5495         if isinstance(other, Series) and not self._indexed_same(other):

-> 5496             raise ValueError("Can only compare identically-labeled Series objects")

   5497

   5498         lvalues = self._values

 

ValueError: Can only compare identically-labeled Series objects

The "Employee_ID column is int.

MikeA
  • 21
  • 6
  • I do believe I asked the [question](https://stackoverflow.com/q/25435229/2336654) that led to this error message being implemented... maybe. Anyhow, you need to ensure your indices are the same for both series. Or both dataframes in this case. The error is stating that your labels are not the same. The labels are the index values. – piRSquared Apr 13 '22 at 00:37
  • The DataFrame lengths are different, but the index is reset - they both start at 0. – MikeA Apr 13 '22 at 00:51
  • Not good enough. As the error says, they need to be identical. Try `DF_2015['Employee_ID'] != DF_2014['Employee_ID'].reindex(DF_2015.index)` – piRSquared Apr 13 '22 at 03:57

0 Answers0