1

Hello this is my first Question and I tried everything.

I'm new to Dataframe and SQL and I have an excel table with multiple Movies. It has the Columns Movie Name, Year ,...,and Actors. In the Column Actors are multiple Actors listet, which are separated by , or by /. I created a Dataframe Movies which have a UUID, and all the information I should store for a Movie. I also filtered all Actors which are in all my Movies and created a DataFrame called Actors with also a Unique ID and Name and stored them individually.

DF_MOVIE:

Movie ID Actors
x Person1,Person2,Person5
y Person1

DF_ACTORS:

ActorID Actor
a Person1
b Person 5

What I want to create is a new Dataframe which shows the Relationship between them.

For Example:

DF_ACTORS_MOVIE:

MOVIEID ACTORID
x a
x b

The Problem is: there Actors in my Excel table (and Bookdf). And I want to take both IDS.

Does anybody have an Idea? would appreciate, Thanks..:

I tried iterating both the Dataframe books and tried to create a new Entry on the DF_ACTORS_MOVIE Dataframe with both UUIDS. The Problem is, that there are multiple Actors and they are divided by , and by /.

Jamiu S.
  • 5,257
  • 5
  • 12
  • 34
Jychnn
  • 13
  • 2

1 Answers1

0

You can first explode the Actors into a new column:

out = df_actors.merge(df_movie.assign(Actor=df_movie['Actors'].str.split(','))
                                            .explode('Actor').drop(columns='Actors')
                      )

Output:

  ActorID    Actor Movie ID
0       a  Person1        x
1       a  Person1        y
2       b  Person5        x
mozway
  • 194,879
  • 13
  • 39
  • 75
  • It shows me this error: pandas.errors.MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False – Jychnn Dec 20 '22 at 13:06
  • Which pandas version do you have? This work fine in the lastest release. Ensure `df_actors` has an "Actor" column – mozway Dec 20 '22 at 13:20