0

How do I extract year from all the data in the column release date (e,g:01-Jan-1995) and assign to a 'Year' column in same dataframe?

One method suggested is to use split str.split() method.

    movie id    movie title release date    unknown Action  Adventure   Animation   Childrens   Comedy  Crime   ... Film-Noir   Horror  Musical Mystery Romance Sci-Fi  Thriller    War Western movie genre
0   1   Toy Story   01-Jan-1995 0   0   0   1   1   1   0   ... 0   0   0   0   0   0   0   0   0   3
1   2   GoldenEye   01-Jan-1995 0   1   1   0   0   0   0   ... 0   0   0   0   0   0   1   0   0   3
2   3   Four Rooms  01-Jan-1995 0   0   0   0   0   0   0   ... 0   0   0   0   0   0   1   0   0   1
3   4   Get Shorty  01-Jan-1995 0   1   0   0   0   1   0   ... 0   0   0   0   0   0   0   0   0   3
4   5   Copycat 01-Jan-1995 0   0   0   0   0   0   1   ... 0   0   0   0   0   0   1   0   0   3
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1676    1678    Mat' i syn  06-Feb-1998 0   0   0   0   0   0   0   ... 0   0   0   0   0   0   0   0   0   1
1677    1679    B. Monkey   06-Feb-1998 0   0   0   0   0   0   0   ... 0   0   0   0   1   0   1   0   0   2
1678    1680    Sliding Doors   01-Jan-1998 0   0   0   0   0   0   0   ... 0   0   0   0   1   0   0   0   0   2
1679    1681    You So Crazy    01-Jan-1994 0   0   0   0   0   1   0   ... 0   0   0   0   0   0   0   0   0   1
1680    1682    Scream of Stone (Schrei aus Stein)  08-Mar-1996 0   0   0   0   0   0   0   ... 0   0   0   0   0
Mehdi
  • 1
  • 1
  • 1
    Does this answer your question? [python pandas extract year from datetime --- df\['year'\] = df\['date'\].year is not working](https://stackoverflow.com/questions/30405413/python-pandas-extract-year-from-datetime-dfyear-dfdate-year-is-not) – MrNobody33 Aug 18 '20 at 22:45

1 Answers1

1

You're correct, one of the ways you can do it is using str.split(...) like this:

df["Year"] = df["release date"].str.split("-").str[-1]

Or, you could use the pandas datetime functionality. First, you cast the data type to datetime, and then you access "year" using the dt accessor:

df["Year"] = pd.to_datetime(df["release date"]).dt.year

Note that the first method will result in "Year" being a string, and the second method will result in "Year" being an int.

Steven Rouk
  • 893
  • 7
  • 9