1

So I wrote some code to turn a list of strings into date times:

s = pd.Series(["14 Nov 2020", "14/11/2020", "2020/11/14", 
          "Hello World", "Nov 14th, 2020"])
s_dates = pd.to_datetime(s, errors='coerce', exact=False)
print(s_dates)

It produced the following output:

0   2020-11-14
1   2020-11-14
2   2020-11-14
3          NaT
4   2020-11-14
dtype: datetime64[ns]

How would I obtain just the year from this?

jdona13
  • 13
  • 3

2 Answers2

3

Since your seriess_dates has dtype datetime64[ns], you can directly use Series.dt.year like:

print(s_dates.dt.year)

This will return a series containing only the year (as dtype int64).

Check the documentation for more useful datetime transformations.

hc_dev
  • 8,389
  • 1
  • 26
  • 38
0

Assuming your years would always be 4 digits, we can try using str.extract here:

s_dates["year"] = s_dates["dates_extracted"].str.extract(r'(\d{4})')
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360