Functions

Date

YEAR

# How to Use Excel's YEAR Function in Pandas

Excel's YEAR function extracts the year as a number from a time value. It's especially useful when working with large datasets where you need to analyze data at yearly granularity.

This page explains how to implement Excel's YEAR function in Python using pandas.

## Implementing the Year Extraction function in Pandas#

Recreating Excel's YEAR function behavior in Python requires a combination of pandas operations. Here are some common implementations:

### Extracting Year from Datetime#

In Excel, you can use the =YEAR function to extract the year from a datetime. Similarly, in pandas, you can the `.dt` accessor followed by the `year` attribute to extract the year.

For example, in Excel you might use =YEAR(A2). In pandas:

`df['Year'] = df['Datetime_Column'].dt.year`

### Use Mito's YEAR function

Mito is an open source library that lets you write Excel formulas in Python. Either write the formula directly in Python or use the YEAR formula in the Mito Spreadsheet and generate the equivalent Python code automatically.

Mito's YEAR function works exactly like it does in Excel. That means you don't need worry about managing data types, handling errors, or the edge case differences between Excel and Python formulas.

Install Mito to start using Excel formulas in Python.

```
# Import the mitosheet Excel functions
from mitosheet.public.v3 import *;
# Use Mito's YEAR function
# Note: no need to convert the Date columns to a datetime
# because Mito's YEAR formula does so automatically
df['year'] = YEAR(df['Date'])
```

### Converting string to datetime and then extracting year#

Often, Pandas will infer the data type of your column as string, even if the data to you looks like a date, ie: 1/2/23. In these cases, you need to convert the string to datetime before extracting the year.

To do this in pandas, first use `pd.to_datetime` to convert the column to a datetime column, and then extract the year:

```
# Convert the string to datetime
df['Datetime_Column'] = pd.to_datetime(df['String_Column'])
# Extract the year from the datetime column
df['Year'] = df['Datetime_Column'].dt.year
```

### Grouping Data by Year#

There are situations where you want to aggregate data based on year. In Excel, you might use a pivot table after extracting the year. Similarly, in pandas, after extracting the year, you can use the `groupby` method

For example, if you have a column called 'Date' and a column called 'Website Traffic', you might want to group the data by year and sum the traffic for each year.

```
df['Year'] = df['Date'].dt.year
grouped_data = df.groupby('Year').agg({'Website Traffic': 'sum'}).reset_index()
```

## Common mistakes when using YEAR in Python#

While implementing the YEAR function equivalent in pandas, a few common pitfalls might occur. Here's how to navigate them.

### Incorrect datatypes#

The `.dt` accessor is exclusive to pandas Series with datetime64 data types. Using it on non-datetime columns will raise an AttributeError.

For example, if you have a column called 'Date', but it actually has an object data type, you'll need to convert it to datetime before using the `.dt` accessor. You can check the data type of a column using `df.dtypes`.

```
# Ensure the column is of datetime dtype
df['Datetime_Column'] = pd.to_datetime(df['Datetime_Column'])
df['Year'] = df['Datetime_Column'].dt.year
```

### Forgetting to Handle Null Values#

If your dataset has missing or NaT (Not-a-Timestamp) values in the datetime column, trying to extract the year from them will result in NaN (Not a Number) values. Make sure to handle or filter them out as necessary.

```
# Drop rows with NaT values before extracting year
df.dropna(subset=['Datetime_Column'], inplace=True)
df['Year'] = df['Datetime_Column'].dt.year
```

## Understanding the Year Extraction Formula in Excel#

The YEAR function in Excel returns the year of a time value.

=YEAR(serial_number)

### YEAR Excel Syntax

Parameter | Description | Data Type |
---|---|---|

serial_number | The time value from which you want to extract the year. | A valid Excel time |

### Examples

Formula | Description | Result |
---|---|---|

=YEAR("10/05/2023 12:45 PM") | Extracts the year from the given time. | 2023 |

=YEAR("8-June-1997 2:31 PM") | Extracts the year from the given time. | 1997 |

=YEAR("2/22/2004") | Extracts the year from the given time. | 2004 |

**Don't re-invent the wheel. Use Excel formulas in Python.**

## Don't want to re-implement Excel's functionality in Python?

## Edit a spreadsheet.

Generate Python.

Mito is the easiest way to write Excel formulas in Python. Every edit you make in the Mito spreadsheet is automatically converted to Python code.

View all 100+ transformations →