Date-time functions for Python
Python offers a whole bunch of functions and libraries for working with DateTime data. I often use:
- Python’s datetime
- Pandas’s Timestamp
After working with a certain kind of data for a while, we end up using some functions more often than the rest. While the choice of the function depends entirely on one’s experience and comfort level, it pays to know your options. If you are working on time series with raw timestamps or some forms of DateTime objects delivered by various systems and devices, this article should brush-up the popular DateTime functions that will come in handy.
1. Strings from Dates and Date Formats
Here we discuss the python .strftime() function and pandas date_format options in excel and CSV writer of pandas.
Very often, if you save your pandas DataFrame into an excel or a text document, you are unhappy with the yyyy-mm-dd hh:mm:ss default format generated by pandas. This is how I control these outputs.
Let’s work with an example of a data frame with random dates and string codes. Here is the toy dataframe we will use:
start = datetime.datetime(2020,2,1)end = datetime.datetime(2020,2,10)dates = pd.date_range(start = start, end = end,freq=’D’).to_list()codes = [ "".join(choice(ascii_lowercase) for i in range(10)) for j
in range(10)]df = pd.DataFrame(columns=['date','code'])df.date = datesdf.code = codes
Obviously, the dates column is pretty informative but not necessarily pretty looking. We don’t want this magnanimous date on our output report, we’re happy with simply knowing the date in the format, say date/month/year.
If a csv or a text output is your goal, this is pretty easy to change. Just specify the format of output with kwarg date_format in your to_csv function.
When it comes to Excel, you have to be a tad bit careful than that, because excel has a mind of its own when it comes to formatting dates.
Of course, you can just right-click on your format cells and change the format to whatever form you want. But when it comes to slashes, excel is stubborn. It is not one of the format options that excel has to offer (probably because it looks ugly, duh!). And, it is worth not having to interfere with the output in excel each time when python has so much to offer.
python .strftime( ) way:
To achieve the flexibility equivalent to date_format in to_csv, we “trick” excel by converting our datetime object to a string using .strftime( ) and passing it on to excel. (Since we have a Series object, we need a .dt in between to access its values)
df.date = df.date.dt.strftime(“%d/%m/%y”)
use the pandas excel writer and specify the format like in to_csv ( and give your sheet a name as an added bonus.)
writer = pd.ExcelWriter(‘C:/location/writer_datetime.xlsx’,
2. Date from string
Here we discuss the .strptime( ) and to_datetime( ) functions, basically the sister functions to what we did earlier.
From the above steps, You now converted your datetime64 object with a set of Timestamp properties into a string series. (Pandas data type is “object” for a string or mixed series)
For some reason, you want your dates back. Say you want to get what month the date is in or what calender week it falls in. You are better off using the inbuilt DateTime functions than parsing a string that looks like datetime.
For this, we use the alter-ego of .strftime( ), which is .strptime( ) or the pandas to_datetime( ) functions, if that’s what you like better.
pandas’ to_datetime( ) way:
Note the format and corresponding short forms of what they mean. Say your date string looks different, like this column:
Did you note that %B represents expansion of the month?
Let’s convert this date string to a DateTime object to get the calendar week from this.
df['date_month'] = pd.to_datetime(df.date_month, format=’%d-%B-%y’)
Now that we have the datetime object back, let’s get calendar weeks using the pandas datetime function .week
df['calender_week'] = df.date_month.dt.week
Python’s strptime( ) way:
Pandas use of Timestamp.strptime( ) is deprecated and is replaced as seen above with a very effective to_datetime( ) which can be applied to an entire series using the accessor .dt.
However, if you have individual entities of timestamps that need to be formatted outside of a series, .strptime( ) is an effective python function.
import datetimedate_string = “2 March, 2020”dttime_obj = datetime.datetime.strptime(date_string, “%d %B, %Y”)
The isocalendar() method of datetime is a tuple containing calender week at position 1.
Those were some useful DateTime object functions in and outside of Pandas in Python. Hope this helps!
“I laugh at stuff for their pointlessness and smile at others because they don’t know, I live in Germany and help them make Audis”
Humorous by choice, data analyst by vocation, & top writer in books.
for future queries you can connect with her on Sruthi’s Profile: linkedin.com/in/sruthi-korlakunta-7a5b80121