If you are the kind of developer who prefers to work in UTC, you may have seen Python's datetime.utcnow() and datetime.utcfromtimestamp() methods and thought, "Ah, yes, this is what I should do to work in UTC!" But alas, this is not the best way to work with UTC datetimes. In fact I would say that it is extremely rare that you would want to use either of these functions. Consider the following dangerous code:
When executed with your system locale set to UTC, this will succeed just fine, but when executed in any locale where the offset at that particular timestamp is something other than 0, the assertion fails — for example when executed with an America/New_York locale, you'll get AssertionError: 1571595618.0 != 1571610018.0.
This is due to an unfortunate quirk of history and a subtle shift in what it means for a datetime to be naive that took place in the Python 2 to 3 transition. I imagine that these functions would not exist if the datetime library were redesigned today, but at the moment there are a mix of harmful and harmless uses of them out there, and it's not a simple matter to rip them all out.
Rather than make you stick around for a history lesson as to why this problem exists, I'm going to spoil the ending and say that the right thing to do is to pass a UTC object to the tz parameter of now() and fromtimestamp(), respectively, to get a time zone-aware datetime:
from datetime import datetime, timezone ts = 1571595618.0 x = datetime.fromtimestamp(ts, tz=timezone.utc) x_ts = x.timestamp() assert ts == x_ts, f"{ts} != {x_ts}" # This assertion succeeds
Naive datetimes as local time
When originally conceived, naive datetimes were intended to be abstract, not representing any specific time zone, and it was up to the program to determine what they represent — this is no different from abstract numbers which can represent mass in kilograms, distance in meters or any other specific quantity according to the programmer's intention. By contrast aware datetimes represent a specific point in time in a specific time zone. Awareness of the datetime's time zone allows you to do things like arithmetic and comparison between time zones, conversion to other time zones and other operations which require a concrete datetime.
In Python 3, two things have changed that make utcnow unnecessary and, in fact, dangerous. The first is that a concrete time zone class, datetime.timezone, was introduced, along with a constant UTC object, datetime.timezone.utc. With this change, you now have a clear and unambiguous way to mark which of your datetimes are in UTC without bringing in third party code or implementing your own UTC class.
The change that made utcnow dangerous is that naive datetimes underwent a subtle shift in meaning: for certain operations that require interpreting a datetime as a fixed point in time, rather than throwing an error they would instead assume that the datetime represents the current system local time zone. So in Python 2, operations like astimezone() will raise an exception when called on a naive datetime:
>>> from datetime import datetime >>> from dateutil import tz >>> datetime(2015, 5, 1).astimezone(tz.UTC) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: astimezone() cannot be applied to a naive datetime
but in Python 3 it will use your system's locale (on my machine it's America/New_York) and convert accordingly:
>>> from datetime import datetime >>> from dateutil import tz >>> datetime(2015, 5, 1).astimezone(tz.UTC) datetime.datetime(2015, 5, 1, 4, 0, tzinfo=tzutc())
This is why the example that I started this post off with fails. The .timestamp() method gives a representation of a fixed point in time, not a point on the calendar; it returns Unix time, which is the number of seconds since 1970-01-01T00:00:00 UTC, and if you call it on a naive datetime, Python will assume that that datetime represents your machine's local time, even if you originally intended it to be UTC.
Conclusions
Even without the change in Python's model of what a naive datetime means, I would still recommend that you not use utcnow() or utcfromtimestamp() simply because it's the wrong abstraction: to do so would be to represent a concrete point in time as an abstract datetime. You know that your datetime represents UTC, and it's easy to mark that clearly in Python, so there's very little reason not to do it. As it says in the warning recently added to the documentation, you should prefer to use now in place of utcnow and fromtimestamp in place of utcfromtimestamp, so replace:
>>> dt_now = datetime.utcnow() >>> dt_ts = datetime.utcfromtimestamp(1571595618.0)
with
>>> from datetime import timezone >>> dt_now = datetime.now(tz=timezone.utc) >>> dt_ts = datetime.fromtimestamp(1571595618.0, tz=timezone.utc)
or the equivalent using positional arguments.
One last thing to note: the reason that we cannot simply change utcnow() into an alias for now(timezone.utc) in the standard library is that would change the semantics of how those datetimes are treated by their consumers (and as such it would not be backwards-compatible). You should keep this in mind when converting over old code that uses utcnow and utcfromtimestamp — you will need to make sure that any code that consumes your datetimes is expecting an aware datetime. In my experience, this is not a high bar to clear, but you probably don't want to just do a search-and-replace on untested code before deploying to production and leaving work for the weekend.
from Hacker News https://ift.tt/LInVlxW
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.