How to stop thinking about time zones and start living
Does time play an important role in your system? Are your users / components distributed throughout the globe, or at least our vast homeland? So you need time zones. Well, that’s easy. The hardest thing you have to do is not to get confused. We’ll talk about this. First you need to learn to think correctly. Thinking correctly, everything else will be either self-evident or simple enough for you.
Let's start with the clock. We are all used to determining the time by looking at the clock on the wall. When working with time zones, this time is called Wall clock time.. In principle, there is nothing wrong with it, only in different places on the globe at the same time, the clock shows different times. If you set a goal, you can come up with an algorithm for translating wall clock time from one time zone to wall clock time of another. Usually you need to add / subtract the difference in hours between time zones, except for (attention) the moments of transition to summer / winter time. That's when the transition begins, the calculations become really complicated.
We need something simple and bulletproof, like ... an integer. This is how the concept of instant in time (instant in time, Unix time, POSIX time, time since (unix) epoch) appeared, which is the number of seconds (in milliseconds in Java) elapsed since January 1, 1970, 00:00:00 GMT The moment of time is the same across the globe - if you imagine that someone clicked on the “pause” and the flow of time stopped, the number corresponding to the moment of time throughout the Earth will be the same, regardless of the time zone. If someone paused an hour after Greenwich began the new 1970, a point in time across the Earth would show 3,600,000. And now, for example, this is already the number 1 280 720 431.859.
So, the moment in time is a universal convertible currency of time computing. It depends only on, hmm, time, moments can be compared (respectively, to determine which of the events happened earlier and which later), and this does not involve any nonsense related to the geographical location, time zones and clock transfers, which dramatically increases the reliability of such calculations. Actually, this is how the work with time is implemented in Java (from version 1.1), where java.util.Date is a wrapper over a long-moment in time (negative longs correspond to dates before 1970), is Comparable, and all human-calendar transformations moved to separate Calendar and DateFormat classes.
About the conversion. The number 1 280 720 431.859 will say little to an ordinary person (although an inquisitive reader can calculate the time from it when I wrote these lines), so you need to be able to translate the moment in time into wall clock time, and, accordingly, parse back wall clock time at the moment in time. But for these transformations it is already required to know the time zone, and these calculations are not at all trivial. The fact is that in different countries / territories / places, not only do different offsets with respect to GMT (GMT), the rules for these offsets have historically changed several times and continue to change (introduce / cancel daylight saving time, combine belts - heard probably about such an initiative we have ?, or remember, for example, my hometown of Novosibirsk, which in the early nineties was moved from GMT + 7 to GMT + 6, and at the beginning of the century there were two belts in it at all - the border of the belt ran along the Ob River, and on different banks there were different belts). In short, in order not to go crazy, all this historical information is carefully maintained in the form of a databaseOlson tz database , named after Arthur founder David Olson, although the editor is Paul Eggert. In this database, each large settlement corresponds to a code (Novosibirsk, for example, Asia / Novosibirsk is called by this database) and a list of all its adventures by time zone, starting in 1970. This database is used in many (all?) Linux / Unix / BSD systems, I won’t say about Windows, in the Java Runtime Environment (for example, it had some updates related exclusively to updating the tz database), and so on, see, in general, Wikipedia . We will not consider the time conversion algorithm to / from this database, we will assume that we have it ready. He, in fact, is almost everywhere ready.
So, we formulate the rules for dealing with time for programs running in several time zones:
You can come up with different solutions to this problem - to store the attribute L / C in a separate column, to store moments in time in a column of the NUMBER type, but the least radical and simple seems to me to store the date / time in UTC. There is no daylight saving time in the UTC time zone, therefore wall clock time instant in time conversions are always performed unambiguously. In addition to the fact that this approach allows you to reliably store all the moments in time in the database, including clock transfers, it also:
Naive is translated into tz-aware datetime using the method:
(pay attention to the second parameter, it is needed just because of an ambiguous conversion), or
(transfer tz-aware date-time to another time zone).
Since all this is implemented through the same datetime.datetime class and the whole difference in the availability of the tzinfo property, you need to be damn careful not to mix up where we have dates with time zones and where not. Here, Python is worse than Java in the sense that when printing, you want, you don’t want, but you need to create a DateFormat and specify the time zone, in Python, many operations, including and printing, can be done for naive dates. It is clear that in a complex application it is advisable to take care that all dates are with a time zone, because if at some point in the application it turns out that it is not there, then you will already figure out figs and what it should be there. And with the belt and the dates will be compared correctly, and printed out, and in general. Moreover,
Rules of a person working with calendar dates. Remember, that:
Although UTC and GMT are very similar, they are still slightly different. If GMT is determined by the average solar time at the Royal Observatory in Greenwich, then UTC is measured by atomic clocks (the weighted average time of two hundred atomic hours in seventy laboratories around the world synchronized via satellites). The difference between GMT and UTC should not exceed 0.9 seconds and is compensated by just adding leap seconds.
Storing date in 32 signed int on UNIX systems is expected to lead to 2038 problemwhen 31 bits are overflowed and negative numbers will correspond to subsequent moments in time, which will break all the comparison methods. New 64-bit systems and programs already use 64 bits to store time, but will such systems have time to completely replace 32-bit ones by 2038?
Let's start with the clock. We are all used to determining the time by looking at the clock on the wall. When working with time zones, this time is called Wall clock time.. In principle, there is nothing wrong with it, only in different places on the globe at the same time, the clock shows different times. If you set a goal, you can come up with an algorithm for translating wall clock time from one time zone to wall clock time of another. Usually you need to add / subtract the difference in hours between time zones, except for (attention) the moments of transition to summer / winter time. That's when the transition begins, the calculations become really complicated.
We need something simple and bulletproof, like ... an integer. This is how the concept of instant in time (instant in time, Unix time, POSIX time, time since (unix) epoch) appeared, which is the number of seconds (in milliseconds in Java) elapsed since January 1, 1970, 00:00:00 GMT The moment of time is the same across the globe - if you imagine that someone clicked on the “pause” and the flow of time stopped, the number corresponding to the moment of time throughout the Earth will be the same, regardless of the time zone. If someone paused an hour after Greenwich began the new 1970, a point in time across the Earth would show 3,600,000. And now, for example, this is already the number 1 280 720 431.859.
So, the moment in time is a universal convertible currency of time computing. It depends only on, hmm, time, moments can be compared (respectively, to determine which of the events happened earlier and which later), and this does not involve any nonsense related to the geographical location, time zones and clock transfers, which dramatically increases the reliability of such calculations. Actually, this is how the work with time is implemented in Java (from version 1.1), where java.util.Date is a wrapper over a long-moment in time (negative longs correspond to dates before 1970), is Comparable, and all human-calendar transformations moved to separate Calendar and DateFormat classes.
About the conversion. The number 1 280 720 431.859 will say little to an ordinary person (although an inquisitive reader can calculate the time from it when I wrote these lines), so you need to be able to translate the moment in time into wall clock time, and, accordingly, parse back wall clock time at the moment in time. But for these transformations it is already required to know the time zone, and these calculations are not at all trivial. The fact is that in different countries / territories / places, not only do different offsets with respect to GMT (GMT), the rules for these offsets have historically changed several times and continue to change (introduce / cancel daylight saving time, combine belts - heard probably about such an initiative we have ?, or remember, for example, my hometown of Novosibirsk, which in the early nineties was moved from GMT + 7 to GMT + 6, and at the beginning of the century there were two belts in it at all - the border of the belt ran along the Ob River, and on different banks there were different belts). In short, in order not to go crazy, all this historical information is carefully maintained in the form of a databaseOlson tz database , named after Arthur founder David Olson, although the editor is Paul Eggert. In this database, each large settlement corresponds to a code (Novosibirsk, for example, Asia / Novosibirsk is called by this database) and a list of all its adventures by time zone, starting in 1970. This database is used in many (all?) Linux / Unix / BSD systems, I won’t say about Windows, in the Java Runtime Environment (for example, it had some updates related exclusively to updating the tz database), and so on, see, in general, Wikipedia . We will not consider the time conversion algorithm to / from this database, we will assume that we have it ready. He, in fact, is almost everywhere ready.
So, we formulate the rules for dealing with time for programs running in several time zones:
- inside the program only work with moments in time;
- Convert moments in time to wall clock time only during the input / output of the date. Remember that the time zone always (always!) Participates in this transformation, so you need to keep track of which one (this is not always visible, because the current one is taken by default);
- Another case where wall clock time is required is calendar transformations (calculate the beginning of the next day, etc.). Here, too, you need to make sure that these transformations occur in the correct time zone;
- when saving the date / time to the database do this in the UTC time zone.
You can come up with different solutions to this problem - to store the attribute L / C in a separate column, to store moments in time in a column of the NUMBER type, but the least radical and simple seems to me to store the date / time in UTC. There is no daylight saving time in the UTC time zone, therefore wall clock time instant in time conversions are always performed unambiguously. In addition to the fact that this approach allows you to reliably store all the moments in time in the database, including clock transfers, it also:
- disciplines (if you forget to specify the time zone somewhere in the transformations, you will immediately see that something is not right, at least if you live not in UTC);
- Allows you not to get confused in the dates / times when information in the database comes from different time zones - in the database, time is always in UTC;
- simplifies the code, since when converting time to / from the database, you don’t have to think about the time zone, it is always the same.
Naive is translated into tz-aware datetime using the method:
tzaware_datetime = some_timezone.localize(some_naive_datetime, is_dst=True)
(pay attention to the second parameter, it is needed just because of an ambiguous conversion), or
another_tzaware_datetime = tzaware_datetime.astimezone(another_tz)
(transfer tz-aware date-time to another time zone).
Since all this is implemented through the same datetime.datetime class and the whole difference in the availability of the tzinfo property, you need to be damn careful not to mix up where we have dates with time zones and where not. Here, Python is worse than Java in the sense that when printing, you want, you don’t want, but you need to create a DateFormat and specify the time zone, in Python, many operations, including and printing, can be done for naive dates. It is clear that in a complex application it is advisable to take care that all dates are with a time zone, because if at some point in the application it turns out that it is not there, then you will already figure out figs and what it should be there. And with the belt and the dates will be compared correctly, and printed out, and in general. Moreover,
Bonuses
Rules of a person working with calendar dates. Remember, that:
- not every year 365 days;
- not every day 24 hours;
- fortunately, every hour is 60 minutes;
- not every minute 60 seconds (it may be 59 and 61. The 61st is called leap second , it is added either on June 30 or December 31, at this time the clock in UTC should show 23:59:60. The addition of the 61st second is caused by a slowing down Earth’s rotation. The opportunity to take one second is provided for cases when the Earth starts to rotate faster, but this opportunity has never been required).
Although UTC and GMT are very similar, they are still slightly different. If GMT is determined by the average solar time at the Royal Observatory in Greenwich, then UTC is measured by atomic clocks (the weighted average time of two hundred atomic hours in seventy laboratories around the world synchronized via satellites). The difference between GMT and UTC should not exceed 0.9 seconds and is compensated by just adding leap seconds.
Storing date in 32 signed int on UNIX systems is expected to lead to 2038 problemwhen 31 bits are overflowed and negative numbers will correspond to subsequent moments in time, which will break all the comparison methods. New 64-bit systems and programs already use 64 bits to store time, but will such systems have time to completely replace 32-bit ones by 2038?