Following a discussion at Eng-Tips, I have been having a closer look at how Excel handles the entry and display of dates. The Eng-Tips discussion concerned changes to data when csv files are opened in Excel. I will deal with that (together with a way to open and save csv files in Excel without changing anything) in a later posts, but for today I want to look at the basics of how Excel stores and formats dates.
Excel stores dates as a day number, starting from 1st January 1900. If you enter a sequence of numbers: 1, 2, 3, … in different cells, then format the cells as dates, it will display as below:
Note that day 60 displays as 29th Feb 1900, which does not actually exist, since century years not divisible by 400 are not leap years. The extra day was introduced to maintain compatibility with the earlier spreadsheet Lotus 123, who had probably done it by mistake.
There is also an option (under Options-Advanced-When Calculating This Workbook) to set day 1 to 1st January 1904, which is the standard for Apple Mac versions of Excel. Everything that follows uses the Windows default.
In addition to the Apple/Windows differences the date formats must also cope with different conventions for writing dates, especially when the month is shown as a number, and must deal with text that may or may not be a date.
The screenshot below illustrates different behaviour depending on format settings, and the first character of the entered text. Column A was pre-formatted to “Text” (to display the values as entered). The remaining columns were left in the default format (“General”).
In Columns A and C the text was entered as shown in Column A. In Columns D and G the same text was entered, except preceded by a space for Column D and an ‘ for Column G.
Columns B, D, F, and H contain a formula subtracting the value in Cell A3 (123) from the value in the adjacent cell.
The default Windows date display for my location (Australian English) was used, i.e. day-month-year.
- The values with a non-numeric character (all except Row 3) are displayed as text in Column A, as entered, but are stored by Excel as a date value, as can be seen in Column B.
- The values separated by “-” and “/” are treated in exactly the same way in all cases.
- The values in Rows 4 and 5 (and 7 and 8) are ambiguous. In Row 4, 23 is treated as being a year (2023), because it cannot be a month, but in Row 5 the 12 is treated as a day, and the 3 as a month (March).
- Three separated values are treated as day, month, year, in accordance with the Windows language setting.
- In Column C the entered text is automatically re-formatted to a date format, but the format depends on the text entered. Note that 1-23 has been converted to 1-Jan-2023, maintaining the day-month-year format, but 12-3 is converted to 12-Mar-2015, i.e. day-month for the current year.
- In Column E, where the first character entered is a space, 123 is converted to a number, but all the other entries remain as text, and hence return the #VALUE! error if used in a formula.
- In Column G, where the first character entered is an apostrophe, all the entries display as text (without the ‘), but as for Column A, they are stored as a date value.
The screenshot below shows details of the format automatically applied to the dates in Column C entered as day-month-year. This is the standard short date format, and the * indicates that the displayed date will be re-formatted, depending on the language setting.
The note above says that the starred date formats will change in response to regional date and time settings, but the default setting (at least in Windows 8) is that the regional date and time settings are tied to the selected language, so to change the date display the Windows language should be changed, as shown below.
To change to United States English in Windows 8.1 I had first to add US English to the list in the Control Panel – Language dialog, then move it into top position:
After changing to US English the display is immediately updated:
- The date values in Column D all remain unchanged.
- The date display in Column C rows 6 and 9 changes to the month-day-year format.
- The other dates in Column C are unchanged.
- The text displayed in Columns A and G is unchanged in all cases, but all the date values have changed.
- The text displayed in Column E is also unchanged, but in this case there is no corresponding date value.
After changing the language re-entering the same values in Column C will result in a new date value:
Note that the date values in Column D now match those in Columns B and H. The dates in Rows 4 and 7 are all the same, but in Columns A and G it is displayed as month-day (i.e. Jan-23rd), whereas in Column C it is displayed as month-year (i.e. Jan-2015).
Returning Australian English to top position in the Control Panel table reverses the process, with the dates in Column C maintaining their date value, but changing format in rows 6 and 9, and text in Columns A and G remaining unchanged, but all the associated date values change:
In summary, when everyone using a worksheet with dates uses the same language settings, and the same date format, the automatic formatting features of Excel should present no problem, although it is still necessary to take care when entering dates in day-month (or month-day) format that the correct year is entered.
For situations where worksheets may be shared between countries with different formats however great care should be taken, especially when calculations involving date differences are involved. Further recommendations can be found in an excellent article at Excel Semi-Pro Regional Date Formats in Excel.