Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
ValueError
due to Timestamp.strftime
are currently silently caught by DatetimeLike array strftime and replaced with str(t)
. This leads to unexpected behaviour:
> dta = pd.DatetimeIndex(np.array(['1820-01-01', '2020-01-02'], 'datetime64[s]')) > dta[0].strftime("%y") # Instance operation raises ValueError Traceback (most recent call last): (...) File "timestamps.pyx", line 1518, in pandas._libs.tslibs.timestamps.Timestamp.strftime ValueError: format %y requires year >= 1900 on Windows > dta.strftime("%y") # Array operation catches error and turns into str representation (with default datetime format) Index(['1820-01-01 00:00:00', '20'], dtype='object')
This very surprising behaviour on the array strftime is due to the try-except at
try: | |
# Note: dispatches to pydatetime | |
res = ts.strftime(format) | |
except ValueError: | |
# Use datetime.str, that returns ts.isoformat(sep=’ ‘) | |
res = str(ts) |
Note that this try-except is around since 0.16.1 (introduced by commit 3d54482)
This «questionable behaviour» was also reported in #48588
Note that this does not happen with dates that cannot be converted to datetime
at all, because this particular ValueError
is currently caught and turned into a NotImplementedError
:
> dta = pd.DatetimeIndex(np.array(['-0020-01-01', '2020-01-02'], 'datetime64[s]')) > dta.strftime("%Y_%m_%d") # Custom so falls back on Timestamp.strftime NotImplementedError: strftime not yet supported on Timestamps which are outside the range of Python's standard library. For now, please call the components you need (such as `.year` and `.month`) and construct your string from there. > dta[0].strftime("%Y_%m_%d") NotImplementedError: strftime not yet supported on Timestamps which are outside the range of Python's standard library. For now, please call the components you need (such as `.year` and `.month`) and construct your string from there.
Feature Description
I propose to add an errors
parameter to array strftime
with the following values
- ‘raise’ (default) would not catch any underlying error and raise them as is
- ‘ignore’ would catch all errors and silently replace the output with None instead of a string
- ‘warn’ would have the same behaviour as ‘ignore’ and would additionally issue a
StrftimeErrorWarning
warning message"The following timestamps could be converted to string: [...]. Set errors=‘raise’ to see the details"
> dta = pd.DatetimeIndex(np.array(['1820-01-01', '2020-01-02'], **'datetime64[s]'))** > dta[0].strftime("%y") ValueError: format %y requires year >= 1900 on Windows > dta.strftime("%y") ValueError: format %y requires year >= 1900 on Windows > dta.strftime("%y", errors='raise') ValueError: format %y requires year >= 1900 on Windows > dta.strftime("%y", errors='ignore') Index([None, '20'], dtype='object') > dta.strftime("%y", errors='warn') StrftimeErrorWarning : Not all timestamps could be converted to string: ['1820-01-01']. Set errors=‘raise’ to see the details Index([None, '20'], dtype='object')
The specific NotImplementedError
described previously can either disappear or stay, but in any case it should be handled the same way than ValueErrors above (meaning that if user selects ‘ignore’, the error must be silently caught)
Alternative Solutions
An alternative solution could be to always raise errors
Additional Context
No response
Python 3.5 suffers from a vulnerability caused by the behavior of the time_strftime() function. When called, the function loops over the format string provided, using strchr to search for each instance of '%'. After finding a '%', it continues to search two characters ahead, assuming that each instance is the beginning of a well formed format string token. However, if a string ends with '%', this logic will result in a call to strchr that reads off the end of the format string buffer: /* check that the format string contains only valid directives */ for(outbuf = strchr(fmt, '%'); <<<< Assuming fmt ends with a '%', this will return a pointer to the end of the string. outbuf != NULL; outbuf = strchr(outbuf+2, '%')) <<<< Once outbuf is pointing to the end of the string, outbuf+2 skips { past the null terimnator, leading to a buffer over-read. if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if ((outbuf[1] == 'y') && buf.tm_year < 0) { PyErr_SetString(PyExc_ValueError, "format %y requires year >= 1900 on Windows"); Py_DECREF(format); return NULL; } } In some applications, it may be possible to exploit this behavior to disclose the contents of adjacent memory. The buffer over-read can be observed by running the following script: from time import * strftime("AA%"*0x10000) Which, depending on the arrangement of memory, may produce an exception such as this: 0:000> g (20b8.18d4): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=ffffffff ebx=52c1a6a0 ecx=00000000 edx=08ef3000 esi=08ec2fe8 edi=08ec2ff8 eip=52d254f3 esp=004cf9d4 ebp=004cfa58 iopl=0 nv up ei pl nz na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206 python35!strchr+0x33: 52d254f3 f30f6f0a movdqu xmm1,xmmword ptr [edx] ds:002b:08ef3000=???????????????????????????????? 0:000> db edx-0x10 08ef2ff0 41 25 41 41 25 41 41 25-00 d0 d0 d0 d0 d0 d0 d0 A%AA%AA%........ 08ef3000 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3010 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3020 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3030 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3040 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3050 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 08ef3060 ?? ?? ?? ?? ?? ?? ?? ??-?? ?? ?? ?? ?? ?? ?? ?? ???????????????? 0:000> k5 ChildEBP RetAddr 004cf9d0 52c1a7f6 python35!strchr+0x33 [f:\dd\vctools\crt\vcruntime\src\string\i386\strchr_sse.inc @ 75] 004cfa58 52c832d3 python35!time_strftime+0x156 [c:\build\cpython\modules\timemodule.c @ 615] 004cfa74 52ce442f python35!PyCFunction_Call+0x113 [c:\build\cpython\objects\methodobject.c @ 109] 004cfaa8 52ce18ec python35!call_function+0x2ff [c:\build\cpython\python\ceval.c @ 4651] 004cfb20 52ce339f python35!PyEval_EvalFrameEx+0x232c [c:\build\cpython\python\ceval.c @ 3184] To fix this issue, it is recommended that time_strftime() be updated to check outputbuf[1] for null in the body of the format string directive validation loop. A proposed patch is attached. Credit: John Leitch (johnleitch@outlook.com), Bryce Darling (darlingbryce@gmail.com)
#1 Фев. 13, 2011 12:21:22
- robotd
-
- От:
- Зарегистрирован: 2011-02-13
- Сообщения: 3
- Репутация:
0
- Профиль
Отправить e-mail
strftime и даты до 1900 года
django
python 2.7
Коллеги, помогите — никак не получается решить проблему:
datetime.datetime.date(date_fact).strftime("%d.%m.%Y")
выдает ошибку:
year=1858 is before 1900; the datetime strftime() methods require year >= 1900
если делаю так:
time.accept2dyear=0
time.strftime("%d.%m.%Y", datetime.datetime.date(date_fact).timetuple()),
ошибка таже, но другими словами:
year >= 1900 required
заранее благодарю!
Отредактировано (Фев. 13, 2011 12:32:55)
Офлайн
- Пожаловаться
John said:
Does anyone know of a datetime string formatter that can handles
strftime format strings over the full range that datetime objects
support?
Here’s what the Python source says
/* Give up if the year is before 1900.
* Python strftime() plays games with the year, and different
* games depending on whether envar PYTHON2K is set. This makes
* years before 1900 a nightmare, even if the platform strftime
* supports them (and not all do).
* We could get a lot farther here by avoiding Python’s strftime
* wrapper and calling the C strftime() directly, but that isn’t
* an option in the Python implementation of this module.
*/
The underlying time.strftime module supports special
behaviour for dates < 1900.
Traceback (most recent call last):
One concern about your request is, what a date mean
when you get before 1900? I assume you want the proleptic
Gregorian calendar, that is, to apply it even when and
where it wasn’t in use.
One way to fake it is to move the date to a date in the
supported time range which starts on the same day, then
use strftime on that new date.
It’s not enough to find the fake year number in the
resulting string and convert it into the real year
number. After all, the format string might be
«1980 %Y» and if the %Y expands to 1980 in your shifted
time frame then you don’t know which to change.
To figure that out, move the date forward by 28 years
(which is the repeat cycle except for the non-leap
centuries) and do it again. The parts of the two
strings that differ indicate where to put the change.
I tried to write this function but I wasn’t sure
how to handle the non-leap year centuries. It seems
to be that those are the same as 6 years later, so
that Jan. 1900’s calendar looks like 1906’s.
Here’s what I came up with. Seems to work.
# Format a datetime.date using the proleptic Gregorian calendar
import time, datetime
def _findall(text, substr):
# Also finds overlaps
sites = []
i = 0
while 1:
j = text.find(substr, i)
if j == -1:
break
sites.append(j)
i=j+1
return sites
# I hope I did this math right. Every 28 years the
# calendar repeats, except through century leap years
# excepting the 400 year leap years. But only if
# you’re using the Gregorian calendar.
def strftime(dt, fmt):
# WARNING: known bug with «%s», which is the number
# of seconds since the epoch. This is too harsh
# of a check. It should allow «%%s».
fmt = fmt.replace(«%s», «s»)
if dt.year > 1900:
return time.strftime(fmt, dt.timetuple())
year = dt.year
# For every non-leap year century, advance by
# 6 years to get into the 28-year repeat cycle
delta = 2000 — year
off = 6*(delta // 100 + delta // 400)
year = year + off
# Move to around the year 2000
year = year + ((2000 — year)//28)*28
timetuple = dt.timetuple()
s1 = time.strftime(fmt, (year,) + timetuple[1:])
sites1 = _findall(s1, str(year))
s2 = time.strftime(fmt, (year+28,) + timetuple[1:])
sites2 = _findall(s2, str(year+28))
sites = []
for site in sites1:
if site in sites2:
sites.append(site)
s = s1
syear = «%4d» % (dt.year,)
for site in sites:
s = s[:site] + syear + s[site+4:]
return s
# Make sure that the day names are in order
# from 1/1/1 until August 2000
def test():
s = strftime(datetime.date(1800, 9, 23),
«%Y has the same days as 1980 and 2008»)
if s != «1800 has the same days as 1980 and 2008»:
raise AssertionError(s)
print «Testing all day names from 0001/01/01 until 2000/08/01»
days = []
for i in range(1, 10):
days.append(datetime.date(2000, 1, i).strftime(«%A»))
nextday = {}
for i in range(8):
nextday[days] = days[i+1]
startdate = datetime.date(1, 1, 1)
enddate = datetime.date(2000, 8, 1)
prevday = strftime(startdate, «%A»)
one_day = datetime.timedelta(1)
testdate = startdate + one_day
while testdate < enddate:
if (testdate.day == 1 and testdate.month == 1 and
(testdate.year % 100 == 0)):
print testdate.year
day = strftime(testdate, «%A»)
if nextday[prevday] != day:
raise AssertionError(str(testdate))
prevday = day
testdate = testdate + one_day
if __name__ == «__main__»:
test()
% cal 8 1850
August 1850
S M Tu W Th F S
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Andrew
(e-mail address removed)
- Home
- Download
- Documentation
- Mailing Lists
- License
- FAQ
Context Navigation
Modify ↓
#12489
closed
defect
(fixed)
Reported by: | Owned by: | Ryan J Ollos | |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.12 |
Component: | general | Version: | 1.0.11 |
Severity: | normal | Keywords: | datetime |
Cc: | Branch: | ||
Release Notes: |
Fix |
||
API Changes: | |||
Internal Changes: |
Maybe we can raise a TracError
for this case, or add/or a warning to the timeline page.
2016-05-26 14:42:07,161 Trac[main] ERROR: Internal Server Error: <RequestWithSession "GET '/timeline?from=1900-01-02&daysback=7&authors='">, referrer 'https://bugs.jqueryui.com/' Traceback (most recent call last): File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/web/main.py", line 562, in _dispatch_request dispatcher.dispatch(req) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/web/main.py", line 249, in dispatch resp = chosen_handler.process_request(req) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/timeline/web_ui.py", line 251, in process_request format='%Y-%m-%d', tzinfo=req.tz) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/util/datefmt.py", line 318, in format_date return _format_datetime(t, format, tzinfo, locale, 'date') File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/util/datefmt.py", line 299, in _format_datetime return _format_datetime_without_babel(t, format) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/util/datefmt.py", line 246, in _format_datetime_without_babel text = t.strftime(str(format)) ValueError: year=1899 is before 1900; the datetime strftime() methods require year >= 1900
Attachments
(0)
Change History
(7)
Milestone: | → 1.0.12 |
---|---|
Owner: | set to Ryan J Ollos |
Status: | new → assigned |
Release Notes: | modified (diff) |
---|
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Modify Ticket
Change Properties
Summary: | |||
---|---|---|---|
Description: |
You may use WikiFormatting |
||
Type: | Priority: | ||
Milestone: | Component: | ||
Version: | Severity: | ||
Keywords: | Cc: |
|
|
Branch: | |||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Attachments ↑
Description ↑
Note:
See TracTickets
for help on using tickets.