npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

Panoplos · 2012-11-08T04:05:26Z

Environment

Python: Version 3.3 (python.org release) on OS X Mountain Lion
numpy: Cloned from git master

When calling numpy.loadtxt on file containing strings as follows:

import numpy as np
datestxt = np.loadtxt("NYSE_dates.txt", dtype=str)
print(datetxt)

Where NYSE_dates.txt is simply a list of dates (could be anything really):

7/5/1962
7/6/1962
7/9/1962
...
12/29/2020
12/30/2020
12/31/2020

Output is:

["b'7/5/1962'" "b'7/6/1962'" "b'7/9/1962'" ..., "b'12/29/2020'"
 "b'12/30/2020'" "b'12/31/2020'"]

As you can see, all the strings have been bytes-casted, then stringified through conv, as you would get the same result from str(str('12/31/2020').encode('latin1')), per conv & compat.asbytes.

After looking at the code, it appears that all strings are bytes-casted with asbytes(...) pretty much throughout, as for example in split_line(...), so this must mean every routine in the module is broken.

The text was updated successfully, but these errors were encountered:

vejnar · 2013-04-18T21:16:44Z

I also have that issue. This is very very annoying; basically you can't use loadtxt in Python3.

Temporary solution: I removed all asbytes() calls in the loadtxt method.

charris · 2013-04-19T16:04:05Z

Yeah, I remember thinking something was fishy in there when I looked through the code.

jonathanrocher · 2014-07-30T16:07:01Z

For the record, I am running into the same issue with datetime64 inputs, leading to a parsing error of the form: Error parsing datetime string "b'2013-01-02'". To work around this, I had to create a converter for that column:

def decoder(input_bytes):
    return input_bytes.decode("ascii")

This would be fine in production code but is highly non-pretty for training material...

charris · 2015-06-21T21:47:29Z

Pushing off to 1.11.

danizen · 2015-12-11T02:59:14Z

work-around - run iconv on the file first.

charris · 2016-01-21T16:44:39Z

pushing off to 1.12.

paalge · 2017-03-03T10:59:54Z

I see that this is being pushed forward, but I find that is is a bug that should be addressed, and a fix seems to be easily implemented.

Queuecumber · 2017-10-18T19:13:52Z

Pretty shocking that this hasn't been fixed for 5 years

mdickinson · 2017-12-13T20:01:26Z

It looks as though this is working as desired in NumPy 1.13.3 (though I'm not sure which PR fixed it). Can this issue be closed?

>>> import io
>>> import numpy as np
>>> f = io.StringIO("7/5/1962\n7/6/1962\n")
>>> np.loadtxt(f, dtype=str)
array(['7/5/1962', '7/6/1962'],
      dtype='<U8')
>>> np.__version__
'1.13.3'

mdickinson · 2017-12-13T20:06:04Z

Looks like this was fixed in #8349, in response to #8033.

mattip · 2018-09-04T06:58:32Z

Closing. Please reopen if needed.

juliantaylor added this to the 1.10 blockers milestone Jul 30, 2014

DavidPowell mentioned this issue Mar 9, 2015

loadtxt fails with complex data under python 3 #5655

Closed

charris modified the milestones: 1.11 blockers, 1.10 blockers Jun 21, 2015

charris modified the milestones: 1.12.0 release, 1.11.0 blockers Jan 21, 2016

charris added 00 - Bug component: numpy.lib labels Jan 21, 2016

rgommers modified the milestone: 1.12.0 release Feb 15, 2017

mdickinson mentioned this issue Dec 13, 2017

Issue when using loadtxt with string data in Python 3 #5530

Closed

mattip closed this as completed Sep 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

Panoplos commented Nov 8, 2012

vejnar commented Apr 18, 2013

charris commented Apr 19, 2013

jonathanrocher commented Jul 30, 2014

charris commented Jun 21, 2015

danizen commented Dec 11, 2015

charris commented Jan 21, 2016

paalge commented Mar 3, 2017

Queuecumber commented Oct 18, 2017

mdickinson commented Dec 13, 2017 •

edited

mdickinson commented Dec 13, 2017

mattip commented Sep 4, 2018

npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

npyio.loadtxt is bytes-casting text file input, even with str dtype specified. #2715

Comments

Panoplos commented Nov 8, 2012

vejnar commented Apr 18, 2013

charris commented Apr 19, 2013

jonathanrocher commented Jul 30, 2014

charris commented Jun 21, 2015

danizen commented Dec 11, 2015

charris commented Jan 21, 2016

paalge commented Mar 3, 2017

Queuecumber commented Oct 18, 2017

mdickinson commented Dec 13, 2017 • edited

mdickinson commented Dec 13, 2017

mattip commented Sep 4, 2018

mdickinson commented Dec 13, 2017 •

edited