git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (ARROW-3890) pa.array of string cannot created from np.array of string


jacques created ARROW-3890:
------------------------------

             Summary: pa.array of string cannot created from np.array of string
                 Key: ARROW-3890
                 URL: https://issues.apache.org/jira/browse/ARROW-3890
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.11.1
            Reporter: jacques


Pyarrow arrays of string cannot be created from Numpy arrays of string anymore for versions pyarrow>=0.8.0 (this includes pyarrow==0.11.1).

Please find below a quick repro:
{code:python}
import numpy as np
import pyarrow as pa
vec = np.array(["toto", "tata"])
pa.array(vec, pa.string())
{code}

Runing this I get the following:

{code:python}
---------------------------------------------------------------------------
ArrowInvalid                              Traceback (most recent call last)
<ipython-input-4-e753fb3a8193> in <module>()
----> 1 pa.array(vec, pa.string())

/usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib.array()

/usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib._ndarray_to_array()

/usr/local/lib/python2.7/dist-packages/pyarrow/lib.so in pyarrow.lib.check_status()

ArrowInvalid: 'utf32' codec can't decode bytes in position 0-3: code point not in range(0x110000)
{code}

However, this code snippet was working fine with pyarrow==0.7.1.

Best,

Jacques






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)