git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (ARROW-3842) [R] RecordBatchStreamWriter api


Romain François created ARROW-3842:
--------------------------------------

             Summary: [R] RecordBatchStreamWriter api
                 Key: ARROW-3842
                 URL: https://issues.apache.org/jira/browse/ARROW-3842
             Project: Apache Arrow
          Issue Type: New Feature
          Components: R
            Reporter: Romain François


To support the "Writing and Reading Streams" section of the vignette, perhaps we should rely more on the RecordBatchStreamWriter class and less the `write_record_batch` function. 

We should be able to write code resembling the python api : 

{code:r}
batch <- ... 
sink <- buffer_output_stream()
writer <- record_batch_stream_writer(sink, batch$schema())
writer$write_batch()
writer$close()
sink$getvalue()
{code}

Most of the code is there, but we need to add 

- RecordBatchStreamWriter$write_batch() : write a record batch to the stream. We already have RecordBatchStreamWriter$WriteRecordBatch
- RecordBatchStreamWriter$close() : not sure why it is lower case close() in python but upper case in C++. We already have RecordBatchWriter$Close()
- BufferOutputStream$getvalue() : we already have BufferOutputStream$Finish()

Currently the constructor for a BufferOutputStream is buffer_output_stream(), perhaps we can align with python and make it BufferOutputStream, that would not clash with the `arrow::BufferOutputStream` class because of the namespacing. 






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)