[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Help with Java API and RecordBatch creation

Good morning,

I have to use apache arrow with scala, so I’m using the Java API from scala, but I’m confused, I hope that someone is going to clarify something for me.

First of all, what is the difference between ArrowRecordBatch (in org.apache.arrow.vector.ipc.message) and RecordBatch (in org.apache.arrow.flatbuf)?
In this regard, if a coder wants to use arrow just for IPC, should she consider only the classes in the package org.apache.arrow.vector, or should she learn also how to use the other packages, particularly io.netty.buffer and org.apache.arrow.memory and org.apache.arrow.flatbuf?

I don’t understand how to perform in java everything that is done in python like in the documentation pages:

I’d like to understand how I can create what in python is called a RecordBatch, and serialize it in a stream, for example to write it on a file or whatever.
I think ArrowRecordBatch can be created by using the constructors, once you built a list of ArrowFieldNode (I haven’t understood what this class stands for, to be honest) and ArrowBuff (I haven’t understood how to create one, I think that I should instantiate an ArrowByteBufAllocator though alloc(), but then I wouldn’t know how to procede...), but I’m not sure.
I hope that my doubts are going to be cleared.

Thank you,