As an implementation idea, it might be cleaner to add some callback hooks, i.e. onRecordBlockWritten(), and then implement that in the FileWriter instead of having the base ArrowWriter track the blocks.
Thanks, Emilio On 04/27/2018 03:19 PM, Eric Wohlstadter wrote:
Hi all, In the context of ArrowStreamWriter: - It looks like field ArrowWriter.recordBlocks is populated and consumes memory, e.g. in ArrowWriter.writeRecordBatch - But the List<ArrowBlock> is never used (it is used in ArrowFileWriter but not ArrowStreamWriter) Would it be safe for me to extend ArrowStreamWriter and override writeRecordBatch with an implementation that does not populate the recordBlocks? This is for HIVE-19305 (if anyone has time to take a look and provide feedback, that would be much appreciated) Thanks for your help, --Eric
|( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-arrow-development/msg04322.html on line 106|
|( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-arrow-development/msg04322.html on line 106|