git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Changing large in-memory data to file based operation


Hello Camel Experts,

Currently we have something of below route (as simplified version), where we keep reading records 1000 at a time and keep formatting and holding in memory.

Current Simplified Solution:

from("direct:processing")
.bean(<formatRecordsAndAddToBody>)
.choice()
.when(header(<isMoreRecords>).bean(<readMoreRecords>).to("direct:processing")
.otherwise().to("sftp:///";)

Generally the data is small and we kept in-memory, we want to improve the same now to safe-guard any in-memory problem. In our landscape we can use 'tmp' storage to hold the file. So we plan to change it like this

Possible Modified Solution:
from("direct:processing")
                .bean(<formatRecordsAndAddToTmpFile>)
                .choice()
                .when(header(<isMoreRecords>).bean(<readMoreRecords>).to("direct:processing")
                .otherwise().bean(<readTmpFile&SetAsBody>).to("sftp:///";)

When locally tried the solution, we see it works - Camel understand 'java.io.File' object & writes to SFTP.

Question 1 - Would camel load it in memory completely before writing to SFTP?
Question 2 - If yes, how can we solve this problem - does some way of streaming works?

Checked : http://www.catify.com/2012/07/09/parsing-large-files-with-apache-camel/ already but they are holding in memory.

Limitation: We don't want to write in chunks to sftp because of the complexity of some data have to written out-of-order.

Regards,
Arpit.