git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Large CSV files with headers


I have some CSV files with a header line, so setting useMaps="true" would
be the natural thing to do. Works great.

My CSV files are very big, so using streaming/parallelProcessing would be
the natural thing to do. Also works great.

Unfortunately using useMaps="true" AND streaming/parallelProcessing does
not work: It results in lots of empty Lists/Maps. Which is understandable,
but not nice.

>> So the question remains: How to efficiently process large CSV files that
have a header line? <<

By the way, this is my route:

<route id="CSVRoute">
    <from uri="file:/tmp/data/" />
    <split streaming="true" parallelProcessing="true">
        <tokenize token="\n" />
        <unmarshal>
            <csv delimiter=";" useMaps="true" />
        </unmarshal>
        <log message="Got ${body}"/>
        <to uri="mock:nextStageProcessor"/>
    </split>
</route>


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-camel-users/msg03252.html on line 94
Call Stack
#TimeMemoryFunctionLocation
10.0006363528{main}( ).../msg03252.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-camel-users/msg03252.html on line 94
Call Stack
#TimeMemoryFunctionLocation
10.0006363528{main}( ).../msg03252.html:0