git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Usage of readLockMinAge


Here we do not want to skip files based on size - but based on whether file
has attained the minAge (as per lastmodifieddate).
With respect to readLockMinAge - we feel it is a useful check to find out
if the incoming file is fully written and if used in conjunction with
readLock=changed option it helps to pick completed files. In each polling
iteration - we want to skip files that have not met the minAge criteria.
The skipped files will be reevaluated in the next polling. There should not
be any 'sleep' happening which will decrease overall throughput. Hence the
suggestion for the either of the 2 changes (mentioned originally)  with
respect to FileChangedExclusiveReadLockStrategy.

As an additional note but not related to the main issue  - We are also
using a custom filter though for a different purpose - since file polling
consumer is single threaded - we had multiple pollers configured for the
same root directory with a custom filter that filters out specific
subdirectories for each poller. In this way the 3000 subdirectories are
divided logically among the pollers.



On Thu, Sep 27, 2018 at 6:48 PM Claus Ibsen <claus.ibsen@xxxxxxxxx> wrote:

> Hi
>
> Have you tried with a custom filter where you can check file size and
> skip small files.
> On Thu, Sep 27, 2018 at 2:16 PM Sunil O <sunilomk2002@xxxxxxxxx> wrote:
> >
> > We have a implementation scenario with File consumer polling large number
> > of folders (3000+) and 200K files per day with 1 mts SLA per file
> > processing.
> >
> > We also need to use readLock to avoid picking files which are being
> > written.
> >
> > In this scenario - we went for readLock=changed option. However this
> option
> > results in thread sleeping for minium time as specified by
> > readLockCheckInterval period option. While looking for workarounds - we
> > found the readLockMinAge option - which allows to pick up files which are
> > old enough without getting into the sleep mode. This has reduced the time
> > for picking files and thereby reducing the overall processing time.
> >
> > However whenever a file is encountered with age below minage - the sleep
> > occurs as per the logic in FileChangedExclusiveReadLockStrategy.  If this
> > 'sleep' step can be avoided when readLockMin age is specified - then
> > insteading of sleeping - the consumer can go on to pick other files. This
> > modified behavior would be useful in scenario where overall throughput
> and
> > processing performance is important than sequential processing etc.
> >
> >
> > While browsing similar issues - found JIRA issue 9324 which also
> discusses
> > the issue regarding the Sleep step -
> >
> > So it would be good if one of the below is available
> >
> > a)  there is a separate ExclusiveReadLockStrategy similar to
> > FileChangedExclusiveReadLockStrategy which deals only with readLockMin
> age
> > and skips file if age is not met instead of sleeping.
> >
> > Or
> >
> > b) an option skip/sleep should be added for
> > FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
> >
> >
> > Please give your suggestions.
>
>
>
> --
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>