Hi Scott and Jozef,
Sorry for the late answer, I missed the email.
Well, MetricsPusher will aggregate the metrics just as PipelineResult.metrics() does but it will do so at given configurable intervals and export the values. It means that if you configure the export to be every 5s, you will get the aggregated (between workers) value of the distribution every 5 sec. It will not be reset. For ex, at t = 0 + 5s if the max received until then is 10, then the value exported will be 10. Then, at t = 0 + 10s, it the distribution was updated with a 5 it will still report 10. Then at t = 0 + 15s, if the distribution was updated with a 11, then it will export 11.
As metrics are global and not bound to windows like PCollection elements, you will always have the cumulative value (essence of the distribution metric). So I agree with Scott, better for your use case is to treat the metric as if it was an element and compute it donwstream so that it could be bound to a window.
Le samedi 02 juin 2018 à 08:01 +0300, Jozef Vilcek a écrit :
nothing special about the use-case. Just want to monitor upper and lower bound for some data floating in operator.
The "report interval" is right now 30 seconds and it is independent of business logic. It is the one mentionedd here:
and value set with respect to how granular and fast do I want to see changes on what is going on in the pipeline compared to how much resources in time-series database I dedicate to it.
Thanks for looking into it