git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

increasing parallelism increases the end2end latency in flink sql


Hi,


My application assigned timestamp to kafka event with BoundedOutOfOrdernessTimestampExtractor then converted them to a table. Finally flink SQL over-window aggregation irun against the table


When I double the parallelism of my flink application, the end2end latency is doubled.  What could be the cause? It seems to me that it's because of slower advance of watermark in operator of operators generated by sql.


In this email thread [1], it's said that flink sql remove the internal DataStream timestamp and move it into the record. Does the query ignore the internal DataStream watermarks and re-generate them from the record? Let say there are two operator instances for one task, do they have same watermark?


There is a similar issue that i can find in the email thread [2] .


Best

Yan


[1]: https://lists.apache.org/thread.html/c5182628272f018037ce832290f9b19976fe5c268aa72760635cf3cc@%3Cuser.flink.apache.org%3E

[2]: https://lists.apache.org/thread.html/bf789df06e979f80caf23f6b2c8676aaf07b007ae0d450ae887b6a82@%3Cuser.flink.apache.org%3E



( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-flink-users/msg09412.html on line 120
Call Stack
#TimeMemoryFunctionLocation
10.0009376888{main}( ).../msg09412.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-flink-users/msg09412.html on line 120
Call Stack
#TimeMemoryFunctionLocation
10.0009376888{main}( ).../msg09412.html:0