git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Jobs running on a yarn per-job cluster fail to restart when a task manager is lost


Hi,

I am running a streaming job without checkpointing enabled. A failute rate restart strategy have been set with StreamExecutionEvironment.setRestartStrategy.

When a task manager is lost because of memory problems, the job manager try to restart the job without launching a new task manager, and failed with NoResourceAvailableException: Not enough slots available to run the job.

The job is running on flink 1.4.2 and Hadoop 2.7.4.


( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-flink-users/msg09376.html on line 69
Call Stack
#TimeMemoryFunctionLocation
10.0009358376{main}( ).../msg09376.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-flink-users/msg09376.html on line 69
Call Stack
#TimeMemoryFunctionLocation
10.0009358376{main}( ).../msg09376.html:0