[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cancelled job not showing its details

Hi Julio,

this might be a bug in job stats. Can you please create an issue in Jira describing the steps you were doing and complete logs?


On 2 Oct 2018, at 21:11, Julio Biason <julio.biason@xxxxxxxxx> wrote:

Oh, another piece of information:

Because the job was failing and restarting, I did a cancel via the CLI tool during one of the restarts.

On Tue, Oct 2, 2018 at 4:03 PM, Julio Biason <julio.biason@xxxxxxxxx> wrote:

I had a job that was failing -- a bug on our code -- so I decided to cancel it and deploy the fix. Because I couldn't create a savepoint due the job restarting, I decided to kill it anyway and use the web interface to get the last successful checkpoint.

The problem is: the interface is not showing anything for the job. The details page show nothing, not even the pipeline.

The only thing that seems related in the JobManager logs is this:

2018-10-02 19:03:14,214 [] ERROR  - Implementation error: Unhandled exception.
java.lang.IllegalArgumentException: Negative number of in progress checkpoints
        at org.apache.flink.util.Preconditions.checkArgument(
        at org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.<init>(
        at org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.createSnapshot(
        at org.apache.flink.runtime.checkpoint.CheckpointStatsTracker.createSnapshot(
        at org.apache.flink.runtime.executiongraph.ExecutionGraph.getCheckpointStatsSnapshot(
        at org.apache.flink.runtime.executiongraph.ArchivedExecutionGraph.createFrom(
        at org.apache.flink.runtime.jobmaster.JobMaster.requestJob(
        at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(
        at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(
        at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(

Julio Biason, Sofware Engineer
AZION  |  Deliver. Accelerate. Protect.
Office: +55 51 3083 8101  |  Mobile: +55 51 99907 0554

Julio Biason, Sofware Engineer
AZION  |  Deliver. Accelerate. Protect.
Office: +55 51 3083 8101  |  Mobile: +55 51 99907 0554