git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problem submitting job, it hangs there


Many Thanks :)

Best regards/祝好,

Chang Liu 刘畅


On 14 Dec 2018, at 11:09, Tzu-Li Chen <wander4096@xxxxxxxxx> wrote:

Hi Chang,

I think there is a JIRA[1] aimed at harden this case.

In fact Flink create this directory on started and without other warnings,
we can assume that it has been created. So it might be deleted by
some clean up processes(by Flink or by the fs).



Chang Liu <fluency.03@xxxxxxxxx> 于2018年12月14日周五 下午5:28写道:
My question is: whatever the Flink user is doing, as long as he/her is doing all the actions within the Flink-provided ways (Flink CLI or Flink APIs in code), should not be able to touch this directory, right?

Because this directory is for the JobManager and managed by Flink.

Best regards/祝好,

Chang Liu 刘畅


On 14 Dec 2018, at 10:23, Chang Liu <fluency.03@xxxxxxxxx> wrote:

Hi Chesnay,

What do you mean by "...we can make a small adjustment to the code…"? Do you mean I, as a flink application developer, can do this in my code, OR, it has to be a code change in the Flink itself?

And more importantly, I would like to ping point the root cause of this because I cannot just manually create such directory in Production.

Many thanks :)

Best regards/祝好,

Chang Liu 刘畅


On 13 Dec 2018, at 14:51, Chesnay Schepler <chesnay@xxxxxxxxxx> wrote:

The directory is automatically created when Flink is started; maybe it was deleted by some cleanup process?

In any case we can make a small adjustment to the code to create all required directories when they don't exist.

On 13.12.2018 14:46, Chang Liu wrote:
Dear All,

I did a workaround and the job submitting is working. I manually created the directory flink-web-upload under the directory /tmp/flink-web-ec768ff6-1db1-4afa-885f-b2828bc31127 .

But I don’t think this is the proper solution. Flink should be able to create such directory automatically. Any ideas? Many Thanks.

Best regards/祝好,

Chang Liu 刘畅


On 13 Dec 2018, at 12:01, Chang Liu <fluency.03@xxxxxxxxx> wrote:

Dear all,

I am trying to submit a job but it got stuck here:

...
2018-12-13 10:43:11,476 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2018-12-13 10:43:11,476 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2018-12-13 10:43:11,476 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2018-12-13 10:43:11,476 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2018-12-13 10:43:11,480 INFO  org.apache.flink.client.program.rest.RestClusterClient        - Submitting job 7d3078bfc22b225351a3178e3e6be992 (detached: true).


And I got this:

...
2018-12-13 10:43:12,292 WARN  org.apache.flink.runtime.rest.FileUploadHandler               - File upload failed.
java.nio.file.NoSuchFileException: c/flink-web-upload/9cdc0d1e-7610-4d50-a665-a614fc8d75e9
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
        at java.nio.file.Files.createDirectory(Files.java:674)
        at org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:105)
        at org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:67)
        at org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
        at org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1407)
        at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1177)
        at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1221)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
        at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
        at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
        at org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)


Could you please help me solving this issue? Many Thanks :)

Best regards/祝好,

Chang Liu 刘畅