git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Failed to fetch BLOB - IO Exception


Hi Manjusha,
If you are, for example, using one of Amazon's Linux AMIs on EMR, you
may fall into a trap that Lasse described during  his Flink Forward talk
[1]: These images include a default cron job that cleans up files in
/tmp which have not been recently accessed. The default BLOB server
directory (blob.storage.directory) will store files under /tmp and on
the JobManager, they are only accessed during deployments, so that falls
under this cleanup detection.
A solution is to change the BLOB storage directory.


Nico

[1]
https://data-artisans.com/flink-forward-berlin/resources/our-successful-journey-with-flink

On 23/10/2018 10:27, Manjusha Vuyyuru wrote:
> Hello,
> 
> Checkpointing to hdfs.
> *state.backend.fs.checkpointdir: hdfs://flink-hdfs:9000/flink-checkpoints*
> *state.checkpoints.num-retained: 2*
> *
> *
> Thanks,
> Manjusha
> 
> 
> On Tue, Oct 23, 2018 at 1:05 PM Dawid Wysakowicz <dwysakowicz@xxxxxxxxxx
> <mailto:dwysakowicz@xxxxxxxxxx>> wrote:
> 
>     Hi Manjusha,
> 
>     I am not sure what is wrong, but Nico or Till (cc'ed) might be able
>     to help you.
> 
>     Best,
> 
>     Dawid
> 
>     On 23/10/2018 06:58, Manjusha Vuyyuru wrote:
>>     Hello All,
>>
>>     I have a  job which fails lets say after every 14 days with IO
>>     Exception, failed to fetch blob.
>>     I submitted the job using command line using java jar.Below is the
>>     exception I'm getting: 
>>
>>     java.io.IOException: Failed to fetch BLOB d23d168655dd51efe4764f9b22b85a18/p-446f7e0137fd66af062de7a56c55528171d380db-baf0b6bce698d586a3b0d30c6e487d16 from flink-job-mamager/10.20.1.85:38147 <http://10.20.1.85:38147> and store it under /tmp/blobStore-e3e34fec-22d9-4b3c-b542-0c1e5cdcf896/incoming/temp-00000022
>>     	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191)
>>     	at org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177)
>>     	at org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205)
>>     	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119)
>>     	at org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878)
>>     	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
>>     	at java.lang.Thread.run(Thread.java:748)
>>     Caused by: java.io.IOException: GET operation failed: Server side error: /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>>     	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253)
>>     	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166)
>>     	... 6 more
>>     Caused by: java.io.IOException: Server side error: /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>>     	at org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306)
>>     	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247)
>>     	... 7 more
>>     Caused by: java.nio.file.NoSuchFileException: /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>>     	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>>     	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>>     	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>>     	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
>>     	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
>>     	at java.nio.file.Files.move(Files.java:1395)
>>     	at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452)
>>     	at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521)
>>     	at org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
>>     	at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)
>>     All the configurations of blob are default, i didn't change anything.
>>     Can someone help me to fix this issue.
>>     Thanks,
>>     Manjusha
> 

-- 
Nico Kruber | Software Engineer
data Artisans

Follow us @dataArtisans
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Data Artisans GmbH | Stresemannstr. 121A,10963 Berlin, Germany
data Artisans, Inc. | 1161 Mission Street, San Francisco, CA-94103, USA
--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen