Issue with Flink not able to properly read the ResourceManager address for a HA setup
Hi, I am trying to create a flink cluster on yarn, by running the following command but the logs are showing that it is unable to properly connect to the ResourceManager
~/flink-1.5.4/bin/yarn-session.sh -n 5 -tm 2048 -s 4 -d -nm flink_yarn
I found a stackoverflow post where someone mentioned that this could be a result of the flink's packaged hadoop version being different than the hadoop on the node and therefore the flink is not able to properly read the ResourceManager address for a HA setup. However, I confirmed the versions are the same in my case. I downloaded flink-1.5.4-bin-hadoop26-scala_2.11 and when I do a hadoop version on the node, I get Hadoop 2.6.0-cdh5.14.0. Would anyone have any ideas on what else the issue could be?
Additional info: The cluster I am running these on is kerberized so I am not sure if that plays into the issue that is being caused. I setup flink-conf to use kerberos ticket cache and did a kinit before trying to stand up the cluster. I verified the ticket cache was generated by doing a klist (logs in the gist )