git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DISCUSS] Flink Kerberos Improvement


Hi Shuyi,

Yes. I think the impersonation is a very much valid question! This can
actually be considered as 2 questions as I stated in the doc.
1. In the doc I stated that impersonation should be implemented on the
user-side code and should only invoke the cluster client as the actual user
joe'.
2. However, since currently the cluster client assumes no impersonation at
all, many of the code assumes that a fully authorized client can be
instantiated with the same authority that the actual Flink cluster has.
When impersonation is enabled, this might not be the case. For example, if
impersonation is in place, most likely the cluster client running on joe's
behalf will not, and should not have access to keytab file of 'joe'.
Instead, a delegation token is used. Thus the second part of the doc is
trying to address this issue.

--
Rong

On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <suez1224@xxxxxxxxx> wrote:

> Hi Rong, thanks a lot for the proposal. Currently, Flink assume the keytab
> is located in a remote DFS. Pre-installing Keytabs statically in YARN node
> local filesystem is a common approach, so I think we should support this
> mode in Flink natively. As an optimazation to reduce the KDC access
> frequency, we should also support method 3 (the DT approach) as discussed
> in [1]. A question is that why do we need to implement impersonation in
> Flink? I assume the superuser can do the impersonation for 'joe' and 'joe'
> can then invoke Flink client to deploy the job. Thanks a lot.
>
> Shuyi
>
> [1]
>
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
>
> On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <walterddr@xxxxxxxxx> wrote:
>
> > Hi All,
> >
> > We have been experimenting integration of Kerberos with Flink in our Corp
> > environment and found out some limitations on the current Flink-Kerberos
> > security mechanism running with Apache YARN.
> >
> > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> only
> > a subset of the suggested long-running service security mechanism is
> > supported in Flink. Furthermore, the current model does not work well
> with
> > superuser impersonating actual users [2] for deployment purposes, which
> is
> > a widely adopted way to launch application in corp environments.
> >
> > We would like to propose an improvement [3] to introduce the other
> comment
> > methods [1] for securing long-running application on YARN and enable
> > impersonation mode. Any comments and suggestions are highly appreciated.
> >
> > Many thanks,
> > Rong
> >
> > [1]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > [2]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > [3]
> >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> >
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>