[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 1.10.0beta1 now available for download

Hi Jakob,

I’m having the feeling we are on different wave lengths and we are not getting closer :-(.

Remarks inline.

> On 2 May 2018, at 22:56, Jakob Homan <jghoman@xxxxxxxxx> wrote:
> Hey Bolke-
>  Stabilizing the tree has nothing to do with getting a release
> through IPMC.  The IPMC doesn't test the code - it only verifies that
> the required licenses and legal obligations are met, that the release
> artifacts meet the requirements to be processed through ASF's
> publishing infra, etc.  Minor issues like a couple missing headers are
> also now generally let through (since it's an incubator release), to
> be fixed in the next go around.  And honestly, if a podling that
> believes it's ready to graduate is still having trouble making sure
> correct license headers are applied, that's a giant red flag that
> there may be large remaining Apache Way issues to address.  Podlings
> demonstrating their ability to follow ASF process and operate in the
> Apache Way is the crux of the incubation process.

The header XYZ was just an example. I think we are doing reasonably
fine on the process with a couple of hiccups here and there. And some
discussions like this of course ;-)

>   Also, I entirely agree with you that it's much harder than it
> should be to get the requisite votes from the IPMC.  I do quite a lot
> of vote munging for all the podlings with which I'm involved and it's
> always annoying.  The IPMC, like all of ASF, is made up of volunteers
> and is not always as responsive as it should be.

Fully understood and thank you for the effort!

>  The process you describe as not having merit is a large part of the
> ASF.  Specifically, preparing a release candidate is pretty well
> documented across ASF [1,2,3,4].  The goals you describe (stabilizing
> a new feature like the Kubernates executor, and rallying people to try
> the code and fix bugs) are exactly what the RC process does as well.
> And once the community gets experience in rolling RCs and running
> release votes, it's actually not that much work.  Lots of projects
> have multiple releases (main branches and bug fixes) going nearly
> constantly.

Here, I don’t follow you anymore. I don’t agree that the release process
as described mentions Release Candidates at all. It mentions shepherding
from an initial consensus to a final distribution. It doesn’t mention how to 
do this (e.g. by having a release candidate) it just mentions output criteria.
So it is up to the release manager in consensus with the community
how to get to a release. It is not stipulated that we need release candidates.

The different projects seem to have different ways of shepherding.

>  I'm not suggesting that we vote on alpha/beta releases.  I'm
> pointing out that the goals of the alpha release as you describe them
> match up very well with the goals of the RC process - which will need
> to be done subsequently anyway.  I'm also saying that announcing
> 'betas' doesn't really jive with how ASF expects artifacts to be
> released or voted on for release (which, if an artifact is up on
>, it most definitely appears to be).  My suggestion
> would be to take this very meritorious effort and make it official.
> Pick a release manager, create a branch, roll an RC, ask for people to
> stabilize it, merge bug fixes but not features to the branch, repeat
> until an RC passes a vote.

I don’t think they match. Firstly, for the beta it is not a release and we don’t
intend to make it one. Secondly, a Release Candidate has a meaning
attached to it to our community. It says: We think it Ready for Production
but we are not entirely sure yet (we dont give you a support contract).

Together with Fokko, I am Release Manager for the upcoming 1.10.0 release.
I’m not prepared (as Release Manager) to create a RC. I think we will expose
people to too many risks and it requires more `shepherding` before we can 
put up a vote.

We have branched v1-10-test. When we are ready to go to Release Candidate
we will branch off v1-10-stable from v1-10-test. We called the current state of 
v1-10-test “beta” and we made a convenience tarball. How is this different from 
nightlies and snapshots?

So we got all your boxes ticked. Except ‘roll an RC’. We are not in that
phase yet. Sure the process looks a bit like it (hey we practice some steps
of the Apache Way), but it is not. I don’t think anyone is confused by that.

Nevertheless, I’m pretty surprised (astonished even) that some projects are 
indeed voting on Alphas and Betas. They call it releases as well. That’s a lot of 
effort for an artefact that will get little exposure [1]. Tomcat votes on Alphas
as well [2]. Still, JMeter publishes snapshots in an Apache repo [3] and many
others [5], but they are all Java projects. 

I still think it is a matter of semantics. The python community at large
does not consider an “alpha” tag or “beta” tag to be a release (PEP-0440, 
a PEP has a RFC status)  (although it meanders a little bit in its 
wording ("Final Release”)), but a phase [4]
Apache itself does not seem to stipulate it, but its projects seem to consider a 
“alpha” or a “beta” to be a release and thus it needs to be voted upon. However, 
those projects do have snapshots and nightlies published. These types of non-
releases are however very java-esque as in maven you cannot include a snapshot
dependency when you do a release. This, however, also goes for “beta”, “alpha” or
“rc” with python’s package managers as they are not considered to be a release.

As mentioned I am not going to tag a RC and create the artefacts for it now. I 
would feel irresponsible in doing so. We are pretty close but not there yet. 
I’m also not going to call out a vote for a pre-release beta, because before 
the vote has ended we will have another pre-release.

So what to do? I have the feeling that we are stuck between a rock and hard place.
Obviously if the community thinks otherwise and I am not seeing it correctly 
I’ll step down as release manager so someone else can pick it up. 

- Bolke


> Thanks,
> Jakob
> [1]
> [2]
> [3]
> [4]
> On 2 May 2018 at 10:40, Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>> Hi Jakob,
>> This ‘release’ is not effectively a RC. We want to have the kubernetes
>> executor stabilised or at least passing its own tests before we like to move
>> to RC status. People also tend to rally to have some extra bugfixes in or
>> some extra features when we announce “beta” status. Given the fact that
>> going from 1.9 to 1.10 is a big leap I think it is important to have
>> period to funnel towards a RC/Release.
>> Gotcha on httpd. However it still seems semantics to me. I would equal
>> a Spark nightly somewhat to an Airflow alpha. A snapshot somewhat
>> to a beta. Ie. for Airflow ‘alphas’ and ‘betas’ are not releases, not from a
>> process perspective and and not from a technical perspective.
>> Practically, I think we need a way to stabilise the tree so we have
>> a reasonable confidence we can pass a vote for ‘real release, which is a
>> technical vote of confidence and a process vote of confidence. Voting
>> on alphas (equivalent to a nightly) and betas would make this a very
>> cumbersome process. Particularly as a podling: getting 3 votes at the IPMC
>> is a tough process (I’ve been physically going around at a conference to
>> obtain votes last year). If we then get a “no you can’t have a alpha because
>> header XYZ is missing” it kind of defeats the purpose of having alphas
>> from the process side (which you are basically saying). However, it still
>> has a technical merit.
>> What would your suggestion be? I’m really afraid of getting stuck
>> in process and the process, to me currently, does not seem to have the merit
>> we are looking for*. We might have a different understanding
>> what we consider to be a ‘release’ though. So open to suggestions
>> (also from the wider community here :) ).
>> Cheers
>> Bolke
>> * dont misunderstand me here please, for Releases (e.g. 1.10.0 with no extra
>> label) I’m quite okay.
>>> On 1 May 2018, at 23:51, Jakob Homan <jghoman@xxxxxxxxx> wrote:
>>> Hey-
>>>  Correct, we can publish nightlies and SNAPSHOTs, but those are not
>>> releases.  Also, if a community votes to consider a release alpha or
>>> beta, it may do so (From the httpd link, "Based on the community's
>>> confidence in the code, the potential release is tagged as alpha, beta
>>> or general availability (GA) and the candidate and is voted in that
>>> manner."), but this is an indicator of the technical quality of the
>>> actual release, not the point in the release's lifecycle.
>>>  My question is - if this  release is effectively an RC, why not
>>> make it officially so? What's the goal of the beta compared to an RC?
>>> As a mentor, I see an invitation for users to come and test some work
>>> that could potentially be a release.  That's what we ask for during a
>>> release process, along with the release manager activity, publishing
>>> to specified locations, etc.  It would be good to demonstrate we can
>>> do that well.
>>> Thanks,
>>> Jakob
>>> On 1 May 2018 at 14:31, Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>>>> Hi Jakob,
>>>> To be honest I’m confused now. In software land (and I assume you know)
>>>> Alpha -> Beta -> RC -> Release is well known and so well established that I would
>>>> be surprised if anyone got confused by that. Even the oldest project from Apache
>>>> have alpha-s and beta-s ( and something
>>>> called GA which is equal to a release I guess.
>>>> If you would expect people to pick up from a git tag and build from there and then report back
>>>> to us, that doesn’t really happen. We are always having a challenge to have enough test surface,
>>>> that would diminish that surface.
>>>> Other projects also “publish” other than voted upon artefacts. E.g. Spark has nightly builds and SNAPSHOTS.
>>>> A snapshot clearly has a different state than a nightly. Apache Flink state that 1.4.2 is their latest stable release.
>>>> So there seems to be a “non-stable” release as well. I did see that their git repositories only mention “RC-X” tags
>>>> or branches.
>>>> Reading through it does not mention anywhere
>>>> that we need to have RCs. It just states that if you want to do a release you need to call a vote and for distribution
>>>> it must be at a certain location. As mentioned this is a “beta” which is not a “release”. We haven’t released it either as
>>>> it wasn’t voted upon and no vote was called. It was just made available for convenience of the community.
>>>> So I am not sure what is expected from us here. How do wo go though dev -> test -> acc -> prod release process
>>>> together with the community? The release process you seem to be referring is only part of the last state imho. Or
>>>> do we need to call a vote on every state change?
>>>> Cheers
>>>> Bolke
>>>>> On 1 May 2018, at 22:47, Jakob Homan <jghoman@xxxxxxxxx> wrote:
>>>>> Hey Bolke-
>>>>> To be clear, I'm not suggesting anyone is trying to do anything
>>>>> wrong.  Release wasn't mentioned, but a new tar ball with a new
>>>>> version number with a 'beta' tag is published in some way for people
>>>>> to come and test.  How is that different than the expected release/RC
>>>>> process (specify a git point, offer a tar ball, add an RCx tag and
>>>>> invite people to test that)?  Seems like a parallel process with lots
>>>>> of similarities that could confuse both our end users and the IPMC.
>>>>> Thanks,
>>>>> Jakob
>>>>> On 1 May 2018 at 13:08, Bolke de Bruin <bdbruin@xxxxxxxxx> wrote:
>>>>>> Hi Jakob,
>>>>>> Understood. But isn’t that in this case not just wording? Ie. this is a tar-ball that we think is beyond just developer testing (alpha) but more towards the enthusiasts (beta) but not a version of the tarball that is for the general public to test (RC) and not a Release (release)? Ie. is the issue in calling it a ‘release’ which in this case is just meta for a tarball? In the original email in never mentioned the word release in conjunction with the beta I think.
>>>>>> Cheers
>>>>>> Bolke
>>>>>>> On 1 May 2018, at 22:01, Jakob Homan <jghoman@xxxxxxxxx> wrote:
>>>>>>> Hey all-
>>>>>>> With my Mentor hat on, I need to point out that ASF doesn't really
>>>>>>> have beta releases.  This work is awesome, but really needs to go
>>>>>>> through the proper steps.  The Release Candidate process is pretty
>>>>>>> well described:
>>>>>>>  This is
>>>>>>> particularly important since, as was mentioned, graduation should be
>>>>>>> imminent and this process will be heavily scrutinized.
>>>>>>> -Jakob
>>>>>>> On 1 May 2018 at 12:41, James Meickle <jmeickle@xxxxxxxxxxxxxx> wrote:
>>>>>>>> Thanks for the pointer! I went through and set this up today, using Google
>>>>>>>> OAuth as the RBAC provider. Overall I'm quite enthusiastic about this move,
>>>>>>>> but I thought that it might be helpful to collect feedback as someone who
>>>>>>>> hasn't been following the overall process and is therefore coming at it
>>>>>>>> with fresh eyes.
>>>>>>>> - The Flask appbuilder security documentation is poor quality (e.g.,
>>>>>>>> there's some broken sentences); if Airflow is to send people there, it
>>>>>>>> might be worth PRing some of the docs to at least look more professional.
>>>>>>>> - There's not much documentation out there on how to properly set up an
>>>>>>>> OAuth app in Google (in my case, using the G+ API). From an adoption POV,
>>>>>>>> it would be good to screenshot the (current) steps in the process, and
>>>>>>>> point out which values should be used in which fields on Google. For
>>>>>>>> example, I had to grep the code base to find the callback URL.
>>>>>>>> - The initial login UI seems over-complex: you have to click the provider
>>>>>>>> icon, and then click either login or register. The standard for this
>>>>>>>> workflow is that you login by clicking the desired provider's icon, and
>>>>>>>> doing so will register you automatically if you aren't already. In my case
>>>>>>>> I only have one provider, so this menu was even more confusing.
>>>>>>>> - It was not clear to me that the "Public" role has absolutely no
>>>>>>>> permissions. When I set this as the default role and registered, I could no
>>>>>>>> longer access the site until I cleared cookies. I thought it was an OAuth
>>>>>>>> error at first, but it turns out the Public role has fewer effective
>>>>>>>> permissions than an anonymous user; this resulted in a redirect loop
>>>>>>>> because I could not even view the homepage. I had to correct this in the
>>>>>>>> database to be able to log in.
>>>>>>>> - The roles list (at roles/list/ ) is intimidatingly large and hard to
>>>>>>>> parse. For instance, I couldn't tell at a glance what "user" allows
>>>>>>>> relative to "viewer". It would be good to have a narrative description of
>>>>>>>> what each of these roles is intended for, and to present the list of
>>>>>>>> permissions in a more clustered or diffable way. Permissions lists tend to
>>>>>>>> only grow, after all.
>>>>>>>> - A "Viewer" currently lacks enough access to see their own profile.
>>>>>>>> - "User Statistics" (userstatschartview/chart/) uses the internal name,
>>>>>>>> rather than firstname/lastname - which in my case is a `google_idnumber`
>>>>>>>> name. Should probably show both names.
>>>>>>>> Unrelatedly to RBAC (I think), on this branch on my sandbox instance, tasks
>>>>>>>> appear to be failing with the only logs present in the UI as:
>>>>>>>> [{'end_of_log': True}, {'end_of_log': True}, {'end_of_log': True},
>>>>>>>> {'end_of_log': True}, {'end_of_log': True}, {'end_of_log': True}]
>>>>>>>> Finally, in case anyone else wanted to test run a similar setup, here is
>>>>>>>> the that I ended up using (note that it has Jinja
>>>>>>>> templating via Ansible):
>>>>>>>> import os
>>>>>>>> from airflow import configuration as conf
>>>>>>>> from import AUTH_OAUTH
>>>>>>>> basedir = os.path.abspath(os.path.dirname(__file__))
>>>>>>>> # The SQLAlchemy connection string.
>>>>>>>> SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
>>>>>>>> # Flask-WTF flag for CSRF
>>>>>>>> CSRF_ENABLED = True
>>>>>>>> # The name to display, e.g. "Airflow Staging Sandbox"
>>>>>>>> APP_NAME = "Airflow {{ env }} {{ app_config | capitalize }}"
>>>>>>>> # Use OAuth
>>>>>>>> # Will allow user self registration
>>>>>>>> # The default user self registration role
>>>>>>>> AUTH_USER_REGISTRATION_ROLE = "{{ airflow_rbac_registration_role |
>>>>>>>> default('Viewer') }}"
>>>>>>>> # Google OAuth:
>>>>>>>> OAUTH_PROVIDERS = [{
>>>>>>>> # The name of the provider
>>>>>>>> 'name': 'google',
>>>>>>>> # The icon to use
>>>>>>>> 'icon': 'fa-google',
>>>>>>>> # The name of the key that the provider sends
>>>>>>>> 'token_key': 'access_token',
>>>>>>>> # Just in case, whitelist to only emails
>>>>>>>> 'whitelist': [''],
>>>>>>>> # Define the remote app:
>>>>>>>> 'remote_app': {
>>>>>>>> 'base_url': '',
>>>>>>>> 'access_token_url': '',
>>>>>>>> 'authorize_url': '',
>>>>>>>> 'request_token_url': None,
>>>>>>>> 'request_token_params': {
>>>>>>>> # Uses the Google+ API, requestingf the 'email' and 'profile' scope
>>>>>>>> 'scope': 'email profile'
>>>>>>>> },
>>>>>>>> 'consumer_key': '{{ vault_airflow_google_oauth_key }}',
>>>>>>>> 'consumer_secret': '{{ vault_airflow_google_oauth_secret }}'
>>>>>>>> }
>>>>>>>> }]
>>>>>>>> On Mon, Apr 30, 2018 at 12:54 PM, Jørn A Hansen <jornhansen@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>>> On Mon, 30 Apr 2018 at 15.56, James Meickle <jmeickle@xxxxxxxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>>>>> Installed this off of the branch, and I do get the Kubernetes executor
>>>>>>>>>> (incl. demo DAG) and some bug fixes - but I don't see any RBAC feature
>>>>>>>>>> anywhere I'd think to look. Do I need to set up some config to get that
>>>>>>>>> to
>>>>>>>>>> show up?
>>>>>>>>> See
>>>>>>>>> test/
>>>>>>>>> It had me left wondering as well - so I decided to go hunt for it in the
>>>>>>>>> RBAC PR. And there it was :-)
>>>>>>>>> Cheers,
>>>>>>>>> JornH
>>>>>>>>>> On Mon, Apr 23, 2018 at 2:06 PM, Bolke de Bruin <bdbruin@xxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>> I am really happy that Fokko and I have created the v1-10-test branch
>>>>>>>>> and
>>>>>>>>>>> subsequently build the first beta of Apache Airflow 1.10!
>>>>>>>>>>> It is available for testing here:
>>>>>>>>>>> Highlights include:
>>>>>>>>>>> * New RBAC web interface in beta
>>>>>>>>>>> * Timezone support
>>>>>>>>>>> * First class kubernetes operator
>>>>>>>>>>> * Experimental kubernetes executor
>>>>>>>>>>> * Documentation improvements
>>>>>>>>>>> * Performance optimizations for large DAGs
>>>>>>>>>>> * many GCP and S3 integration improvements
>>>>>>>>>>> * many new operators
>>>>>>>>>>> * many many many bug fixes
>>>>>>>>>>> We are aiming for a fully compliant Apache release so we should be able
>>>>>>>>>> to
>>>>>>>>>>> kick off the graduation process after this release. I hope you help us
>>>>>>>>>> out
>>>>>>>>>>> getting there!
>>>>>>>>>>> Kind regards,
>>>>>>>>>>> Bolke & Fokko

( ! ) Warning: include(msgfooter.php): failed to open stream: No such file or directory in /var/www/git/apache-airflow-development/msg03253.html on line 501
Call Stack
10.0009393336{main}( ).../msg03253.html:0

( ! ) Warning: include(): Failed opening 'msgfooter.php' for inclusion (include_path='.:/var/www/git') in /var/www/git/apache-airflow-development/msg03253.html on line 501
Call Stack
10.0009393336{main}( ).../msg03253.html:0