git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pinning dependencies for Apache Airflow


I am still not convinced that pinning is bad. I re-read again the whole
mail thread and the thread from 2016
<https://github.com/apache/incubator-airflow/pull/1809#issuecomment-257502174>
to
read all the arguments, but I stand by pinning.

I am - of course - not sure about graduation argument. I would just imagine
it might be the cas.. I however really think that situation we are in now
is quite volatile. The latest 1.10.0 cannot be clean-installed via pip
without manually tweaking and forcing lower version of flask-appbuilder.
Even if you use the constraints file it's pretty cumbersome because you'd
have to somehow know that you need to do exactly that (not at all obvious
from the error you get). Also it might at any time get worse as other
packages get newer versions released. The thing here is that maintainers of
flask-appbuilder did nothing wrong, they simply released new version with
click dependency version increased (probably for a good reason) and it's
airflow's cross-dependency graph which makes it incompatible.

I am afraid that if we don't change it, it's all but guaranteed that every
single release at some point of time will "deteriorate" and refuse to
clean-install. If we want to solve this problem (maybe we don't and we
accept it as it is?), I think the only way to solve it is to hard-pin all
the requirements at the very least for releases.

Of course we might choose pinning only for releases (and CI builds) and
have the compromise that Matt mentioned. I have the worry however (also
mentioned in the previous thread) that it will be hard to maintain.
Effectively you will have to maintain both in parallel. And the case with
constraints is a nice workaround for someone who actually need specific
(even newer) version of specific package in their environment.

Maybe we should simply give it a try and do Proof-Of-Concept/experiment as
also Fokko mentioned?

We could have a PR with pinning enabled, and maybe ask the people who voice
concerns about environment give it a try with those pinned versions and see
if that makes it difficult for them to either upgrade dependencies and fork
apache-airflow or use constraints file of pip?

J.


On Tue, Oct 9, 2018 at 5:56 PM Matt Davis <jiffyclub@xxxxxxxxx> wrote:

> Erik, the Airflow task execution code itself of course must run somewhere
> with Airflow installed, but if the task is making a database query or a web
> request or running something in Docker there's separation between the
> environments and maybe you don't care about Python dependencies at all
> except to get Airflow running. When running Python operators that's not the
> case (as you already deal with).
>
> - Matt
>
> On Tue, Oct 9, 2018 at 2:45 AM EKC (Erik Cederstrand)
> <EKC@xxxxxxxxxxxxx.invalid> wrote:
>
> > This is maybe a stupid question, but is it even possible to run tasks in
> > an environment where Airflow is not installed?
> >
> >
> > Kind regards,
> >
> > Erik
> >
> > ________________________________
> > From: Matt Davis <jiffyclub@xxxxxxxxx>
> > Sent: Monday, October 8, 2018 10:13:34 PM
> > To: dev@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
> > Subject: Re: Pinning dependencies for Apache Airflow
> >
> > It sounds like we can get the best of both worlds with the original
> > proposals to have minimal requirements in setup.py and "guaranteed to
> work"
> > complete requirements in a separate file. That way we have flexibility
> for
> > teams that run airflow and tasks in the same environment and guidance on
> a
> > working set of requirements. (Disclaimer: I work on the same team as
> > George.)
> >
> > Thanks,
> > Matt
> >
> > On Mon, Oct 8, 2018 at 8:16 AM Ash Berlin-Taylor <ash@xxxxxxxxxx> wrote:
> >
> > > Although I think I come down on the side against pinning, my reasons
> are
> > > different.
> > >
> > > For the two (or more) people who have expressed concern about it would
> > > pip's "Constraint Files" help:
> > >
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpip.pypa.io%2Fen%2Fstable%2Fuser_guide%2F%23constraints-files&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=rUqtgC5eVKIQGlzniFMyJpU9IXFZ2Efs04ZCgO2I%2F9g%3D&amp;reserved=0
> > >
> > > For example, you could add "flask-appbuilder==1.11.1" in to this file,
> > > specify it with `pip install -c constraints.txt apache-airflow` and
> then
> > > whenever pip attempted to install _any version of FAB it would use the
> > > exact version from the constraints file.
> > >
> > > I don't buy the argument about pinning being a requirement for
> graduation
> > > from Incubation fwiw - it's an unavoidable artefact of the open-source
> > > world we develop in.
> > >
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibraries.io%2F&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=QX5hO%2FVPJE9M9A38QgCjx%2BfT4C1tfvr1ySUW%2FpV86Jw%3D&amp;reserved=0
> > offers a (free?) service that will monitor apps
> > > dependencies for being out of date, might be better than writing our
> own
> > > solution.
> > >
> > > Pip has for a while now supported a way of saying "this dep is for
> py2.7
> > > only":
> > >
> > > > Since version 6.0, pip also supports specifiers containing
> environment
> > > markers like so:
> > > >
> > > >    SomeProject ==5.4 ; python_version < '2.7'
> > > >    SomeProject; sys_platform == 'win32'
> > >
> > >
> > > Ash
> > >
> > >
> > > > On 8 Oct 2018, at 07:58, George Leslie-Waksman <waksman@xxxxxxxxx>
> > > wrote:
> > > >
> > > > As a member of a team that will also have really big problems if
> > > > Airflow pins all requirements (for reasons similar to those already
> > > > stated), I would like to add a very strong -1 to the idea of pinning
> > > > them for all installations.
> > > >
> > > > In a number of situation on our end, to avoid similar problems with
> > > > CI, we use `pip-compile` from pip-tools (also mentioned):
> > > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpypi.org%2Fproject%2Fpip-tools%2F&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=1d9m%2Bk4NSuXNtnXFRFtv6pGdAUDvVvkoFe95pTshiIQ%3D&amp;reserved=0
> > > >
> > > > I would like to suggest, a middle ground of:
> > > >
> > > > - Have the installation continue to use unpinned (`>=`) with minimum
> > > > necessary requirements set
> > > > - Include a pip-compiled requirements file (`requirements-ci.txt`?)
> > > > that is used by CI
> > > > - - If we need, there can be one file for each incompatible python
> > > version
> > > > - Append a watermark (hash of `setup.py` requirements?) to the
> > > > compiled requirements file
> > > > - Add a CI check that the watermark and original match to ensure no
> > > > drift since last compile
> > > >
> > > > I am happy to do much of the work for this, if it can help avoid
> > > > pinning all of the depends at the installation level.
> > > >
> > > > --George Leslie-Waksman
> > > >
> > > > On Sun, Oct 7, 2018 at 1:26 PM Maxime Beauchemin
> > > > <maximebeauchemin@xxxxxxxxx> wrote:
> > > >>
> > > >> pip-tools can definitely help here to ship a reference [locked]
> > > >> `requirements.txt` that can be used in [all or part of] the CI. It's
> > > >> actually kind of important to get CI to fail when a new [backward
> > > >> incompatible] lib comes out and break things while allowing version
> > > ranges.
> > > >>
> > > >> I think there may be challenges around pip-tools and projects that
> run
> > > in
> > > >> both python2.7 and python3.6. You sometimes need to have 2
> > > requirements.txt
> > > >> lock files.
> > > >>
> > > >> Max
> > > >>
> > > >> On Sun, Oct 7, 2018 at 5:06 AM Jarek Potiuk <
> Jarek.Potiuk@xxxxxxxxxxx
> > >
> > > >> wrote:
> > > >>
> > > >>> It's a nice one :). However I think when/if we go to pinned
> > > dependencies
> > > >>> the way poetry/pip-tools do it, this will be suddenly lot-less
> useful
> > > It
> > > >>> will be very easy to track dependency changes (they will be always
> > > >>> committed as a change in the .lock file or requirements.txt) and if
> > > someone
> > > >>> has a problem while upgrading a dependency (always consciously,
> never
> > > >>> accidentally) it will simply fail during CI build and the change
> > won't
> > > get
> > > >>> merged/won't break the builds of others in the first place :).
> > > >>>
> > > >>> J.
> > > >>>
> > > >>> On Sun, Oct 7, 2018 at 6:26 AM Deng Xiaodong <xd.deng.r@xxxxxxxxx>
> > > wrote:
> > > >>>
> > > >>>> Hi folks,
> > > >>>>
> > > >>>> On top of this discussion, I was thinking we should have the
> ability
> > > to
> > > >>>> quickly monitor dependency release as well. Previously, it
> happened
> > > for a
> > > >>>> few times that CI kept failing for no reason and eventually turned
> > > out it
> > > >>>> was due to dependency release. But it took us some time,
> sometimes a
> > > few
> > > >>>> days, to realise the failure was because of dependency release.
> > > >>>>
> > > >>>> To partially address this, I tried to develop a mini tool to help
> us
> > > >>> check
> > > >>>> the latest release of Python packages & the release date-time on
> > PyPi.
> > > >>> So,
> > > >>>> by comparing it with our CI failure history, we may be able to
> > > >>> troubleshoot
> > > >>>> faster.
> > > >>>>
> > > >>>> Output Sample (ordered by upload time in desc order):
> > > >>>>                               Latest Version          Upload Time
> > > >>>> Package Name
> > > >>>> awscli                    1.16.28
> > > >>> 2018-10-05T23:12:45
> > > >>>> botocore                1.12.18
> > > 2018-10-05T23:12:39
> > > >>>> promise                   2.2.1
> > > >>> 2018-10-04T22:04:18
> > > >>>> Keras                     2.2.4
> > > >>> 2018-10-03T20:59:39
> > > >>>> bleach                    3.0.0
> > > >>> 2018-10-03T16:54:27
> > > >>>> Flask-AppBuilder         1.12.0                2018-10-03T09:03:48
> > > >>>> ... ...
> > > >>>>
> > > >>>> It's a minimal tool (not perfect yet but working). I have hosted
> > this
> > > >>> tool
> > > >>>> at
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FXD-DENG%2Fpypi-release-query&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=xk9hyQA%2BnaJjqPF7bTQB%2BydqSfGIVzxkynfxjx%2FVoYo%3D&amp;reserved=0
> > .
> > > >>>>
> > > >>>>
> > > >>>> XD
> > > >>>>
> > > >>>> On Sat, Oct 6, 2018 at 12:25 AM Jarek Potiuk <
> > > Jarek.Potiuk@xxxxxxxxxxx>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hello Erik,
> > > >>>>>
> > > >>>>> I understand your concern. It's a hard one to solve in general
> > (i.e.
> > > >>>>> dependency-hell). It looks like in this case you treat Airflow as
> > > >>>>> 'library', where for some other people it might be more like 'end
> > > >>>> product'.
> > > >>>>> If you look at the "pinning" philosophy - the "pin everything" is
> > > good
> > > >>>> for
> > > >>>>> end products, but not good for libraries. In the case you have
> > > Airflow
> > > >>> is
> > > >>>>> treated as a bit of both. And it's perfectly valid case at that
> > (with
> > > >>>>> custom python DAGs being central concept for Airflow).
> > > >>>>> However, I think it's not as bad as you think when it comes to
> > exact
> > > >>>>> pinning.
> > > >>>>>
> > > >>>>> I believe - a bit counter-intuitively - that tools like
> > > >>> pip-tools/poetry
> > > >>>>> with exact pinning result in having your dependencies upgraded
> more
> > > >>>> often,
> > > >>>>> rather than less - especially in complex systems where
> > > dependency-hell
> > > >>>>> creeps-in. If you look at Airflow's setup.py now - It's a bit
> scary
> > > to
> > > >>>> make
> > > >>>>> any change to it. There is a chance it will blow at your face if
> > you
> > > >>>> change
> > > >>>>> it. You never know why there is 0.3 < ver < 1.0 - and if you
> change
> > > it,
> > > >>>>> whether it will cause chain reaction of conflicts that will ruin
> > your
> > > >>>> work
> > > >>>>> day.
> > > >>>>>
> > > >>>>> On the contrary - if you change it to exact pinning in
> > > >>>>> .lock/requirements.txt file (poetry/pip-tools) and have much
> > simpler
> > > >>> (and
> > > >>>>> commented) exclusion/avoidance rules in your .in/.tml file, the
> > whole
> > > >>>> setup
> > > >>>>> might be much easier to maintain and upgrade. Every time you
> > prepare
> > > >>> for
> > > >>>>> release (or even once in a while for master) one person might
> > > >>> consciously
> > > >>>>> attempt to upgrade all dependencies to latest ones. It should be
> > > almost
> > > >>>> as
> > > >>>>> easy as letting poetry/pip-tools help with figuring out what are
> > the
> > > >>>> latest
> > > >>>>> set of dependencies that will work without conflicts. It should
> be
> > > >>> rather
> > > >>>>> straightforward (I've done it in the past for fairly complex
> > > systems).
> > > >>>> What
> > > >>>>> those tools enable is - doing single-shot upgrade of all
> > > dependencies.
> > > >>>>> After doing it you can make sure that all tests work fine (and
> fix
> > > any
> > > >>>>> problems that result from it). And then you test it thoroughly
> > before
> > > >>> you
> > > >>>>> make final release. You can do it in separate PR - with automated
> > > >>> testing
> > > >>>>> in Travis which means that you are not disturbing work of others
> > > >>>>> (compilation/building + unit tests are guaranteed to work before
> > you
> > > >>>> merge
> > > >>>>> it) while doing it. It's all conscious rather than accidental.
> Nice
> > > >>> side
> > > >>>>> effect of that is that with every release you can actually
> > "catch-up"
> > > >>>> with
> > > >>>>> latest stable versions of many libraries in one go. It's better
> > than
> > > >>>>> waiting until someone deliberately upgrades to newer version (and
> > the
> > > >>>> rest
> > > >>>>> remain terribly out-dated as is the case for Airflow now).
> > > >>>>>
> > > >>>>> So a bit counterintuitively I think tools like pip-tools/poetry
> > help
> > > >>> you
> > > >>>> to
> > > >>>>> catch up faster in many cases. That is at least my experience so
> > far.
> > > >>>>>
> > > >>>>> Additionally, Airflow is an open system - if you have very
> specific
> > > >>> needs
> > > >>>>> for requirements, you might actually - in the very same way with
> > > >>>>> pip-tools/poetry - upgrade all your dependencies in your local
> fork
> > > of
> > > >>>>> Airflow before someone else does it in master/release. Those
> tools
> > > kind
> > > >>>> of
> > > >>>>> democratise dependency management. It should be as easy as
> > > `pip-compile
> > > >>>>> --upgrade` or `poetry update` and you will get all the
> > > >>> "non-conflicting"
> > > >>>>> latest dependencies in your local fork (and poetry especially
> seems
> > > to
> > > >>> do
> > > >>>>> all the heavy lifting of figuring out which versions will work).
> > You
> > > >>>> should
> > > >>>>> be able to test and publish it locally as your private package
> for
> > > >>> local
> > > >>>>> installations. You can even mark the specific dependency you want
> > to
> > > >>> use
> > > >>>>> specific version and let pip-tools/poetry figure out exact
> versions
> > > of
> > > >>>>> other requirements. You can even make a PR with such upgrade
> > > eventually
> > > >>>> to
> > > >>>>> get it faster in master. You can even downgrade in case newer
> > > >>> dependency
> > > >>>>> causes problems for you in similar way. Guided by the tools, it's
> > > much
> > > >>>>> faster than figuring the versions out by yourself.
> > > >>>>>
> > > >>>>> As long as we have simple way of managing it and document how to
> > > >>>>> upgrade/downgrade dependencies in your own fork, and mention how
> to
> > > >>>> locally
> > > >>>>> release Airflow as a package, I think your case could be covered
> > even
> > > >>>>> better than now. What do you think ?
> > > >>>>>
> > > >>>>> J.
> > > >>>>>
> > > >>>>> On Fri, Oct 5, 2018 at 2:34 PM EKC (Erik Cederstrand)
> > > >>>>> <EKC@xxxxxxxxxxxxx.invalid> wrote:
> > > >>>>>
> > > >>>>>> For us, exact pinning of versions would be problematic. We have
> > DAG
> > > >>>> code
> > > >>>>>> that shares direct and indirect dependencies with Airflow, e.g.
> > > lxml,
> > > >>>>>> requests, pyhive, future, thrift, tzlocal, psycopg2 and ldap3.
> If
> > > our
> > > >>>> DAG
> > > >>>>>> code for some reason needs a newer point release due to a bug
> > that's
> > > >>>>> fixed,
> > > >>>>>> then we can't cleanly build a virtual environment containing the
> > > >>> fixed
> > > >>>>>> version. For us, it's already a problem that Airflow has quite
> > > strict
> > > >>>>> (and
> > > >>>>>> sometimes old) requirements in setup.py.
> > > >>>>>>
> > > >>>>>> Erik
> > > >>>>>> ________________________________
> > > >>>>>> From: Jarek Potiuk <Jarek.Potiuk@xxxxxxxxxxx>
> > > >>>>>> Sent: Friday, October 5, 2018 2:01:15 PM
> > > >>>>>> To: dev@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
> > > >>>>>> Subject: Re: Pinning dependencies for Apache Airflow
> > > >>>>>>
> > > >>>>>> I think one solution to release approach is to check as part of
> > > >>>> automated
> > > >>>>>> Travis build if all requirements are pinned with == (even the
> deep
> > > >>>> ones)
> > > >>>>>> and fail the build in case they are not for ALL versions
> > (including
> > > >>>>>> dev). And of course we should document the approach of
> > > >>>> releases/upgrades
> > > >>>>>> etc. If we do it all the time for development versions (which
> > seems
> > > >>>> quite
> > > >>>>>> doable), then transitively all the releases will also have
> pinned
> > > >>>>> versions
> > > >>>>>> and they will never try to upgrade any of the dependencies. In
> > > poetry
> > > >>>>>> (similarly in pip-tools with .in file) it is done by having a
> > .lock
> > > >>>> file
> > > >>>>>> that specifies exact versions of each package so it can be
> rather
> > > >>> easy
> > > >>>> to
> > > >>>>>> manage (so it's worth trying it out I think  :D  - seems a bit
> > more
> > > >>>>>> friendly than pip-tools).
> > > >>>>>>
> > > >>>>>> There is a drawback - of course - with manually updating the
> > module
> > > >>>> that
> > > >>>>>> you want, but I really see that as an advantage rather than
> > drawback
> > > >>>>>> especially for users. This way you maintain the property that it
> > > will
> > > >>>>>> always install and work the same way no matter if you installed
> it
> > > >>>> today
> > > >>>>> or
> > > >>>>>> two months ago. I think the biggest drawback for maintainers is
> > that
> > > >>>> you
> > > >>>>>> need some kind of monitoring of security vulnerabilities and
> > cannot
> > > >>>> rely
> > > >>>>> on
> > > >>>>>> automated security upgrades. With >= requirements those security
> > > >>>> updates
> > > >>>>>> might happen automatically without anyone noticing, but to be
> > honest
> > > >>> I
> > > >>>>>> don't think such upgrades are guaranteed even in current setup
> for
> > > >>> all
> > > >>>>>> security issues for all libraries anyway.
> > > >>>>>>
> > > >>>>>> Finding the need to upgrade because of security issues can be
> > quite
> > > >>>>>> automated. Even now I noticed Github started to inform owners
> > about
> > > >>>>>> potential security vulnerabilities in used libraries for their
> > > >>> project.
> > > >>>>>> Those notifications can be sent to devlist and turned into JIRA
> > > >>> issues
> > > >>>>>> followed bvy  minor security-related releases (with only few
> > library
> > > >>>>>> dependencies upgraded).
> > > >>>>>>
> > > >>>>>> I think it's even easier to automate it if you have pinned
> > > >>>> dependencies -
> > > >>>>>> because it's generally easy to find applicable vulnerabilities
> for
> > > >>>>> specific
> > > >>>>>> versions of libraries by static analysers - when you have >=,
> you
> > > >>> never
> > > >>>>>> know which version will be used until you actually perform the
> > > >>>>>> installation.
> > > >>>>>>
> > > >>>>>> There is one big advantage for maintainers for "pinned" case.
> Your
> > > >>>> users
> > > >>>>>> always have the same dependencies - so when issue is raised, you
> > can
> > > >>>>>> reproduce it more easily. It's hard to know which version user
> has
> > > >>> (as
> > > >>>>> the
> > > >>>>>> user could install it month ago or yesterday) and even if you
> find
> > > >>> out
> > > >>>> by
> > > >>>>>> asking the user, you might not be able to reproduce the set of
> > > >>>>> requirements
> > > >>>>>> easily (simply because there are already newer versions of the
> > > >>>> libraries
> > > >>>>>> released and they are used automatically). You can ask the user
> to
> > > >>> run
> > > >>>>> pip
> > > >>>>>> --upgrade but that's dangerous and pretty lame ("check the
> latest
> > > >>>>> version -
> > > >>>>>> maybe it fixes your problem ? ") and sometimes not possible
> (e.g.
> > > >>>> someone
> > > >>>>>> has pre-built docker image with dependencies from few months ago
> > and
> > > >>>>> cannot
> > > >>>>>> rebuild the image easily).
> > > >>>>>>
> > > >>>>>> J.
> > > >>>>>>
> > > >>>>>> On Fri, Oct 5, 2018 at 12:35 PM Ash Berlin-Taylor <
> ash@xxxxxxxxxx
> > >
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> One thing to point out here.
> > > >>>>>>>
> > > >>>>>>> Right now if you `pip install apache-airflow=1.10.0` in a clean
> > > >>>>>>> environment it will fail.
> > > >>>>>>>
> > > >>>>>>> This is because we pin flask-login to 0.2.1 but
> flask-appbuilder
> > is
> > > >>>>> =
> > > >>>>>>> 1.11.1, so that pulls in 1.12.0 which requires flask-login >=
> > 0.3.
> > > >>>>>>>
> > > >>>>>>> So I do think there is maybe something to be said about pinning
> > for
> > > >>>>>>> releases. The down side to that is that if there are updates
> to a
> > > >>>>> module
> > > >>>>>>> that we want then we have to make a point release to let people
> > get
> > > >>>> it
> > > >>>>>>>
> > > >>>>>>> Both methods have draw-backs
> > > >>>>>>>
> > > >>>>>>> -ash
> > > >>>>>>>
> > > >>>>>>>> On 4 Oct 2018, at 17:13, Arthur Wiedmer <
> > > >>> arthur.wiedmer@xxxxxxxxx>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> Hi Jarek,
> > > >>>>>>>>
> > > >>>>>>>> I will +1 the discussion Dan is referring to and George's
> > advice.
> > > >>>>>>>>
> > > >>>>>>>> I just want to double check we are talking about pinning in
> > > >>>>>>>> requirements.txt only.
> > > >>>>>>>>
> > > >>>>>>>> This offers the ability to
> > > >>>>>>>> pip install -r requirements.txt
> > > >>>>>>>> pip install --no-deps airflow
> > > >>>>>>>> For a guaranteed install which works.
> > > >>>>>>>>
> > > >>>>>>>> Several different requirement files can be provided for
> specific
> > > >>>> use
> > > >>>>>>> cases,
> > > >>>>>>>> like a stable dev one for instance for people wanting to work
> on
> > > >>>>>>> operators
> > > >>>>>>>> and non-core functions.
> > > >>>>>>>>
> > > >>>>>>>> However, I think we should proactively test in CI against
> > > >>> unpinned
> > > >>>>>>>> dependencies (though it might be a separate case in the
> matrix)
> > ,
> > > >>>> so
> > > >>>>>> that
> > > >>>>>>>> we get advance warning if possible that things will break.
> > > >>>>>>>> CI downtime is not a bad thing here, it actually caught a
> > problem
> > > >>>> :)
> > > >>>>>>>>
> > > >>>>>>>> We should unpin as possible in setup.py to only maintain
> minimum
> > > >>>>>> required
> > > >>>>>>>> compatibility. The process of pinning in setup.py is extremely
> > > >>>>>>> detrimental
> > > >>>>>>>> when you have a large number of python libraries installed
> with
> > > >>>>>> different
> > > >>>>>>>> pinned versions.
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Arthur
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Oct 4, 2018 at 8:36 AM Dan Davydov
> > > >>>>>> <ddavydov@xxxxxxxxxxx.invalid
> > > >>>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Relevant discussion about this:
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-airflow%2Fpull%2F1809%23issuecomment-257502174&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=9wta3PcUeZjBg%2FmACBH06cNRzbYG4NcAW0XDJKan6cM%3D&amp;reserved=0
> > > >>>>>>>>>
> > > >>>>>>>>> On Thu, Oct 4, 2018 at 11:25 AM Jarek Potiuk <
> > > >>>>>> Jarek.Potiuk@xxxxxxxxxxx>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> TL;DR; A change is coming in the way how
> > > >>>> dependencies/requirements
> > > >>>>>> are
> > > >>>>>>>>>> specified for Apache Airflow - they will be fixed rather
> than
> > > >>>>>> flexible
> > > >>>>>>>>> (==
> > > >>>>>>>>>> rather than >=).
> > > >>>>>>>>>>
> > > >>>>>>>>>> This is follow up after Slack discussion we had with Ash and
> > > >>>> Kaxil
> > > >>>>> -
> > > >>>>>>>>>> summarising what we propose we'll do.
> > > >>>>>>>>>>
> > > >>>>>>>>>> *Problem:*
> > > >>>>>>>>>> During last few weeks we experienced quite a few downtimes
> of
> > > >>>>>> TravisCI
> > > >>>>>>>>>> builds (for all PRs/branches including master) as some of
> the
> > > >>>>>>> transitive
> > > >>>>>>>>>> dependencies were automatically upgraded. This because in a
> > > >>>> number
> > > >>>>> of
> > > >>>>>>>>>> dependencies we have  >= rather than == dependencies.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Whenever there is a new release of such dependency, it might
> > > >>>> cause
> > > >>>>>>> chain
> > > >>>>>>>>>> reaction with upgrade of transitive dependencies which might
> > > >>> get
> > > >>>>> into
> > > >>>>>>>>>> conflict.
> > > >>>>>>>>>>
> > > >>>>>>>>>> An example was Flask-AppBuilder vs flask-login transitive
> > > >>>>> dependency
> > > >>>>>>> with
> > > >>>>>>>>>> click. They started to conflict once AppBuilder has released
> > > >>>>> version
> > > >>>>>>>>>> 1.12.0.
> > > >>>>>>>>>>
> > > >>>>>>>>>> *Diagnosis:*
> > > >>>>>>>>>> Transitive dependencies with "flexible" versions (where >=
> is
> > > >>>> used
> > > >>>>>>>>> instead
> > > >>>>>>>>>> of ==) is a reason for "dependency hell". We will sooner or
> > > >>> later
> > > >>>>> hit
> > > >>>>>>>>> other
> > > >>>>>>>>>> cases where not fixed dependencies cause similar problems
> with
> > > >>>>> other
> > > >>>>>>>>>> transitive dependencies. We need to fix-pin them. This
> causes
> > > >>>>>> problems
> > > >>>>>>>>> for
> > > >>>>>>>>>> both - released versions (cause they stop to work!) and for
> > > >>>>>> development
> > > >>>>>>>>>> (cause they break master builds in TravisCI and prevent
> people
> > > >>>> from
> > > >>>>>>>>>> installing development environment from the scratch.
> > > >>>>>>>>>>
> > > >>>>>>>>>> *Solution:*
> > > >>>>>>>>>>
> > > >>>>>>>>>>  - Following the old-but-good post
> > > >>>>>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnvie.com%2Fposts%2Fpin-your-packages%2F&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=0jqlZcLU6%2BvO%2BJKSMlX7gyix6dKvD%2BZbrgHn9pRknLY%3D&amp;reserved=0
> > > >>>>>> we are going to fix the
> > > >>>>>>>>>> pinned
> > > >>>>>>>>>>  dependencies to specific versions (so basically all
> > > >>>> dependencies
> > > >>>>>> are
> > > >>>>>>>>>>  "fixed").
> > > >>>>>>>>>>  - We will introduce mechanism to be able to upgrade
> > > >>>> dependencies
> > > >>>>>> with
> > > >>>>>>>>>>  pip-tools (
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjazzband%2Fpip-tools&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=hu%2FivDsKxwocNlVtBTgYE0E%2BET97u2DWN1IdnCF1ckU%3D&amp;reserved=0
> > > >>>>> ).
> > > >>>>>> We might also
> > > >>>>>>>>> take a
> > > >>>>>>>>>>  look at pipenv:
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > >
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpipenv.readthedocs.io%2Fen%2Flatest%2F&amp;data=01%7C01%7CEKC%40novozymes.com%7C787382d8ea6a465b48f108d62d5a9613%7C43d5f49ee03a4d22a2285684196bb001%7C0&amp;sdata=s0iqMPk3O8%2Bk1BCPBLYfIIMU2D4SdmPVEYELo%2FKS1%2FA%3D&amp;reserved=0
> > > >>>>>>>>>>  - People who would like to upgrade some dependencies for
> > > >>> their
> > > >>>>> PRs
> > > >>>>>>>>> will
> > > >>>>>>>>>>  still be able to do it - but such upgrades will be in their
> > > >>> PR
> > > >>>>> thus
> > > >>>>>>>>> they
> > > >>>>>>>>>>  will go through TravisCI tests and they will also have to
> be
> > > >>>>>>> specified
> > > >>>>>>>>>> with
> > > >>>>>>>>>>  pinned fixed versions (==). This should be part of review
> > > >>>> process
> > > >>>>>> to
> > > >>>>>>>>>> make
> > > >>>>>>>>>>  sure new/changed requirements are pinned.
> > > >>>>>>>>>>  - In release process there will be a point where an upgrade
> > > >>>> will
> > > >>>>> be
> > > >>>>>>>>>>  attempted for all requirements (using pip-tools) so that we
> > > >>> are
> > > >>>>> not
> > > >>>>>>>>>> stuck
> > > >>>>>>>>>>  with older releases. This will be in controlled PR
> > > >>> environment
> > > >>>>>> where
> > > >>>>>>>>>> there
> > > >>>>>>>>>>  will be time to fix all dependencies without impacting
> others
> > > >>>> and
> > > >>>>>>>>> likely
> > > >>>>>>>>>>  enough time to "vet" such changes (this can be done for
> > > >>>>> alpha/beta
> > > >>>>>>>>>> releases
> > > >>>>>>>>>>  for example).
> > > >>>>>>>>>>  - As a side effect dependencies specification will become
> far
> > > >>>>>> simpler
> > > >>>>>>>>>>  and straightforward.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Happy to hear community comments to the proposal. I am happy
> > to
> > > >>>>> take
> > > >>>>>> a
> > > >>>>>>>>> lead
> > > >>>>>>>>>> on that, open JIRA issue and implement if this is something
> > > >>>>> community
> > > >>>>>>> is
> > > >>>>>>>>>> happy with.
> > > >>>>>>>>>>
> > > >>>>>>>>>> J.
> > > >>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>>
> > > >>>>>>>>>> *Jarek Potiuk, Principal Software Engineer*
> > > >>>>>>>>>> Mobile: +48 660 796 129 <+48%20660%20796%20129>
> > <+48%20660%20796%20129>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>>
> > > >>>>>> *Jarek Potiuk, Principal Software Engineer*
> > > >>>>>> Mobile: +48 660 796 129 <+48%20660%20796%20129>
> > <+48%20660%20796%20129>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>>
> > > >>>>> *Jarek Potiuk, Principal Software Engineer*
> > > >>>>> Mobile: +48 660 796 129 <+48%20660%20796%20129>
> > <+48%20660%20796%20129>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>>
> > > >>> *Jarek Potiuk, Principal Software Engineer*
> > > >>> Mobile: +48 660 796 129 <+48%20660%20796%20129>
> > <+48%20660%20796%20129>
> > > >>>
> > >
> > >
> >
>


-- 

*Jarek Potiuk, Principal Software Engineer*
Mobile: +48 660 796 129