git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Airflow 1.10 Migration Duration


Good point about mentioning the database specifics, thanks. It's a Postgres
9.6.6 DB running in AWS RDS in an db.r3.large instance (2 vCPUs, 15 GB of
RAM).

Not sure what you mean by online/offline, but we timed the migrations in a
test run against a database with nothing else going on at the time.

- Matt

On Tue, Sep 25, 2018 at 7:54 PM Ruiqin Yang <yrqls21@xxxxxxxxx> wrote:

> Thank you Taylor, the db-cleanup DAG is very nice! Got a question for you,
> should we expect the DB migration to be backward compatible, i.e. would
> 1.8.x cluster run fine with upgraded DB?
>
> Thank you!
> Kevin Y
>
> On Tue, Sep 25, 2018 at 6:14 PM Taylor Edmiston <tedmiston@xxxxxxxxx>
> wrote:
>
> > I haven't done 1.8.x to 1.10.x in one go, but multiple hours seems long
> for
> > running a handful of Alembic migrations on 10M rows.  It might be worth
> > noting if you're using MySQL or Postgres and how your db is hosted... I
> > wonder if there's a bottleneck at play here.
> >
> > Also, are you running the migrations in online or offline mode?
> >
> > You may see a performance improvement if you collapse all migrations into
> > one then apply that (https://stackoverflow.com/a/34492022/149428).
> >
> > I prefer to keep all of my metadata in place personally, but the
> db-cleanup
> > DAG in https://github.com/teamclairvoyant/airflow-maintenance-dags has
> > been
> > brought up before.
> >
> > T
> >
> > *Taylor Edmiston*
> > Blog <https://blog.tedmiston.com/> | LinkedIn
> > <https://www.linkedin.com/in/tedmiston/> | Stack Overflow
> > <https://stackoverflow.com/users/149428/taylor-edmiston> | Developer
> Story
> > <https://stackoverflow.com/story/taylor>
> >
> >
> > On Tue, Sep 25, 2018 at 8:30 PM, Sid Anand <sanand@xxxxxxxxxx> wrote:
> >
> > > I checked with our Ops guy and he mentioned that when he upgraded from
> > > 1.8.x to 1.9.x, it took a few seconds. We had 3M rows in the
> > task_instance
> > > table and run MySQL 5.7.
> > >
> > > -s
> > >
> > > On Tue, Sep 25, 2018 at 4:54 PM Matt Davis <jiffyclub@xxxxxxxxx>
> wrote:
> > >
> > > > Hi folks,
> > > >
> > > > Here at Clover we're excitedly migrating to Airflow 1.10 (thanks for
> > > > everyone's hard work on that!). We're finding that it's taking about
> 2
> > > > hours to apply all the migrations to go from Airflow 1.8 to 1.10,
> > largely
> > > > driven by the 10 million rows in our task_instance table. That got us
> > > > wondering what kind of maintenance people do on their Airflow
> metadata
> > > > databases. Do folks mostly put up with long migrations and generally
> > > longer
> > > > queries, or are y'all doing periodic cleanups of your metadata DB to
> > keep
> > > > it fairly light?
> > > >
> > > > Thanks,
> > > > Matt Davis
> > > >
> > >
> >
>