git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Database referral integrity


I'm in favor of having referential integrity. It will add some load in
having to enforce the referential integrity, but it will also make sure
that the database stays clean. Also in Airflow we use transactions which
will make sure that the integrity checks are not validated on every
statement, but after the commit. I'm happy to help with this as well.

Cheers, Fokko

Op di 18 sep. 2018 om 11:07 schreef Bolke de Bruin <bdbruin@xxxxxxxxx>:

> Adding these kind of checks which work for integrity well make database
> access pretty slow. In addition it isnt there because in the past there was
> no strong connection between for example tasks and dagruns, it was more or
> less just coincidental. There also so some bisecting tools that probably
> have difficulty functioning in a new regime. In other words it is not an
> easy change and it will have operational challenges.
>
> > On 18 Sep 2018, at 11:03, Ash Berlin-Taylor <ash@xxxxxxxxxx> wrote:
> >
> > Ooh good spot.
> >
> > Yes I would be in favour of adding these, but as you say we need to
> thing about how we might migrate old data.
> >
> > Doing this at 2.0.0 and providing a cleanup script (or doing it as part
> of the migration?) is probably the way to go.
> >
> > -ash-
> >
> >> On 17 Sep 2018, at 19:56, Stefan Seelmann <mail@xxxxxxxxxxxxxxxxxx>
> wrote:
> >>
> >> Hi,
> >>
> >> looking into the DB schema there is almost no referral integrity
> >> enforced at the database level. Many foreign key constraints between
> >> dag, dag_run, task_instance, xcom, dag_pickle, log, etc would make sense
> >> IMO.
> >>
> >> Is there a particular reason why that's not implemented?
> >>
> >> Introducing it now will be hard, probably any real-world setup has some
> >> violations. But I'm still in favor of this additional safety net.
> >>
> >> Kind Regards,
> >> Stefan
> >
>
>