git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CASSANDRA-13241 lower default chunk_length_in_kb


| The risk from such a patch is very low
If I had a nickel for every time I've heard that... ;)

I'm neutral on the default change, -.5 (i.e. don't agree with it but won't
die on that hill) on the data structure change post-freeze. We put this in,
and that's a slippery slope as I'm sure we can find numerous other
seemingly low-risk trivial optimizations and rewrites that cumulatively
would make a "feature-freeze" effectively meaningless as a tool to start
stabilizing the contents of the release.

In isolation many changes look innocuous. In the context of an organically
grown open-source code-base that's this old, I've learned that it pays to
be very, very cautious.

On Tue, Oct 23, 2018 at 3:33 PM Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:

> My objection (-0.5) is based on freeze not in code complexity
>
>
>
> --
> Jeff Jirsa
>
>
> > On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith <benedict@xxxxxxxxxx>
> wrote:
> >
> > To discuss the concerns about the patch for a more efficient
> representation:
> >
> > The risk from such a patch is very low.  It’s a very simple in-memory
> data structure, that we can introduce thorough fuzz tests for.  The reason
> to exclude it would be for reasons of wanting to begin strictly enforcing
> the freeze only.  This is a good enough reason in my book, which is why I’m
> neutral on its addition.  I just wanted to provide some context for
> everyone else's voting intention.
> >
> >
> >> On 23 Oct 2018, at 16:51, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> I just asked Jeff. He is -0 and -0.5 respectively.
> >>
> >> Ariel
> >>
> >>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> >>> I’m +1 change of default.  I think Jeff was -1 on that though.
> >>>
> >>>
> >>>> On 23 Oct 2018, at 16:46, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> To summarize who we have heard from so far
> >>>>
> >>>> WRT to changing just the default:
> >>>>
> >>>> +1:
> >>>> Jon Haddadd
> >>>> Ben Bromhead
> >>>> Alain Rodriguez
> >>>> Sankalp Kohli (not explicit)
> >>>>
> >>>> -0:
> >>>> Sylvaine Lebresne
> >>>> Jeff Jirsa
> >>>>
> >>>> Not sure:
> >>>> Kurt Greaves
> >>>> Joshua Mckenzie
> >>>> Benedict Elliot Smith
> >>>>
> >>>> WRT to change the representation:
> >>>>
> >>>> +1:
> >>>> There are only conditional +1s at this point
> >>>>
> >>>> -0:
> >>>> Sylvaine Lebresne
> >>>>
> >>>> -.5:
> >>>> Jeff Jirsa
> >>>>
> >>>> This (
> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
> is a rough cut of the change for the representation. It needs better
> naming, unit tests, javadoc etc. but it does implement the change.
> >>>>
> >>>> Ariel
> >>>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> >>>>> Sorry, to be clear - I'm +1 on changing the configuration default,
> but I
> >>>>> think changing the compression in memory representations warrants
> further
> >>>>> discussion and investigation before making a case for or against it
> yet.
> >>>>> An optimization that reduces in memory cost by over 50% sounds
> pretty good
> >>>>> and we never were really explicit that those sort of optimizations
> would be
> >>>>> excluded after our feature freeze.  I don't think they should
> necessarily
> >>>>> be excluded at this time, but it depends on the size and risk of the
> patch.
> >>>>>
> >>>>>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad <jon@xxxxxxxxxxxxx>
> wrote:
> >>>>>>
> >>>>>> I think we should try to do the right thing for the most people
> that we
> >>>>>> can.  The number of folks impacted by 64KB is huge.  I've worked on
> a lot
> >>>>>> of clusters created by a lot of different teams, going from brand
> new to
> >>>>>> pretty damn knowledgeable.  I can't think of a single time over the
> last 2
> >>>>>> years that I've seen a cluster use non-default settings for
> compression.
> >>>>>> With only a handful of exceptions, I've lowered the chunk size
> considerably
> >>>>>> (usually to 4 or 8K) and the impact has always been very noticeable,
> >>>>>> frequently resulting in hardware reduction and cost savings.  Of
> all the
> >>>>>> poorly chosen defaults we have, this is one of the biggest
> offenders that I
> >>>>>> see.  There's a good reason ScyllaDB  claims they're so much faster
> than
> >>>>>> Cassandra - we ship a DB that performs poorly for 90+% of teams
> because we
> >>>>>> ship for a specific use case, not a general one (time series on
> memory
> >>>>>> constrained boxes being the specific use case)
> >>>>>>
> >>>>>> This doesn't impact existing tables, just new ones.  More and more
> teams
> >>>>>> are using Cassandra as a general purpose database, we should
> acknowledge
> >>>>>> that adjusting our defaults accordingly.  Yes, we use a little bit
> more
> >>>>>> memory on new tables if we just change this setting, and what we
> get out of
> >>>>>> it is a massive performance win.
> >>>>>>
> >>>>>> I'm +1 on the change as well.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <
> kohlisankalp@xxxxxxxxx>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> (We should definitely harden the definition for freeze in a
> separate
> >>>>>>> thread)
> >>>>>>>
> >>>>>>> My thinking is that this is the best time to do this change as we
> have
> >>>>>>> not even cut alpha or beta. All the people involved in the test
> will
> >>>>>>> definitely be testing it again when we have these releases.
> >>>>>>>
> >>>>>>>> On Oct 19, 2018, at 8:00 AM, Michael Shuler <
> michael@xxxxxxxxxxxxxx>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> >>>>>>>>>
> >>>>>>>>> At the risk of hijacking this thread, when are we going to
> transition
> >>>>>>> from
> >>>>>>>>> "no new features, change whatever else you want including
> refactoring
> >>>>>>> and
> >>>>>>>>> changing years-old defaults" to "ok, we think we have something
> that's
> >>>>>>>>> stable, time to start testing"?
> >>>>>>>>
> >>>>>>>> Creating a cassandra-4.0 branch would allow trunk to, for
> instance, get
> >>>>>>>> a default config value change commit and get more testing. We
> might
> >>>>>>>> forget again, from what I understand of Benedict's last comment :)
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Michael
> >>>>>>>>
> >>>>>>>>
> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Jon Haddad
> >>>>>> http://www.rustyrazorblade.com
> >>>>>> twitter: rustyrazorblade
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Jon Haddad
> >>>>> http://www.rustyrazorblade.com
> >>>>> twitter: rustyrazorblade
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> >> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>
>