git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CASSANDRA-13241 lower default chunk_length_in_kb


If you undertake sufficiently many low risk things, some will bite you, I think everyone understands that.  It’s still valuable to factor a risk assessment into the equation, I think?

Either way, somebody asked who didn’t have the context to easily answer, so I did my best to offer them that information so they could make an informed decision.  I’m not campaigning for its inclusion, just trying to facilitate a collective decision.






> On 24 Oct 2018, at 16:27, Joshua McKenzie <jmckenzie@xxxxxxxxxx> wrote:
> 
> | The risk from such a patch is very low
> If I had a nickel for every time I've heard that... ;)
> 
> I'm neutral on the default change, -.5 (i.e. don't agree with it but won't
> die on that hill) on the data structure change post-freeze. We put this in,
> and that's a slippery slope as I'm sure we can find numerous other
> seemingly low-risk trivial optimizations and rewrites that cumulatively
> would make a "feature-freeze" effectively meaningless as a tool to start
> stabilizing the contents of the release.
> 
> In isolation many changes look innocuous. In the context of an organically
> grown open-source code-base that's this old, I've learned that it pays to
> be very, very cautious.
> 
> On Tue, Oct 23, 2018 at 3:33 PM Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:
> 
>> My objection (-0.5) is based on freeze not in code complexity
>> 
>> 
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>>> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith <benedict@xxxxxxxxxx>
>> wrote:
>>> 
>>> To discuss the concerns about the patch for a more efficient
>> representation:
>>> 
>>> The risk from such a patch is very low.  It’s a very simple in-memory
>> data structure, that we can introduce thorough fuzz tests for.  The reason
>> to exclude it would be for reasons of wanting to begin strictly enforcing
>> the freeze only.  This is a good enough reason in my book, which is why I’m
>> neutral on its addition.  I just wanted to provide some context for
>> everyone else's voting intention.
>>> 
>>> 
>>>> On 23 Oct 2018, at 16:51, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I just asked Jeff. He is -0 and -0.5 respectively.
>>>> 
>>>> Ariel
>>>> 
>>>>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
>>>>> I’m +1 change of default.  I think Jeff was -1 on that though.
>>>>> 
>>>>> 
>>>>>> On 23 Oct 2018, at 16:46, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> To summarize who we have heard from so far
>>>>>> 
>>>>>> WRT to changing just the default:
>>>>>> 
>>>>>> +1:
>>>>>> Jon Haddadd
>>>>>> Ben Bromhead
>>>>>> Alain Rodriguez
>>>>>> Sankalp Kohli (not explicit)
>>>>>> 
>>>>>> -0:
>>>>>> Sylvaine Lebresne
>>>>>> Jeff Jirsa
>>>>>> 
>>>>>> Not sure:
>>>>>> Kurt Greaves
>>>>>> Joshua Mckenzie
>>>>>> Benedict Elliot Smith
>>>>>> 
>>>>>> WRT to change the representation:
>>>>>> 
>>>>>> +1:
>>>>>> There are only conditional +1s at this point
>>>>>> 
>>>>>> -0:
>>>>>> Sylvaine Lebresne
>>>>>> 
>>>>>> -.5:
>>>>>> Jeff Jirsa
>>>>>> 
>>>>>> This (
>> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>> is a rough cut of the change for the representation. It needs better
>> naming, unit tests, javadoc etc. but it does implement the change.
>>>>>> 
>>>>>> Ariel
>>>>>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
>>>>>>> Sorry, to be clear - I'm +1 on changing the configuration default,
>> but I
>>>>>>> think changing the compression in memory representations warrants
>> further
>>>>>>> discussion and investigation before making a case for or against it
>> yet.
>>>>>>> An optimization that reduces in memory cost by over 50% sounds
>> pretty good
>>>>>>> and we never were really explicit that those sort of optimizations
>> would be
>>>>>>> excluded after our feature freeze.  I don't think they should
>> necessarily
>>>>>>> be excluded at this time, but it depends on the size and risk of the
>> patch.
>>>>>>> 
>>>>>>>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad <jon@xxxxxxxxxxxxx>
>> wrote:
>>>>>>>> 
>>>>>>>> I think we should try to do the right thing for the most people
>> that we
>>>>>>>> can.  The number of folks impacted by 64KB is huge.  I've worked on
>> a lot
>>>>>>>> of clusters created by a lot of different teams, going from brand
>> new to
>>>>>>>> pretty damn knowledgeable.  I can't think of a single time over the
>> last 2
>>>>>>>> years that I've seen a cluster use non-default settings for
>> compression.
>>>>>>>> With only a handful of exceptions, I've lowered the chunk size
>> considerably
>>>>>>>> (usually to 4 or 8K) and the impact has always been very noticeable,
>>>>>>>> frequently resulting in hardware reduction and cost savings.  Of
>> all the
>>>>>>>> poorly chosen defaults we have, this is one of the biggest
>> offenders that I
>>>>>>>> see.  There's a good reason ScyllaDB  claims they're so much faster
>> than
>>>>>>>> Cassandra - we ship a DB that performs poorly for 90+% of teams
>> because we
>>>>>>>> ship for a specific use case, not a general one (time series on
>> memory
>>>>>>>> constrained boxes being the specific use case)
>>>>>>>> 
>>>>>>>> This doesn't impact existing tables, just new ones.  More and more
>> teams
>>>>>>>> are using Cassandra as a general purpose database, we should
>> acknowledge
>>>>>>>> that adjusting our defaults accordingly.  Yes, we use a little bit
>> more
>>>>>>>> memory on new tables if we just change this setting, and what we
>> get out of
>>>>>>>> it is a massive performance win.
>>>>>>>> 
>>>>>>>> I'm +1 on the change as well.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <
>> kohlisankalp@xxxxxxxxx>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> (We should definitely harden the definition for freeze in a
>> separate
>>>>>>>>> thread)
>>>>>>>>> 
>>>>>>>>> My thinking is that this is the best time to do this change as we
>> have
>>>>>>>>> not even cut alpha or beta. All the people involved in the test
>> will
>>>>>>>>> definitely be testing it again when we have these releases.
>>>>>>>>> 
>>>>>>>>>> On Oct 19, 2018, at 8:00 AM, Michael Shuler <
>> michael@xxxxxxxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>>>>>>>>>>> 
>>>>>>>>>>> At the risk of hijacking this thread, when are we going to
>> transition
>>>>>>>>> from
>>>>>>>>>>> "no new features, change whatever else you want including
>> refactoring
>>>>>>>>> and
>>>>>>>>>>> changing years-old defaults" to "ok, we think we have something
>> that's
>>>>>>>>>>> stable, time to start testing"?
>>>>>>>>>> 
>>>>>>>>>> Creating a cassandra-4.0 branch would allow trunk to, for
>> instance, get
>>>>>>>>>> a default config value change commit and get more testing. We
>> might
>>>>>>>>>> forget again, from what I understand of Benedict's last comment :)
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Michael
>>>>>>>>>> 
>>>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jon Haddad
>>>>>>>> http://www.rustyrazorblade.com
>>>>>>>> twitter: rustyrazorblade
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Jon Haddad
>>>>>>> http://www.rustyrazorblade.com
>>>>>>> twitter: rustyrazorblade
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx