git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CASSANDRA-13241 lower default chunk_length_in_kb


My objection (-0.5) is based on freeze not in code complexity



-- 
Jeff Jirsa


> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith <benedict@xxxxxxxxxx> wrote:
> 
> To discuss the concerns about the patch for a more efficient representation:
> 
> The risk from such a patch is very low.  It’s a very simple in-memory data structure, that we can introduce thorough fuzz tests for.  The reason to exclude it would be for reasons of wanting to begin strictly enforcing the freeze only.  This is a good enough reason in my book, which is why I’m neutral on its addition.  I just wanted to provide some context for everyone else's voting intention.
> 
> 
>> On 23 Oct 2018, at 16:51, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
>> 
>> Hi,
>> 
>> I just asked Jeff. He is -0 and -0.5 respectively.
>> 
>> Ariel
>> 
>>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
>>> I’m +1 change of default.  I think Jeff was -1 on that though.
>>> 
>>> 
>>>> On 23 Oct 2018, at 16:46, Ariel Weisberg <ariel@xxxxxxxxxxx> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> To summarize who we have heard from so far
>>>> 
>>>> WRT to changing just the default:
>>>> 
>>>> +1:
>>>> Jon Haddadd
>>>> Ben Bromhead
>>>> Alain Rodriguez
>>>> Sankalp Kohli (not explicit)
>>>> 
>>>> -0:
>>>> Sylvaine Lebresne 
>>>> Jeff Jirsa
>>>> 
>>>> Not sure:
>>>> Kurt Greaves
>>>> Joshua Mckenzie
>>>> Benedict Elliot Smith
>>>> 
>>>> WRT to change the representation:
>>>> 
>>>> +1:
>>>> There are only conditional +1s at this point
>>>> 
>>>> -0:
>>>> Sylvaine Lebresne
>>>> 
>>>> -.5:
>>>> Jeff Jirsa
>>>> 
>>>> This (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce) is a rough cut of the change for the representation. It needs better naming, unit tests, javadoc etc. but it does implement the change.
>>>> 
>>>> Ariel
>>>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
>>>>> Sorry, to be clear - I'm +1 on changing the configuration default, but I
>>>>> think changing the compression in memory representations warrants further
>>>>> discussion and investigation before making a case for or against it yet.
>>>>> An optimization that reduces in memory cost by over 50% sounds pretty good
>>>>> and we never were really explicit that those sort of optimizations would be
>>>>> excluded after our feature freeze.  I don't think they should necessarily
>>>>> be excluded at this time, but it depends on the size and risk of the patch.
>>>>> 
>>>>>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad <jon@xxxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> I think we should try to do the right thing for the most people that we
>>>>>> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
>>>>>> of clusters created by a lot of different teams, going from brand new to
>>>>>> pretty damn knowledgeable.  I can't think of a single time over the last 2
>>>>>> years that I've seen a cluster use non-default settings for compression.
>>>>>> With only a handful of exceptions, I've lowered the chunk size considerably
>>>>>> (usually to 4 or 8K) and the impact has always been very noticeable,
>>>>>> frequently resulting in hardware reduction and cost savings.  Of all the
>>>>>> poorly chosen defaults we have, this is one of the biggest offenders that I
>>>>>> see.  There's a good reason ScyllaDB  claims they're so much faster than
>>>>>> Cassandra - we ship a DB that performs poorly for 90+% of teams because we
>>>>>> ship for a specific use case, not a general one (time series on memory
>>>>>> constrained boxes being the specific use case)
>>>>>> 
>>>>>> This doesn't impact existing tables, just new ones.  More and more teams
>>>>>> are using Cassandra as a general purpose database, we should acknowledge
>>>>>> that adjusting our defaults accordingly.  Yes, we use a little bit more
>>>>>> memory on new tables if we just change this setting, and what we get out of
>>>>>> it is a massive performance win.
>>>>>> 
>>>>>> I'm +1 on the change as well.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <kohlisankalp@xxxxxxxxx>
>>>>>> wrote:
>>>>>> 
>>>>>>> (We should definitely harden the definition for freeze in a separate
>>>>>>> thread)
>>>>>>> 
>>>>>>> My thinking is that this is the best time to do this change as we have
>>>>>>> not even cut alpha or beta. All the people involved in the test will
>>>>>>> definitely be testing it again when we have these releases.
>>>>>>> 
>>>>>>>> On Oct 19, 2018, at 8:00 AM, Michael Shuler <michael@xxxxxxxxxxxxxx>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>>>>>>>>> 
>>>>>>>>> At the risk of hijacking this thread, when are we going to transition
>>>>>>> from
>>>>>>>>> "no new features, change whatever else you want including refactoring
>>>>>>> and
>>>>>>>>> changing years-old defaults" to "ok, we think we have something that's
>>>>>>>>> stable, time to start testing"?
>>>>>>>> 
>>>>>>>> Creating a cassandra-4.0 branch would allow trunk to, for instance, get
>>>>>>>> a default config value change commit and get more testing. We might
>>>>>>>> forget again, from what I understand of Benedict's last comment :)
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Michael
>>>>>>>> 
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Jon Haddad
>>>>>> http://www.rustyrazorblade.com
>>>>>> twitter: rustyrazorblade
>>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Jon Haddad
>>>>> http://www.rustyrazorblade.com
>>>>> twitter: rustyrazorblade
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
>> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx