git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CASSANDRA-13241 lower default chunk_length_in_kb


+1

I would guess a lot of C* clusters/tables have this option set to the
default value, and not many of them are having the need for reading so big
chunks of data.
I believe this will greatly limit disk overreads for a fair amount (a big
majority?) of new users. It seems fair enough to change this default value,
I also think 4.0 is a nice place to do this.

Thanks for taking care of this Ariel and for making sure there is a
consensus here as well,

C*heers,
-----------------------
Alain Rodriguez - alain@xxxxxxxxxxxxxxxxx
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg <ariel@xxxxxxxxxxx> a écrit :

> Hi,
>
> This would only impact new tables, existing tables would get their
> chunk_length_in_kb from the existing schema. It's something we record in a
> system table.
>
> I have an implementation of a compact integer sequence that only requires
> 37% of the memory required today. So we could do this with only slightly
> more than doubling the memory used. I'll post that to the JIRA soon.
>
> Ariel
>
> On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> >
> >
> > I think 16k is a better default, but it should only affect new tables.
> > Whoever changes it, please make sure you think about the upgrade path.
> >
> >
> > > On Oct 12, 2018, at 2:31 AM, Ben Bromhead <ben@xxxxxxxxxxxxxxx> wrote:
> > >
> > > This is something that's bugged me for ages, tbh the performance gain
> for
> > > most use cases far outweighs the increase in memory usage and I would
> even
> > > be in favor of changing the default now, optimizing the storage cost
> later
> > > (if it's found to be worth it).
> > >
> > > For some anecdotal evidence:
> > > 4kb is usually what we end setting it to, 16kb feels more reasonable
> given
> > > the memory impact, but what would be the point if practically, most
> folks
> > > set it to 4kb anyway?
> > >
> > > Note that chunk_length will largely be dependent on your read sizes,
> but 4k
> > > is the floor for most physical devices in terms of ones block size.
> > >
> > > +1 for making this change in 4.0 given the small size and the large
> > > improvement to new users experience (as long as we are explicit in the
> > > documentation about memory consumption).
> > >
> > >
> > >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg <ariel@xxxxxxxxxxx>
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> This is regarding
> https://issues.apache.org/jira/browse/CASSANDRA-13241
> > >>
> > >> This ticket has languished for a while. IMO it's too late in 4.0 to
> > >> implement a more memory efficient representation for compressed chunk
> > >> offsets. However I don't think we should put out another release with
> the
> > >> current 64k default as it's pretty unreasonable.
> > >>
> > >> I propose that we lower the value to 16kb. 4k might never be the
> correct
> > >> default anyways as there is a cost to compression and 16k will still
> be a
> > >> large improvement.
> > >>
> > >> Benedict and Jon Haddad are both +1 on making this change for 4.0. In
> the
> > >> past there has been some consensus about reducing this value although
> maybe
> > >> with more memory efficiency.
> > >>
> > >> The napkin math for what this costs is:
> > >> "If you have 1TB of uncompressed data, with 64k chunks that's 16M
> chunks
> > >> at 8 bytes each (128MB).
> > >> With 16k chunks, that's 512MB.
> > >> With 4k chunks, it's 2G.
> > >> Per terabyte of data (pre-compression)."
> > >>
> > >>
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
> > >>
> > >> By way of comparison memory mapping the files has a similar cost per
> 4k
> > >> page of 8 bytes. Multiple mappings makes this more expensive. With a
> > >> default of 16kb this would be 4x less expensive than memory mapping a
> file.
> > >> I only mention this to give a sense of the costs we are already
> paying. I
> > >> am not saying they are directly related.
> > >>
> > >> I'll wait a week for discussion and if there is consensus make the
> change.
> > >>
> > >> Regards,
> > >> Ariel
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> > >> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> > >>
> > >> --
> > > Ben Bromhead
> > > CTO | Instaclustr <https://www.instaclustr.com/>
> > > +1 650 284 9692
> > > Reliability at Scale
> > > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx
>
>