git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: secondary index table - tombstones surviving compactions


This would matter for the base table, but would be less likely for the secondary index, where the partition key is the value of the base row

Roman: there’s a config option related to only purging repaired tombstones - do you have that enabled ? If so, are you running repairs?

-- 
Jeff Jirsa


> On May 18, 2018, at 6:41 AM, Eric Stevens <mightye@xxxxxxxxx> wrote:
> 
> The answer to Question 3 is "yes."  One of the more subtle points about
> tombstones is that Cassandra won't remove them during compaction if there
> is a bloom filter on any SSTable on that replica indicating that it
> contains the same partition (not primary) key.  Even if it is older than
> gc_grace, and would otherwise be a candidate for cleanup.
> 
> If you're recycling partition keys, your tombstones may never be able to be
> cleaned up, because in this scenario there is a high probability that an
> SSTable not involved in that compaction also contains the same partition
> key, and so compaction cannot have confidence that it's safe to remove the
> tombstone (it would have to fully materialize every record in the
> compaction, which is too expensive).
> 
> In general it is an antipattern in Cassandra to write to a given partition
> indefinitely for this and other reasons.
> 
> On Fri, May 18, 2018 at 2:37 AM Roman Bielik <
> roman.bielik@xxxxxxxxxxxxxxxxxxxx> wrote:
> 
>> Hi,
>> 
>> I have a Cassandra 3.11 table (with compact storage) and using secondary
>> indices with rather unique data stored in the indexed columns. There are
>> many inserts and deletes, so in order to avoid tombstones piling up I'm
>> re-using primary keys from a pool (which works fine).
>> I'm aware that this design pattern is not ideal, but for now I can not
>> change it easily.
>> 
>> The problem is, the size of 2nd index tables keeps growing (filled with
>> tombstones) no matter what.
>> 
>> I tried some aggressive configuration (just for testing) in order to
>> expedite the tombstone removal but with little-to-zero effect:
>> COMPACTION = { 'class':
>> 'LeveledCompactionStrategy', 'unchecked_tombstone_compaction': 'true',
>> 'tombstone_compaction_interval': 600 }
>> gc_grace_seconds = 600
>> 
>> I'm aware that perhaps Materialized views could provide a solution to this,
>> but I'm bind to the Thrift interface, so can not use them.
>> 
>> Questions:
>> 1. Is there something I'm missing? How come compaction does not remove the
>> obsolete indices/tombstones from 2nd index tables? Can I trigger the
>> cleanup manually somehow?
>> I have tried nodetool flush, compact, rebuild_index on both data table and
>> internal Index table, but with no result.
>> 
>> 2. When deleting a record I'm deleting the whole row at once - which would
>> create one tombstone for the whole record if I'm correct. Would it help to
>> delete the indexed columns separately creating extra tombstone for each
>> cell?
>> As I understand the underlying mechanism, the indexed column value must be
>> read in order a proper tombstone for the index is created for it.
>> 
>> 3. Could the fact that I'm reusing the primary key of a deleted record
>> shortly for a new insert interact with the secondary index tombstone
>> removal?
>> 
>> Will be grateful for any advice.
>> 
>> Regards,
>> Roman
>> 
>> --
>> <http://www.openmindnetworks.com>
>> <http://www.openmindnetworks.com/>
>> <https://www.linkedin.com/company/openmind-networks>
>> <https://twitter.com/Openmind_Ntwks>  <http://www.openmindnetworks.com/>
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx