Re: TWCS Compaction backed up

Hi Jeff/Jon et al, here is what I'm thinking to do to clean up, please lmk what you think. 

This is precisely my problem I believe:

With this I have a lot of wasted space due to a bad incremental repair.  So I am thinking to abandon incremental repairs by; 
- Set all repairedAt values to 0 on any/all *Data.db SSTables
- using either or reaper run sub range repairs

Will this clean everything up?  

On Tue, Aug 7, 2018 at 9:18 PM Brian Spindler <brian.spindler@xxxxxxxxx> wrote:
In fact all of them say Repaired at: 0. 

On Tue, Aug 7, 2018 at 9:13 PM Brian Spindler <brian.spindler@xxxxxxxxx> wrote:
Hi, I spot checked a couple of the files that were ~200MB and the mostly had "Repaired at: 0" so maybe that's not it? 


On Tue, Aug 7, 2018 at 8:16 PM <brian.spindler@xxxxxxxxx> wrote:
Everything is ttl’d 

I suppose I could use sstablemeta to see the repaired bit, could I just set that to unrepaired somehow and that would fix? 


On Aug 7, 2018, at 8:12 PM, Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:

May be worth seeing if any of the sstables got promoted to repaired - if so they’re not eligible for compaction with unrepaired sstables and that could explain some higher counts

Do you actually do deletes or is everything ttl’d?

Jeff Jirsa

On Aug 7, 2018, at 5:09 PM, Brian Spindler <brian.spindler@xxxxxxxxx> wrote:

Hi Jeff, mostly lots of little files, like there will be 4-5 that are 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each.   

Re incremental repair; Yes one of my engineers started an incremental repair on this column family that we had to abort.  In fact, the node that the repair was initiated on ran out of disk space and we ended replacing that node like a dead node.   

Oddly the new node is experiencing this issue as well.  


On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:
You could toggle off the tombstone compaction to see if that helps, but that should be lower priority than normal compactions

Are the lots-of-little-files from memtable flushes or repair/anticompaction?

Do you do normal deletes? Did you try to run Incremental repair?  

Jeff Jirsa

On Aug 7, 2018, at 5:00 PM, Brian Spindler <brian.spindler@xxxxxxxxx> wrote:

Hi Jonathan, both I believe.  

The window size is 1 day, full settings: 
    AND compaction = {'timestamp_resolution': 'MILLISECONDS', 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.2', 'class': 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'} 

nodetool tpstats 

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
MutationStage                     0         0    68582241832         0                 0
ReadStage                         0         0      209566303         0                 0
RequestResponseStage              0         0    44680860850         0                 0
ReadRepairStage                   0         0       24562722         0                 0
CounterMutationStage              0         0              0         0                 0
MiscStage                         0         0              0         0                 0
HintedHandoff                     1         1            203         0                 0
GossipStage                       0         0        8471784         0                 0
CacheCleanupExecutor              0         0            122         0                 0
InternalResponseStage             0         0         552125         0                 0
CommitLogArchiver                 0         0              0         0                 0
CompactionExecutor                8        42        1433715         0                 0
ValidationExecutor                0         0           2521         0                 0
MigrationStage                    0         0         527549         0                 0
AntiEntropyStage                  0         0           7697         0                 0
PendingRangeCalculator            0         0             17         0                 0
Sampler                           0         0              0         0                 0
MemtableFlushWriter               0         0         116966         0                 0
MemtablePostFlush                 0         0         209103         0                 0
MemtableReclaimMemory             0         0         116966         0                 0
Native-Transport-Requests         1         0     1715937778         0            176262

Message type           Dropped
READ                         2
RANGE_SLICE                  0
_TRACE                       0
MUTATION                  4390
COUNTER_MUTATION             0
BINARY                       0
REQUEST_RESPONSE          1882
PAGED_RANGE                  0
READ_REPAIR                  0

On Tue, Aug 7, 2018 at 7:57 PM Jonathan Haddad <jon@xxxxxxxxxxxxx> wrote:
What's your window size?

When you say backed up, how are you measuring that?  Are there pending tasks or do you just see more files than you expect?

On Tue, Aug 7, 2018 at 4:38 PM Brian Spindler <brian.spindler@xxxxxxxxx> wrote:
Hey guys, quick question: 
I've got a v2.1 cassandra cluster, 12 nodes on aws i3.2xl, commit log on one drive, data on nvme.  That was working very well, it's a ts db and has been accumulating data for about 4weeks.  

The nodes have increased in load and compaction seems to be falling behind.  I used to get about 1 file per day for this column family, about ~30GB Data.db file per day.  I am now getting hundreds per day at  1mb - 50mb.

How to recover from this? 

I can scale out to give some breathing room but will it go back and compact the old days into nicely packed files for the day?    

I tried setting compaction throughput to 1000 from 256 and it seemed to make things worse for the CPU, it's configured on i3.2xl with 8 compaction threads. 


Lastly, I have mixed TTLs in this CF and need to run a repair (I think) to get rid of old tombstones, however running repairs in 2.1 on TWCS column families causes a very large spike in sstable counts due to anti-compaction which causes a lot of disruption, is there any other way?  

Jon Haddad
twitter: rustyrazorblade

