git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: C* as fluent data storage, 10MB/sec/node?


Probably fine as long as there’s some concept of time in the partition key to keep them from growing unbounded. 

Use TWCS, TTLs and something like 5-10 minute buckets. Don’t use RF=1, but you can write at CL ONE. TWCS will largely just drop whole sstables as they expire (especially with 3.11 and the more aggressive expiration logic there)



-- 
Jeff Jirsa


> On Nov 28, 2018, at 11:24 AM, Adam Smith <adamsmith8745@xxxxxxxxx> wrote:
> 
> Hi All,
> 
> I need to use C* somehow as fluent data storage - maybe this is different to the queue antipattern? Lots of data come in (10MB/sec/node), remains for e.g. 1 hour and should then be evicted. It is somehow not critical when data would occasionally disappear/get lost.
> 
> Thankful for any advice!
> 
> Is this nowadays possible without suffering too much from compactation? I would not have ranged tombstones, and depending on a possible solution only using point deletes (PK+CK). There is only one CK, could also be empty.
> 
> 1) The data is usually 1 MB. Can I just update with empty data? PK + CK would remain, but I would not carry about that. Would this create tombstones or is equivalent to a DELETE?
> 
> 2) Like 1) and later then set a TTL == small amount of data to be deleted then? And hopefully small compactation?
> 
> 3) Simply setting TTL 1h and hoping the best, because I am wrong with my worries?
> 
> 4) Any optimization strategies like setting the RF to 1? Which compactation strategy is advised?
> 
> 5) Are there any recent performance benchmarks for one of the scenarios? 
> 
> What else could I do?
> 
> Thanks a lot!
> Adam

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: user-help@xxxxxxxxxxxxxxxxxxxx