git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RangeAwareCompaction for manual token management


It should work fine with num_tokens: 1

Without vnodes it also flushes to per-range sstables (if you have RF=3 you
will "always" get 3 sstables after flush), while with vnodes it groups the
ranges and flushes a full disk, so if you have a single
data_file_directories you get only one sstable, then compaction will write
them out to per-range sstables once they accumulates enough data.

/Marcus

On Thu, Jul 19, 2018 at 11:34 PM Carl Mueller
<carl.mueller@xxxxxxxxxxxxxxx.invalid> wrote:

> I don't want to comment on the 10540 ticket since it seems very well
> focused on vnode-aligned sstable partitioning and compaction. I'm pretty
> excited about that ticket. RACS should enable:
>
> - smaller scale LCS, more constrained I/O consumption
> - less sstables to hit in read path
> - multithreaded/multiprocessor compactions and even serving of data based
> on individual vnode or pools of vnodes
> - better alignment of tombstones with data they should be
> nullifying/eventually removing
> - repair streaming efficiency
> - backups have more granularity for not uploading sstables that didn't
> change for the range since last backup snapshot
>
> There is ongoing discussions as to using Priam for cluster management where
> I am, and as I understand it (superficially) Priam does not use vnodes and
> use manual tokens, and expands via node multiples. I believe it has certain
> advantages over vnodes including expanding by multiple machines at once,
> backups could possibly do (nodecount / RF) number of nodes for data backups
> rather than the mess of vnodes where you have to do basically all of them.
>
> But we could still do some divisor split of the manual range and apply RACS
> to that. I guess this would be vnode-lite. We could have some number like
> 100 subranges on a  node and expansion might just involve temporary lower
> bound count of subranges until the sstables can be reprocessed to the
> typical subrange count.
>
> Is this theoretically correct, or are there glaring things I might have
> missed with respect to RACS-style compaction and manual tokens?
>