Thank you for explaining, Alain!
Predetermining the nodes to query, then sending 'data' request to one of them and 'digest' request to another (for CL=QUORUM, RF=3) indeed explains more effective use of filesystem cache when dynamic snitching is disabled.
So, there will be replica / replicas for each token range that will never be queried (2 replicas for CL=ONE, 1 replica for CL=QUORUM for RF=3). But taking into account that data is evenly distributed across all nodes in the cluster, looks like there shouldn't be any issues related to such load redistribution, except the case that you mentioned, when a node is having performance issues but all requests are being sent to in anyway.
From: Alain RODRIGUEZ <arodrime@xxxxxxxxx>
Sent: Wednesday, August 8, 2018 1:27:50 AM
To: user cassandra.apache.org
Subject: Re: dynamic_snitch=false, prioritisation/order or reads from replicas
But in case of CL=QUORUM/LOCAL_QUORUM, if I'm not wrong, read request is sent to all replicas waiting for first 2 to reply.
My understanding is that this sentence is wrong. It is as you described it for writes indeed, all the replicas got the information (and to all the data centers). It's not the case for reads. For reads, x nodes are picked and used (x = ONE, QUORUM, ALL, ...).
Looks like the only change for dynamic_snitch=false is that "data" request is sent to a determined node instead of "currently the fastest one".
Indeed, the problem is that the 'currently the fastest one' changes very often in certain cases, thus removing the efficiency from the cache without enough compensation in many cases.
The idea of not using the 'bad' nodes is interesting to have more predictable latencies when a node is slow for some reason. Yet one of the side effects of this (and of the scoring that does not seem to be absolutely reliable) is that the clients are often routed to distinct nodes when under pressure, due to GC pauses for example or any other pressure.
Saving disk reads in read-heavy workloads under pressure is more important than trying to save a few milliseconds picking the 'best' node I guess.
I can imagine that alleviating these disks, reducing the number of disk IO/throughput ends up lowering the latency for all the nodes, thus the client application latency improves overall. That is my understanding of why it is so often good to disable the dynamic_snitch.
Did you get improved response for CL=ONE only or for higher CL's as well?
I must admit I don't remember for sure, but many people are using 'LOCAL_QUORUM' and I think I saw this for this consistency level as well. Plus this question might no longer stand as reads in Cassandra work slightly differently than what you thought.
I am not 100% comfortable with this 'dynamic_snitch theory' topic, so I hope someone else can correct me if I am wrong, confirm or add information :). But for sure I have seen this disabled giving some really nice improvement (as many others here as you mentioned). Sometimes it was not helpful, but I have never seen this change being really harmful though.
2018-08-06 22:27 GMT+01:00 Kyrylo Lebediev <Kyrylo_Lebediev@xxxxxxxx.invalid>: