git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: Re: How to gracefully decommission a highly loaded node?


After a long time stuck in LEAVING, and "not doing any streams", i killed Cassandra process and restart it, then again ran nodetool decommission (Datastax recipe for stuck decommission),
now it says, LEAVING, "unbootstrap $(the node id)"

What's going on? Should i forget about decommission and just remove the node?

There is an issue to make decommission resumable:
https://issues.apache.org/jira/browse/CASSANDRA-12008

but i couldn't figure out how this suppose to work? I was expecting that after restarting stucked-decommission-cassandra, it resume the decommissioning process, but the node became UN after restart.

Sent using Zoho Mail



============ Forwarded message ============
From : Simon Fontana Oscarsson <simon.fontana.oscarsson@xxxxxxxxxxxx>
To : "user@xxxxxxxxxxxxxxxxxxxx"<user@xxxxxxxxxxxxxxxxxxxx>
Date : Tue, 04 Dec 2018 15:20:15 +0330
Subject : Re: How to gracefully decommission a highly loaded node?
============ Forwarded message ============

Hi,

If it already uses 100 % CPU I have a hard time seeing it being able to do a decomission while serving requests. If you have a lot of free space I would first try nodetool disableautocompaction. If you don't see any progress in nodetool netstats you can also disablebinary, disablethrift and disablehandoff to stop serving client requests. 

--
SIMON FONTANA OSCARSSON
Software Developer

Ericsson
Ölandsgatan 1
37133 Karlskrona, Sweden
simon.fontana.oscarsson@xxxxxxxxxxxx
www.ericsson.com

On tis, 2018-12-04 at 14:21 +0330, onmstester onmstester wrote:

One node suddenly uses 100% CPU, i suspect hardware problems and do not have time to trace that, so decided to just remove the node from the cluster, but although the node state changed to UL, but no sign of Leaving: the node is still compacting and flushing memtables, writing mutations and CPU is 100% for hours since.
Is there any means to force a Cassandra Node to just decommission and stop doing normal things?
Due to W.CL=ONE, i can not use removenode and shutdown the node

Best Regards

Sent using Zoho Mail




Attachment: smime.p7s
Description: Binary data

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: user-help@xxxxxxxxxxxxxxxxxxxx