git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Bootstrap OOM issues with Cassandra 3.11.1


In the past, we have seen corruption in the persisted key cache resulting in OOM when starting up, thus perhaps something similar (corruption) is possible for commit logs as well, resulting in very large on-heap allocations (think about allocating a byte array with a buggy initial size), cause according to your provided Cassandra log, this first happens processing the commit log, which sounds strange.

 

Just a very wild guess.

 

Thomas

 

From: Laszlo Szabo <laszlo.viktor.szabo@xxxxxxxxx>
Sent: Dienstag, 07. August 2018 17:39
To: user@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Bootstrap OOM issues with Cassandra 3.11.1

 

The last run I attempted used 135GB of RAM allocated to the JVM (arguments below), and while there are OOM errors, there is not a stack trace in either the system or debug log.  On direct memory runs, there is a stack trace.  The last Direct memory run used 60GB heaps and 60GB for off heap (that was the stack trace attached).  The HPROF file is 135GB and I'm trying to generate the heap information from that now, but its been running for 2 hours.  

 

The closest I can get to the stack trace for the 135GB heap run is below.  

 

ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2018-08-07 00:34:13,980 JVMStabilityInspector.java:82 - Exiting due to error while processing commit log during initialization.
java.lang.OutOfMemoryError: Java heap space
ERROR [MessagingService-Incoming-/10.1.1.11] 2018-08-07 00:34:13,980 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/10.1.1.11,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR [HintsWriteExecutor:1] 2018-08-07 00:34:13,981 CassandraDaemon.java:228 - Exception in thread Thread[HintsWriteExecutor:1,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR [MessagingService-Incoming-/10.1.1.13] 2018-08-07 00:34:13,980 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/10.1.1.13,5,main]
java.lang.OutOfMemoryError: Java heap space
INFO  [Service Thread] 2018-08-07 00:34:13,982 StatusLogger.java:101 - system.schema_triggers                    0,0

 

 

JVM Arguments: [-Xloggc:/var/log/cassandra/gc.log, -ea, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003

, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem, -Djava.net.preferIPv4Stack=true, -Xms135G, -Xmx135G, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, 

-XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:+CMSClassUnloadingEnab

led, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -X

mn2048M, -XX:+UseCondCardMark, -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar, -Dcassandra.jmx.local.port=7199, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.j

mxremote.password.file=/etc/cassandra/jmxremote.password, -Djava.library.path=/usr/share/cassandra/lib/sigar-bin, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/var/log/cassandra, -Dcassandra.storagedir=/var/lib/cassandra, -Dcassan

dra-pidfile=/var/run/cassandra/cassandra.pid, -XX:HeapDumpPath=/var/lib/cassandra/java_1533559747.hprof, -XX:ErrorFile=/var/lib/cassandra/hs_err_1533559747.log]

 

 

On Tue, Aug 7, 2018 at 11:12 AM, Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:

That's a direct memory OOM - it's not the heap, it's the offheap. 

 

You can see that gpsmessages.addressreceivedtime_idxgpsmessages.addressreceivedtime_idx is holding about 2GB of offheap memory (most of it for the bloom filter), but none of the others look like they're holding a ton offheap (either in bloom filter, memtable, etc).  With what JVM args are you starting cassandra (how much direct memory are you allocating)? Are all of your OOMs in direct memory?

 

 

 

On Tue, Aug 7, 2018 at 6:30 AM, Laszlo Szabo <laszlo.viktor.szabo@xxxxxxxxx> wrote:

Hi,

 

Thanks for the fast response!

 

We are not using any materialized views, but there are several indexes.  I don't have a recent heap dump, and it will be about 24 before I can generate an interesting one, but most of the memory was allocated to byte buffers, so not entirely helpful.  

 

nodetool cfstats is also below.

 

I also see a lot of flushing happening, but it seems like there are too many small allocations to be effective.  Here are the messages I see,

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459 ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%) on-heap, 0.014KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,460 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,461 ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%) on-heap, 0.011KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465 ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%) on-heap, 0.188KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%) on-heap, 0.024KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%) on-heap, 0.188KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472 ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%) on-heap, 0.013KiB (0%) off-heap

 

 

 

Stack traces from errors are below.

 

java.io.IOException: Broken pipe

        at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_181]

        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_181]

        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_181]

        at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_181]

        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_181]

        at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.doFlush(BufferedDataOutputStreamPlus.java:323) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.flush(BufferedDataOutputStreamPlus.java:331) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:409) [apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:380) [apache-cassandra-3.11.1.jar:3.11.1]

        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

ERROR [MutationStage-226] 2018-08-06 07:16:08,236 JVMStabilityInspector.java:142 - JVM state determined to be unstable.  Exiting forcefully due to:

java.lang.OutOfMemoryError: Direct buffer memory

        at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_181]

        at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.8.0_181]

        at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_181]

        at org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:139) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:104) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:40) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.Memtable.put(Memtable.java:269) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1332) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:618) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:425) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:222) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:68) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]

        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.1.jar:3.11.1]

        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.1.jar:3.11.1]

        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459 ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%) on-heap, 0.014KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,460 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,461 ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%) on-heap, 0.011KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465 ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%) on-heap, 0.188KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%) on-heap, 0.024KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470 ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%) on-heap, 0.188KiB (0%) off-heap

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472 ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo', ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live: 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00

DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472 ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%) on-heap, 0.013KiB (0%) off-heap

 

Total number of tables: 40

----------------

Keyspace : userinfo

Read Count: 143301

Read Latency: 14.945587623254548 ms.

Write Count: 2754603904

Write Latency: 0.020883145284324698 ms.

Pending Flushes: 0

Table (index): gpsmessages.gpsmessages_addresscount_idxgpsmessages.gpsmessages_addresscount_idx

SSTable count: 9

Space used (live): 19043463189

Space used (total): 19043463189

Space used by snapshots (total): 0

Off heap memory used (total): 6259448

SSTable Compression Ratio: 0.3704785164266614

Number of partitions (estimate): 1025

Memtable cell count: 309066

Memtable data size: 13602774

Memtable off heap memory used: 0

Memtable switch count: 0

Local read count: 0

Local read latency: NaN ms

Local write count: 46025778

Local write latency: 0.034 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 2504

Bloom filter off heap memory used: 2432

Index summary off heap memory used: 320

Compression metadata off heap memory used: 6256696

Compacted partition minimum bytes: 43

Compacted partition maximum bytes: 44285675122

Compacted partition mean bytes: 30405277

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 0

 

Table (index): gpsmessages.gpsmessages_addresses_idxgpsmessages.gpsmessages_addresses_idx

SSTable count: 18

Space used (live): 409514565570

Space used (total): 409514565570

Space used by snapshots (total): 0

Off heap memory used (total): 153405673

SSTable Compression Ratio: 0.4447731157134059

Number of partitions (estimate): 6013125

Memtable cell count: 1110334

Memtable data size: 67480140

Memtable off heap memory used: 0

Memtable switch count: 0

Local read count: 0

Local read latency: NaN ms

Local write count: 147639252

Local write latency: 0.015 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 34175400

Bloom filter off heap memory used: 34175256

Index summary off heap memory used: 7432177

Compression metadata off heap memory used: 111798240

Compacted partition minimum bytes: 61

Compacted partition maximum bytes: 322381140

Compacted partition mean bytes: 36692

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 0

 

Table (index): gpsmessages.addressreceivedtime_idxgpsmessages.addressreceivedtime_idx

SSTable count: 10

Space used (live): 52738155477

Space used (total): 52738155477

Space used by snapshots (total): 0

Off heap memory used (total): 1909362628

SSTable Compression Ratio: 0.4106961621795128

Number of partitions (estimate): 1338730016

Memtable cell count: 308990

Memtable data size: 13410867

Memtable off heap memory used: 0

Memtable switch count: 0

Local read count: 0

Local read latency: NaN ms

Local write count: 46012614

Local write latency: 0.012 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 1687550888

Bloom filter off heap memory used: 1687550808

Index summary off heap memory used: 213249180

Compression metadata off heap memory used: 8562640

Compacted partition minimum bytes: 36

Compacted partition maximum bytes: 2759

Compacted partition mean bytes: 54

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 0

 

Table: gpsmessages

SSTable count: 13

Space used (live): 337974446627

Space used (total): 337974446627

Space used by snapshots (total): 0

Off heap memory used (total): 77833540

SSTable Compression Ratio: 0.5300637241381126

Number of partitions (estimate): 22034

Memtable cell count: 308904

Memtable data size: 72074512

Memtable off heap memory used: 0

Memtable switch count: 110

Local read count: 0

Local read latency: NaN ms

Local write count: 45996652

Local write latency: 0.281 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 67904

Bloom filter off heap memory used: 67800

Index summary off heap memory used: 11756

Compression metadata off heap memory used: 77753984

Compacted partition minimum bytes: 73

Compacted partition maximum bytes: 1155149911

Compacted partition mean bytes: 13158224

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 13699

 

Table: user_history

SSTable count: 17

Space used (live): 116361158882

Space used (total): 116361158882

Space used by snapshots (total): 0

Off heap memory used (total): 29562319

SSTable Compression Ratio: 0.5683114352331539

Number of partitions (estimate): 1337206

Memtable cell count: 773277

Memtable data size: 40623368

Memtable off heap memory used: 0

Memtable switch count: 57

Local read count: 209

Local read latency: NaN ms

Local write count: 145853733

Local write latency: 0.020 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 3844416

Bloom filter off heap memory used: 3844280

Index summary off heap memory used: 800991

Compression metadata off heap memory used: 24917048

Compacted partition minimum bytes: 61

Compacted partition maximum bytes: 464228842

Compacted partition mean bytes: 72182

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 66702

 

Table: users

SSTable count: 3

Space used (live): 89945186

Space used (total): 89945186

Space used by snapshots (total): 0

Off heap memory used (total): 2092053

SSTable Compression Ratio: 0.5712127629253333

Number of partitions (estimate): 1365645

Memtable cell count: 3556

Memtable data size: 150903

Memtable off heap memory used: 0

Memtable switch count: 42

Local read count: 143087

Local read latency: 6.094 ms

Local write count: 250971

Local write latency: 0.024 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 1709848

Bloom filter off heap memory used: 1709824

Index summary off heap memory used: 372125

Compression metadata off heap memory used: 10104

Compacted partition minimum bytes: 36

Compacted partition maximum bytes: 310

Compacted partition mean bytes: 66

Average live cells per slice (last five minutes): 1.0

Maximum live cells per slice (last five minutes): 1

Average tombstones per slice (last five minutes): 1.0

Maximum tombstones per slice (last five minutes): 1

Dropped Mutations: 114

 

Table: tweets

SSTable count: 18

Space used (live): 1809145656486

Space used (total): 1809145656486

Space used by snapshots (total): 0

Off heap memory used (total): 435915908

SSTable Compression Ratio: 0.5726200929451171

Number of partitions (estimate): 26217889

Memtable cell count: 710146

Memtable data size: 31793929

Memtable off heap memory used: 0

Memtable switch count: 399

Local read count: 5

Local read latency: NaN ms

Local write count: 2322829524

Local write latency: 0.019 ms

Pending flushes: 0

Percent repaired: 0.0

Bloom filter false positives: 0

Bloom filter false ratio: 0.00000

Bloom filter space used: 35019224

Bloom filter off heap memory used: 35019080

Index summary off heap memory used: 16454076

Compression metadata off heap memory used: 384442752

Compacted partition minimum bytes: 104

Compacted partition maximum bytes: 3379391

Compacted partition mean bytes: 124766

Average live cells per slice (last five minutes): NaN

Maximum live cells per slice (last five minutes): 0

Average tombstones per slice (last five minutes): NaN

Maximum tombstones per slice (last five minutes): 0

Dropped Mutations: 697696

 

----------------

 

 

On Mon, Aug 6, 2018 at 8:57 PM, Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:

 

 

Upgrading to 3.11.3 May fix it (there were some memory recycling bugs fixed recently), but analyzing the heap will be the best option

 

If you can print out the heap histogram and stack trace or open a heap dump in your kit or visualvm or MAT and show us what’s at the top of the reclaimed objects, we may be able to figure out what’s going on

-- 

Jeff Jirsa

 


On Aug 6, 2018, at 5:42 PM, Jeff Jirsa <jjirsa@xxxxxxxxx> wrote:

Are you using materialized views or secondary indices? 

-- 

Jeff Jirsa

 


On Aug 6, 2018, at 3:49 PM, Laszlo Szabo <laszlo.viktor.szabo@xxxxxxxxx> wrote:

Hello All,

 

I'm having JVM unstable / OOM errors when attempting to auto bootstrap a 9th node to an existing 8 node cluster (256 tokens).  Each machine has 24 cores 148GB RAM and 10TB (2TB used).  Under normal operation the 8 nodes have JVM memory configured with Xms35G and Xmx35G, and handle 2-4 billion inserts per day.  There are never updates, deletes, or sparsely populated rows.  

 

For the bootstrap node, I've tried memory values from 35GB to 135GB in 10GB increments. I've tried using both memtable_allocation_types (heap_buffers and offheap_buffers).  I've not tried modifying the memtable_cleanup_threshold but instead have tried memtable_flush_writers from 2 to 8.  I've tried memtable_(off)heap_space_in_mb from 20000 to 60000.  I've tried both CMS and G1 garbage collection with various settings.  

 

Typically, after streaming about ~2TB of data, CPU load will hit a maximum, and the "nodetool info" heap memory will, over the course of an hour, approach the maximum.  At that point, CPU load will drop to a single thread with minimal activity until the system becomes unstable and eventually the OOM error occurs.

 

Excerpt of the system log is below, and what I consistently see is the MemtableFlushWriter and the MemtableReclaimMemory pending queues grow as the memory becomes depleted, but the number of completed seems to stop changing a few minutes after the CPU load spikes.

 

One other data point is there seems to be a huge number of mutations that occur after most of the stream has occured.  Concurrent_writes is set at 256 with the queue getting as high as 200K before dropping down.  

 

Any suggestions for yaml changes or jvm changes?  JVM.options is currently the default with the memory set to the max, the current YAML file is below.

 

Thanks!

 

 

INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,329 StatusLogger.java:51 - MutationStage                     1         2      191498052         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,331 StatusLogger.java:51 - ViewMutationStage                 0         0              0         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,338 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0         0         0           5865         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,343 StatusLogger.java:51 - ReadStage                         0         0              0         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,347 StatusLogger.java:51 - ValidationExecutor                0         0              0         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,360 StatusLogger.java:51 - RequestResponseStage              0         0              8         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,380 StatusLogger.java:51 - Sampler                           0         0              0         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,382 StatusLogger.java:51 - MemtableFlushWriter               8     74293           4716         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,388 StatusLogger.java:51 - ReadRepairStage                   0         0              0         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,389 StatusLogger.java:51 - CounterMutationStage              0         0              0         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,404 StatusLogger.java:51 - MiscStage                         0         0              0         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,407 StatusLogger.java:51 - CompactionExecutor                8        13            493         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,410 StatusLogger.java:51 - InternalResponseStage             0         0             16         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,413 StatusLogger.java:51 - MemtableReclaimMemory             1      6066            356         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,421 StatusLogger.java:51 - AntiEntropyStage                  0         0              0         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,430 StatusLogger.java:51 - CacheCleanupExecutor              0         0              0         0                 0
INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,431 StatusLogger.java:51 - PendingRangeCalculator            0         0              9         0                 0
INFO  [Service Thread] 2018-08-06 17:49:26,436 StatusLogger.java:61 - CompactionManager                 8        19

 

 

 

 Current Yaml

num_tokens: 256

hinted_handoff_enabled: true

hinted_handoff_throttle_in_kb: 10240 

max_hints_delivery_threads: 8

hints_flush_period_in_ms: 10000

max_hints_file_size_in_mb: 128

batchlog_replay_throttle_in_kb: 10240

authenticator: AllowAllAuthenticator

authorizer: AllowAllAuthorizer

role_manager: CassandraRoleManager

roles_validity_in_ms: 2000

permissions_validity_in_ms: 2000

credentials_validity_in_ms: 2000

partitioner: org.apache.cassandra.dht.Murmur3Partitioner

data_file_directories:

    - /data/cassandra/data

commitlog_directory: /data/cassandra/commitlog

cdc_enabled: false

disk_failure_policy: stop

commit_failure_policy: stop

prepared_statements_cache_size_mb:

thrift_prepared_statements_cache_size_mb:

key_cache_size_in_mb:

key_cache_save_period: 14400

row_cache_size_in_mb: 0

row_cache_save_period: 0

counter_cache_size_in_mb:

counter_cache_save_period: 7200

saved_caches_directory: /data/cassandra/saved_caches

commitlog_sync: periodic

commitlog_sync_period_in_ms: 10000

commitlog_segment_size_in_mb: 32

seed_provider:

    - class_name: org.apache.cassandra.locator.SimpleSeedProvider

      parameters:

          - seeds: "10.1.1.11,10.1.1.12,10.1.1.13"

concurrent_reads: 128

concurrent_writes: 256

concurrent_counter_writes: 96

concurrent_materialized_view_writes: 32

disk_optimization_strategy: spinning

memtable_heap_space_in_mb: 61440

memtable_offheap_space_in_mb: 61440

memtable_allocation_type: heap_buffers

commitlog_total_space_in_mb: 81920

memtable_flush_writers: 8

 

 

 

 

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313