git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CommitLog Recovery replay stop on first timestamp after restore-point-in-time


I don’t have any personal knowledge of the fix but out of interest I took a
look in Jira and it looks sounds to me like the behaviour was fixed here
(in 2.0.10): https://issues.apache.org/jira/browse/CASSANDRA-6905

---


*Ben Slater*
*Chief Product Officer*


<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Thu, 20 Dec 2018 at 21:07, Morten Vejen Nielsen <mvejen@xxxxxxxxx> wrote:

> Hi,
>
> (Moved from user mailing list to here)
>
> I have found a statement in the Datastax documentation regarding CommitLog
> recovery that concerns me, namely:
>
> "*Restore stops when the first client-supplied timestamp is greater than
> the restore point timestamp. Because the order in which the database
> receives mutations does not strictly follow the timestamp order, this can
> leave some mutations unrecovered.*"
>
> From:
>
> https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configLogArchive.html
> Which to me means that point in time restore really doesn't guarantee point
> in time replay for the configured time. Since we expect to have mutations
> out of order in our setup.
>
> I conducted a few experiments on this myself by forcing my Cassandra
> instance to do CommitLog replay with changes ahead in time. But I was not
> able to reproduce this behavior.
> I used a fresh instance taken from the official Cassandra docker image to
> run the tests, so no changes to any configs was done other than setting the
> restore_point_in_time as specified below.
> I did the experiment as follows:
>
> --edit /etc/cassandra/commitlog_archiving.properties, set
> *restore_point_in_time* to something in the near future (lets say 2
> hours ahead of server-time)
>
> ssh into instance
>
> cqlsh
> create keyspace thezoo with replication =
> {'class':'SimpleStrategy','replication_factor':1};
> use thezoo;
> create table animal (id int primary key, name varchar);
> insert into animal (id, name) values (1, 'Bear1');insert into animal
> (id, name) values (2, 'Bear2');insert into animal (id, name) values
> (3, 'Bear3');insert into animal (id, name) values (4, 'Bear4');insert
> into animal (id, name) values (5, 'Bear5');insert into animal (id,
> name) values (6, 'Bear6');insert into animal (id, name) values (7,
> 'Bear7');insert into animal (id, name) values (8, 'Bear8');insert into
> animal (id, name) values (9, 'Bear9');insert into animal (id, name)
> values (10, 'Bear10');
> select id,name,writetime(name) from animal;
> --Add some to timestamp, and use this as future_timestamp, must be
> ahead of what was defined in commitlog config file
> insert into animal (id, name) values (11, 'DuckFromFuture') using
> timestamp <future_timestamp>
> insert into animal (id, name) values (12, 'Bird1');insert into animal
> (id, name) values (13, 'Bird2');insert into animal (id, name) values
> (14, 'Bird3');insert into animal (id, name) values (15,
> 'Bird4');insert into animal (id, name) values (16, 'Bird5');insert
> into animal (id, name) values (17, 'Bird6');insert into animal (id,
> name) values (18, 'Bird7');insert into animal (id, name) values (19,
> 'Bird8');insert into animal (id, name) values (20, 'Bird9');insert
> into animal (id, name) values (21, 'Bird10');
>
> --Now I simply forced the power off the machine held the power button
> down. And restarted
>
> --During startup verify that commitlog replay has been done in log
>
> ssh into instance and enter cqlsh
>
> cqlsh:thezoo> select * from animal;
>
> --Which shows all the bears and birds have been replayed but not the duck!
>
> I also did some digging in the Cassandra source code, and made the
> following findings:
>
> I think the code that skips mutations ahead of time is in CommitLogReplayer
> class:
> See lines: 194-195 (at the time of writing)
> if (commitLogReplayer.pointInTimeExceeded(mutation))
>        return;
> This code is triggerred from CommitLogReader, where readSection seems to be
> responsible for reading the commit logs, this is wrapped in a while loop,
> that just reads the file until EOF.
> See:
>  while (statusTracker.shouldContinue() && reader.getFilePointer() < end &&
> !reader.isEOF())
> This method is called file by file from CommitLog.recover to recover all
> commitlog segment files.
> And just a note statusTracker.shouldContinue will fail if
> statusTracker.requestTermination(); is called but I dont see this being the
> case for the pointInTimeExceeded case.
>
> I am a bit concerned if this is some hidden feature in Cassandra, and as
> such we might have to revise our backup strategies, if this is the case.
> However as far as I can see the Datastax documentation on this is simply
> wrong unfortunately the official documentation on this just seems to be
> work in progress.
> The fact that it doesn't do this is in fact a positive result for me as I
> would also expect the point-in-time to guarantee that all mutatations up
> until this point-in-time is in fact recovered.
>
> Can anyone confirm if this is just the documentation that is wrong or maybe
> if I did something wrong in my experiments.
>
> (For reference I also conducted some experiments with larger amount of data
> where the recovery went through multiple commitlog files, but I got the
> same results, namely that it recovered ALL records before
> restore_point_in_time.)
>
> Best regards
> Morten V. Nielsen
>