git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Openstack] [cinder] Pruning Old Volume Backups with Ceph Backend


Apologies -- I'm running Pike release of Cinder, Luminous release of
Ceph. Deployed with OpenStack-Ansible and Ceph-Ansible respectively.

On Thu, Aug 23, 2018 at 8:27 PM, David Medberry <openstack at medberry.net> wrote:
> Hi Chris,
>
> Unless I overlooked something, I don't see Cinder or Ceph versions posted.
>
> Feel free to just post the codenames but give us some inkling.
>
> On Thu, Aug 23, 2018 at 3:26 PM, Chris Martin <cmart at cyverse.org> wrote:
>>
>> I back up my volumes daily, using incremental backups to minimize
>> network traffic and storage consumption. I want to periodically remove
>> old backups, and during this pruning operation, avoid entering a state
>> where a volume has no recent backups. Ceph RBD appears to support this
>> workflow, but unfortunately, Cinder does not. I can only delete the
>> *latest* backup of a given volume, and this precludes any reasonable
>> way to prune backups. Here, I'll show you.
>>
>> Let's make three backups of the same volume:
>> ```
>> openstack volume backup create --name backup-1 --force volume-foo
>> openstack volume backup create --name backup-2 --force volume-foo
>> openstack volume backup create --name backup-3 --force volume-foo
>> ```
>>
>> Cinder reports the following via `volume backup show`:
>> - backup-1 is not an incremental backup, but backup-2 and backup-3 are
>> (`is_incremental`).
>> - All but the latest backup have dependent backups
>> (`has_dependent_backups`).
>>
>> We take a backup every day, and after a week we're on backup-7. We
>> want to start deleting older backups so that we don't keep
>> accumulating backups forever! What happens when we try?
>>
>> ```
>> # openstack volume backup delete backup-1
>> Failed to delete backup with name or ID 'backup-1': Invalid backup:
>> Incremental backups exist for this backup. (HTTP 400)
>> ```
>>
>> We can't delete backup-1 because Cinder considers it a "base" backup
>> which `has_dependent_backups`. What about backup-2? Same story. Adding
>> the `--force` flag just gives a slightly different error message. The
>> *only* backup that Cinder will delete is backup-7 -- the very latest
>> one. This means that if we want to remove the oldest backups of a
>> volume, *we must first remove all newer backups of the same volume*,
>> i.e. delete literally all of our backups.
>>
>> Also, we cannot force creation of another *full* (non-incrmental)
>> backup in order to free all of the earlier backups for removal.
>> (Omitting the `--incremental` flag has no effect; you still get an
>> incremental backup.)
>>
>> Can we hope for better? Let's reach behind Cinder to the Ceph backend.
>> Volume backups are represented as a "base" RBD image with a snapshot
>> for each incremental backup:
>>
>> ```
>> # rbd snap ls volume-e742c4e2-e331-4297-a7df-c25e729fdd83.backup.base
>> SNAPID NAME
>>    SIZE TIMESTAMP
>>    577 backup.e3c1bcff-c1a4-450f-a2a5-a5061c8e3733.snap.1535046973.43
>> 10240 MB Thu Aug 23 10:57:48 2018
>>    578 backup.93fbd83b-f34d-45bc-a378-18268c8c0a25.snap.1535047520.44
>> 10240 MB Thu Aug 23 11:05:43 2018
>>    579 backup.b6bed35a-45e7-4df1-bc09-257aa01efe9b.snap.1535047564.46
>> 10240 MB Thu Aug 23 11:06:47 2018
>>    580 backup.10128aba-0e18-40f1-acfb-11d7bb6cb487.snap.1535048513.71
>> 10240 MB Thu Aug 23 11:22:23 2018
>>    581 backup.8cd035b9-63bf-4920-a8ec-c07ba370fb94.snap.1535048538.72
>> 10240 MB Thu Aug 23 11:22:47 2018
>>    582 backup.cb7b6920-a79e-408e-b84f-5269d80235b2.snap.1535048559.82
>> 10240 MB Thu Aug 23 11:23:04 2018
>>    583 backup.a7871768-1863-435f-be9d-b50af47c905a.snap.1535048588.26
>> 10240 MB Thu Aug 23 11:23:31 2018
>>    584 backup.b18522e4-d237-4ee5-8786-78eac3d590de.snap.1535052729.52
>> 10240 MB Thu Aug 23 12:32:43 2018
>> ```
>>
>> It seems that each snapshot stands alone and doesn't depend on others.
>> Ceph lets me delete the older snapshots.
>>
>> ```
>> # rbd snap rm
>> volume-e742c4e2-e331-4297-a7df-c25e729fdd83.backup.base at backup.e3c1bcff-c1a4-450f-a2a5-a5061c8e3733.snap.1535046973.43
>> Removing snap: 100% complete...done.
>> # rbd snap rm
>> volume-e742c4e2-e331-4297-a7df-c25e729fdd83.backup.base at backup.10128aba-0e18-40f1-acfb-11d7bb6cb487.snap.1535048513.71
>> Removing snap: 100% complete...done.
>> ```
>>
>> Now that we nuked backup-1 and backup-4, can we still restore from
>> backup-7 and launch an instance with it?
>>
>> ```
>> openstack volume create --size 10 --bootable volume-foo-restored
>> openstack volume backup restore backup-7 volume-foo-restored
>> openstack server create --volume volume-foo-restored --flavor medium1
>> instance-restored-from-backup-7
>> ```
>>
>> Yes! We can SSH to the instance and it appears intact.
>>
>> Perhaps each snapshot in Ceph stores a complete diff from the base RBD
>> image (rather than each successive snapshot depending on the last). If
>> this is true, then Cinder is unnecessarily protective of older
>> backups. Cinder represents these as "with dependents" and doesn't let
>> us touch them, even though Ceph will let us delete older RBD
>> snapshots, apparently without disrupting newer snapshots of the same
>> volume. If we could remove this limitation, Cinder backups would be
>> significantly more useful for us. We mostly host servers with
>> non-cloud-native workloads (IaaS for research scientists). For these,
>> full-disk backups at the infrastructure level are an important
>> supplement to file-level or application-level backups.
>>
>> It would be great if someone else could confirm or disprove what I'm
>> seeing here. I'd also love to hear from anyone else using Cinder
>> backups this way.
>>
>> Regards,
>>
>> Chris Martin at CyVerse
>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>