Implementing Partial Backups and Restores (via Leveraging Partitions)

Continuing on with my Partitioning post series, this is part 5.

The partitioning includes several major components of work (and can be linked below);

  1. partitioning large existing non-partitioned tables
  2. measuring performance impacts of partitioned aligned indexes
  3. measuring performance impacts of DML triggers for enforcing partitioned unique indexes
  4. rebuilding tables that are already partitioned (ie. apply a new partitioning scheme)
  5. implementing partial backups and restores (via leveraging partitions)
  6. implementing partition aware index optimisation procedures
  7. Calculating table partition sizes in advance

This blog post deals with implementing partition aware partial backup and restore procedures.

I will blog about the other steps later.

Why partial backups?

Partial SQL database backups are exactly as they sound, a method to backup portions of a single database in such a way that still enables them to be restored consistently and in their entirety.  The process to perform partial backups and restores can be complex and therefore the feature is typically used for very large databases that warrant the extra DBA management overhead.

The partial backup/restore process ties in rather nicely with partitioning — more specifically when partitions are stored on their own dedicated filegroups/files.  When partitioning is setup this way the partial backup/restore process can perform actions at the filegroup level and manage the database in logical (and physical!) chunks.

The older partition filegroups that are no longer changing can be set to READ ONLY with only the most recent partition filegroups set to READ WRITE.  This provides the ability to only backup data that is changing leaving the unchanged data in place and thereby speeding up the backup/restore processes.

A couple of key points for Partial Backups;

  • Partial Backups work under any recovery model, but was specifically designed for SIMPLE recovery model.
  • A partial backup of a database with no READ ONLY filegroups is the same as running a FULL database backup
  • A partial backup of a READ ONLY database contains only the PRIMARY filegroup.
  • There is no reliable method to tell when a filegroup changes its property setting for READ ONLY / READ WRITE and thus a partial backup can easily miss a filegroup that has been changed from RW to RO.  Therefore every time a filegroup changes you must backup that filegroup as well as the partial database.  (NOTE: At first glance in SQL 2016 there are no new XEvents that can track filegroup changes! Grrr!)
  • A best practice for any database (and in particular partial databases) is to ensure that no user objects reside in the PRIMARY filegroup, and instead reside in one or more SECONDARY filegroups.  This becomes evident for partial backups/restores as the PRIMARY filegroup controls everything about the database and its best that its kept small.

Why partial (aka piecemeal) restores?

Why would you ever want to restore only part of your database?  Well, what if your database was 100TB in size and took 240hrs to restore in full before coming online?  That’s got to hurt!

But what if you had taken my sage council from previous posts and implemented table partitions which spread the core data over say 100x 1TB filegroups which enabled the database to be online after only the very first 1hr restore sequenceDBA Zero to Hero!

You’re talking crazy Mr. Fox SQL!  —  er, no  —  Enter Partial Restores!

With partial restore you can progress through a piecemeal restore process that slowly recovers the database piece by piece in an order which is relevant to how the application uses the data, ie; most to least recently accessed.  Even better, any filegroup that was marked as READ ONLY does not even need to be restored! (Caveat; assuming its files are still available, undamaged and consistent on the underlying storage subsystem).

A couple of key points for Partial Restores;

  • This is an Enterprise feature only
  • Partial Restore work under any recovery model, and although it was specifically designed for SIMPLE recovery model, its more flexible (and complex!) with the FULL recovery model
  • You can both restore and use only part of a partitioned table (ie. say restore and use just the 1st filegroup out of all filegroups).  However any attempt to access table data from a filegroup that is not online will fail with a Severity 16 Error.
  • You can restore a database and bring it actively online as soon as the PRIMARY filegroup is restored, then slowly restore SECONDARY filegroups in a set order as needed for the application.
  • There are limitations and informational statements around memory optimised tables and filestream that is best read for your specific situation https://msdn.microsoft.com/en-us/library/ms177425.aspx

 

Lets see this puppy in action…!

Setup the Demo Playpen Environment

We’ll setup a demo database that has a PRIMARY filegroup, 3x SECONDARY filegroups and a partitioned table (by date) that has one yearly partition on each of the SECONDARY filegroups.

We’ll then load up some sample data and select this back so we know what it looks like.

-- CREATE DATABASE
USE [master]
GO

CREATE DATABASE [PartialDatabase] ON PRIMARY
(   NAME = N'PartialDatabase'
  , FILENAME = N'C:\SQLData\PartialDatabase_primary.mdf'
  , SIZE = 10240KB , FILEGROWTH = 10240KB )
 
  , FILEGROUP [Secondary01]
(   NAME = N'PartialDatabase_Secondary01'
  , FILENAME = N'C:\SQLData\PartialDatabase_Secondary01.ndf'
  , SIZE = 10240KB , FILEGROWTH = 10240KB )
  , FILEGROUP [Secondary02]
(   NAME = N'PartialDatabase_Secondary02'
  , FILENAME = N'C:\SQLData\PartialDatabase_Secondary02.ndf'
  , SIZE = 10240KB , FILEGROWTH = 10240KB )
  , FILEGROUP [Secondary03]
(   NAME = N'PartialDatabase_Secondary03'
  , FILENAME = N'C:\SQLData\PartialDatabase_Secondary03.ndf'
  , SIZE = 10240KB , FILEGROWTH = 10240KB ) 

  LOG ON
(   NAME = N'PartialDatabase_log'
  , FILENAME = N'C:\SQLData\PartialDatabase_log.ldf'
  , SIZE = 10240KB , FILEGROWTH = 10240KB )
GO

ALTER DATABASE [PartialDatabase] SET RECOVERY FULL
GO

-- CREATE PARTITION FUNCTION & SCHEME
USE [PartialDatabase]
GO
CREATE PARTITION FUNCTION pf_myorders_date (DATETIME) AS RANGE RIGHT FOR VALUES('2014-01-01 00:00:00', '2015-01-01 00:00:00')
CREATE PARTITION SCHEME ps_myorders_date AS PARTITION pf_myorders_date TO ([Secondary01], [Secondary02], [Secondary03])
GO

-- CREATE TABLE
CREATE TABLE dbo.myorders
(
     myorder_id        INT                NOT NULL
     , myorder_date    DATETIME            NOT NULL
     , myorder_details NVARCHAR(4000)    NOT NULL
     , CONSTRAINT pk_myorderid PRIMARY KEY CLUSTERED (myorder_id, myorder_date)
)
ON ps_myorders_date(myorder_date)
GO

-- INSERT SOME DATA
INSERT INTO dbo.myorders SELECT 1, '2013-01-01 00:00:00', 'Secondary01'
INSERT INTO dbo.myorders SELECT 2, '2014-01-01 00:00:00', 'Secondary02'
INSERT INTO dbo.myorders SELECT 3, '2015-01-01 00:00:00', 'Secondary03'
GO

 

What does our “Partial Database” look like?

Here’s a SQL command to run which shows us a picture of what the partial play pen database looks like and the data within it.

-- CHECK WHAT DATA WE CAN SEE
SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   dbo.myorders
GO

Partial Database Starting Data

 

 

 

Partial Backup/Restore Testing Sequences

There are just so many combinations when it comes to partial backups/restores that its difficult to cover then all – so I will step through 2 fairly common sequences;

  • STANDARD Sequence – Recover partial backup progressively in full
  • READ ONLY Sequence – Recover partial backup but leave Read Only filegroups in place

Both of these could be interleaved with differential and log backups however these are not demonstrated here.

Also in the case of an actual PROD restore process, you would normally use the WITH NORECOVERY option (or STANDBY option) at the various restore stages and mindfully recover the database when you are ready.  If you recover the database at any stage by accident then you would have to go back to the start of the sequence and start again, and that will make you go DBA Hero to Zero.

 

Partial backup and restore – STANDARD Sequence

Lets go though the steps to backup and then progressively recover a database in full from a partial backup file

Firstly we need to initiate a partial backup of our primary database.  To do this we use the clause READ_WRITE_FILEGROUPS which will only backup any filegroup that is in a READ WRITE state.   At this point in time all the filegroups are RW so this is the same as a FULL backup.

-- Partial DB Backup
BACKUP DATABASE PartialDatabase READ_WRITE_FILEGROUPS
TO DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_FULL.bak'
WITH INIT
GO

As can be seen the output shows all filegroups were accessed for the backup.

Processed 328 pages for database 'PartialDatabase', file 'PartialDatabase' on file 1.
Processed 16 pages for database 'PartialDatabase', file 'PartialDatabase_Secondary01' on file 1.
Processed 16 pages for database 'PartialDatabase', file 'PartialDatabase_Secondary02' on file 1.
Processed 16 pages for database 'PartialDatabase', file 'PartialDatabase_Secondary03' on file 1.
Processed 8 pages for database 'PartialDatabase', file 'PartialDatabase_log' on file 1.
BACKUP DATABASE...FILE= successfully processed 384 pages in 0.046 seconds (65.090 MB/sec).

We can then check the backup file itself using RESTORE FILELISTONLY to see whats actually inside it and what components of the database were RO at the time of the backup.  If you need further detailed information of the first and last LSN’s in the backup files then you need to use the command RESTORE HEADERONLY as well.

-- SEE WHAT IS IN THE FILES
RESTORE FILELISTONLY
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
GO

Partial Database Partial Backup File 1

 

We can now perform a partial restore of the database;

  • We’ll restore it to a new name called PartialDatabase_Recovery
  • The WITH option to use is PARTIAL which tells SQL that the restore is part of a partial database recovery process.
  • The restore must instruct SQL where to place the database files regardless if they are being restored in this step or not.
RESTORE DATABASE [PartialDatabase_Recovery]
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY, PARTIAL,
MOVE 'PartialDatabase' TO 'C:\SQLData\PartialDatabase_Recovery_Primary.mdf',
MOVE 'PartialDatabase_Secondary01' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary01.ndf',
MOVE 'PartialDatabase_Secondary02' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary02.ndf',
MOVE 'PartialDatabase_Secondary03' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary03.ndf',
MOVE 'PartialDatabase_log' TO 'C:\SQLData\PartialDatabase_Recovery_log.ldf'
GO

The big difference for the partial restore of a database in FULL RECOVERY mode is that it restores just the PRIMARY filegroup and the LOG file — and by default that’s it.  Even though the SECONDARY filegroup data is inside the partial backup file (as we saw above) the 3x SECONDARY filegroups are not restored unless you explicitly state them.

Whether or not you explicitly state them is up to you – I prefer not to and instead control which ones I restore later in an order relevant to my application.

Processed 328 pages for database 'PartialDatabase_Recovery', file 'PartialDatabase' on file 1.
Processed 8 pages for database 'PartialDatabase_Recovery', file 'PartialDatabase_log' on file 1.
RESTORE DATABASE ... FILE= successfully processed 336 pages in 0.044 seconds (59.525 MB/sec).

 

At this point in time the database is online, available and accessible — however any attempt to access data in a filegroup that is not online will fail.

SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   [PartialDatabase_Recovery].dbo.myorders
GO
Msg 679, Level 16, State 1, Line 165
One of the partitions of index 'pk_myorderid' for table 'dbo.myorders'(partition ID 72057594040549376) resides on a filegroup ("Secondary01") that cannot be accessed because it is offline, restoring, or defunct. This may limit the query result.

 

As a quick aside – at any point in time in the database restore process if you lose track of where you were up to and want to know what files or filegroups in the database are or are not recovered you can run the following SQL command.  Mind you the PRIMARY filegroup needs to be ONLINE for this query to work!  (Yet another reason to keep PRIMARY empty)

SELECT    file_id, name, type_desc, state_desc, physical_name, read_only_lsn, read_write_lsn, redo_start_lsn, redo_target_lsn
FROM    [PartialDatabase_Recovery].sys.database_files
GO

Partial Database Recovery Position

 

Now each one of the SECONDARY filegroups can be restored in turn in any order we want.  Just for testing purposes we’ll restore SECONDARY 01 and 02 but leave 03 for later to demonstrate what happens to SQL queries unlucky enough to access on OFFLINE filegroup.

RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary01'
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY
GO
RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary02'
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY
GO

 

So now we can access any data in the table dbo.myorders as long as we ONLY access data that resides in SECONDARY 01 or 02.  Any attempt to access SECONDARY 03 will result in an error.

  • The first query below works as it only accesses SECONDARY 01 and specifically via a seek of the clustered partitioning key which does not touch SECONDARY 03.
  • The second query below fails as it wants to access data in SECONDARY 03 which is not online which results in the error – “One of the partitions of index ‘pk_myorderid’ for table ‘dbo.myorders'(partition ID 72057594040680448) resides on a filegroup (“Secondary03”) that cannot be accessed because it is offline…
-- THIS WORKS
SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   [PartialDatabase_Recovery].dbo.myorders
WHERE  myorder_date = '2013-01-01 00:00:00'
AND    myorder_id = 1
GO
-- THIS FAILS
SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   [PartialDatabase_Recovery].dbo.myorders
WHERE  myorder_date = '2015-01-01 00:00:00'
AND    myorder_id = 3
GO

 

However this now raises unusual query data access issues – such as what happens if we don’t use the key and instead we need to scan the index (or table)?  The query example below shows where it both works and fails!

  • It works because you get a resultset and if you understand the data you will know it to be correct because the key you passed simply cannot be on SECONDARY 03
  • It fails as the query plan wants to SCAN the index and it can’t as SECONDARY 03 is offline.
  • As per the error message SQL cannot be certain it retrieved all the data the query asked for and so it must also report the error “This may limit the query result” as shown below
  • In a partial restore situation where you bring your database online before all data is ready this is a very real scenario and thus your application must account for this error.
  • This error can also occur for UPDATE or DELETE statements that need index access.
-- THIS WORKS AND FAILS AT THE SAME TIME
SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   [PartialDatabase_Recovery].dbo.myorders
WHERE  myorder_id = 1
GO

Partial Database Successful Failure Query Result

 

 

Msg 679, Level 16, State 1, Line 9
One of the partitions of index 'pk_myorderid' for table 'dbo.myorders'(partition ID 72057594040680448) resides on a filegroup ("Secondary03") that cannot be accessed because it is offline, restoring, or defunct. This may limit the query result.

 

Now we’re done with testing, so we can proceed and restore the SECONDARY 03 filegroup thus bringing the database right up to the same position across all filegroups.  The database is then fully online with all filegroups and data recovered and accessible.

RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary03'
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY
GO

 

Partial backup and restore – READ ONLY Sequence

Now lets set FILEGROUP 03 as READ ONLY and then put new data into the table in partition 1.  We’ll then show that as long as a filegroup hasn’t changed then you don’t need to restore it!

NOTE – the order here is important and in a production world its likely this would not be done exactly using steps as show below.

First set FILEGROUP 03 as READ ONLY on the original database.  NOTE – this requires the database to be forced into SINGLE_USER mode for the query to run.

ALTER DATABASE [PartialDatabase] MODIFY FILEGROUP [Secondary03] READONLY
GO

Now we put some new data into our partitioned dbo.myorders table specifically into the READ WRITE filegroup (SECONDARY 01).

INSERT INTO dbo.myorders SELECT 1, '2012-01-01 00:00:00', 'Secondary01'
GO

Next we initiate a partial backup of our primary database using the clause READ_WRITE_FILEGROUPS which this time will skip over FILEGROUP 03 as its in a READ ONLY state.

-- Partial DB Backup
BACKUP DATABASE PartialDatabase READ_WRITE_FILEGROUPS
TO DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_FULL.bak'
WITH INIT
GO

As can be seen the output shows only SECONDARY 01 and 02 were accessed for backup.

Processed 328 pages for database 'PartialDatabase', file 'PartialDatabase' on file 1.
Processed 16 pages for database 'PartialDatabase', file 'PartialDatabase_Secondary01' on file 1.
Processed 16 pages for database 'PartialDatabase', file 'PartialDatabase_Secondary02' on file 1.
Processed 2 pages for database 'PartialDatabase', file 'PartialDatabase_log' on file 1.
BACKUP DATABASE...FILE= successfully processed 362 pages in 0.036 seconds (78.382 MB/sec).

And then checking the backup file itself shows what components of the database are actually in the backup file itself along with the settings of those components.

-- SEE WHAT IS IN THE FILES
RESTORE FILELISTONLY
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
GO

Partial Database Partial Backup File 2

The interesting bits here are;

  • The backup knows that SECONDARY 03 is READ ONLY
  • The backup knows that SECONDARY 03 is required for the database but that filegroup IS NOT in the backup file
  • The LSN at which time the filegroup was changed from READ WRITE to READ ONLY is tracked and stored along with the partial backup.  This is important during the partial restore to ensure all database files are aligned to maintain database consistency.
  • If you need further detailed information of the first and last LSN’s in the backup files then you need to use the command RESTORE HEADERONLY as well.

 

We now have everything we need to do the partial restore!

Next we need to put the target database into a state where it can be restored over the top.

  • To do that we need to do a tail-log backup and issue a NORECOVERY (or STANDBY)
  • For our test we dont need the actual tail-log backup file per se – however we do need the database to be ready for recovery. (In a PROD recovery scenario we’d highly likely run the tail-log backup and keep it to apply as the last recovery step)
  • Another option is we could just run the RESTORE DATABASE command itself using a WITH REPLACE which would also restore our partial backup file directly over the top of the database. (We might do this in say a restore of the database to a TEST server)
  • As soon as we issue this command the database is OFFLINE
BACKUP LOG [PartialDatabase_Recovery]
TO DISK = N'C:\SQLBackup\PartialDatabase_TAIL.bak'
WITH NORECOVERY, INIT
GO

 

We can now perform a partial restore of the database just as we have done before.

RESTORE DATABASE [PartialDatabase_Recovery]
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY, PARTIAL,
MOVE 'PartialDatabase' TO 'C:\SQLData\PartialDatabase_Recovery_Primary.mdf',
MOVE 'PartialDatabase_Secondary01' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary01.ndf',
MOVE 'PartialDatabase_Secondary02' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary02.ndf',
MOVE 'PartialDatabase_Secondary03' TO 'C:\SQLData\PartialDatabase_Recovery_Secondary03.ndf',
MOVE 'PartialDatabase_log' TO 'C:\SQLData\PartialDatabase_Recovery_log.ldf'
GO

 

As mentioned above the default partial restore of the database will restore just the PRIMARY filegroup and the LOG file and bring the database immediately ONLINE, however we must restore or recover the SECONDARY filegroups manually in the order we see fit to bring those ONLINE as well.

The filegroup backups for SECONDARY 01 and 02 are in the partial backup file as they were in a READ WRITE state at the time we took that partial backup, so each one of the filegroups can simply be restored in turn in any order we want.

As above, after these are restored we are able to access any of the partitioned data as long as we dont touch any data residing within SECONDARY 03.

RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary01'
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY
GO
RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary02'
FROM DISK = N'C:\SQLBackup\PartialDatabase_PARTIAL_Full.bak'
WITH RECOVERY
GO

 

Now the final step is to simply bring any READ ONLY filegroups online.  We cannot restore these as they are not in the partial backup file, but as the filegroup file itself is on the underlying storage subsystem, is unchanged and consistent with the PRIMARY then we can simply bring it online within a few seconds without restoring anything.

Note that if the underlying Read Only filegroup file had changed in any way then the LSN on the file would be beyond the Read Only LSN as recorded in the partial backup file, however strangely this would not fail the file recovery.  So to maintain file consistency with the rest of the database any filegroup that changed READ WRITE to READ ONLY between partial backups would need to be specifically backed up and restored.

RESTORE DATABASE [PartialDatabase_Recovery]
FILEGROUP = 'Secondary03'
WITH RECOVERY
GO
RESTORE DATABASE successfully processed 0 pages in 0.008 seconds (0.000 MB/sec).

 

Now we can access any of the new partitioned data across the table! (…and life is good, if you’re a DBA!)

SELECT *, $partition.pf_myorders_date(myorder_date) as order_$partition
FROM   [PartialDatabase_Recovery].dbo.myorders
GO

Partial Database Final Data

 

 

 

And So in Summary…

Partial backups and piecemeal restores when combined with partitioning and READ ONLY filegroups are an excellent method by which to manage database backup time, file size management and restore time (thus increasing availability).

However the DBA management overhead and complexity can be high as you will need to track more backup files and ensure you have the right files available to do a restore.  If a RO FG backup file goes missing then you will lose all data in that FG, it’d be like the perfect smile with a missing tooth!

In a situation with a very large database where recent data changes rapidly and aged data does not change and there is a requirement to ensure fast and flexible recovery, then the backups that may need to be considered can be, well, quite overwhelming;

  • Full database backup (perhaps only performed once per year as a new recovery point)
  • Partial database backup (perhaps performed once per day)
  • Filegroup database backup (performed say monthly for an FG from RW to RO)
  • Transaction log backup (perhaps performed once per 15 mins)
  • Partial differential database backup (if desired – perhaps performed once every 6 hours)

At the end of your first year you would have generated over 36,878 files which depending on how you want to recover is just harder to think about at 2am when you inevitably get the call to do a restore!

So in short – you need to think about whats relevant for your database, and test it, as your mileage may vary!

 


Disclaimer: all content on Mr. Fox SQL blog is subject to the disclaimer found here

Advertisements

9 thoughts on “Implementing Partial Backups and Restores (via Leveraging Partitions)

  1. Thomas Franz 16 November 2015 / 9:26 PM

    I guess, that I can’t apply any DDL-changes (new / dropped / altered indexes, columns, computed columns …) to a table when it is partitioned and any of the FG’s is read only. Correct?

    Like

    • Mr. Fox SQL (Rolf Tesmer) 16 November 2015 / 11:04 PM

      Hi Thomas – yes you are correct. Once the DB FG is RO you cannot apply any changes to it regardless of where they come from – you will receive a “FileGroup is Read Only” error. So even things like ALTER INDEX REBUILD will fail for that FG, altering a column, etc. If you rebuild indexes you need to use the PARTITION = X option. You can add non-partition aligned indexes however as long as they do not get dropped onto that RO FG.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s