Hot Backups, Redo, and Fractured Blocks
October 28th, 2006 by The Oracle AlchemistI received a question from a reader that I figured I’d address here.
Question: The oracle documentation tells us that when we put a tablespace in backup mode, the first DML in the session logs the entire block in the redo log buffer and not just the changed vectors. i have two questions regarding this:
1) Can we simulate an example to see this happening?
2) What can be the purpose of logging the entire block the first time and not do the same subsequently?
Answer:
No problem! Below, I’ve created a simulation. Pay attention to the “redo size” statistic in each.
First, I have updated a single row of the employees table.
SQL> set autotrace trace stat
SQL> update employees set first_name = 'Stephen' where employee_id = 100;
1 row updated.
Statistics
----------------------------------------------------------
0 recursive calls
1 db block gets
1 consistent gets
0 physical reads
292 redo size
669 bytes sent via SQL*Net to client
598 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> rollback;
Rollback complete.
Notice the redo size was only 292 bytes, not a very large amount. Now, let’s put the USERS tablespace into hot backup mode.
SQL> alter tablespace users begin backup;
Tablespace altered.
SQL> update employees set first_name = 'Stephen' where employee_id = 100;
1 row updated.
Statistics
----------------------------------------------------------
0 recursive calls
2 db block gets
1 consistent gets
0 physical reads
8652 redo size
670 bytes sent via SQL*Net to client
598 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
Wow! Quite a bit of a difference. This time, we can see that at least an entire block was written to redo; 8,652 bytes total. Let’s run it one more time, with the tablespace still in hot backup mode.
SQL> /
1 row updated.
Statistics
----------------------------------------------------------
0 recursive calls
1 db block gets
1 consistent gets
0 physical reads
292 redo size
671 bytes sent via SQL*Net to client
598 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
This time, it only used 292 bytes, the same as the original amount. However, to address your second question, we’re going to attempt changing a different block, by changing a record in the departments table instead of employees.
SQL> update departments set department_name = 'Test Dept.' where department_id = 270;
1 row updated.
Statistics
----------------------------------------------------------
17 recursive calls
1 db block gets
5 consistent gets
1 physical reads
8572 redo size
673 bytes sent via SQL*Net to client
610 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
The result is that another entire block was written to redo. In your question, you stated: “The oracle documentation tells us that when we put a tablespace in backup mode, the first DML in the session logs the entire block in the redo log buffer and not just the changed vectors”
This is close, but not right on the mark. It is not the first DML of the session, but the first DML to a block that is written to redo. You’re absolutely right when you imply that the first DML of the session would make no sense. However, when Oracle writes the first DML for the block, it ensures that the redo logs/archive trail contains at least one full representation of each block that is changed. Subsequent changes will therefore be safe.
This process exists to resolve block fractures. A block fracture occurs when a block is being read by the backup, and being written to at the same time by DBWR. Because the OS (usually) reads blocks at a different rate than Oracle, your OS copy will pull pieces of an Oracle block at a time. What if the OS copy pulls half a block, and while that is happening, the block is changed by DBWR? When the OS copy pulls the second half of the block it will result in mismatched halves, which Oracle would not know how to reconcile.
This is also why the SCN of the datafile header does not change when a tablespace enters hot backup mode. The current SCNs are recorded in redo, but not in the datafile. This is to ensure that Oracle will always recover over the datafile contents with redo entries. When recovery occurs, the fractured datafile block will be replaced with a complete block from redo, making it whole again. After Oracle can be certain it has a complete block, all it needs are the vectors.
I hope this answered your question!


October 29th, 2006 at 3:20 am
Steve,
Thanks for a great answer. The example clearly shows the logging of the entire block in the redo buffer when the tablespcae is put in the backup mode. As for the second part ,your explanation is indeed lucid. I just wanted to clarify one point here. You have written ‘This process exists to resolve block fractures. A block fracture occurs when a block is being read by the backup, and being written to at the same time by DBWR. Because the OS (usually) reads blocks at a different rate than Oracle, your OS copy will pull pieces of an Oracle block at a time. What if the OS copy pulls half a block, and while that is happening, the block is changed by DBWR? When the OS copy pulls the second half of the block it will result in mismatched halves, which Oracle would not know how to reconcile.
My idea is that regardless of whether DBWR is writing to the block or not when it is being copied, the o/s copy will always be fuzzy. Anything copied hot is intrinsically inconsistent and needs ‘recovery’. That is why oracle logs an entire block, so that it has a ‘good base’ on which to apply archive logs. Putting a tablespace in backup mode stops the SCN so that we know from which point recovery will be required.
Am I correct? Thanks Steve once again for your help.
October 29th, 2006 at 11:55 am
Saibal,
No problem, I’m glad you enjoyed the answer.
Yes, it is correct that the SCN is the datafile header is frozen so Oracle knows the point at which redo may be required in order to recover. If a change is made to a block during a hot backup, the copied block could be right, it could be wrong, it could be fractured…who knows? Certainly not Oracle…all Oracle knows upon recovery is that it has redo with a higher SCN than the datafile you just restored, and thus a recovery will occur. Just as I said in my original answer:
“This is also why the SCN of the datafile header does not change when a tablespace enters hot backup mode. The current SCNs are recorded in redo, but not in the datafile. This is to ensure that Oracle will always recover over the datafile contents with redo entries. When recovery occurs, the fractured datafile block will be replaced with a complete block from redo, making it whole again. After Oracle can be certain it has a complete block, all it needs are the vectors.”
May 11th, 2007 at 3:22 am
Hi Steve,
I have cleared my doubts with well understanding of the concepts in which u have explained with examples.
Thanks
May 11th, 2007 at 3:26 am
Hi Steve,
Can u explain the factors fhat causes block corruption? Is there any way to remove the corrupted block other than RMAN?
Need help .
June 25th, 2007 at 3:02 am
Steve,
great explanation on the fractured blocks with examples.
December 31st, 2007 at 6:14 am
Hi steve,
Awesome answer..
December 31st, 2007 at 6:50 am
Hi,
I am having one more doubt regarding backing up of tablespace.
1)During backup of a tablespace,the datafile header gets frozen.For example at that time the SCN is 100.
2)When the backup is in progress,an update stmt is fired on the bkl,that is being backed up.
3)At that time,the entire block along with the new SCN(110) is stored in the redo log file.
My question is
Once the backup is over,when does this changed block is written to the backedup tablespace’s datafile?Which SCN will be updated in the datafile header after backup.
March 13th, 2008 at 2:30 am
Steve,
Its was good explanation about the fractured blocks…I am a junior DBA.
I really got confused of what internally happens when we put a datafile into a hot backup mode.Can you please eloborate it..suppose I try to do a DML transaction when the users tablespace is in backup mode, will the changed data’s will be written to the datafile? or
oracle will write those modified data’s to both redologfile and corresponding datafile??
Need help…
Thanks,
Srinivas
March 13th, 2008 at 1:24 pm
Srinivas,
Yes, read Saibal’s original comment and my reply to him. The SCN is frozen, but the data will still be written to the datafile. The redo data is there to correct the “fuzziness” of the data written to the datafile during hot backup mode.
NOTE, this does not happen when you use RMAN, just when you use ‘alter tablespace begin backup’ or some variant thereof.
March 23rd, 2008 at 5:37 am
Steve,
Thanks you very much for such a detail explanation about fractured block. I have cleared all my doubts regarding to it.
Can you tell me in detail what is happen in database when we fired the
ALTER DATABASE BEGIN BACKUP & END BACKUP Command.
Thanks in Advance
Nimesh
March 23rd, 2008 at 10:27 am
Nimesh,
Sure. When you begin backup, it freezes the header of your datafiles (meaning the SCN number will not increment any more until the backup is ended). It also instructs LGWR to write whole blocks to redo the first time a block is touched during hot backup mode.
When you end backup, it unfreezes the header of the datafiles and allows SCNs to be recorded properly during checkpoint.
Consequently, this is why your database will appear to need recovery if it crashes while in hot backup mode. If you go into hot backup mode and then shutdown abort, when you come up it will not open and will report that files need recovery. However, they simply need to be taken out of hot backup mode.
June 2nd, 2008 at 12:05 pm
Hi,
can I use the hot backup as consistent backup or does I need a recovery in all time?
You wrote ————–
Sure. When you begin backup, it freezes the header of your datafiles (meaning the SCN number will not increment any more until the backup is ended). It also instructs LGWR to write whole blocks to redo the first time a block is touched during hot backup mode.
When you end backup, it unfreezes the header of the datafiles and allows SCNs to be recorded properly during checkpoint.
—————————-
Because when I start the backup the SCN frozen (the db is consistent?). After the end backup the database set the SCN to the current SCN who is written to the logs?
Regards hans
June 2nd, 2008 at 12:21 pm
When you begin the hot backup the SCN is frozen. The datafile data will be consistent, and the data in the redo logs will be consistent, but the backup itself will not be unless the redo/archive is applied.
June 2nd, 2008 at 2:01 pm
I don’t understand why I need to apply my redo/archivelogs, when I want to have the SCN of my begin backup.
Or does Oracle modify my db block during the backup, so my SCN is the same but the data is changed? Thats means for me the database only want to “know” from which SCN it must be recover to the end backup to be consist.
June 2nd, 2008 at 2:14 pm
Hans, you’ve got it. Writes still occur to the datafile, just the SCN is frozen. This occurs so that at recovery time, Oracle will know that it must overwrite all blocks in the backup file with redo entries due to fracturing. The original datafiles remain up-to-date, but the backup files will not be because they are being changed during backup.
What you’re describing is a cold backup. You must have the database closed to be truly frozen at a given SCN.
August 18th, 2008 at 2:09 am
it was a gud explaination. but i ahve a doubt, when at ablespace is in backup mode and a long running query is updating the datafile beloging to that particular tablespace such that if redolog is full and rewritten, how the datafiles is updated when tablespace brought back to online.
August 18th, 2008 at 9:09 am
Arun,
It’s just like I told Hans:
Writes still occur to the datafile, just the SCN is frozen. This occurs so that at recovery time, Oracle will know that it must overwrite all blocks in the backup file with redo entries due to fracturing. The original datafiles remain up-to-date, but the backup files will not be because they are being changed during backup.
The datafile is updated throughout the entire process of being in hot backup mode. That is precisely the reason that extra redo must be generated: the backup datafile will have inconsistencies because the source file was being changed during the backup process.
The only part of the datafile that becomes ‘out of date’ is the SCN which is stored in the header. This can be brought up to date easily by looking at the last checkpoint record. The data does not need to be updated.
October 17th, 2008 at 11:45 am
hii…it was a very gud explantion ..thanx a lotz …can u plz explain the role of controlfile during hot /online backup of database?
October 26th, 2008 at 2:44 am
great explanation, most of the subject is clear, just one thing
if we talk only about hot backups, and not about restore or recover.
in begin backup mode
1. data is updated in datafiles when ever a DML is fired
2. and SCN in updated after finishing backup i.e end backup mode taking the information from checkpoint.
if so
1. why database is required to be in archive log mode for taking hot backup?
2. is the REDO or Archived REDOs used by datafile during the backup phase or after completing the backup.
3. if not, then what is the requirement to keep database in archive log mode.
regards
prthvira
November 12th, 2008 at 6:14 pm
Good explanation.
Hi,
i have some questions related to stanby database.
eg we have two servers Host A and Host B. i have primary database on Host A.Now i dont know how to create a standby database .It is confusing when i go through notes.
Kindly help me i am new to it…..
December 5th, 2008 at 6:27 pm
just awesome !!!
June 10th, 2009 at 4:26 am
Steve, very nice explanation. So, doing a hot backup along with OS copy might generate lot of fractured blocks which can increase the recovery time? We use NetApp snapshot by putting the DB in hot backup mode and snapping the files and cataloging them into RMAN including the archives. As the snaps are done within seconds – we use this compared to taking full RMAN backups.
July 2nd, 2009 at 9:11 am
[...] ? | what is BINARY COMPRTESSION ?| what is UNUSED BLOCK COMPRESSION ? « lutz hartmann as sysdba http://www.oraclealchemist.com/oracle/hot-backups-redo-and-fractured-blocks/ Possibly related posts: (automatically generated)Oracle By Example Series for 10g Release [...]
July 6th, 2009 at 2:41 pm
Hi
i thanks for the explanation this doubt was in my mind from past 1 year now it got cleared.Really it was excellect explanation.
July 6th, 2009 at 2:44 pm
Hi
i have one more question it would be a great favour for me if u can answer to this question below
Q.My query was running fine yesterday but now it was taken more time to complete what is the reason ?
July 10th, 2009 at 3:09 pm
Hi nice to see this neat explantaion about the hotbackup and abt fracured blocks ..I have a question ..can u plz let me know why there is more redo generation during the tablespace/database begin backup mode ?
September 5th, 2009 at 1:38 pm
Thanku for giving for great explanation. I doubt was clear with ur explanation.
I have a doubt why extra redo’s are not genarated while taking backup with RMAN.
September 23rd, 2009 at 6:41 pm
Hi
Its really greate doc regarding fracture blocks during HOT backups.
I have two doubts ,it would be greate if you will clear my doubts:-
1>How to avoide the fracture blocks in HOT backups ?
2>My database crashes during HOT backup,than how to bringup my database ?
October 27th, 2009 at 5:40 pm
Steve…Thanks for the great explanation…..
June 11th, 2010 at 3:05 am
Thank you very much Steve
June 22nd, 2010 at 6:43 pm
Hi,
It was a great explanation, thanks a lot. I have a question, as the datafiles headers are freezed while taking backup when the SCNs are updated… For example if hundreds of transactions have performed then how the SCNs are updated. Plz clarify my doubt. It would be great favour to me and other DBA’s.
June 25th, 2010 at 12:49 am
AHH cant be explained in any better way…. thanks