Hot Backups, Redo, and Fractured Blocks

Share Button

I received a question from a reader that I figured I’d address here.

Question: The oracle documentation tells us that when we put a tablespace in backup mode, the first DML in the session logs the entire block in the redo log buffer and not just the changed vectors. i have two questions regarding this:

1) Can we simulate an example to see this happening?
2) What can be the purpose of logging the entire block the first time and not do the same subsequently?

Answer:

No problem! Below, I’ve created a simulation. Pay attention to the “redo size” statistic in each.

First, I have updated a single row of the employees table.

SQL> set autotrace trace stat
SQL> update employees set first_name = 'Stephen' where employee_id = 100;

1 row updated.


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
          1  consistent gets
          0  physical reads
        292  redo size
        669  bytes sent via SQL*Net to client
        598  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> rollback;

Rollback complete.

Notice the redo size was only 292 bytes, not a very large amount. Now, let’s put the USERS tablespace into hot backup mode.


SQL> alter tablespace users begin backup;

Tablespace altered.

SQL> update employees set first_name = 'Stephen' where employee_id = 100;

1 row updated.


Statistics
----------------------------------------------------------
          0  recursive calls
          2  db block gets
          1  consistent gets
          0  physical reads
       8652  redo size
        670  bytes sent via SQL*Net to client
        598  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

Wow! Quite a bit of a difference. This time, we can see that at least an entire block was written to redo; 8,652 bytes total. Let’s run it one more time, with the tablespace still in hot backup mode.

SQL> /

1 row updated.


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
          1  consistent gets
          0  physical reads
        292  redo size
        671  bytes sent via SQL*Net to client
        598  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

This time, it only used 292 bytes, the same as the original amount. However, to address your second question, we’re going to attempt changing a different block, by changing a record in the departments table instead of employees.

SQL> update departments set department_name = 'Test Dept.' where department_id = 270;

1 row updated.


Statistics
----------------------------------------------------------
         17  recursive calls
          1  db block gets
          5  consistent gets
          1  physical reads
       8572  redo size
        673  bytes sent via SQL*Net to client
        610  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

The result is that another entire block was written to redo. In your question, you stated: “The oracle documentation tells us that when we put a tablespace in backup mode, the first DML in the session logs the entire block in the redo log buffer and not just the changed vectors”

This is close, but not right on the mark. It is not the first DML of the session, but the first DML to a block that is written to redo. You’re absolutely right when you imply that the first DML of the session would make no sense. However, when Oracle writes the first DML for the block, it ensures that the redo logs/archive trail contains at least one full representation of each block that is changed. Subsequent changes will therefore be safe.

This process exists to resolve block fractures. A block fracture occurs when a block is being read by the backup, and being written to at the same time by DBWR. Because the OS (usually) reads blocks at a different rate than Oracle, your OS copy will pull pieces of an Oracle block at a time. What if the OS copy pulls half a block, and while that is happening, the block is changed by DBWR? When the OS copy pulls the second half of the block it will result in mismatched halves, which Oracle would not know how to reconcile.

This is also why the SCN of the datafile header does not change when a tablespace enters hot backup mode. The current SCNs are recorded in redo, but not in the datafile. This is to ensure that Oracle will always recover over the datafile contents with redo entries. When recovery occurs, the fractured datafile block will be replaced with a complete block from redo, making it whole again. After Oracle can be certain it has a complete block, all it needs are the vectors.

I hope this answered your question!

Share Button

45 comments

  1. Steve,
    Thanks for a great answer. The example clearly shows the logging of the entire block in the redo buffer when the tablespcae is put in the backup mode. As for the second part ,your explanation is indeed lucid. I just wanted to clarify one point here. You have written ‘This process exists to resolve block fractures. A block fracture occurs when a block is being read by the backup, and being written to at the same time by DBWR. Because the OS (usually) reads blocks at a different rate than Oracle, your OS copy will pull pieces of an Oracle block at a time. What if the OS copy pulls half a block, and while that is happening, the block is changed by DBWR? When the OS copy pulls the second half of the block it will result in mismatched halves, which Oracle would not know how to reconcile.
    My idea is that regardless of whether DBWR is writing to the block or not when it is being copied, the o/s copy will always be fuzzy. Anything copied hot is intrinsically inconsistent and needs ‘recovery’. That is why oracle logs an entire block, so that it has a ‘good base’ on which to apply archive logs. Putting a tablespace in backup mode stops the SCN so that we know from which point recovery will be required.
    Am I correct? Thanks Steve once again for your help.

  2. Saibal,

    No problem, I’m glad you enjoyed the answer.

    Yes, it is correct that the SCN is the datafile header is frozen so Oracle knows the point at which redo may be required in order to recover. If a change is made to a block during a hot backup, the copied block could be right, it could be wrong, it could be fractured…who knows? Certainly not Oracle…all Oracle knows upon recovery is that it has redo with a higher SCN than the datafile you just restored, and thus a recovery will occur. Just as I said in my original answer:

    “This is also why the SCN of the datafile header does not change when a tablespace enters hot backup mode. The current SCNs are recorded in redo, but not in the datafile. This is to ensure that Oracle will always recover over the datafile contents with redo entries. When recovery occurs, the fractured datafile block will be replaced with a complete block from redo, making it whole again. After Oracle can be certain it has a complete block, all it needs are the vectors.”

  3. Hi Steve,
    I have cleared my doubts with well understanding of the concepts in which u have explained with examples.

    Thanks

  4. Hi Steve,
    Can u explain the factors fhat causes block corruption? Is there any way to remove the corrupted block other than RMAN?

    Need help .

  5. Hi,
    I am having one more doubt regarding backing up of tablespace.
    1)During backup of a tablespace,the datafile header gets frozen.For example at that time the SCN is 100.
    2)When the backup is in progress,an update stmt is fired on the bkl,that is being backed up.
    3)At that time,the entire block along with the new SCN(110) is stored in the redo log file.
    My question is
    Once the backup is over,when does this changed block is written to the backedup tablespace’s datafile?Which SCN will be updated in the datafile header after backup.

  6. Steve,

    Its was good explanation about the fractured blocks…I am a junior DBA.
    I really got confused of what internally happens when we put a datafile into a hot backup mode.Can you please eloborate it..suppose I try to do a DML transaction when the users tablespace is in backup mode, will the changed data’s will be written to the datafile? or
    oracle will write those modified data’s to both redologfile and corresponding datafile??
    Need help…

    Thanks,
    Srinivas

  7. Srinivas,

    Yes, read Saibal’s original comment and my reply to him. The SCN is frozen, but the data will still be written to the datafile. The redo data is there to correct the “fuzziness” of the data written to the datafile during hot backup mode.

    NOTE, this does not happen when you use RMAN, just when you use ‘alter tablespace begin backup’ or some variant thereof.

  8. Steve,

    Thanks you very much for such a detail explanation about fractured block. I have cleared all my doubts regarding to it.

    Can you tell me in detail what is happen in database when we fired the

    ALTER DATABASE BEGIN BACKUP & END BACKUP Command.

    Thanks in Advance
    Nimesh

  9. Nimesh,

    Sure. When you begin backup, it freezes the header of your datafiles (meaning the SCN number will not increment any more until the backup is ended). It also instructs LGWR to write whole blocks to redo the first time a block is touched during hot backup mode.

    When you end backup, it unfreezes the header of the datafiles and allows SCNs to be recorded properly during checkpoint.

    Consequently, this is why your database will appear to need recovery if it crashes while in hot backup mode. If you go into hot backup mode and then shutdown abort, when you come up it will not open and will report that files need recovery. However, they simply need to be taken out of hot backup mode.

  10. Hi,

    can I use the hot backup as consistent backup or does I need a recovery in all time?
    You wrote ————–
    Sure. When you begin backup, it freezes the header of your datafiles (meaning the SCN number will not increment any more until the backup is ended). It also instructs LGWR to write whole blocks to redo the first time a block is touched during hot backup mode.
    When you end backup, it unfreezes the header of the datafiles and allows SCNs to be recorded properly during checkpoint.
    —————————-
    Because when I start the backup the SCN frozen (the db is consistent?). After the end backup the database set the SCN to the current SCN who is written to the logs?

    Regards hans

  11. When you begin the hot backup the SCN is frozen. The datafile data will be consistent, and the data in the redo logs will be consistent, but the backup itself will not be unless the redo/archive is applied.

  12. I don’t understand why I need to apply my redo/archivelogs, when I want to have the SCN of my begin backup.
    Or does Oracle modify my db block during the backup, so my SCN is the same but the data is changed? Thats means for me the database only want to “know” from which SCN it must be recover to the end backup to be consist.

  13. Hans, you’ve got it. Writes still occur to the datafile, just the SCN is frozen. This occurs so that at recovery time, Oracle will know that it must overwrite all blocks in the backup file with redo entries due to fracturing. The original datafiles remain up-to-date, but the backup files will not be because they are being changed during backup.

    What you’re describing is a cold backup. You must have the database closed to be truly frozen at a given SCN.

  14. it was a gud explaination. but i ahve a doubt, when at ablespace is in backup mode and a long running query is updating the datafile beloging to that particular tablespace such that if redolog is full and rewritten, how the datafiles is updated when tablespace brought back to online.

  15. Arun,

    It’s just like I told Hans:

    Writes still occur to the datafile, just the SCN is frozen. This occurs so that at recovery time, Oracle will know that it must overwrite all blocks in the backup file with redo entries due to fracturing. The original datafiles remain up-to-date, but the backup files will not be because they are being changed during backup.

    The datafile is updated throughout the entire process of being in hot backup mode. That is precisely the reason that extra redo must be generated: the backup datafile will have inconsistencies because the source file was being changed during the backup process.

    The only part of the datafile that becomes ‘out of date’ is the SCN which is stored in the header. This can be brought up to date easily by looking at the last checkpoint record. The data does not need to be updated.

  16. hii…it was a very gud explantion ..thanx a lotz …can u plz explain the role of controlfile during hot /online backup of database?

  17. great explanation, most of the subject is clear, just one thing
    if we talk only about hot backups, and not about restore or recover.
    in begin backup mode
    1. data is updated in datafiles when ever a DML is fired
    2. and SCN in updated after finishing backup i.e end backup mode taking the information from checkpoint.

    if so
    1. why database is required to be in archive log mode for taking hot backup?
    2. is the REDO or Archived REDOs used by datafile during the backup phase or after completing the backup.
    3. if not, then what is the requirement to keep database in archive log mode.

    regards

    prthvira

  18. Good explanation.

    Hi,

    i have some questions related to stanby database.

    eg we have two servers Host A and Host B. i have primary database on Host A.Now i dont know how to create a standby database .It is confusing when i go through notes.

    Kindly help me i am new to it…..

  19. Steve, very nice explanation. So, doing a hot backup along with OS copy might generate lot of fractured blocks which can increase the recovery time? We use NetApp snapshot by putting the DB in hot backup mode and snapping the files and cataloging them into RMAN including the archives. As the snaps are done within seconds – we use this compared to taking full RMAN backups.

  20. Hi
    i thanks for the explanation this doubt was in my mind from past 1 year now it got cleared.Really it was excellect explanation.

  21. Hi
    i have one more question it would be a great favour for me if u can answer to this question below
    Q.My query was running fine yesterday but now it was taken more time to complete what is the reason ?

  22. Hi nice to see this neat explantaion about the hotbackup and abt fracured blocks ..I have a question ..can u plz let me know why there is more redo generation during the tablespace/database begin backup mode ?

  23. Thanku for giving for great explanation. I doubt was clear with ur explanation.
    I have a doubt why extra redo’s are not genarated while taking backup with RMAN.

  24. Hi

    Its really greate doc regarding fracture blocks during HOT backups.
    I have two doubts ,it would be greate if you will clear my doubts:-

    1>How to avoide the fracture blocks in HOT backups ?
    2>My database crashes during HOT backup,than how to bringup my database ?

  25. A question please.. we r backing up a very large tablespace. since it is a very large tablespace, lots of redo is getting generated and in the process, redo logs got overwritten (they are writtenin cyclic manner) and the generated archived logs are transferred offsite. What will happen when we issue the end backup command. Will it need the archived logs to complete the end backup. Could anyone please explain.. Thanks.

  26. Ji really nice .., now my doubt is completly clear:) thank u fro your great examples 🙂

  27. Hi,

    It was a great explanation, thanks a lot. I have a question, as the datafiles headers are freezed while taking backup when the SCNs are updated… For example if hundreds of transactions have performed then how the SCNs are updated. Plz clarify my doubt. It would be great favour to me and other DBA’s.

  28. I loved the way you used autotrace for a quick and easy example! Thanks for the neat explanation.

  29. I have one doubt.Take the below case

    Suppose our database have 100 datafile and right now rman has completed backup of 64 datafile and in process of backup of 65 number datafile.In mean time I excuted some query and it has changed the block 55. so now my question weather rman will go and bacup this datafile again or it will leave this file as it is and archive file will be having info about this file.

  30. Steve,
    Thanks for a great answer. The example clearly shows the logging of the entire block in the redo buffer when the tablespcae is put in the backup mode.

  31. thanks steve……..
    u have explaned very well and reply very well………..
    keep posting …..
    all the very best ……

  32. Nice Document….!!! – Good Explanation.

    Please suggest the following.
    1) During hotbackup, if I fire update command on that table / tablespace then commit. so, my SCN will be changed. Then I fire select command for same record, what will be the result? Updated data?? Then what about Freezing SCN in header for tablespace which are kept in begin backup mode? What is the role of controlfile?
    2) at end backup, from where the tablespace refer latest SCN?

    Thank you…

  33. Hi,
    wonderful article, very helpful :).

    I have a question, if I may: Do you know what happens when having fuzzy blocks in the database, you start the database using _allow_resetlogs_corruption?

    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.