May | 2011 | Oracle Tips and Tricks -- David Fitzjarrell

May 13, 2011

HA HA HA

Filed under: General — dfitzjarrell @ 16:44
Tags: Dataguard, Disaster recovery, Failover, High availability

Apparently there is still confusion over which Oracle feature provides high-availability and which provides disaster recovery. This DBA seems to believe that Data Guard is a high-availability (HA) solution; I don’t consider it so as we’ll soon discover. Let’s define what high-availability is then see which product and/or feature satisfies that definition.

To be a high-availability solution it must provide relatively uninterrupted access to the production system by implementing a mechanism where failures are handled in a transparent manner (the user community is unaware of failures which could affect access).

Data Guard, in all of its glory, does not provide such access in my opinion, although the Oracle documentation says otherwise; Real Application Clusters does, as does an older Oracle product called FailSafe and an even older product that was cumbersome to configure and use, Oracle Parallel Server. Still there are DBAs in the workforce who firmly believe that Data Guard is a valid high-availability solution, even knowing that a failover involves time where users have no access the database. [Apparently my idea of HA and Oracle’s differs.] Given the criteria listed above Data Guard does not, in my mind, fit the bill. So why do some DBAs consider it high-availability? Let’s see what Data Guard does do and maybe we’ll see why I don’t consider it that way.

Data Guard provides a mechanism whereby Oracle will keep one or more databases synchronized with the primary database. For a physical standby configuration in release 10.2 three protection modes are available

Maximum Protection

Maximum Availability

Maximum Performance

Maximum Protection mode guarantees that the standby will be in ‘lock step’ with the primary as all transactions are written to both the primary redo logs and the standby redo logs with the caveat that if Oracle cannot write a transaction to the standby redo logs the primary will suspend activity until the error causing the write issue is corrected. No transaction can commit until all local and remote redo has been written successfully. This ensures a seamless cutover should a disaster strike, but it also inconveniences production users should problems in standby redo log writes occur. Maximum Availability mode works like Maximum Protection mode until a standby redo log write problem occurs, when Oracle switches to Maximum Performance mode until the standby redo log write issue is resolved, at which time the standby redo log writes catch up with the primary. Maximum Performance mode allows a transaction to commit after successfully writing the redo entries to the local redo logs regardless of whether the standby redo log writes have completed. [In 11.2 a snapshot mode is available which can be converted to a physical standby at any time, and allows for read/write access to the data. Also available in that release is support for redo apply to a physical standby database open for read access.] A logical standby configuration is also available which relies upon log shipping to the standby where Log Miner is used to extract and apply DDL and DML changes, although there are a few data types (listed in the online documentation) which won’t replicate in such a setup. For the purposes of this discussion only the physical standby configuration will be considered as it will replicate all changes made to the primary thus providing a byte-for-byte replica of the primary.

A physical standby can provide (depending upon the protection mode) an exact ‘point in time’ copy of the primary so that no transactions are ‘pending’ due to archive log transfers. [In Maximum Performance mode, if standby redo logs are not configured then the standby is synchronized to the last log transferred from the primary leaving a gap of several minutes worth of transactions at the standby site.] This does NOT provide a high-availability configuration as failover tasks consume time and take the database out of service until the failover is complete. Since high-availability is defined as relatively uninterrupted access to the database even during failure of some resources Data Guard cannot, and should not, be used if high-availability (meaning no downtime as RAC provides it) is desired or required. It is a Disaster Recovery (DR) solution and DR and HA are not the same in my book. [Golden Gate provides both DR (with Active Data Guard) and real-time replication solutions through the same interface, neither of which are high-availability offerings even though many ‘experts’, and Oracle Corporation, offer the product as a high-availability configuration.]

Real Application Clusters, or RAC as it’s known industry-wide, is an HA solution as it provides the uninterrupted access required. Unless this is a single-node RAC (a configuration available in release 11.2 which provides for expansion and is primarily designed for development and testing purposes) this option is configurable to transparently failover to a known good node should one node fail, the key term being ‘transparently’. No user interaction or intervention is necessary as RAC seamlessly transfers work to a good node and continues without inconveniencing the users. The database is available as long as at least one node remains operational; that, of course, could slow down transactional activity depending on the available memory on that node but there is no loss of service. Contrast that to Data Guard, where the primary database is no longer functioning and a secondary database, in a physically separate location, must be converted from being the standby to being the primary before users can resume work. Also add the time to reconfigure the old primary to being the new standby and it’s clear this is not high-availability.

As another aspect of this discussion a RAC configuration involves one database and two or more clustered instances accessing that database, and Data Guard involves two or more separate databases, usually found in two or more physical locations. Yes, the tnsnames.ora files can be configured to ‘fail over’ to the first active production site so that users need not reconfigure SQL*Net to access the former standby database should it be needed but that isn’t the issue with Data Guard; the issue is the failover time required to exchange the roles of primary and standby which interrupts service until the transition is complete. Improvements in Data Guard may have decreased the downtime considerably although I would have a difficult time recommending Data Guard as an HA offering.

Don’t get the wrong idea about Data Guard; it’s an excellent technology that provides data protection in the event of a catastrophic disaster at the primary data center and it is often used in conjunction with RAC to provide an environment resilient to node failure and complete disaster. If you want HA (in my opinion) then you need to consider RAC as the ‘out of the box’ solution provided by Oracle as it handles node failures with grace (and possibly style) and keeps the work flowing seamlessly. Know that data protection and high-availability are different, but compatible, areas which need to be considered when constructing a robust database configuration and that the former cannot replace the latter (and, again, this is my opinion).

[The Oracle documentation, at first blush, agrees with my definition but later on in the depths of the HA discussion clearly states, without question, that Oracle considers Data Guard a high-availability solution. Far be it from me to argue with Oracle.]

Data Guard and RAC are both well-tested and reliable options to consider when designing and implementing a fault-tolerant configuration, but of the two only RAC, in my estimation, provides high-availability.

Unless you like explaining to upper management why your ‘HA solution’ required an outage.

Comments (2)

May 8, 2011

"Sherman, set the WAYBAC machine for …"

Filed under: flashback — dfitzjarrell @ 22:22
Tags: Tables, Tablespace

Archiving older data is a complex task; local, national and sometimes international regulations dictate when, how and for how long the archived data must remain available. Add to that the seemingly insurmountable task of storing all of this data electronically and what appears, from those outside the IT arena, to be a simple act can end up as anything but simple. Within the context of an Oracle database there are methods of archiving data, some simple, some a bit more complex but still within the realm of possibility. Let’s look at those options and what they can, and cannot, offer.

The first option which comes to mind (mine, anyway) involves partitioning, an Enterprise Edition option (which should not be a surprise since companies who generate reams of data to archive usually install this edition). Archiving in this scenario is fairly easy: convert the relevant partitions to stand-alone tables in their own tablespace, separate from the ‘live’ production data. If this data is now on its own storage it can even be moved to another database server to facilitate access and not impact daily production. Let’s look at the steps involved with this option. First let’s create a partitioned table:

 CREATE TABLE archive_test
( 
        dusty DATE,
        vol VARCHAR2(60),
        info NUMBER
)
PARTITION  BY RANGE ( dusty ) 
(
PARTITION really_old 
   VALUES LESS THAN ( TO_DATE('01-apr-1999','dd-mon-yyyy'))
   TABLESPACE older_than_dirt,
PARTITION quite_old 
   VALUES LESS THAN ( TO_DATE('01-jul-2004','dd-mon-yyyy'))
   TABLESPACE old_as_dirt,
PARTITION sorta_new
   VALUES LESS THAN ( TO_DATE('01-oct-2009','dd-mon-yyyy'))
   TABLESPACE newer,
PARTITION really_new
   VALUES LESS THAN ( TO_DATE('01-jan-2012','dd-mon-yyyy'))
   TABLESPACE newest
);

-- 
-- Create local prefixed index
--

CREATE INDEX i_archives_l ON archive_test ( dusty,vol ) 
LOCAL ( 
PARTITION i_otd_one TABLESPACE i_otd_one,
PARTITION i_oad_two TABLESPACE i_oad_two,
PARTITION i_nwr_three TABLESPACE i_nwr_three,
PARTITION i_nwst_four TABLESPACE i_nwst_four
);

The last partition of our table is set to accept all data through 01/01/2012 so archiving data simply involves converting the desired partition to a stand-alone table, preferably stored on a different diskgroup or array than the current production data. [Sometimes a new partition is created prior to archiving the old partition (or partitions) to keep data flowing into the partitioned table. We’ll presume we have enough ‘room’ to avoid creating a new partition at archive time.] For the sake of illustration let’s put the destination tablespace, ARCHIVED_TS, in a separate ASM diskgroup (doing this allows for the movement of the diskgroup to another physical server for use by a separate Oracle instance). To archive the partition REALLY_OLD to a stand-alone table named REALLY_OLD_TBL:

--
-- Create empty table matching partition definition
--
create table really_old_tbl
( 
        dusty date,
        vol varchar2(60),
        info number
) tablespace archived_ts;   -- Tablespace created in separate ASM diskgroup or on separate storage

--
-- Check row count in desired partition
--

select count(*)
from archive_test partition(really_old);

--
-- Move partition data to stand-alone table
--
alter table archive_test
exchange partition really_old with table really_old_tbl with validation;

--
-- Verify all rows written to destination table
--

select count(*)
from really_old_tbl;

--
-- Drop now-empty partition presuming row counts match
--
alter table archive_test drop partition really_old;

The data is now archived to a separate table and will no longer be available in ARCHIVE_TEST; this, however, makes partition QUITE_OLD the first partition resulting in any DUSTY value less than the upper partition limit being stored there, including values which should have been in REALLY_OLD. This may not be an issue as values that old may no longer be generated but it is an aspect to consider when archiving older data from a partitioned table.

A second method is available for those not using partitioning which involves creating an archive table from the source table by selecting the desired data (this will also work for partitioned tables and may be the option of choice if a single archive table is desired as the above illustrated method creates a new table for each partition to be archived):


--
-- Create table and copy data
--
create table really_old_tbl
tablespace archived_ts
as select *
from archive_test
where dusty <= [some date here];

--
-- Verify all data copied successfully
--
select *
from archive_test
where (dusty,vol,info) not in (select * from really_old_tbl)
and dusty <= [some date here];

--
-- Delete from source table
--
delete
from archive_test
where dusty <= [some date here];

commit;

The data is now archived to a separate table. Changing the create table statement to an insert statement can allow for ‘newer’ archived data to be stored in the same archive table; again a similar condition exists as any data within the archived range can still be inserted into the source table as no date limits may exist to restrict inserts. A trigger can be used to restrict such inserts as shown below:


SQL> create or replace trigger ins_chk_trg
  2  before insert on archive_test
  3  for each row
  4  declare
  5          mindt date;
  6  begin
  7          select max(dusty) into mindt from really_old_tbl;
  8          if :new.dusty 
SQL> insert into archive_test
  2  values (to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') );
old   2: values (to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') )
new   2: values (to_date('1999-03-08 14:17:40', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('1999-03-08 14:17:40', 'RRRR-MM-DD HH24:MI:SS') )
insert into archive_test
            *
ERROR at line 1:
ORA-20987: Data (07-MAR-99) outside of acceptable date range
ORA-06512: at "BLORPO.INS_CHK_TRG", line 6
ORA-04088: error during execution of trigger 'BLORPO.INS_CHK_TRG'
SQL>

Such a trigger can be used on partitioned and non-partitioned tables to police the inserts and reject those bearing dates present in the archive table. As the archive table data increases (due to subsequent inserts) the trigger will recognize the new maximum and use it to reject inserts.

Lest we forget the external utilities both exp/imp and expdp/impdp can be used to archive data; the QUERY option to both exp and expdp allows extraction of specific data from a given table so that only the oldest data will be exported. Oracle recommends using a parameter file when using the QUERY option to avoid operating system specific escape characters. Additionally expdp allows for one query per table and multiple table:query pairs when specified with the schema.table:query format. A sample parameter file is shown below:

TABLES=employees, departments
QUERY=employees:'”WHERE department_id > 10 AND salary > 10000″‘
QUERY=departments:'”WHERE department_id > 10″‘
NOLOGFILE=y
DIRECTORY=dpump_dir1
DUMPFILE=exp1.dmp

This creates tables with the source names and a limited subset of the source data which can be imported into a different schema or different database. The imported tables can be renamed with the usual command (in releases 10g and later):


rename employees to employees_arch;
rename departments to departments_arch;

or in 11gR2 by using the REMAP_TABLE parameter to impdp:

…
REMAP_TABLE=employees:employees_arch
REMAP_TABLE=departments:departments_arch
…

[REMAP_TABLE will fail if the source table has named constraints in the same schema as those constraints will need to be created when the destination table is created. Constraints named SYS% will be created without error and the table or tables will be remapped.]

The final step in this process is to delete the now-archived data from the source table, as illustrated in the previous example for non-partitioned tables.

If you’re using a release older than 10g the process is a bit more time consuming, involving creating a new table with the desired name from the imported table then copying any index/constraint definitions to the new table, finally dropping the imported table once you’re certain the ‘renamed’ table has all necessary indexes and constraints in place.

Archiving older data is not a terribly difficult task (at least in an Oracle database) but it does take planning and attention to detail to ensure all of the desired data is properly archived and available for the end users. Maintaining the archived table (or tables) also takes planning as applications may need to be written to directly access the archive and, in the case of multiple archive tables, be ‘smart’ enough to be able to access the newer additions as they arrive. Remember, too, that the specifications for the archiving revolve around local, state, federal (in the U.S.) and possibly international regulations and the archiving scheme must be flexible enough to provide the required ‘window’ of access. It’s also true that archived data may outlast the regulations which established it (unless legal issues preclude maintaining the archive beyond the prescribed date range); in such cases a sound storage strategy is a must and it’s not unusual for archived data to go from Tier II (slower, cheaper disk) storage to Tier I (tape) as long as the data is still accessible as access speed is not a requirement for archived data.

“Sherman, set the WAYBAC machine for …”