FORALL operations IN oracle

Share Button

Bulk binds and bulk loading are nothing new; in fact, the feature came out in Oracle 8i. This was a time when Oracle Parallel Servers roamed the land, pfiles were all the rage, and connecting internal was a fading fad. But like the rest of Oracle, the features have grown over the years and are a great way to pull data into memory via PL/SQL, manipulate it if desired, and perform DML in a single operation with the FORALL command. In this article we will use the SH.CUSTOMERS table to perform a FORALL insert and FORALL merge.

Setting Up the Environment

The environment for this test is an Oracle 11gR2 Enterprise Edition installation with the default schemas (HR, SH, OE, etc) loaded. I have logged in as SH and created a copy of the CUSTOMERS table by using the query: select dbms_metadata.get_ddl(‘TABLE’, ‘CUSTOMERS’) from dual;

SQL> desc customers
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUST_ID				   NOT NULL NUMBER
 CUST_FIRST_NAME			   NOT NULL VARCHAR2(20)
 CUST_LAST_NAME 			   NOT NULL VARCHAR2(40)
 CUST_GENDER				   NOT NULL CHAR(1)
 CUST_YEAR_OF_BIRTH			   NOT NULL NUMBER(4)
 CUST_MARITAL_STATUS				    VARCHAR2(20)
 CUST_STREET_ADDRESS			   NOT NULL VARCHAR2(40)
 CUST_POSTAL_CODE			   NOT NULL VARCHAR2(10)
 CUST_CITY				   NOT NULL VARCHAR2(30)
 CUST_CITY_ID				   NOT NULL NUMBER
 CUST_STATE_PROVINCE			   NOT NULL VARCHAR2(40)
 CUST_STATE_PROVINCE_ID 		   NOT NULL NUMBER
 COUNTRY_ID				   NOT NULL NUMBER
 CUST_MAIN_PHONE_NUMBER 		   NOT NULL VARCHAR2(25)
 CUST_INCOME_LEVEL				    VARCHAR2(30)
 CUST_CREDIT_LIMIT				    NUMBER
 CUST_EMAIL					    VARCHAR2(30)
 CUST_TOTAL				   NOT NULL VARCHAR2(14)
 CUST_TOTAL_ID				   NOT NULL NUMBER
 CUST_SRC_ID					    NUMBER
 CUST_EFF_FROM					    DATE
 CUST_EFF_TO					    DATE
 CUST_VALID					    VARCHAR2(1)

SQL> select count(*) from customers;

  COUNT(*)
----------
     55500

SQL> desc customers2
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUST_ID				   NOT NULL NUMBER
 CUST_FIRST_NAME			   NOT NULL VARCHAR2(20)
 CUST_LAST_NAME 			   NOT NULL VARCHAR2(40)
 CUST_GENDER				   NOT NULL CHAR(1)
 CUST_YEAR_OF_BIRTH			   NOT NULL NUMBER(4)
 CUST_MARITAL_STATUS				    VARCHAR2(20)
 CUST_STREET_ADDRESS			   NOT NULL VARCHAR2(40)
 CUST_POSTAL_CODE			   NOT NULL VARCHAR2(10)
 CUST_CITY				   NOT NULL VARCHAR2(30)
 CUST_CITY_ID				   NOT NULL NUMBER
 CUST_STATE_PROVINCE			   NOT NULL VARCHAR2(40)
 CUST_STATE_PROVINCE_ID 		   NOT NULL NUMBER
 COUNTRY_ID				   NOT NULL NUMBER
 CUST_MAIN_PHONE_NUMBER 		   NOT NULL VARCHAR2(25)
 CUST_INCOME_LEVEL				    VARCHAR2(30)
 CUST_CREDIT_LIMIT				    NUMBER
 CUST_EMAIL					    VARCHAR2(30)
 CUST_TOTAL				   NOT NULL VARCHAR2(14)
 CUST_TOTAL_ID				   NOT NULL NUMBER
 CUST_SRC_ID					    NUMBER
 CUST_EFF_FROM					    DATE
 CUST_EFF_TO					    DATE
 CUST_VALID					    VARCHAR2(1)

SQL> select count(*) from customers2;

  COUNT(*)
----------
	 0

As you can see, the CUSTOMERS table is action packed with 55,500 scintillating (not really) rows of customer data while the CUSTOMERS2 table has nothing at this time. In order to remedy that problem we’re going to write an anonymous PL/SQL program to bulk collect data from CUSTOMERS into a PL/SQL collection then use FORALL to load the data into CUSTOMERS2.

Keep in mind that this is a very pedestrian example of what you can do with bulk binds. While bulk collecting into memory and using FORALL to perform the insert does present noticeable performance benefits, it’s very unimaginative and dull. In my opinion the real power of bulk binds comes in its ability to manipulate the collections in memory using standard PL/SQL code and then perform a straight DML to the DB with the results of the object. Regardless, here is the code used to perform the insert:

SQL> declare
  2    type t_customers is table of customers%rowtype;
  3    lv_customers_tab t_customers;
  4  begin
  5    select * bulk collect into lv_customers_tab from customers;
  6    forall rec in lv_customers_tab.first .. lv_customers_tab.last
  7      insert into customers2 values lv_customers_tab(rec);
  8    commit;
  9  end;
 10  /

PL/SQL procedure successfully completed.

SQL> select count(*) from customers2;

  COUNT(*)
----------
     55500

As you can see, the code performed the following operations:

  1. Create an object type (t_customers) which is a table of CUSTOMER%ROWTYPEs. Imagine a single row in the CUSTOMER table. CUSTOMER%ROWTYPE would be the definition of columns (name, datatype) which make up a single row. By creating an object that is a TABLE OF those definitions, we create a PL/SQL representation of the CUSTOMER table.
  2. Create a variable (lv_customers_tab) of datatype t_customers. This lv_customers_tab variable is now an object that can hold a representation of the CUSTOMER table.
  3. Query the CUSTOMERS table, bulk collecting all data into the lv_customers_tab variable. This loads the data from the physical table in Oracle into an in-memory collection which can be manipulated, written, or discarded.
  4. Perform a FORALL operation using all data from the FIRST to LAST object in the collection, assigning the index for the pseudo-loop to the variable ‘rec’. There are many methods that can be used with PL/SQL collections (I will go through these later), but FIRST and LAST define the first and last index of the collection and are great for looping or pulling those specific records.
  5. Perform an INSERT operation, taking all data from the collection and somewhat literally slamming it into the target table. Note that we do not have to list each column of the table; instead, we insert an entire object of columns at once based on the index from the previous step (which is contained in the variable ‘rec’).

To understand the way collections store data, consider the following table example:

EXAMPLE
-------------------------
EXAMPLE_ID       NUMBER
VALUE            VARCHAR2
MYDATE           DATE

In this table, a piece of data contained in the EXAMPLE_ID column of datatype NUMBER is just a numeric chunk of data. But a combination of the column name (EXAMPLE_ID), datatype (NUMBER) can be considered an object of type EXAMPLE.EXAMPLE_ID%TYPE; this object will contain the attributes associated with that column in that table. The data contained therein (for example, 12345) could be considered an instance of the EXAMPLE.EXAMPLE_ID%TYPE object.

Likewise, an object of type EXAMPLE%ROWTYPE would be a combination of EXAMPLE_ID:NUMBER, VALUE:VARCHAR2, and MYDATE:DATE. Three NAME:TYPE combinations together form a single row in the table and a single set of values in the EXAMPLE%ROWTYPE object.

Finally, if we make a TABLE OF EXAMPLE%ROWTYPE then we have a collection of EXAMPLE_ID:NUMBER, VALUE:VARCHAR2, MYDATE:DATE combinations. This format is no different from an actual table except that it is completely variable based inside a chunk of PL/SQL code. Any data put into this structure will exist in memory and can be manipulated via code.

Performing a FORALL MERGE

Despite Tom Kyte’s argument that a FORALL MERGE is not required as MERGE is always a bulk capable event that already worked on sets, it appears that Oracle has provided the ability regardless.

as I’ve said before — forall doesn’t *make sense* with merge! merge already *works* on sets. merge is only about sets. You don’t *need* forall. -Tom Kyte

I’m willing to bet it was added for one of two reasons: either to spare customers the agony (sarcasm) of using table(cast()) in their MERGE statements, or to satisfy customers who kept demanding it regardless of actual need. Either way it’s there now.

Before doing the example we will insert more rows into the CUSTOMERS table (using a new sequence to provide CUST_IDs) and update every 7th row in CUSTOMERS:

SQL> select max(cust_id) from customers;

MAX(CUST_ID)
------------
      104500

SQL> create sequence customer_seq start with 104501 increment by 1;

Sequence created.

SQL> insert into customers
  2  select customer_seq.nextval, cust_first_name, cust_last_name, cust_gender,
  3  cust_year_of_birth, cust_marital_status, cust_street_address,
  4  cust_postal_code, cust_city, cust_city_id, cust_state_province,
  5  cust_state_province_id, country_id, cust_main_phone_number, cust_income_level,
  6  cust_credit_limit, cust_email, cust_total, cust_total_id, cust_src_id,
  7  cust_eff_from, cust_eff_to, cust_valid
  8  from customers where rownum < 20000;

19999 rows created.

SQL> update customers
  2  set cust_year_of_birth = 1999,
  3  cust_postal_code = '07030'
  4  where mod(cust_id, 7) = 0;

10785 rows updated.

Now let’s do a comparison of the two tables to make sure there is a noticeable difference. We will use COUNT() to provide a difference in row count, and MINUS to check for any differences at all:

SQL> select 
  2  (select count(1) from customers) - 
  3  (select count(1) from customers2) as diff
  4  from dual;

      DIFF
----------
     19999

SQL>  select count(*) from
  2  (select * from customers2
  3   minus
  4   select * from customers);

  COUNT(*)
----------
      7928

From these queries you can see that CUSTOMERS contains 19,999 rows that are not in CUSTOMERS2. Additionally, the CUSTOMERS2 table has 7,928 rows that are different from CUSTOMERS in some way. Using “select * from tab2 minus select * from tab2” compares every column of every row against the other table for comparison (something that came in very handy for testing replication back in 8i).

Now that we know our source tables are ready for the MERGE, let’s get to it:

SQL>  declare
  2    type t_customers is table of customers%rowtype;
  3    lv_customers_tab t_customers;
  4  begin
  5    select * bulk collect into lv_customers_tab from customers;
  6    forall rec in lv_customers_tab.first .. lv_customers_tab.last
  7	 merge into customers2 c2
  8	 using (select cust_id from customers c1
  9		where c1.cust_id = lv_customers_tab(rec).cust_id) match
 10	 on (match.cust_id = c2.cust_id)
 11	 when matched then
 12	   update set cust_first_name = lv_customers_tab(rec).cust_first_name,
 13		      cust_last_name = lv_customers_tab(rec).cust_last_name,
 14		      cust_gender = lv_customers_tab(rec).cust_gender,
 15		      cust_year_of_birth = lv_customers_tab(rec).cust_year_of_birth,
 16		      cust_marital_status = lv_customers_tab(rec).cust_marital_status,
 17		      cust_street_address = lv_customers_tab(rec).cust_street_address,
 18		      cust_postal_code = lv_customers_tab(rec).cust_postal_code,
 19		      cust_city = lv_customers_tab(rec).cust_city,
 20		      cust_city_id = lv_customers_tab(rec).cust_city_id,
 21		      cust_state_province = lv_customers_tab(rec).cust_state_province,
 22		      cust_state_province_id = lv_customers_tab(rec).cust_state_province_id,
 23		      country_id = lv_customers_tab(rec).country_id,
 24		      cust_main_phone_number = lv_customers_tab(rec).cust_main_phone_number,
 25		      cust_income_level = lv_customers_tab(rec).cust_income_level,
 26		      cust_credit_limit = lv_customers_tab(rec).cust_credit_limit,
 27		      cust_email = lv_customers_tab(rec).cust_email,
 28		      cust_total = lv_customers_tab(rec).cust_total,
 29		      cust_total_id = lv_customers_tab(rec).cust_total_id,
 30		      cust_src_id = lv_customers_tab(rec).cust_src_id,
 31		      cust_eff_from = lv_customers_tab(rec).cust_eff_from,
 32		      cust_eff_to = lv_customers_tab(rec).cust_eff_to,
 33		      cust_valid = lv_customers_tab(rec).cust_valid
 34	 when not matched then
 35	   insert values lv_customers_tab(rec);
 36    commit;
 37  end;
 38  /

As you can see, the one unfortunate part of the code is having to write out every column of the UPDATE part of the MERGE statement as UPDATE SET ROW does not work here. However, it does provide an opportunity to see better what is contained in the lv_customers_tab object. In this PL/SQL we performed the same steps as the INSERT, except we fed the lv_customers_tab object into a MERGE statement. The collection is compared to the target table (match) and rows are either inserted or updated into the target table depending on a match of CUST_ID. Let’s check the differences again to make sure it worked:

SQL> select
  2  (select count(1) from customers) -
  3  (select count(1) from customers2) as diff
  4  from dual;

      DIFF
----------
	 0

SQL> select count(*) from
  2  (select * from customers2
  3   minus
  4   select * from customers);

  COUNT(*)
----------
	 0

Success! No differences were found between CUSTOMERS and CUSTOMERS2 following our operation.

As I mentioned before, this is a very pedestrian example. The real power of bulk binds is in how you can manipulate the objects. For example, once you bulk load your data into a PL/SQL collection you can add new rows, delete rows, alter the data, etc. This allows you to pull data in, perform in-memory changes, then push it back to the source table or to another table effortlessly. If you incorporate this into a PACKAGE with the PL/SQL collection declared globally, you can keep entire sets of data in memory and manipulate them for the duration of your session which can be pretty cool.

The methods for doing this manipulation can be found in the Oracle documentation. Methods such as FIRST, LAST, PRIOR, and NEXT are good for traversing the collection while methods such as EXPAND and DELETE are good for manipulation. As you iterate through a collection you can use and modify data with the COLLECTIONNAME(index).COLUMN syntax as you would any other variable. And once you’re done working with the collection you can write it as a contiguous chunk of data or as a sparse collection to your target table.

Here are some more resources for Bulk Binds if you want to experiment further:

Share Button

2 comments

  1. To properly diff a set you need to compare both ways with each as primary.

    (select * from customers2 minus select * from customers)
    union all
    (select * from customers minus select * from customers2)
    ;

  2. Fair enough, thank you for the clarification!

    SQL> (select * from customers2 minus select * from customers)
    2 union all
    3 (select * from customers minus select * from customers2);

    no rows selected

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.