Trick #3 – Store Any Data with AnyData

Share Button

Today’s (well, yesterday’s really) trick deals with the storage of data. Let me first say that the methods I’m about to show you are NOT useful for where clauses or in normal database design. However, they can be very useful when you need a reference table that can store anything in a simple column.

For instance, take a look at Google. Google is an amazing search engine, capable of parsing billions of sites and enabling fast searching of the data it harvests.

In addition, Google is able to parse prices, mileage, dates, and other important elements, then use them as they were meant to be used; for instance, using a price to power Froogle’s bargain hunting tool.

If we were trying to duplicate the same concept into Oracle, we’d somehow need to create a table with an ID and a token, the token being the word that was parsed out of a complete document. However, what if we want that column to be able to store numbers, characters, dates…and we want them to act natively?

Enter ANYDATA. ANYDATA is a “self describing data instance type.” All this truly means is that the type not only holds a value, but it also holds its own datatype within. It also employs many methods that allow us to perform various functions with it.

So, back to our parser! We’ll start off with a very simple table:

create table search_words (
token_id number not null primary key,
token sys.anydata
);

Now we must populate the table, but normal inserts unfortunately wont work. We must use ANYDATA’s built-in Convert methods to perform our inserts. We’ll use the phrase “Make 50 dollars on 12/01/06”

INSERT INTO SEARCH_WORDS
VALUES (1, sys.anydata.convertVarchar2('Make');
INSERT INTO SEARCH_WORDS
VALUES (2, sys.anydata.convertNumber(50);
INSERT INTO SEARCH_WORDS
VALUES (3, sys.anydata.convertVarchar2('dollars');
INSERT INTO SEARCH_WORDS
VALUES (4, sys.anydata.convertVarchar2('on');
INSERT INTO SEARCH_WORDS
VALUES (5, sys.anydata.convertDate(to_date('12/01/2006', 'MM/DD/YYYY');
commit;

And now we have data! Let’s query it and see what it looks like.

SQL> select * from search_words; TOKEN_ID TOKEN()
---------- --------------------
1 ANYDATA()
2 ANYDATA()
3 ANYDATA()
4 ANYDATA()
5 ANYDATA()

Oops! That’s not how we want our data to look. Unfortunately, Oracle is not yet sophisticated enough to understand which data type it is querying. Because of this, we have to make a function.

CREATE OR REPLACE FUNCTION getTokens(pv_token in sys.anydata)
RETURN VARCHAR2 IS

lv_number      NUMBER;
lv_date     DATE;
lv_varchar2 VARCHAR2(4000);

BEGIN

CASE pv_token.getTypeName
WHEN 'SYS.NUMBER' THEN
IF (pv_token.getNumber(lv_number) = dbms_types.success) THEN
lv_varchar2 := lv_number;
END IF;
WHEN 'SYS.DATE' THEN
IF (pv_token.getDate(lv_date) = dbms_types.success) THEN
lv_varchar2 := lv_date;
END IF;
WHEN 'SYS.VARCHAR2' THEN
IF (pv_token.getVarchar2(lv_varchar2) = dbms_types.success) THENa
NULL;
END IF;
ELSE
lv_varchar2 := 'unknown datatype';
END CASE;

RETURN lv_varchar2;

END getTokens;

Notice that we used the ANYDATA methods again, but this time with getVarchar2, getNumber, and getDate. Finally, let’s go ahead and query the table using the function we created.

SQL> column token format a20
SQL> select token_id, getTokens(token) token from search_words;

TOKEN_ID TOKEN
---------- --------------------
1 Make
2 50
3 dollars
4 on
5 01-DEC-06

And we have our data! Since we have to use a function, you can see why this would not be very good in a WHERE clause. However, for storing multiple datatypes in a single table, it’s tops.

You can find more information about ANYDATA in the Database Application Developer’s Guide, and information about its methods in the PL/SQL Packages and Types Reference.

Join me next time to learn about pulling XML from the database…with any query!

Share Button

3 comments

  1. Great examples, I’m just trying to think of how to use it.

    I know you said that you shouldn’t/wouldbn’t use a where clause to filter out the token column, but why can’t you do something like WHERE getTokens(TOKEN) = 50) and even have a function based index to help with the serach times?

    Also can you do something like :

    SELECT 2*getTokens(TOKEN) FROM SEARCH_WORDS WHERE TOKEN_ID=2;

    and get 100 returned?

    and from the same table :

    SELECT SUBSTR(getTokens(TOKEN),1,2) || ‘de’ FROM SEARCH_WORDS WHERE TOKEN_ID=1;

    and get “Made” returned?

    or

    SELECT to_char(getTokens(TOKEN)-1),’dd-MON-YY’)
    FROM SEARCH_WORDS WHERE TOKEN_ID=5;

    and get 31-NOV-06

    NOTE: I assume this is a 10g functionality? I gon’t have a 10g DB to test in yet, so sorry for the questions that i would have tested myself if i did.

  2. Oh i think i just answered my own geestion. I just noticed that the return value for your function is : VARCHAR2, so some may work, but probably not the date one, unless we do something like

    SELECT to_char(to_date(to_char(getTokens(TOKEN)),’dd-MON-YY’) -1, ’dd-MON-YY’), ’dd-MON-YY’)
    FROM SEARCH_WORDS WHERE TOKEN_ID=5;

    Just a thought, why not overload the function somehow based on the return datatype….

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.