Posts mit dem Label Federation werden angezeigt. Alle Posts anzeigen
Posts mit dem Label Federation werden angezeigt. Alle Posts anzeigen

Mittwoch, 2. Dezember 2009

Staging Pain Minimized: DB2 Create Tabe Like

A quite usual task when building a staging area is to more or less "stupidly" replicating federated source data into the target data warehouse.

This is to minimize the influence of ongoing transactions within the operational systems on your following ETL processes. This is also to allow for "batch performance" treatments on the source data that you normally would not apply to your source systems. And finally, having the data locally gives you more expressivity when combining the data even with other data from the same source.

In an ongoing customer´s project, we were faced with exactly that situation and came to build a lot scripts of the form:


drop table stage.replicated_table;

create table stage.replicated_table(replicated_column1 replicated_data_type1 replicated_constraint1, replicated_column2 ...);

declare load_cursor load for select column1, column2,... from source_link_schema.source_table for read only with ur;

load from "load_cursor" method p(1,2,...) insert into stage.replicated_table(replicated_column1, replicated_column2, ...) nonrecoverable;

runstats on table stage.replicated_table ...;


each taking about 20 minutes in order to describe the original table and reformat the output into the create table syntax. Pffff.

For gods sake, some people at IBM do not look only good in their blue suits (still too few, I think I have to blog next my hate away about the ugly and time-consuming create view restriction when selecting non-unique column names) and invented the create table like syntax


drop table stage.replicated_table;

create table stage.replicated_table like sources_link_schema.source_table;

declare load_cursor load for select * from source_link_schema.source_table for read only with ur;

load from "load_cursor" method p(1,2,...) insert into stage.replicated_table nonrecoverable;

runstats on table stage.replicated_table ...;


1 minute. Runs fine. I´m happy.

Mittwoch, 25. November 2009

Date Type Conversion When Federating Oracle into DB/2

Federation (or DB-Link depending on the target DWH platform) is a most convenient way to connect your staging area to your data sources.

There is the golden rule not to stretch the tempting, modern capabilities of that feature to much, hence to first replicate, then transform the data.

There are nevertheless a few issues left that may still hurt you mostly around the builtin data type conversions, especially when formulating the mostly temporally constrained replication conditions.

So, DB2 9.5 represents oracle´s DATE as a TIMESTAMP - hence be sure to use CURRENT_TIMESTAMP instead of CURRENT_DATE!