semantic integration explained

Let us see how definitions and meta types provide for semantic integration. Suppose the previously mentioned RDBMS column contains an alphanumeric part number [e.g., pnum varchar2(6)] while the COBOL file resource data item identifies a numeric part number [e.g., part_no PIC 9(5)]. Note that not only are these fields different types but they are also different sizes.

During the semantic rationalization process, each field is mapped to the definition part_number and assigned the meta type id, an alias for the expressor string type. What is the rationale for changing the types of the pnum and part_no fields? It is a reasonably safe assumption that mathematical operations will not be made on the part_number field, so storing and manipulating this value as a string type will provide greater functionality (the string manipulation and pattern matching functions will be usable), less confusing scripting (part numbers are always represented by the same type), and memory management efficiency. It is the responsibility of the expressor parallel data processing engine, not the integration developer, to manage the conversions between the types in the resources and the types within the integration, so these conversions are completely hidden from the developer, who only needs to be aware of the meta type assigned to the definition. Now all integration projects, where a data resource contains a field mapped to the part_number definition, will treat this data as a string type.

What about the size differences? Although all part numbers are now represented by a common type, how can code be simplified if part numbers differ in size? The answer is to write a business rule for the definition part_number that adjusts the size of part number fields. For example, the following rule initially left pads the part number and then extracts the right-most characters as the formatted part number value.

string.substring(string.concatenate("00000000", part_number), -8)

Note that the formatted part number actually contains eight characters, which means that the rule is usable with data resources where part number values contain up to eight characters. Of course, the rule can be modified to format the part number to any width using any padding character. Writing a comparable rule for a numeric field would be much more complicated, especially when using zero as the padding digit. The first step of an integration would apply this rule to the incoming part number and all subsequent steps would work with an eight character wide string field.

<< previous page   1 2 3   >> next page