WebSphere®/InfoSphere® DataStage® has the functionality, flexibility, and scalability that are required to meet the most demanding data integration requirements.
WebSphere®/InfoSphere® DataStage has the following capabilities:
- Integrates data from the widest range of enterprise and external data sources
- Incorporates data validation rules
- Processes and transforms large amounts of data using scalable parallel processing
- Handles very complex transformations
- Manages multiple integration processes
- Provides direct connectivity to enterprise applications as sources or targets
- Leverages metadata for analysis and maintenance
- Operates in batch, real time, or as a Web service
Scenarios for data transformationThe following scenarios show how organizations use WebSphere DataStage to address complex data transformation and movement needs.
- Retail: Consolidating financial systems
- A leading retail chain watched sales flatten for the first time in years. Without insight into store-level and unit-level sales data, they could not adjust shipments or merchandising to improve results. With long production lead-times and existing large volume manufacturing contracts, they could not change their product lines quickly, even if they understood the problem. To integrate the company’s forecasting, distribution, replenishment, and inventory management processes, they needed a way to migrate financial reporting data from many systems to a single system of record. The company deployed IBM® Information Server to deliver data integration services between business applications in both messaging and batch file environment. WebSphere DataStage is now the common companywide standard for transforming and moving data. The service-oriented interface allows them to define common integration tasks and reuse them throughout the enterprise. New methodology and reusable components for other global projects will lead to additional future savings in design, testing, deployment and maintenance.
- Banking: Understanding the customer
- A large retail bank understood that the more it knew about its customers, the better it could market its products, including credit cards, savings accounts, checking accounts, certificates of deposit, and ATM services. Faced with terabytes of customer data from vendor sources, the bank recognized the need to integrate the data into a central repository where decision-makers could retrieve it for market analysis and reporting. Without a solution, the bank risked flawed marketing decisions and lost cross-selling opportunities.The bank used WebSphere DataStage to automatically extract and transform raw vendor data, such as credit card account information, banking transaction details and Web site usage statistics, and load it into its data warehouse. From there, the company can generate reports that let them track the effectiveness of programs and analyze their marketing efforts. WebSphere DataStage helps the bank maintain, manage, and improve its information management with an IT staff of three instead of six or seven, saving hundreds of thousands of dollars in the first year alone, and enabling it to use the same capabilities more rapidly on other data integration projects.
Where WebSphere/InfoSphere DataStage fits in the overall business context
WebSphere DataStage enables an integral part of the information integration process: data transformation, as shown below
IBM DataStage as sub product of IBM Information Server delivers below
- Business intelligence
- IBM Information Server makes it easier develop a unified view of the business for better decisions. It helps you understand existing data sources, cleanse, correct, and standardize information, and load analytical views that can be reused throughout the enterprise.
- Master data management
- IBM Information Server simplifies the development of authoritative master data by showing where and how information is stored across source systems. It also consolidates disparate data into a single, reliable record, cleanses and standardizes information, removes duplicates, and links records together across systems. This master record can be loaded into operational data stores, data warehouses, or master data applications such as WebSphere Customer Center. The record can also be assembled, completely or partially, on demand.
- Infrastructure rationalization
- IBM Information Server aids in reducing operating costs by showing relationships between systems and by defining migration rules to consolidate instances or move data from obsolete systems. Data cleansing and matching ensure high-quality data in the new system.
- Business transformation
- IBM Information Server can speed development and increase business agility by providing reusable information services that can be plugged into applications, business processes, and portals. These standards-based information services are maintained centrally by information specialists but are widely accessible throughout the enterprise.
- Risk and compliance
- IBM Information Server helps improve visibility and data governance by enabling complete, authoritative views of information with proof of lineage and quality. These views can be made widely available and reusable as shared services, while the rules inherent in them are maintained centrally.
IBM Information Server features a unified set of separately orderable product modules, or suite components, that solve multiple types of business problems. Information validation, access and processing rules can be reused across projects, leading to a higher degree of consistency, stronger control over data, and improved efficiency in IT projects.
- Understand your data
- IBM Information
Server can help you automatically discover, define, and model information
content and structure and understand and analyze the meaning, relationships,
and lineage of information. By automating data profiling and data-quality
auditing within systems, organizations can achieve these goals:
- Understand data sources and relationships
- Eliminate the risk of using or proliferating bad data
- Improve productivity through automation
- Leverage existing IT investments
- Cleanse your information
- IBM Information Server supports information quality and consistency by standardizing, validating, matching, and merging data. It can certify and enrich common data elements, use trusted data such as postal records for name and address information, and match records across or within data sources. IBM Information Server allows a single record to survive from the best information across sources for each unique entity, helping you to create a single, comprehensive, and accurate view of information across source systems.
- Transform your data into information
- IBM Information Server transforms and enriches information to ensure that it is in the proper context for new uses. Hundreds of prebuilt transformation functions combine, restructure, and aggregate information.Transformation functionality is broad and flexible, to meet the requirements of varied integration scenarios. For example, IBM Information Server provides inline validation and transformation of complex data types such as U.S. Health Insurance Portability and Accountability Act (HIPAA), and high-speed joins and sorts of heterogeneous data. IBM Information Server also provides high-volume, complex data transformation and movement functionality that can be used for standalone extract-transform-load (ETL) scenarios, or as a real-time data processing engine for applications or processes.
- Deliver your information
- IBM Information Server provides the ability to virtualize, synchronize, or move information to the people, processes, or applications that need it. Information can be delivered by using federation or time-based or event-based processing, moved in large bulk volumes from location to location, or accessed in place when it cannot be consolidated.IBM Information Server provides direct, native access to a wide variety of information sources, both mainframe and distributed. It provides access to databases, files, services and packaged applications, and to content repositories and collaboration systems. Companion products allow high-speed replication, synchronization and distribution across databases, change data capture, and event-based publishing of information.