Guidelines: UDA Design

Artifacts > Analysis & Design Artifact Set > Software Architecture Document > Guidelines > Universal Data Access (UDA) Design

Guidelines: Universal Data Access (UDA) Design

This material is provided courtesy of Applied Information Sciences, Inc.

Topics

ODBC, OLE DB and ADO

ODBC Characteristics and Features

OLE DB Characteristics and Features

ADO Characteristics and Features

Optimization Techniques

Connection Usage

Connection and Session Pooling

Record Set Management

Asynchronous Execution

Stored Procedure Usage

Other Optimization Techniques

Architectural Considerations

Two-Tier Applications

N-Tier Thick Web Client Applications

Other N-Tier Applications

Modeling Considerations

References

http://www.microsoft.com/data
http://msdn.microsoft.com/library/psdk/dasdk/choo0wvn.htm

Summary

Microsoft®Universal Data Access® (UDA) technology provides standardized access to a variety of relational and non-relational information sources. The programming interfaces defined by UDA are tool and language independent and support the ability to integrate data from diverse data sources. The primary deployment artifact for UDA technologies is Microsoft Data Access Components® (MDAC). MDAC can be deployed in a variety of architectures, including desktop applications, two-tier client/server applications and distributed, n-layer/n-tier applications. MDAC includes the following data access Application Programming Interfaces (APIs):

ActiveX Data Objects (ADO)

OLE DB

Open Database Connectivity (ODBC)

These guidelines will first describe the characteristics and features of each of the data access APIs and then describe optimization techniques for each. Next, scenarios will be described to demonstrate when and where each of the UDA technologies should be applied. Finally, a discussion on how UDA technologies fit into the application design process is presented.

ODBC, OLE DB and ADO

Each of the UDA technologies exposes a different API and a different set of features, for different types of environments and applications. Each of the UDA technologies can be used in conjunction with the others in a layered approach. This means that the API selection can be based upon the features of the API and not on whether or not an OLE DB provider or ODBC driver has been written for a particular data source. The figure below depicts the relationships between the different UDA technologies.

UDA Technologies

As shown in the figure, OLE DB providers and ODBC drivers are the UDA components that directly access data sources. The ODBC Manager, which is an MDAC component, exposes the ODBC API to ODBC clients. To make ODBC data sources available to OLE DB consumers, Microsoft provides another MDAC component called the OLE DB provider for ODBC. This component converts to and from the OLE DB API to the ODBC API and plays the role of an ODBC manager to an ODBC driver. At the highest level is ADO. ADO does not directly access any data sources, but instead uses OLE DB providers to access data. The following sections describe the characteristics and features of ODBC, OLE DB and ADO.

ODBC Characteristics and Features

ODBC is a widely supported application programming interface (API) for data access. It is based on the Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for database APIs and uses Structured Query Language (SQL) as its primary database access language. ODBC includes the following features and characteristics:

C Language API based

Large number of drivers available for a wide variety of data sources

Well suited for accessing data from relational data sources

Supports both SQL and procedure-based data access techniques

Support on non-Windows platforms

OLE DB Characteristics and Features

OLE DB is a Component Object Model (COM)-based data access API that was developed after ODBC. OLE DB was designed to not only provide access to relational data, but also to non-relational data in an efficient manner. Because of its COM-based interface, the API has easily been evolved and enhanced to add new features such as Online Analytical Processing (OLAP) and data mining support. OLE DB includes the following features and characteristics:

COM-based API

High-Performance native COM interfaces well-suited for access from C++

Support for relational and non-relational data sources

OLAP support

Data mining support

ADO Characteristics and Features

ADO is a high level interface exposing OLE DB provider data to OLE automation clients. It was designed to simplify earlier data access APIs such as Data Access Objects (DAO) and Remote Data Objects (RDO) by flattening their hierarchical programming models. It also supports advanced features such as disconnected record sets and is accessible to all automation-compliant languages such as VBScript, Jscript and Visual Basic. ADO includes the following features and characteristics:

COM-based

Automation support

Disconnected record set support

Client-side data via Remote Data Service (RDS)

Record set persistence

Simple API

Optimization Techniques

The following sections describe optimization techniques that can be applied to all of the UDA-based technologies.

Connection Usage

Typically, one of the most expensive resources in gaining access to the information contained in a data source is the connection to the data source itself. This is particularly true when the consumer of the data is located on a different machine than the data source itself. When designing applications, the number of connections to a data source should be kept to a minimum and the duration that the connection to the data source is kept open should also be minimized. This can be achieved by using the following guidelines when building data access applications:

Perform as much work as possible before the connection is opened

Minimize the amount of work that is performed while the connection is open

Whenever possible, reuse connections

This approach, when used in conjunction with the following guidelines on connection and session pooling, maximize connection usage.

Connection and Session Pooling

The ODBC manager and the OLE DB service components provide connection and session pooling respectively. Connection/session pooling is a management service that pools open connections/sessions to data sources. When an application requests a new connection to a data source, the connection/session manager examines the list of currently connected, but unallocated connections. The security and other properties of the connection are respected. Pooled connections are reused only if all requested properties match what has been requested.

Pooling is intended to improve the performance of connecting and disconnecting in large n-tier applications by keeping a pool of live connections open on the middle tier for a short period of time and then reusing them for new connection requests. This allows middle-tier applications that rely on quick, individual database actions to share connections rather than having to create and release connections for each action. Note that the techniques described in the previous Connection Usage section maximize the effectiveness of connection and session pooling.

Record Set Management

Efficient record set management defines how the cursor or record set of data that results from a query is created and managed. Two considerations should be taken into account when creating record sets:

Use client-side cursors to maximize performance. There are only a few scenarios, such as when you are retrieving a very large data set, in which it makes sense to use server-side cursors; otherwise, client-side cursors make the most efficient use of network bandwidth and machine resources. Additionally, when using ADO, as the access API, whenever the record set is being used in a read-only scenario, or the record set needs to be used for a long period, the record set should be disconnected from the connection. Then the connection can be closed, which returns it to the connection pool for reuse by other threads of the application.

Another record set management consideration to make is the type of cursor that is opened. The most efficient type of cursor that can be opened is a forward-only, read-only cursor. If it is necessary to update data, it is typically most efficient to retrieve data using a forward-only, read-only cursor and to then create an SQL statement to perform the necessary updates. Only use other type of cursors, such as dynamic, keyset or static, when absolutely necessary.

Asynchronous Execution

All of the UDA-based APIs support some form of asynchronous execution. The typical scenario is to initiate a data access request using an asynchronous API call and to check back for a completion station at a later time. In a well-designed, multi-threaded application, the thread of execution that initiates the data access request can perform some meaningful work while it, or another thread, polls for the request result.

Stored Procedure Usage

Regardless of the data source being accessed, if stored procedures are available, they should typically be used instead of standard SQL statements. In most databases, when stored procedures are created, they go through the parsing and normalization steps that occur at runtime against standard SQL statements. Another advantage of stored procedures is that client requests will use less network bandwidth than equivalent SQL statements sent to the server because they are inherently less verbose.

Stored procedures can provide even greater performance gains when they efficiently leverage database-processing power. This typically involves minimizing the amount of data that is returned to the UDA client by performing ‘appropriate’ levels of processing in the stored procedure. Balancing the amount of work that is performed within the stored procedure, against the amount of work that is performed in a UDA client needs to be done on a case-by-case basis to achieve maximum performance.

Other Optimization Techniques

Several other optimization techniques should be taken into consideration when designing UDA-based applications.

When an application is using ADO or OLE DB as its UDA API, use native OLE DB providers (instead of the OLE DB provider for ODBC in conjunction with an ODBC driver). Native OLE DB providers will typically be faster and have a smaller memory and disk footprint than the OLE DB provider for ODBC used in conjunction with an ODBC provider.

When using Visual Basic with ADO, declare all variables (including objects) and strongly type them. This will cause Visual Basic to not create variables as variants. Variants are inefficient.

Minimize the amount of data queried for by formatting the data access request to minimize the number of columns in the result set. For example, instead of using the SQL command “SELECT * FROM”, select only the columns that are required by the query.

Architectural Considerations

Each of the UDA technologies, ODBC, OLE DB and ADO provide features that make them well suited for certain architectures. The following sections describe when and where to use each of the technologies.

Two-Tier Applications

Two tier applications combine the presentation, business logic and data layers of an application on a single machine, while the data services portion of the application resides on another machine. The data layer of the application directly accesses the data source using one of the UDA technologies. Legacy, non-COM applications should utilize ODBC as the data access API because this is the simplest and most efficient method available to them. Also, applications that may be ported to non-Windows platforms should utilize ODBC because ODBC is supported on numerous non-Windows platforms. Applications written in C++ can use either OLE DB or ADO. If a very low level of control over the data access portion of the application is required, then OLE DB is preferable. If the presentation or business logic layers of the application are written in a language that does not support the COM-interface exposed by OLE DB, then ADO should be used to simplify the transfer of data to the higher layers. Finally, applications written in languages that do not support COM interface-based programming should utilize ADO to access data.

N-Tier Thick Web Client Applications

N-Tier, thick Web client applications are typified by their use of HTML/Dynamic HTML (DHTML)/eXtensible Markup Language (XML), client-side scripting, ActiveX controls and other browser-specific features. Internet and Intranet sites that can control or dictate the type of browser that will be used to access their content, will construct presentation tiers using this client type. This style of application can take advantage of ADO’s Remote Data Service (RDS). RDS provides a mechanism for ActiveX-enabled browsers such as Microsoft Internet Explorer® to directly access OLE DB accessible data sources across the Internet and through firewalls. RDS can also be used to instantiate COM-based business objects and transfer data to and from those COM business objects using ADO recordsets. RDS supports client-side caching of result sets and provides automatic binding with data-aware ActiveX controls and DHTML elements. RDS, scripting and DHTML can be used to minimize the number of round-trips to the Web server for validation, data retrieval and data updates. It should be noted that much of this type of functionality is also available in the XML Document Object Model (XMLDOM) that is part of Microsoft Internet Explorer versions 5.0 and higher. However, this technology is not addressed in these guidelines because it is not part of the current UDA framework.

The backend tiers of an n-tier, thick Web client application can be built using any of the UDA technologies. However, when ADO and RDS are being used to transfer disconnected recordsets to the thick Web client application, the best API to use on the backend is ADO. Although ADO recordsets can be manually constructed from ODBC or OLE DB data sources, it is simpler and more efficient, to use ADO to perform the recordset creation step.

Other N-Tier Applications

The other types of n-tier applications such as rich native client and thin Web client applications have common design characteristics. Both types of applications can use any of the UDA APIs. ODBC would only be used in cases where existing ODBC drivers could not be used with the OLE DB provider for ODBC. Refer to the appropriate Microsoft documentation for limitations of the ODBC provider for ODBC. The choice of OLE DB vs. ADO is based upon the same criteria as was defined for other architectural models.

Applications written in C++ can use either OLE DB or ADO. If a very low level of control over the data access portion of the application is required, then OLE DB is preferable. If the presentation or business logic layers of the application are written in a language that does not support COM-interface programming, then ADO should be used to simplify the transfer of data to the higher layers. Applications written in languages such as Visual Basic, VBScript and JavaScript, that do not support the COM interfaces exposed by OLE DB, should utilize ADO to access data.

Modeling Considerations

In the standard three-layer logical architecture model for Windows DNA-based applications, UDA technologies can be used in the design of data access components in the business logic or data layers. Alternatively, in n-layer logical architecture models, the data access components that utilize the APIs provided by the UDA technologies are often grouped into a separate data access layer that is between the business logic and data layers. See the Guidelines: Windows DNA Sample Architecture Patterns and Implementation Mechanisms for more information on logical architecture models.

In the Analysis and Design discipline, during the Workflow Detail: Analyze Behavior, the designer defines Unified Modeling Language (UML) interaction diagrams for modeling the dynamic behavior of the objects that participate in a use case. The designer should consider using the «entity» analysis class stereotype for modeling the objects and classes that will be implemented in the data access components. The «entity» analysis class is used to model classes that have responsibilities for storing and managing information in the system. Refer to Guidelines: Analysis Class for more information on the «entity» analysis class stereotype.

For Windows DNA-based applications developed using Microsoft Visual Basic®, the designer may alternately use a pre-defined Windows DNA specific class stereotype in Rational Rose to model the data access classes. This stereotype is the «ADO Class» stereotype. When the designer creates a class in Rational Rose with the stereotype «ADO Class», Rational Rose automatically creates several commonly used properties and methods that facilitate the use of ADO services. Refer to the Rational Rose help topic “Rose Visual Basic” for more information on using Visual Basic specific stereotypes.

Copyright © 1987 - 2001 Rational Software Corporation

Rational Unified Process