% response.redirect("http://www.digitalinsight.com") %>
| Feature Story |
To all those up to their elbows in creating a data warehouse, Electronic Data Systems Corp. says "Stop right there."
The Plano, Tex. vendor says there is a simpler, cheaper, and more logical way for companies to get a handle on their fragmented stores of information. You don't have to duplicate data from your subsystems into a supersystem if you can access the subsystems, EDS argues, saying that it will allow companies to share data among their various systems however incompatible they seem. In synthesizing information for reporting purposes, EDS says it will filter redundant information and dynamically rectify conflicting information.
To accept the argument puts not just clients--who can wave goodbye to their investments to date--but EDS itself in an awkward position. One of its forty-odd division's has just proposed a successor to the data warehouse, while another offers data warehousing.
System 21, the warehouse challenger comprises a basic presentation tool on the front end and EDS's integration skills on the back end. EDS calls it a "virtual Web warehouse" because it offers the same universal access as a Web warehouse, the latest in warehousing (see glossary). System 21 clients don't need to have an intranet in place, but EDS plans to make intranet access an option.
If one accepts EDS's premise, its package is appealing: outsource the grief and pay under $4 million total, when warehouses can run up to $25 million and take years to construct. (Mentis Corp., a Durham, NC firm which conducts annual surveys on warehousing, says the initial outlay for a warehouse typically is at least $3 million.)
Strange though it seems, EDS says its warehouse alternative, is winning strong support from those currently creating warehouses. True, warehouses are notoriously complex and both vendors and consultants spoke of clients who sought their help having abandoned in-house efforts.
"I have had one customer indicate that they would be willing to pay $15 million for System 21," said Larry Walker director of EDS's mortgage division, adding that the client is a top-ten mortgage lender with a warehouse already under way.
Walker spoke to ABA BJ just before System 21's recent unveiling at the annual Mortgage Bankers Association convention. He explained that EDS targeted the mortgage lending area first because it is "the most data-rich" section of a financial institution. The information provided by data warehouses can be used for various ends, but a common goal is better marketing through knowing customers better.
Two developmental streams
Although arguably the financial sector with most to gain from warehousing, the mortgage industry has been slow to do it.
Likewise, although community banks risk losing their strongest suit, the customer relationship, sources are hard pressed to name one with a warehouse. Bill Gillbert, a research associate at Mentis Corp., said, "I know of no successful community bank implementation."
Everyone says data warehousing is mushrooming in banking, but on closer examination, two patterns are emerging: one for the big banks and another for mortgage and community banks both. Spending estimates suggest bigger banks have data warehouses under construction while smaller institutions are only getting in on the act. The Boston-based Tower Group estimates total bank spending on warehousing will rise to $1.75 billion next year, from $1.1 billion last year. However when one uses $1 billion in asset size as a dividing line, as Mentis did, spending by the larger insitutions is seen to be declining while among the smaller institutions it is rising.
As banking's technological followers start building warehouses, its technological leaders are skirting the next hurdle: data mining (see sidebar). Spending on data mining, will triple to $300 million next year, from $100 million last year, according to Aaron Zorns, vice-chairman of the San Francsico-based Meta Group.
Down at the company level, the picture is less clear. What, for instance, is the asset-size threshold for an institution to be able to build a data warehouse? Estimates ranged from a quarter of a billion dollars to $10 billion.
Asking what it costs to build a data warehouse is, in fairness, like asking, "How long is a piece of string?" The answer, "somewhere between a half-million dollars and $25 million," reflects factors such as the amount of data and how long it is to be kept.
Walter Sauls, a data warehouse consultant with Systems Techniques Inc., Atlanta, suggests a data warehouse developed for a single department runs $400,000 to $600,000, a warehouse for a small bank (or a large department in a large bank) runs $800,000 to $1.5 million and an enterprise wide warehouse for a large bank runs $10 million to $15 million.
He adds that it takes a minimum of six months to build a warehouse, though nine months to a year is likely. Big banks often report taking close to two years.
Cost/benefit analysis proves thorny
What rewards do these investments reap? Consultants and vendors talk of recouping the investment within six months to three years, with paybacks ranging from 100% to 700%. Banks rarely talk, either protecting a "competitive advantage" or because they haven't quantified the benefits. The starkest reputed success--$29 million in annual savings by Chase Bank, as cited by the Data Warehousing Institute, a Gaithersburg, Md., educational group--was denied by its subject. Chase said it didn't know where that number, attributed to its credit card warehouse, was derived. However, a spokesperson said the bank expects to save between $12 and $15 million in previous costs once it creates a combined warehouse for both Chase and Chemical's credit card business.
"Trying to quantify the benefits of a warehouse is like asking someone in a plane are they going faster than they were when they were in an automobile," says Walter Sauls. They say, 'Oh yeah, much faster,' but they have no concept of how much faster."
Despite the intangible benefits, Saul predicts community banks will start to go for warehousing, particularly outsourced solutions. Other sources agreed outsourcing is a good option for community bank.
In fact, across the whole banking spectrum, Mentis Corp. found just one-fifth attempting to build a warehouse without any outside help. Today, consultants are increasingly used, Mentis adds.
Even though Provident Bank of Maryland, at $3 billion in assets, exceeds the conventional size definition of a community bank, it did not consider building its own warehouse and, until recently, regarded even outsourcing as too costly.
Limitations in communications bandwith until relatively recently made it impossible for a $3 billion institution like Provident to outsource data warehousing, John King, chief information officer, explained. Provident is in Baltimore and the firm to which it outsources data processing, M&I Data Services, is near Milwaukee. "Some years ago M&I had a (data-warehousing type) package, and we couldn't use it," King explained. Ironically, at that time only small banks, with less data to shift, could afford an outsourced warehouse, he said.
Provident is saving in excess of $154,500 a year since it began using one of M&I's warehouse services. Informatter, M&I reporting software, called Informatter, merges information from separate transaction systems to produce reports, which previously required employees to compare print-outs from each system. Reduced paper and printing costs alone (labor savings have not yet been estimated) are saving Provident $515 a year on each of its 300 reports.
Provident has not yet recouped its cost of warehousing, King said. However, he expects even greater results from another report Provident plans to use.
"We plan to get heavily into self-service alternatives for the customer," King says, explaining that by knowing who prefers the ATM to the teller, for instance, Provident can market home-banking to the more technically inclined.
The evolution of data warehousing
Rick Roy, vice president with M&I, explained that the provision of transaction-level detail is a key distinction between the data warehouse and its predecessor, the customer information file. These files, which came into use about 10 years ago, were the first recognition of the marketing value of information stored in customers' different accounts, he said. Roy agreed that a warehouse might more aptly be described as a library. "You might say a data warehouse is like a library with a turbo-charged catalog," he said.
One M&I customer has been able to reduce its marketing costs by $300,000 following a $150,000 investment in data warehousing, Roy said. He said he was not at liberty to name the $3.5 billion institution. Nor could Roy say what is the minimum cost of outsourcing, which involves a third-party treating and merging data provided by the client institution.
In the year since M&I, a 30-year old firm, counted itself in the data warehousing business, warehousing has grown to probably 12% of its revenues. M&I now has 180 clients getting into warehousing Roy said, adding that they range in asset size from $200 million to $50 billion.
Those who build a warehouse may use a mainframe to run the database, though parallel processors are more common. (Using a number of linked processors speeds up response times). These may be massively parallel processors (MPP) or symmetric parallel processors (known as SMP). Mentis Corp's associate notes, "MPP historically has meant a bigger investment." A large warehouse using MPP might cost $15 million versus $1 million to $2 million for its peer using SMP, Gilbert said.
Middleware, a type of software that helps to span highly incompatible systems, also may be used. (See "OLAP and other warehouse jargon" for the types of software required, both to create and utilize the warehouse.)
The environmentalists' motto, "Think globally, act locally," is also recommended to those embarking on a warehouse. Since standardization of information is a key rationale for a warehouse, inadequate co-ordination of participants can be its downfall. Experts also emphasize that what goes into the warehouse (during "data modelling") is a business decision not a technical decision.
Experts also underscore metadata, which might be thought of as an index to how the data is stored. Again the emphasis reflects the mistakes of early implementors: those supposed to use the warehouse didn't know where to find the information they needed.
Citibank's construction of a warehouse for risk management gives some flavor of the process. When the New York bank set out to combine data from 70 systems, reflecting its operations in 94 countries, it turned to five vendors that sometimes work together. Data modelling, was handled by Logic Works, of Princeton, N.J. Then data extraction and conversion was done by Prism Solutions Inc., Sunnyvale, Calif. Indexing, was also done by Prism. Another aspect of metadata, monitoring warehouse use, is done by Hewlett Packard Inc., Cupertino, Calif. Database management is done by Sybase Inc. of Emeryville, Calif.
Gail Port, Citibank's senior technology person on the project, and now an independent consultant, said the warehouse is hoped to be live by the first quarter of next year. "The goal is to make information on our total exposure available on a nightly basis," Port said. In the event of an earthquake, for example, she said, "we have risk we didn't have an hour ago."
She emphasized the need for a strong mandate from senior management since a standardized approach is key to the success of a warehouse, but different areas within a company may resist changing their ways unless someone can impose standards.
Warehouses are very much works in progress. Take Norwest Corp., for example. It's first warehousing application--servicing risk management--is to come from its mortgage subsidiary, early next year. Earlier this year, Norwest Mortgage, acquired Prudential Home Mortgage Corp., in part, it said, because Pru had a warehouse. Norwest's fledgling warehouse was combined with Pru's and now that mortgage warehouse is being supplemented with data from its parent bank...and so it goes.
To add to the confusion, there's a counter-trend toward data marts. Warehouses are all about centralizing information while marts are about decentralizing: spinning information back to communities of end users. If it sounds like going back to square one--fragmented systems--it sometimes is in practice, although in principle marts should conform to a common standard.
When even giants like Citibank and Norwest aren't done with warehousing, it's clearly coming slower than many reports suggest. One survey tells that 70% of the top 100 banks have a data warehousing project "underway," while surveys released in 1995 by Mentis and 1996 by Ernst & Young concurred that only about one-tenth of banks had by then completed warehouses. Moreover, Mentis found more than half of its 600 respondents ignorant of the data warehousing concept as of 1994.
Which brings us back to EDS and its new warehouse without a warehouse. Will the concept slow the spread of warehouses? Experts found EDS's promise too good to be true. EDS's Walker admits that he cannot imagine banks committing all their information to an untried concept, but he does see them willing to use System 21 on the marketing side. "It probably is too good to be true," Walker says, "but I haven't met anybody who didn't like it." Mentis' expert, Gilbert, says, "it has a lot of risk in it--not just security of information, but that you would miss a lot of stuff." Nonetheless, he said this type of approach currently is being addressed in Mentis' 1996 survey on data warehousing.
OLAP and Other Warehouse Jargon
Data Mining--is becoming a catch-all phrase for extracting useful information from raw data. Strictly speaking, it means using artificial intelligence to search a database for patterns. The idea is that jettisoning the presumptions that shape a human being's data queries, leads to new discoveries.
Data Warehouse--A huge database containing a honed set of data copied from transaction systems (DDA, mortgage loans, etc.). The data is standardized and "scrubbed" to free it of error, inconsistency, omission, and redundancy, before transferring it to the warehouse.
Data Warehousing--Duplicating information contained in a company's susbsystems and pooling it to create a unified view of the company's operations. The scope of the supersystem determines whether it is a data warehouse (enterprisewide) or a data mart (limited to one department or function).
Metadata--Data about the data. It serves as an index to how information is stored and it monitors warehouse use.
OLAP--Data mining is a type of On-Line Analytical Processing, a techinque which allows multi-part questions to be posed of the database. Instead of a report on revenue by branch (thinks of a spreadsheet grid) OLAP, which is also known as multi-dimensional processing, might report revenue by branch, subdivided by product lines and by region.
Query Tools--These off-the-shelf software offerings democratized the databases. Formerly only those trained in structured query language (SQL), the language of databases, could run database reports. Now nonprogrammers can access the database using plain English. Since all reporting does not fall to programmers, reports that previously took weeks get produced in hours. Query tools handle queries too basic for OLAP.
Web Warehouse--A data warehouse accessed on an intranet (though Internet access may be available also). The appeal is easier implementation through use of a standard that cuts across hardware platforms and training barriers.
