I make no secret of the fact that I think peer-to-peer computing, as perceived by the PC community, is a retrograde step and another recipe for unnecessary problems. This however applies to the concept of trying to exploit all the redundant computing resources that we have foolishly been persuaded to buy, but the application of the concept to servers, so called "grid computing", deserves a lot of serious consideration.
Given the growth of the desire to share and access data on a global scale created by the Web, there is an ever increasing demand to create a single pool of data which any application can reach. It is beyond the bounds of possibility to implement this with a single physical system, so it must be a virtual concept. In other words the applications don’t need to know where the data is actually stored. The principal of browsing, using URLs to move from one Web server to another without logging on and off is an excellent starting point, but it cannot combine information on multiple servers into a single virtual system. This problem was addressed many years ago by the relational database community, when the concept of "enterprise application integration" was in its infancy. To provide a seamless distributed database is not a trivial task, and even today there are significant limitations, largely created by the need to transiently move data sets across networks and to guarantee consistency in the light of a failure in a single machine.
Given the limitations with the relatively well defined relational database model, the current demands for data, text and graphics, not to mention multi-media, represents a significantly greater challenge. This appears to have completely passed over the heads of the PC peer-to- peer campaigners, but nor is the more sober server fraternity completely free of under estimation of problems. As we have seen with the developments of storage technologies such as NAS and SAN, there is still a naive ignorance of the difference between physical storage, file systems and databases. It is the data that needs to be shared rather than the storage. In other words the distributed database model is the only practical way to solve the demand for more and more on-line data. That is not to say that NAS, etc. are valueless, far from it, because they are lower level technologies needed to assist the database engines.
While most of the storage developments are dedicated to data processing and data warehouse applications, they are equally applicable to Web based systems as long as these are implemented as a virtual subsystem and not a random mess. This then is the incentive behind grid computing, to exploit the technologies developed for current business applications in the new world of e-commerce.
It is worth noting why a single physical system is not an option. First the size of the predicted Web server requirements is beyond any hardware system developments. We are talking about thousands of mainframes (as opposed to millions of PCs). Then there is the problem of ownership and the not insignificant impact of customer loyalty. A long standing HP customer will not mind combining in some way with someone else using Sun or IBM equipment, but they won’t be willing to change. Remember that this will be an evolution and not a revolution so that it will be a multi-vendor development. Although Microsoft would wish it to be all .NET based, they too will have to join the fold. Then there is the problem of the network. This is at two levels, the very high speed one needed to network the servers themselves and the one serving the users. The latter can be much slower per user but there will be a huge number of them. A single server would result in an impossible bottleneck, a problem of distribution already successfully tackled by IBM with their Olympic Games systems.
But the real nightmare we must face up to is management and security, or lack of it! While the mainframe and Unix systems have been relatively free from hacking, Windows systems are proving a hackers joy. Despite claims by Microsoft that they will resolve these problems in the now delayed release of .NET server, don’t hold your breath. The problem lies at the very heart of the NT software model. And while Microsoft has the soft underbelly so loved by hackers, the rest are by no means perfect.
If grid computing is to succeed cooperation and standards are essential, areas in which the IT industry has an appalling track record!
Martin Healey, pioneer development Intel-based computers en c/s-architecture. Director of a number of IT specialist companies and an Emeritus Professor of the University of Wales.