We present an architectural approach to distributed and parallel programming using the C++ language. Particular attention is paid to how the C++ standard library, algorithms, and container classes behave in distributed and parallel environments. Methods for extending the C++ language through class libraries and function libraries to accomplish distributed and parallel programming tasks are explained. Emphasis is placed on how C++ works with the new POSIX and Single UNIX standards for multithreading. Combining C++ executables with other language executables to achieve multilingual solutions to distributed or parallel programming problems is also discussed. Several methods of organizing software that support parallel and distributed programming are introduced. We demonstrate how to remove the fundamental obstacles to concurrency. The notion of emergent parallelization is explored.
Our focus is not on optimization techniques, hardware specifics, performance comparisons, or on trying to apply parallel programming techniques to complex scientific or mathematical algorithms; rather, on how to structure computer programs and software systems to take advantage of opportunities for parallelization. Furthermore, we acquaint the reader with a multiparadigm approach to solving some of the problems that are inherent with distributed and parallel programming. Effective solutions to these problems often require a mix of several software design and engineering approaches. For instance, we deploy object-oriented programming techniques to tackle data race and synchronization problems. We use agent-oriented architectures to deal with multiprocess and multithread management. Blackboards are used to minimize communication issues. In addition to object-oriented, agent-oriented, and AI-oriented programming, we use parameterized programming to implement generalized algorithms that are suitable where concurrency is required. Our experience with the development of software of all sizes and shapes has led us to believe that successful software design and implementation demands versatility.
The suggestions, ideas, and solutions we present in this book reflect that experience. The Challenges There are three basic challenges to writing parallel or distributed programs: Identifying the natural parallelism that occurs within the context of a problem domain. Dividing the software appropriately into two or more tasks that can be performed at the same time to accomplish the required parallelism. Coordinating those tasks so that the software correctly and efficiently does what it is supposed to do. These three challenges are accompanied by the following obstacles to concurrency: Data race Deadlock detection Partial failure Latency Deadlock Communication failures Termination detection Lack of global state Multiple clock problem Protocol mismatch Localized errors Lack of centralized resource allocation This book explains what these obstacles are, why they occur, and how they can be managed. Finally, several of the mechanisms we use for concurrency use TCP/IP as a protocol. Specifically the MPI (Message Passing Interface) library, PVM (Parallel Virtual Machine) library, and the MICO (CORBA) library. This allows our approaches to be used in an Internet/Intranet environment, which means that programs cooperating in parallel may be executing at different sites on the Internet or a corporate intranet and communicating through message passing.
Many of the ideas serve as foundations for infrastructure of Web services. In addition to the MPI and PVM routines, the CORBA objects we use can communicate from different servers accross the Internet. These components can be used to provide a variety of Internet/intranet services. The Approach We advocate a component approach to the challenges and obstacles found in distributed and parallel programming. Our primary objective is to use framework classes as building blocks for concurrency. The framework classes are supported by object-oriented mutexes, semaphores, pipes, and sockets. The complexity of task synchronization and communication is significantly reduced through the use of interface classes. We deploy agent-driven threads and processes to facilitate thread and process management.
Our primary approach to a global state and its related problems involve the use of blackboards. We combine agent-oriented and object-oriented architectures to accomplish multiparadigm solutions. Our multiparadigm approach is made possible using the support C++ has for object-oriented programming, parameterized programming, and structured programming. Why C++? There are C++ compilers available for virtually every platform and operating environment. The ANSI (American National Standards Institute) and ISO (International Standard Organization) have defined standards for the C++ language and its library. There are robust open-source implementations as well as commercial implementations of the language. The language has been widely adopted by researchers, designers, and professional developers around the world. The C++ language has been used to solve problems of all sizes and shapes from device drivers to large-scale industrial applications.
The language supports a multiparadigm approach to software development and libraries that add parallel and distributed programming capabilities are readily available. Libraries for Parallel and Distributed Programming The MPICH, an implementation of MPI, the PVM library, and the Pthreads (POSIX Threads) library, are used to implement parallel programming using C++. MICO, a C++ implementation of the CORBA standard, is used to achieve distrbuted programming. The C++ Standard Library, in combination with CORBA and the Pthreads library, provides the support for agentoriented and blackboard programming concepts that are discussed in this book. The New Single UNIX Specification Standard The new Single UNIX Specification Standard, Version 3, a joint effort between IEEE and the Open Group, was finalized and released in December 2001. The new Single UNIX Specification encompasses the POSIX standards and promotes portability for application programmers. It was designed to give software developers a single set of APIs to be supported by every UNIX system. It provides a reliable road map of standards for programmers who need to write multitasking and multithreading applications.
In this book we rely on the Single UNIX Specification Standard for our discussions on process creations, process management, the Pthreads library, the new posix_spawn() routines, the POSIX semaphores, and FIFOs. Appendix B in this book contains excerpts from the standard that can be used as a reference to the material that we present. Who is This Book For? This book is written for software designers, software developers, application programmers, researchers, educators, and students who need an introduction to parallel and distributed programming using the C++ language. A modest knowledge of the C++ language and standard C++ class libraries is required. This book is not intended as a tutorial on programming in C++ or object-oriented programming. It is assumed that the reader will have a basic understanding of object-oriented programming techniques such as encapsulation, inheritance, and polymorphism. This book introduces the basics of parallel and distributed programming in the context of C++. Development Environments.