There are strong biological and engineering evidences to support the fact that the information processing capability of NNs is determined by their architectures. Much work has been devoted to finding the optimal or near optimal NN architecture using various algorithms, including EAs [55]. However, many real world problems are too large and too complex for any single ANN to solve in practice. There are ample examples from both natural and artificial systems that show that an integrated system consisting of several subsystems can reduce the total complexity of the entire system while solving a difficult problem satisfactorily. NN ensembles adopt the divide-and-conquer strategy. Instead of using a single large network to solve a complex problem, an NN ensemble combines a set of ANNs that learn to decompose the problem into sub-problems and then solve them efficiently. An ANN ensemble offers several advantages over a monolithic ANN [15]. First, it can perform more complex tasks than any of its components (i.e., individual ANNs in the ensemble). Second, it can make the overall system easier to understand and modify. Finally, it is more robust than a monolithic ANN, and can show graceful performance degradation in situations where only a subset of ANNs in the ensemble performs correctly. There have been many studies in statistics and ANNs that show that ensembles, if designed appropriately, generalize better than any single individuals in the ensemble. A theoretical account of why and when ensembles perform better than single individuals was presented in [52].