PDF Ebook Intel Compilers on Linux Clusters
Commodity computer clusters are emerging as a cost-effective way to provide supercomputer level performance to department-level organizations, at an affordable cost. Such clusters are typically based on widely available CPUs, such as Intel Pentium 4, and use the robust Linux operating system. Various choices of CPU interconnects are available, ranging from standard 100bT and 1000bT Ethernet, to Myrinet or Dolphin networks.
For the computational scientist, the platform offers slightly more limited choice of programming tools than traditional supercomputers, although the range of these tools is steadily increasing. Typically, there is no support for shared-memory parallel programming spanning multiple CPU boards (nodes), and the shared-memory model is limited to CPUs residing on a single board. On the other hand, mature distributed-memory tools, such as Message-Passing Interface (MPI) library, are freely available. These often exploit the particular strengths of an interconnect, cf. Myrinet-specific MPICH-GM implementation of MPI.
Compiler options are also growing, with Intel compiler set being the latest addition to the already wide range of options. In this report, we study the performance of the Intel compiler tools applied to a large existing computational mechanics simulation code.
The XNS computational code that serves as our benchmark is based on stabilized space-time finite element formulation of incompressible Navier-Stokes equations. Its main areas of application involve deformations of the computational domain, which are naturally handled by the Lagrangian-Eulerian space-time approach. This mature code has been used to simulate three-dimensional complex flows of air around a helicopter, of water around a submarine or past hydraulic structures, and of blood in a centrifugal blood pump. The code has been ported to a variety of computer architectures, including PC cluster, CRAY T3E, IBM SP, SGI Origin, CRAY MTA and others.
Posted in :