Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] - نسخه متنی

William Gropp; Ewing Lusk; Thomas Sterling

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید










[54] and the Cray SHMEM library used on the massively parallel Cray T3D and T3E computers [30].


In one-sided communication, one process may put data directly into the memory of another process, without that process using an explicit receive call. For this reason, this is also called remote memory access (RMA).


Using RMA involves four steps:





  1. Describe the memory into which data may be put.





  2. Allow access to the memory.






  3. Begin put operations (e.g., with


    MPI_Put ).





  4. Complete all pending RMA operations.





The first step is to describe the region of memory into which data may be placed by an


MPI_Put operation (also accessed by


MPI_Get or updated by


MPI_Accumulate ). This is done with the routine


MPI_Win_create :



MPI_Win win;
double ulocal[MAX_NX][NY+2];
MPI_Win_create( ulocal, (NY+2)*(i_end-i_start+3)*sizeof(double),
sizeof(double), MPI_INFO_NULL, MPI_COMM_WORLD, &win );


The input arguments are, in order, the array


ulocal , the size of the array in bytes, the size of a basic unit of the array (


sizeof (double) in this case), a "hint" object, and the communicator that specifies which processes may use RMA to access the array.


MPI_Win_create is a collective call over the communicator. The output is an MPI window object


win . When a window object is no longer needed, it should be freed with


MPI_Win_free .


RMA operations take place between two sentinels. One begins a period where access is allowed to a window object, and one ends that period. These periods are called epochs.[Section 9.3.1). They complete at the end of the epoch that they start in, for example, at the closing


MPI_Win_fence . Because these routines specify both the source and destination of data, they have more arguments than do the point-to-point communication routines. The arguments can be easily understood by taking them a few at a time.





  1. The first three arguments describe the origin data, that is, the data on the calling process. These are the usual "buffer, count, datatype" arguments.





  2. The next argument is the rank of the target process. This serves the same function as the destination of an


    MPI_Send . The rank is relative to the communicator used when creating the MPI window object.





  3. The next three arguments describe the destination buffer. The


    count and


    datatype arguments have the same meaning as for an


    MPI_Recv , but the buffer location is specified as an offset from the beginning of the memory specified to


    MPI_Win_create on the target process. This offset is in units of the displacement argument of the


    MPI_Win_create and is usually the size of the basic datatype.





  4. The last argument is the MPI window object.





Note that there are no MPI requests; the


MPI_Win_fence completes all preceding RMA operations.


MPI_Win_fence provides a collective synchronization model for RMA operations in which all processes participate. This is called active target synchronization.


With these routines, we can create a version of the mesh exchange that uses RMA instead of point-to-point communication. Figure 9.13 shows one possible implementation.










void exchang_nbrs( double u_local[][NY+2], int i_start, int i_end,
int left, int right, MPI_Win win )
{
MPI_Aint left_ghost_disp, right_ghost_disp;
int c;
MPI_Win_fence( 0, win );
/* Put the left edge into the left neighbors rightmost
ghost cells. See text about right_ghost_disp */
right_ghost_disp = 1 + (NY+2) * (i_end-i-start+2);
MPI_Put( &u_local[1][1], NY, MPI_DOUBLE,
left, right_ghost_disp, NY, MPI_DOUBLE, win );
/* Put the right edge into the right neighbors leftmost ghost
cells */
left_ghost_disp = 1;
c = i_end - i_start + 1;
MPI_Put( &u_local[c][1], NY, MPI_DOUBLE,
right, left_ghost_disp, NY, MPI_DOUBLE, win );
MPI_Win_fence( 0, win )
}








Figure 9.13: Neighbor exchange using MPI remote memory access.


Another form of access requires no MPI calls (not even a fence) at the target process. This is called passive target synchronization. The origin process uses


MPI_Win_lock to begin an access epoch and


MPI_Win_unlock to end the access epoch.[50, Chapter 6]. Also note that as of 2003, not all MPI implementations support passive target RMA operation. Check that your implementation fully implements passive target RMA operations before using them.


A more complete discussion of remote memory access can be found in [50, Chapters 5 and 6]. Note that MPI implementations are just beginning to provide the RMA routines described in this section. Most current RMA implementations emphasize functionality over performance. As implementations mature, however, the performance of RMA operations will also improve.


[2]MPI has two kinds of epochs for RMA: an access epoch and an exposure epoch. For the example used here, the epochs occur together, and we refer to both of them as just epochs.


[3]The names


MPI_Win_lock and


MPI_Win_unlock are really misnomers; think of them as begin-RMA and end-RMA.


/ 198