Non-blocking send/recv - MPI_Isend(buf, count, type, dest, tag, comm, MPI_Reqest *req); - MPI_Irecv(buf, count, type, src, tag, comm, MPI_Reqest *req); * 'I' stands for 'immediate' * local * can be buffered or unbuffered * req is an OUT parameter used to identify send/recv * can not safely use send or recv buffer until you have verified that message has been safely moved from send buffer using either ... MPI_Wait(MPI_Request *req, MPI_Status *status); - non-local - blocks until request req is statisfied. Request then set to MPI_REQUEST_NULL - status set to contain information on the completed operation except when MPI_STATUS_IGNOREis used MPI_Test(MPI_Request *req, int *flag, MPI_Status *status); - local - return flag=true if operation identified by request is complete - continues either way -- let's you get work done Q: How would using these affect our message exchange example of last class?? A: No longer will depend on ordering of send/recvs -- much safer Important note: standard does not say how many MPI_Isend or MPI_Irecv may posted at any one time -- only says that for a good implementation "number should be very large". Note also that all three send modes supported by non-blocking send: MPI_Ibsend MPI_Issend MPI_Irsend Other useful related functions (no real subtleties in how they are used; please consult web listing for full signature): MPI_Waitany MPI_Testany MPI_Waitall MPI_Testall MPI_Waitsome MPI_Testsome One other set of very important functions MPI_Probe(source, tag, comm, MPI_Status *stat): *non-local, blocks until matching message is found at source MPI_IProbe(source, tag, comm, int *flag, MPI_Status *stat) * local, returns flag which tells weather matching message was found Probes look for matching message and either return true (iprobe) or block (probe) until one is found. Status gives info about message (e.g. size) Some examples of when asynchronous communication is a good model: - overlap communication with computation (homework) simplify - tremendously matching sends/recvs (neuron example, global reorder,etc.) Solution of PDE's, simulation: Solve del^2 f = g using jacobi iteration f(i+1, j) + f(i-1, j) + f(i, j+1) + f(i, j-1) - 4*f(i,j) = g For the case where g = 0 this is called Laplace's Equation: f(i+1, j) + f(i-1, j) + f(i, j+1) + f(i, j-1) - 4*f(i,j) = 0 Q: How to solve such a system? A: Simple (though impractical) technique is to write as: f(i,j) = 1/4* [f(i+1, j) + f(i-1, j) + f(i, j+1) + f(i, j-1) ] and to start with some guess + boundary values on f and iterate above equation until convergence is reached. We then discussed ghost cells, partitioning, and introduced new functions: MPI_Sendrecv(sendbuf, sendcount, sendtype, dest, sendtag, recvbuf, recvcount, recvtype, srce, recvtag, comm, stat) MPI_Sendrecv_replace(buf, count, datatype, dest, sendtag, src, recvtag, comm, status)