1

I have some problem in understanding the difference between mpi_type_get_extent and mpi_type_get_true_extent. In practice, I was using the former, expecting the results I then obtained with the latter, so I checked the MPI 3.1 Standard, where I found (at the section 4.1.8 True Extent of Datatypes)

However, the datatype extent cannot be used as an estimate of the amount of space that needs to be allocated, if the user has modified the extent

which made me think that I should have experienced no difference in the use of the two subroutines as long as I hadn't modified the extent of the datatype.

But I'm evidently missing something.

Declared the following MPI derived data type,

sizes    = [10,10,10]
subsizes = [ 3, 3, 3]
starts   = [ 2, 2, 2]
CALL MPI_TYPE_CREATE_SUBARRAY(ndims, sizes, subsizes, starts, MPI_ORDER_FORTRAN, MPI_DOUBLE_PRECISION, newtype, ierr)

the following code

call mpi_type_size(newtype, k, ierr)
call mpi_type_get_extent(newtype, lb, extent, ierr)                                                                                             
call mpi_type_get_true_extent(newtype, tlb, textent, ierr)
write(*,*) k/DBS, lb/DBS, extent/DBS, tlb/DBS, textent/DBS ! DBS is the size of double precision

produces the output (obviously the same for all processes)

27   0   1000   222   223

So mpi_type_size behave like I expect, returning PRODUCT(subsizes)*DBS in k; on the other hand, I'd have expected from both mpi_type_get_extent and mpi_type_get_true_extent what only the latter returns (since I have not modified newtype at all), specifically 222 223, which are basically starts(1) + starts(2)*sizes(1) + starts(3)*sizes(1)*sizes(2) and 1 + (subsizes - 1)*[1, sizes(1), sizes(1)*sizes(2)].

Why does mpi_type_get_extent return 0 and PRODUCT(sizes) in lb and extent, regardless of subsizes and starts?

I haven't posted an MWE since I have no errors at all (not at compile time, nor at runtime), I simply haven't got the way the two aforementioned routines work. I would basically like someone to help me in understanding the description of those subroutine in the standard document and why it is correct to obtain those result that I didn't expect.

EDIT As requested by @GillesGouaillardet, I add a "minimal" working example to be run with at least 4 processes (please run it with exactly 4 processes, so that we have the same output), at the end of this question. The last lines can be uncommented (with awareness) to show that the types representing non-contiguous memory location work properly when used with count > 1, once they've been properly resized by means of mpi_type_create_resized. With those lines commented, the program prints size, lb, extent, true_lb, true_extent for all types created (even those intermediate, not committed):

 mpi_type_contiguous                    4                    0                    4                    0                    4
 mpi_type_vector                        4                    0                   13                    0                   13
 mpi_type_vector res                    4                    0                    1                    0                   13
 mpi_type_create_subarray               4                    0                   16                    0                   13
 mpi_type_create_subarray res           4                    0                    1                    0                   13

All types represent one row or column of a 4 by 4 matrix, so their size is predictably always 4; the column type has extent and true_extent both equal to 4 units as well, since it represents four contiguous reals in memory; the type created with mpi_type_vector has extent and true_extent both equal to 13 reals, as I expected (see the nice sketch); if I want to use it with count > 1, I must resize it, changing its extent (and true_extent stays the same); now the hard part comes:

What is that 16 as extent of the type created with mpi_type_create_subarray? To be honest I'd have expected that routine to return an already resized type, ready to be used with count > 1 (i.e. a type with size = 4, extent = 1, true_extent = 13), but it seems it does not: surprisingly for me, extent is 16, which is the size of the global array!

The question is: why? Why the extent of a type created with mpi_type_create_subarray is the product of the elements of the array_of_sizes argument?

enter image description here

program subarray
use mpi
implicit none
integer :: i, j, k, ierr, myid, npro, rs, mycol, myrowugly, myrow_vec, myrow_sub
integer(kind = mpi_address_kind) :: lb, extent, tlb, textent
real, dimension(:,:), allocatable :: mat
call mpi_init(ierr)
call mpi_comm_rank(mpi_comm_world, myid, ierr)
call mpi_comm_size(mpi_comm_world, npro, ierr)
allocate(mat(npro,npro))
mat = myid*1.0
call mpi_type_size(mpi_real, rs, ierr)

call mpi_type_contiguous(npro, mpi_real, mycol, ierr)
call mpi_type_commit(mycol, ierr)
call mpi_type_size(mycol, k, ierr)
call mpi_type_get_extent(mycol, lb, extent, ierr)
call mpi_type_get_true_extent(mycol, tlb, textent, ierr)
if (myid == 0) print *, 'mpi_type_contiguous         ', k/rs, lb/rs, extent/rs, tlb/rs, textent/rs

call mpi_type_vector(npro, 1, npro, mpi_real, myrowugly, ierr)
call mpi_type_size(myrowugly, k, ierr)
call mpi_type_get_extent(myrowugly, lb, extent, ierr)
call mpi_type_get_true_extent(myrowugly, tlb, textent, ierr)
if (myid == 0) print *, 'mpi_type_vector             ', k/rs, lb/rs, extent/rs, tlb/rs, textent/rs
call mpi_type_create_resized(myrowugly, int(0, mpi_address_kind)*rs, int(1, mpi_address_kind)*rs, myrow_vec, ierr)
call mpi_type_commit(myrow_vec, ierr)
call mpi_type_size(myrow_vec, k, ierr)
call mpi_type_get_extent(myrow_vec, lb, extent, ierr)
call mpi_type_get_true_extent(myrow_vec, tlb, textent, ierr)
if (myid == 0) print *, 'mpi_type_vector res         ', k/rs, lb/rs, extent/rs, tlb/rs, textent/rs

call mpi_type_create_subarray(2, [npro, npro], [1, npro], [0, 0], mpi_order_fortran, mpi_real, myrowugly, ierr)
call mpi_type_size(myrowugly, k, ierr)
call mpi_type_get_extent(myrowugly, lb, extent, ierr)
call mpi_type_get_true_extent(myrowugly, tlb, textent, ierr)
if (myid == 0) print *, 'mpi_type_create_subarray    ', k/rs, lb/rs, extent/rs, tlb/rs, textent/rs

call mpi_type_create_resized(myrowugly, int(0, mpi_address_kind)*rs, int(1, mpi_address_kind)*rs, myrow_sub, ierr)
call mpi_type_commit(myrow_sub, ierr)
call mpi_type_size(myrow_sub, k, ierr)
call mpi_type_get_extent(myrow_sub, lb, extent, ierr)
call mpi_type_get_true_extent(myrow_sub, tlb, textent, ierr)
if (myid == 0) print *, 'mpi_type_create_subarray res', k/rs, lb/rs, extent/rs, tlb/rs, textent/rs

!if (myid == 0) call mpi_send(mat(1,1), 2, mycol, 1, 666, mpi_comm_world, ierr)
!if (myid == 0) call mpi_recv(mat(1,3), 2, mycol, 1, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_recv(mat(1,1), 2, mycol, 0, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_send(mat(1,3), 2, mycol, 0, 666, mpi_comm_world, ierr)
!if (myid == 0) call mpi_send(mat(1,1), 2, myrow_vec, 1, 666, mpi_comm_world, ierr)
!if (myid == 0) call mpi_recv(mat(3,1), 2, myrow_vec, 1, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_recv(mat(1,1), 2, myrow_vec, 0, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_send(mat(3,1), 2, myrow_vec, 0, 666, mpi_comm_world, ierr)
!if (myid == 0) call mpi_send(mat(1,1), 2, myrow_sub, 1, 666, mpi_comm_world, ierr)
!if (myid == 0) call mpi_recv(mat(3,1), 2, myrow_sub, 1, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_recv(mat(1,1), 2, myrow_sub, 0, 666, mpi_comm_world, mpi_status_ignore, ierr)
!if (myid == 1) call mpi_send(mat(3,1), 2, myrow_sub, 0, 666, mpi_comm_world, ierr)
!do i = 0, npro
!if (myid == i) then
!print *, ""
!print *, myid
!do j = 1, npro
!print *, mat(j,:)
!end do
!end if
!call mpi_barrier(mpi_comm_world, ierr)
!end do

call mpi_finalize(ierr)
end program subarray
Enlico
  • 23,259
  • 6
  • 48
  • 102
  • (answer to the disappeared comment) I do not have errors, I simply don't understand whats wrong with my understanding of those two MPI tools, as applied to a user defined type, such as one created with `MPI_TYPE_CREATE_SUBARRAY` (with a prefefined data type, such as `MPI_DOUBLE_PRECISION`, there's no difference). – Enlico Apr 30 '18 at 10:29
  • You don't have to answer deleted comments, there is a reason why I deleted it after all... By error I meant different results, but anyway, I deleted the comment. Your code is not directly compilable, but the necessary declarations are well predictable. – Vladimir F Героям слава Apr 30 '18 at 10:52
  • Regarding your edit (I repeat that my first comment is deleted so forget it), even programs that produce some result to be explained usually need a MWE, not just programs with errors but all other programs. Anyway, your code is probably fine and the declarations are predictable, that's why I deleted the comment. – Vladimir F Героям слава Apr 30 '18 at 10:59
  • long story short, `MPI_Type_create_subarray()` might modify the extent under the hood. In Fortran, a datatype that describes a column of a 2D square matrix ends up with `size==extent`, but a datatype that describes a row has the same size, but `extent` is the size of one element. – Gilles Gouaillardet Apr 30 '18 at 13:00
  • @GillesGouaillardet, what do you mean by _size of one element_? As far as I've read and seen, the `extent` of a type representing a row of a `m` by `n` Fortran matrix is `1+m*(n-1)`, and it's what I expected `mpi_type_get_extent` wouuld return. – Enlico Apr 30 '18 at 20:01
  • You are mixing `extent` and `true_extent`. Feel free to `MPI_Send(matrix, 2, row_datatype, ...)` and when you get correct results, double check `extent` and `true_extent`. – Gilles Gouaillardet May 01 '18 at 00:32
  • @GillesGouaillardet, indeed the title of the question is exactly _difference between [...]_. – Enlico May 01 '18 at 05:33
  • what about writing a [MCVE] as I previously suggested ? You might want to write it in `C` and create a datatype for one column. – Gilles Gouaillardet May 01 '18 at 06:07
  • (I'm not a C programmer, it'd take to much to write it.) The link you suggested says, just after the title, _When asking a question about a problem caused by your code_; well, my code has no problem, my understanding of `extent` has problem, which I'm trying to solve here. Anyway I've added some compilable code which can be run. – Enlico May 01 '18 at 08:06
  • Yes, the commented lines do that. I don't want to be repetitive, but the code **works**. **Properly**. It works because I appropriately resize the datatypes I create to represent non-contiguous memory locations. I'm just trying to understand what is the meaning of the `extent` that `mpi_type_create_subarray` confers to the type it creates. – Enlico May 01 '18 at 08:26

1 Answers1

1

MPI_Type_create_subarray() creates a derived datatype whose extent is, per definition, the product of all sizes.

The definition is in the MPI 3.1 standard at page 96.

MPI_Type_create_subarray() is generally used for MPI-IO, so this definition of the extent makes sense there.

It might not be what you wish in this very specific case, but think of a 2x2 subarray of a 4x4 array. What extent would you expect ?

Gilles Gouaillardet
  • 8,193
  • 11
  • 24
  • 30
  • So it is `(ub_marker, size0*ex)`, the last element of the typemap of eq. (4.2), to be "responsible", have I correctly understood? – Enlico May 01 '18 at 13:22
  • Yes, this is my understanding – Gilles Gouaillardet May 01 '18 at 13:24
  • Oh, finally! I hope this question has given you the opportunity to reinforce your knowledge, as it helped me in improving mine. Btw, being the datatype in question a subarray `S` of an array `A`, I expected its `extent` to be equal either to its `true_extent` (which is a property of `S` as part of `A`, in which it "lives"), or, at least, to its `subsize` (which would be a property of `S` only), but not `size` which is a property of `A` only. – Enlico May 01 '18 at 13:40