8

I have build environment to run code in a Docker container. One of the components is OpenMPI, which I think is the source of problem or manifest it.

When I run code using MPI I getting message,

Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/NHW6L2TB73FPMK4A52XDP6SO2V:/var/lib/docker/overlay2/l/MKAGUDHZZTJF4KNSUM73QGVRUD:/var/lib/docker/overlay2/l/4PFRG6M47TX5TYVHKQQO2KCG7Q:/var/lib/docker/overlay2/l/4UR3OEP3IW5ZTADZ6OKT77ZBEU:/var/lib/docker/overlay2/l/LGBMK7HFUCHRTM2MMITMD6ILMG:/var/lib/docker/overlay2/l/ODJ2DJIGYGWRXEJZ6ECSLG7VDJ:/var/lib/docker/overlay2/l/JYQIR5JVEUVQPHEF452BRDVC23:/var/lib/docker/overlay2/l/AUDTRIBKXDZX62ANXO75LD3DW5:/var/lib/docker/overlay2/l/RFFN2MQPDHS2Z'
Unexpected end of /proc/mounts line `KNEJCAQH6YG5S:/var/lib/docker/overlay2/l/7LZSAIYKPQ56QB6GEIB2KZTDQA:/var/lib/docker/overlay2/l/CP2WSFS5347GXQZMXFTPWU4F3J:/var/lib/docker/overlay2/l/SJHIWRVQO5IENQFYDG6R5VF7EB:/var/lib/docker/overlay2/l/ICNNZZ4KB64VEFSKEQZUF7XI63:/var/lib/docker/overlay2/l/SOHRMEBEIIP4MRKRRUWMFTXMU2:/var/lib/docker/overlay2/l/DL4GM7DYQUV4RQE4Z6H5XWU2AB:/var/lib/docker/overlay2/l/JNEAR5ISUKIBKQKKZ6GEH6T6NP:/var/lib/docker/overlay2/l/LIAK7F7Q4SSOJBKBFY4R66J2C3:/var/lib/docker/overlay2/l/MYL6XNGBKKZO5CR3PG3HIB475X:/var/lib/do'

That message is printed for code line

MPI_Init(&argc,&argv);

To make the problem more complex to understand, a warning message is printed only when the host machine is mac os x, for linux host all is ok.

Except for warning message all works fine. I do not know how OpenMPI and docker well enough how this can be fixed.

likask
  • 215
  • 3
  • 7

2 Answers2

12

This is likely due to your /proc/mount file having a line in it greater than 512 characters, causing the hwloc module of OpenMPI to fail to parse it correctly. Docker has a tendency to put very long lines into /proc/mounts. You can see the bug in openmpi-1.10.7/opal/mca/hwloc/hwloc191/hwloc/src/topology-linux.c:1677:

static void
hwloc_find_linux_cpuset_mntpnt(char **cgroup_mntpnt, char **cpuset_mntpnt, int fsroot_fd)
{
#define PROC_MOUNT_LINE_LEN 512
  char line[PROC_MOUNT_LINE_LEN];
  FILE *fd;

  *cgroup_mntpnt = NULL;
  *cpuset_mntpnt = NULL;

  /* ideally we should use setmntent, getmntent, hasmntopt and endmntent,
   * but they do not support fsroot_fd.
   */

  fd = hwloc_fopen("/proc/mounts", "r", fsroot_fd);
  if (!fd)
    return;

This can be fixed by increasing the value of PROC_MOUNT_LINE_LEN, although that should be considered a temporary workaround.

flaviut
  • 2,007
  • 3
  • 23
  • 32
Lisanna
  • 133
  • 1
  • 4
  • That is great answer. Following question, can we somehow fix it? Using mpich or newer version of OpenMPI would help? – likask Sep 12 '17 at 06:13
  • 6
    @likask When I ran into the problem recently, I didn't have time to patch and rebuild OpenMPI, so a workaround that I did was I flattened my Docker container (https://tuhrig.de/flatten-a-docker-container-or-image/). The /proc/mounts line is long because you start a container from an image that has many intermediate images in its history, and flatting my docker image made that line drop below 512 characters. – Lisanna Sep 13 '17 at 16:06
4

This issue should be fixed in hwloc since 1.11.3 (released 2 years ago). You can either upgrade to OpenMPI 3.0 which contains a hwloc 1.11.7 >= 1.11.3. Or recompile OpenMPI to use an external hwloc instead of the old embedded one.

Brice
  • 96
  • 2
  • 1
    I installed OpenMPI 3.0 from source and still encounter this problem. The source is from https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.gz – Chiara Hsieh Mar 05 '18 at 10:03
  • How can I configure the build (`./configure`) to do this? – Pratik K. Jan 11 '20 at 10:35