Error: undefined reference to `MPIX_Query_cuda_support'

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
siwakorn_sukharom
Newbie
Newbie
Posts: 8
Joined: Mon Oct 17, 2022 3:17 am

Error: undefined reference to `MPIX_Query_cuda_support'

#1 Post by siwakorn_sukharom » Tue Feb 07, 2023 4:35 am

I'm trying to install VASP.6.3.2 using NVHPC 21.9 on Cray EX system

Here these are the loaded modules:

Code: Select all

Currently Loaded Modules:
  1) craype-x86-rome      4) nvhpc/21.9         7) cray-mpich/8.1.17      10) cray-fftw/3.3.10.1
  2) libfabric/1.15.0.0   5) craype/2.7.16      8) cray-libsci/21.08.1.2  11) craype-accel-nvidia80
  3) craype-network-ofi   6) cray-dsmml/0.2.2   9) PrgEnv-nvhpc/8.3.3
This is the makefile.include which i use the template from makefile.include.nvhpc_acc

Code: Select all

CPP_OPTIONS = -DHOST=\"LinuxNV\" \
              -DMPI -DMPI_BLOCK=8000 -Duse_collective \
              -DscaLAPACK \
              -DCACHE_SIZE=4000 \
              -Davoidalloc \
              -Dvasp6 \
              -Duse_bse_te \
              -Dtbdyn \
              -Dqd_emulate \
              -Dfock_dblbuf \
              -D_OPENACC \
              -DUSENCCL -DUSENCCLP2P

CPP         = ftn -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX)  > $*$(SUFFIX)
# N.B.: you might need to change the cuda-version here
#       to one that comes with your NVIDIA-HPC SDK
FC          = ftn -acc -gpu=cc60,cc70,cc80,cuda11.4
FCL         = ftn -acc -gpu=cc60,cc70,cc80,cuda11.4 -c++libs
FREE        = -Mfree
FFLAGS      = -Mbackslash -Mlarge_arrays
OFLAG       = -fast
DEBUG       = -Mfree -O0 -traceback
OBJECTS     = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
LLIBS       = -cudalib=cublas,cusolver,cufft,nccl -cuda

# Redefine the standard list of O1 and O2 objects
SOURCE_O1  := pade_fit.o
SOURCE_O2  := pead.o

# For what used to be vasp.5.lib
CPP_LIB     = $(CPP)
FC_LIB      = ftn
CC_LIB      = cc -w
CFLAGS_LIB  = -O
FFLAGS_LIB  = -O1 -Mfixed
FREE_LIB    = $(FREE)

OBJECTS_LIB = linpack_double.o

# For the parser library
CXX_PARS    = CC --no_warnings

VASP_TARGET_CPU ?= -tp host
FFLAGS     += $(VASP_TARGET_CPU)

# Specify your NV HPC-SDK installation (mandatory)
#... first try to set it automatically
#NVROOT      =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }')

# If the above fails, then NVROOT needs to be set manually
NVHPC      ?= /opt/nvidia/hpc_sdk
NVVERSION   = 21.9
NVROOT      = $(NVHPC)/Linux_x86_64/$(NVVERSION)

# Software emulation of quadruple precsion (mandatory)
QD         ?= $(NVROOT)/compilers/extras/qd
LLIBS      += -L$(QD)/lib -lqdmod -lqd
INCS       += -I$(QD)/include/qd

BLAS =
LAPACK =
SCALAPACK =

FFTW_ROOT  ?= /path/to/your/fftw/installation
LLIBS      += -L$(FFTW_ROOT)/lib -lfftw3
INCS       += -I$(FFTW_ROOT)/include
and i got the error result like this

Code: Select all

ftn -acc -gpu=cc60,cc70,cc80,cuda11.4 -c++libs -o vasp c2f_interface.o nccl2for.o simd.o base.o profiling.o string.o tutor.o versiand_line.o vhdf5_base.o incar_reader.o reader_base.o openmp.o openacc_struct.o mpi.o mpi_shmem.o mathtools.o hamil_struct.o radial pseudo_struct.o mgrid_struct.o wave_struct.o nl_struct.o mkpoints_struct.o poscar_struct.o afqmc_struct.o fock_glb.o chi_glb.o smate.o xml.o extpot_glb.o constant.o ml_ff_c2f_interface.o ml_ff_prec.o ml_ff_constant.o ml_ff_taglist.o ml_ff_struct.o ml_ff_mpi_hff_mpi_shmem.o vdwforcefield_glb.o jacobi.o main_mpi.o openacc.o scala.o asa.o lattice.o poscar.o ini.o mgrid.o ml_ff_error.o ml_fl_ff_helper.o ml_ff_logfile.o ml_ff_math.o ml_ff_iohandle.o ml_ff_memory.o ml_ff_abinitio.o ml_ff_ff.o ml_ff_mlff.o setex_struct.ovdw_nl.o xclib_grad.o setex.o radial.o pseudo.o gridq.o ebs.o symlib.o mkpoints.o random.o wave.o wave_mpi.o wave_high.o bext.o spymmetry.o lattlib.o nonl.o nonlr.o nonl_high.o dfast.o choleski2.o mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o constrmag.o c relativistic.o LDApU.o paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o diis.o rhfatm.o hyperfine.o fock_ace.o paw.o mkpo.o charge.o Lebedev-Laikov.o stockholder.o dipol.o solvation.o scpc.o pot.o tet.o dos.o elf.o hamil_rot.o chain.o dyna.o fileio.o phpro.o us.o core_rel.o aedens.o wavpre.o wavpre_noio.o broyden.o dynbr.o reader.o writer.o xml_writer.o brent.o stufak.o opergridr.o fast_aug.o fock_multipole.o fock.o fock_dbl.o fock_frc.o mkpoints_change.o subrot_cluster.o sym_grad.o mymath.o npt_dynamics.o.o subdftd4.o internals.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o nmr.o pead.o k-proj.o subrot.o subrot_scf.o ption.o rpa_force.o ml_reader.o ml_interface.o force.o pwlhf.o gw_model.o optreal.o steep.o rmm-diis.o davidson.o david_inner.o roolcao_bare.o locproj.o electron_common.o electron.o rot.o electron_all.o shm.o pardens.o optics.o constr_cell_relax.o stm.o finite_pol.o hamil_lr.o rmm-diis_lr.o subrot_lr.o lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o linear_optics.o setlocalper.o electron_OEP.o electron_lhf.o twoelectron4o.o gauss_quad.o m_unirnk.o minimax_ini.o minimax_dependence.o minimax_functions1D._functions2D.o minimax_struct.o minimax_varpro.o minimax.o umco.o mlwf.o ratpol.o pade_fit.o screened_2e.o wave_cacher.o crpa.o chwpot.o local_field.o ump2.o ump2kpar.o fcidump.o ump2no.o bse_te.o bse.o time_propagation.o acfdt.o afqmc.o rpax.o chi.o acfdt_GG.GG_base.o greens_orbital.o lt_mp2.o rnd_orb_mp2.o greens_real_space.o chi_GG.o chi_super.o sydmat.o rmm-diis_mlr.o linear_responsennier_interpol.o wave_interpolate.o linear_response.o auger.o dmatrix.o phonon.o wannier_mats.o elphon.o core_con_mat.o embed.o exa_high.o fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o main.o  -Llib -ldmy -Lparser -lparser -cudalib=cublas,cusolver,cufft,nccl -cudnvidia/hpc_sdk/Linux_x86_64/21.9/compilers/extras/qd/lib -lqdmod -lqd -L/opt/cray/pe/fftw/3.3.10.1/x86_rome/lib -lfftw3
/usr/bin/ld: warning: /opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/lib/nvhpc.ld contains output sections; did you forget -T?
/usr/bin/ld: openacc.o: in function `mopenacc_init_acc_cuda_aware_support':
/lustrefs/disk/home/siwakorn/VASP/vasp.6.3.2/build/std/openacc.f90:194: undefined reference to `MPIX_Query_cuda_support'
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:132: vasp] Error 2
make[2]: Leaving directory '/lustrefs/disk/home/siwakorn/VASP/vasp.6.3.2/build/std'
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:129: all] Error 1
make[1]: Leaving directory '/lustrefs/disk/home/siwakorn/VASP/vasp.6.3.2/build/std'
make: *** [makefile:17: std] Error 2
any suggestion ?

fabien_tran1
Global Moderator
Global Moderator
Posts: 419
Joined: Mon Sep 13, 2021 11:02 am

Re: Error: undefined reference to `MPIX_Query_cuda_support'

#2 Post by fabien_tran1 » Wed Feb 08, 2023 10:25 am

Hi,

I got help from colleagues that are experts in this topic, and the info that they provide is the following:

The call to MPIX_Query_cuda_support was put in place to provide the user with a meaningful error message when trying to use the OpenACC version of the code without a CUDA-aware version of MPI, instead of having the code simple crash. Unfortunately, it seems Cray MPICH or just some versions thereof do not provide this functionality yet.

To work around this, comment out the interface to MPIX_Query_cuda_support in openacc.F (lines 185-188):

! INTERFACE
! INTEGER(c_int) FUNCTION MPIX_Query_cuda_support() BIND(C, name="MPIX_Query_cuda_support")
! END FUNCTION
! END INTERFACE

and replace (at line 194)

CUDA_AWARE_SUPPORT = MPIX_Query_cuda_support() == 1

by

CUDA_AWARE_SUPPORT = .TRUE.

siwakorn_sukharom
Newbie
Newbie
Posts: 8
Joined: Mon Oct 17, 2022 3:17 am

Re: Error: undefined reference to `MPIX_Query_cuda_support'

#3 Post by siwakorn_sukharom » Wed Feb 08, 2023 4:00 pm

This works. Thank you very much

Post Reply