Vasp 6.1.1 fails SiC_HSE tests

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Vasp 6.1.1 fails SiC_HSE tests

#1 Post by rkingsbury » Fri Sep 04, 2020 5:46 pm

Hello, I have comipiled VASP 6.1.0 and VASP 6.1.1 on the NERSC CORI cluster using their official toolchain. The resulting binaries pass all tests in the "fast" test suite except for the four SiC_HSE tests. Has anyone else experienced a similar failure with VASP6?

The tests were run with

Code: Select all

OMP_NUM_THREADS=1
on a single node, with either 4 or 8 MPI ranks (

Code: Select all

srun -n 8 -c 32
or

Code: Select all

srun -n 4 -c 32
).

The test output is

Code: Select all

==================================================================
SUMMARY:
==================================================================
The following tests failed, please check the output file manually:
SiC_HSE06_ALGO=A SiC_HSE06_ALGO=A_RPR SiC_HSE06_ALGO=D SiC_HSE06_ALGO=D_RPR 
an example result from one of the failing tests is

Code: Select all

RK
Ryan Kingsbury
Additional comments•2020-09-03 15:16:37
The NERSC-provided vasp binary (6.1.0-knl) fails the SiC_HSE06_* tests (4 tests total) in the built-in VASP testsuite. I encountered this problem because I am compiling a patched version of VASP 6.1 and wish to validate it using the built-in test suite. I understand that the VASP-provided test suite is designed to be run with 1, 2, 4 or 8 MPI ranks and have run the test suite on a single KNL node with the following srun commands:

OMP_NUM_THREADS=1
VASP_TESTSUITE_EXE_STD='srun -N 1 -n 8 -c 32 --cpu_bind=cores vasp_std`
and
VASP_TESTSUITE_EXE_STD='srun -N 1 -n 4 -c 32 --cpu_bind=cores vasp_std'

The exact test output was

==================================================================
SUMMARY:
==================================================================
The following tests failed, please check the output file manually:
SiC_HSE06_ALGO=A SiC_HSE06_ALGO=A_RPR SiC_HSE06_ALGO=D SiC_HSE06_ALGO=D_RPR

and an example output from a single failing test is

exiting run_recipe SiC_HSE06_ALGO=A
ERROR: the frequencies are different, please check
--------------------------------------------------
f= 820.46 cm-1
f= 820.46 cm-1
f= 820.46 cm-1
---------------------------------------------------------------------------
Comparing files: freq and freq.ref
3 number(s) differ.
Max diff.: 0.389999999999986
(at row number: 1 column number: 2 )
Tolerance: 0.250000000000000
---------------------------------------------------------------------------
ERROR: the test yields different results for the energies, please check
-----------------------------------------------------------------------
-17.68675223
-17.68675223
-17.68442408
-17.68442408
-17.68443044
-17.68443044
-17.68443016
-17.68443016
-17.68442175
-17.68442175
---------------------------------------------------------------------------
Comparing files: energy_outcar and energy_outcar.ref
10 number(s) differ.
#!/bin/bash
Max diff.: 0.124483999999999
(at row number: 7 column number: 1 )
Tolerance: 5.000000000000000E-004

merzuk.kaltak
Administrator
Administrator
Posts: 282
Joined: Mon Sep 24, 2018 9:39 am

Re: Vasp 6.1.1 fails SiC_HSE tests

#2 Post by merzuk.kaltak » Mon Sep 07, 2020 12:36 pm

Most probably these tests fail, because you are running the testsuite with more than 8 MPI (in total).
We have successfully run the testsuite (on 1, 2, 4, 6, and 8 MPI-ranks) using executables built with various toolchains as mentioned here.
Do you have failed tests when running with 1, 2, 4, 6 or MPI ranks?

rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Re: Vasp 6.1.1 fails SiC_HSE tests

#3 Post by rkingsbury » Wed Sep 09, 2020 2:57 am

Thank you for your reply. I get these failures using either 4 or 8 MPI ranks (via

Code: Select all

srun -N 1 -n 8 -c 32
, for example)

merzuk.kaltak
Administrator
Administrator
Posts: 282
Joined: Mon Sep 24, 2018 9:39 am

Re: Vasp 6.1.1 fails SiC_HSE tests

#4 Post by merzuk.kaltak » Wed Sep 09, 2020 7:30 am

Please attach ./testsuite/testsuite.log (or the stdout) including your makefile.include file as a zip file. Also, which compiler suite and libraries do you use?

rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Re: Vasp 6.1.1 fails SiC_HSE tests

#5 Post by rkingsbury » Wed Sep 09, 2020 5:53 pm

Please see attached archive. In this instance I ran only the `SiC_HSE06_ALGO=A_RPR` test. I or a colleague will reply later on with the compiler details.
2020-09-09.zip
You do not have the required permissions to view the files attached to this post.

rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Re: Vasp 6.1.1 fails SiC_HSE tests

#6 Post by rkingsbury » Wed Sep 09, 2020 6:17 pm

Regarding compiler details, we are on Xeon Phi (Knight's Landing) CPUs (https://docs.nersc.gov/systems/cori/#knl-compute-nodes) and use the ifort compiler, version 19.0.3.199 20190206.

rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Re: Vasp 6.1.1 fails SiC_HSE tests

#7 Post by rkingsbury » Wed Sep 23, 2020 7:28 pm

Hello, can you offer any further guidance for troubleshooting this failure? Please note that I submitted our makefile.include to the moderator. Thank you!

merzuk.kaltak
Administrator
Administrator
Posts: 282
Joined: Mon Sep 24, 2018 9:39 am

Re: Vasp 6.1.1 fails SiC_HSE tests

#8 Post by merzuk.kaltak » Wed Sep 30, 2020 11:59 am

Could you run this specific SiC_HSE test on 2, 4 and 6 MPI ranks and post the testsuite.log and the OUTCAR (located in ./testsuite/tests/SiC_HSE_RPR/) of these runs.
Also, I would recommend to compile VASP with the same compiler suite for an alternative hardware, preferably an ordinary Xeon (not Phi), and run this HSE test only.
To run a specific test you have to put these lines into your batch script

Code: Select all

export VASP_TESTSUITE_TESTS="SiC_HSE SiC_HSE_RPR"
Also, it could be helpful to reduce compiler optimization. From your post I have extracted following makefile.include (which I think is not the complete one):

Code: Select all

CPP_OPTIONS= -DHOST=\"VASP6.1.1-r2SCAN-MP\"\
             -DMPI -DMPI_BLOCK=8000 -Duse_collective \
             -DscaLAPACK \
             -DCACHE_SIZE=4000 \
             -Davoidalloc \
             -Dvasp6 \
             -Duse_bse_te \
             -Dtbdyn \
             -Dfock_dblbuf \
             -D_OPENMP \
             -Duse_shmem \
             -Dshmem_bcast_buffer \
             -Dshmem_rproj \
             -Dmemalign64 \
             -D_OPENMP45 -DSIMD512 \
             -DIntelKNL \
             -DVASP2WANNIER90v2 \
             -Dlibbeef \
             -DPROFILING
CPP        = fpp -f_com=no -free -w0  $*$(FUFFIX) $*$(SUFFIX) $(CPP_OPTIONS)
FC         = ftn -qopenmp
FCL        = ftn -qopenmp #-mkl
FREE       = -free -names lowercase
For instance, I would recompile vasp (after a make veryclean) without the

Code: Select all

-DSIMD512
option and run the test again.

rkingsbury
Newbie
Newbie
Posts: 6
Joined: Tue Nov 19, 2019 5:17 pm

Re: Vasp 6.1.1 fails SiC_HSE tests

#9 Post by rkingsbury » Wed Oct 14, 2020 10:11 pm

Thank you for the advice. Recompiling without the

Code: Select all

-DSIMD512
option allowed the binary to pass all tests in the 'fast' test suite with 8 MPI ranks.

Can you elaborate on how the absence of this flag is expected to affect performance, or why it would be related to the HSE test failures?

Post Reply