Problems running VASP: crashes, internal errors, "wrong" results.
Moderators: Global Moderator, Moderator
-
vladimir.ladygin
- Newbie

- Posts: 3
- Joined: Wed Jan 20, 2021 1:02 pm
#1
Post
by vladimir.ladygin » Tue Feb 25, 2025 1:15 am
Dear VASP Developers,
I've got an official bug trying to use one core per one gpu nccl setup on NERSC Perlmutter. This is just an ordinary relaxation calculation.
"""internal error in: mpi.F at line: 903
M_init_nccl: Error in ncclCommInitRank
If you are not a developer, you should not encounter this problem.
Please submit a bug report.
"""
Kind Regards,
Vladimir
You do not have the required permissions to view the files attached to this post.
-
ferenc_karsai
- Global Moderator

- Posts: 530
- Joined: Mon Nov 04, 2019 12:44 pm
#2
Post
by ferenc_karsai » Tue Feb 25, 2025 9:48 am
Thanks for the report, I will try to reproduce the error on our machines.
-
ferenc_karsai
- Global Moderator

- Posts: 530
- Joined: Mon Nov 04, 2019 12:44 pm
#3
Post
by ferenc_karsai » Wed Feb 26, 2025 12:30 pm
I talked to a colleague and he observed a similar bug before from a user on the forum.
Here is the originale post:
https://www.vasp.at/forum/viewtopic.php?t=19822
The solution for the moment is that you don't use NCCL, so compile without -DUSENCCL in the makefile.include.