My Community

Posted: **Fri Feb 14, 2025 2:33 pm**

Hello,

I have some memory problems during my force field training, which interstingly seem not to arise from the MLFF part. I am doing a continuation run with a new phase, before I had trained some bulk water, now I add bulk Zn to the data. Water needs a large basis set, so the memory requirement for the ML part is sizable with 5719.9 MB, according to the estimation in the ML_LOGFILE. But this number is absolutely dwarfed by the requirements of the DFT part. According to the OUTCAR the wavefunction takes an excessive amount of memory:

Code: Select all

 total amount of memory used by VASP MPI-rank0  1119160. kBytes
=======================================================================

   base      :      30000. kBytes
   nonlr-proj:      14083. kBytes
   fftplans  :       6687. kBytes
   grid      :       9436. kBytes
   one-center:         62. kBytes
   wavefun   :    1058892. kBytes

high-k-point-density.zip

There are 96 Zn atoms in the cell and there are 71 k-points in the IBZ. Why is the wavefunction taking so much memory? In a separate DFT relaxation with 64 Zn atoms and the same k-point density in the automatic k-point generation scheme (32 in the IBZ) the wavfunction took up 110929 kBytes, one order of magnitude less.

For comparison, at first I did the MLFF training with a wrong k-point density, with only 4 k-points in the IBZ, then the memory requirement was much lower, and the training worked, but the error was of course too large:

Code: Select all

total amount of memory used by VASP MPI-rank0   120276. kBytes
=======================================================================

   base      :      30000. kBytes
   nonlr-proj:      14808. kBytes
   fftplans  :       7165. kBytes
   grid      :      10091. kBytes
   one-center:         62. kBytes
   wavefun   :      58150. kBytes

low-k-point-density.zip

Is the memory allotment in the high k-point density case "correct"? Can I do something about it other than reducing the number of k-points (and using more nodes)? 1100 GB memory consumption seem ridiculous for such a modest cell.

Posted: **Fri Feb 14, 2025 10:09 pm**

Hello!

Thanks for reaching out. Unfortunately, the memory consumption seems to be realistic in this case. The cell size and k-point density are important factors, but what really drives up the memory requirements is the high ENCUT in combination with PREC=accurate. This results in NPLWV=967680 plane-wave coefficients. Setting NCORE to a large value is already the best thing you can do to reduce memory if you do not want to compromise on accuracy. However, I would suggest to reduce ENCUT or try running with PREC=normal. This should have the biggest impact.

Let me know if that works for you.

My Community

Excessive memory use in FF training

Excessive memory use in FF training

Re: Excessive memory use in FF training