Troubleshooting SEWARD I/O performance

MaxParadiz · 2020-03-30 12:00:00

Hello!

I am in the process of shifting from working on a cluster to a supercomputer. I have compiled OpenMolcas in both. However, I quickly noticed that Seward is considerably faster in the cluster than it is in the super computer. I have ran a simple test with a 45-atom molecule in SERIAL, single thread, cc-pVDZ (445 basis functions). In the cluster, the module seward is done in 23 minutes, and in the super computer it takes 2 hours 25 minutes. It appears that the reading/writing of ORDINT and TEMP01 is the bottleneck.

I am trying to undestand why this is happening to see if there is a way to improve the performance on the supercomputer.

These are the specs of the nodes that I am using on each of the systems:

Cluster (Seward is 6 times faster here):
1 Intel Xeon Gold Processor 6130 (22M Cache, 2.1 GHz, 16 cores)
Memory 96 GB UPI
Every compute node in the system contains a disk. These disks are much more efficient than the home file system and they are only accessible within the node itself.
The scratch file system is located on such a local disk.

Supercomputer:
2 x Intel Xeon Processor E5-2690 v3 (30M cache, 2.6 GHz, 12 cores)
Memory 64 GB
"/scratch-local/" behaves like it is local to each node, whereas "/scratch-shared/" denotes the same location on every node. But in fact not even the /scratch-local/ directories are truly (physically) local

Cluster (23 minutes)

configuration info                                                                                                                                                       
------------------                                                                                                                                                       
C Compiler ID: GNU                                                                                                                                                       
C flags: -std=gnu99 -fopenmp                                                                                                                                             
Fortran Compiler ID: GNU                                                                                                                                                 
Fortran flags: -cpp -fno-aggressive-loop-optimizations -fdefault-integer-8 -fopenmp                                                                                      
Definitions: _MOLCAS_;_I8_;_LINUX_;_GA_;_MOLCAS_MPP_;SCALAPACK;_MKL_                                                                                                     
Parallel: ON (GA=ON)  

&GATEWAY                                                                                                                                                                 
coord=$CurrDir/Geom.xyz                                                                                                                                             
basis=cc-pVDZ                                                                                                                                                            
group=C1                                                                                                                                                                                                                                                                                                                               
&SEWARD                                                                                                                                                                  
           
++ I/O STATISTICS                                                                                                                                                        
                                                                                                                                                                         
  I. General I/O information                                                                                                                                             
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
  Unit  Name          Flsize      Write/Read            MBytes           Write/Read                                                                                      
                      (MBytes)       Calls              In/Out           Time, sec.                                                                                      
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
   1  RUNFILE          16.20 .    1316/    8224 .     25.2/     89.9 .       0/       0                                                                                  
   2  NQGRID            0.00 .       2/       0 .      0.0/      0.0 .       0/       0                                                                                  
   3  ONEINT           16.50 .      44/     889 .     47.6/     28.4 .       0/       0                                                                                  
   4  ORDINT        42801.78 . 5079503/  480480 . 156050.0/ 120120.0 .     592/      16                                                                                  
   5  TEMP01        17965.38 . 4742742/  143723 .  35930.3/  17965.4 .      22/       3                                                                                  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
   *  TOTAL         60799.86 . 9823607/  633316 . 192053.1/ 138203.7 .     615/      20                                                                                  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
                                                                                                                                                                         
  II. I/O Access Patterns                                                                                                                                                
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
  Unit  Name               % of random                                                                                                                                   
                         Write/Read calls                                                                                                                                
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
   1  RUNFILE             28.6/  10.6                                                                                                                                    
   2  NQGRID              50.0/   0.0                                                                                                                                    
   3  ONEINT              93.2/   1.7                                                                                                                                    
   4  ORDINT              65.0/  64.4                                                                                                                                    
   5  TEMP01              63.0/ 100.0                                                                                                                                    
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
--                                                                                                                                                                       
--- Stop Module: seward at Sun Mar 29 18:29:04 2020 /rc=_RC_ALL_IS_WELL_ ---                                                                                                                                                                           
--- Module seward spent 23 minutes 17 seconds ---

Super computer (2 hours, 25 min):

                                                                                                                                                                       configuration info                                                                                                                                                       
------------------                                                                                                                                                       
C Compiler ID: GNU                                                                                                                                                       
C flags: -std=gnu99 -fopenmp                                                                                                                                             
Fortran Compiler ID: GNU                                                                                                                                                 
Fortran flags: -fno-aggressive-loop-optimizations -cpp -fdefault-integer-8 -fopenmp                                                                                      
Definitions: _MOLCAS_;_I8_;_LINUX_;_GA_;_MOLCAS_MPP_;SCALAPACK;_MKL_                                                                                                     
Parallel: ON (GA=ON)    

++ ---------   Input file   ---------                                                                                                                                    
                                                                                                                                                                         
&GATEWAY                                                                                                                                                                 
coord=$CurrDir/Geom.xyz                                                                                                                                             
basis=cc-pVDZ                                                                                                                                                            
group=C1                                                                                                                                                                 
&SEWARD                                                                                                                                                                  
                                                                                                                                                                         
-- ----------------------------------  

++ I/O STATISTICS                                                                                                                                                        
                                                                                                                                                                         
  I. General I/O information                                                                                                                                             
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
  Unit  Name          Flsize      Write/Read            MBytes           Write/Read                                                                                      
                      (MBytes)       Calls              In/Out           Time, sec.                                                                                      
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
   1  RUNFILE          16.20 .    1428/    8775 .     26.0/     93.9 .       0/       0                                                                                  
   2  NQGRID            0.00 .       2/       0 .      0.0/      0.0 .       0/       0                                                                                  
   3  ONEINT           23.31 .      62/     889 .     66.6/     28.4 .       0/       0                                                                                  
   4  ORDINT        41057.53 . 5285395/  471585 . 155504.2/ 117896.2 .     723/    5091                                                                                  
   5  TEMP01        18804.12 . 4964239/  150433 .  37608.1/  18804.1 .     128/    1853                                                                                  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
   *  TOTAL         59901.17 .10251126/  631682 . 193204.8/ 136822.7 .     852/    6944                                                                                  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -                                                                                  
                                                                                                                                                                         
  II. I/O Access Patterns                                                                                                                                                
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
  Unit  Name               % of random                                                                                                                                   
                         Write/Read calls                                                                                                                                
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
   1  RUNFILE             28.6/  10.6                                                                                                                                    
   2  NQGRID              50.0/   0.0                                                                                                                                    
   3  ONEINT              95.2/   1.7                                                                                                                                    
   4  ORDINT              57.3/  65.2                                                                                                                                    
   5  TEMP01              54.8/ 100.0                                                                                                                                    
  - - - - - - - - - - - - - - - - - - - -                                                                                                                                
--                                                                                                                                                                       
--- Stop Module: seward at Mon Mar 30 03:02:25 2020 /rc=_RC_ALL_IS_WELL_ ---                                                                                                               
--- Module seward spent 2 hours 25 minutes 22 seconds ---

From looking at this, my thoughts are that either

(A) The disk being truly local in the cluster provides a very substantial I/O performance boost.
(B) There is something else that I might be able to change in order to boost the prformance.

If someone has any tips for how I can navigate through this issue I would appreciate it very much! My experience with this is very limited, so even resources to learn the basics of I/O performance and benchmarking would be helpful smile

Thank you!
Max

Ignacio · 2020-03-30 12:18:31

Of course, a truly local scratch directory is what you want for a reasonable efficiency. A non-local scratch directory is asking for trouble. As a possible workaround you could use RICD, that will reduce I/O, but you'll still have an I/O problem with non-local scratch.

MaxParadiz · 2020-03-30 13:05:31

Ignacio wrote:

Of course, a truly local scratch directory is what you want for a reasonable efficiency. A non-local scratch directory is asking for trouble. As a possible workaround you could use RICD, that will reduce I/O, but you'll still have an I/O problem with non-local scratch.

Thank you Ignacio. I have done some experimenting with RICD, and it has worked very well for me in XMS-CASPT2 single-point calculations. However, when trying to optimize MECPs is where I am facing problems. During my first attempt, Alaska triggered the Numerical Gradients. I then added "DoAnalytical" to seward, and now Alaska calls MCLR, and on the second Alaska call it freezes (the last message that prints is: " A total of 11606515. entities were prescreened and 11606515. were kept.", this message does not appear without RICD. I have left the calculation running for over 10 hours now and it seems stuck (on my previous optimizations on the cluster these alaska calls without RICD took 5 minutes).

This is my input:

&GATEWAY                                                                                                                                                                 
Coord=Geom.xyz                                                                                                                                                   
Basis=cc-pVDZ                                                                                                                                                            
Group=NoSym                                                                                                                                                              
RICD                                                                                                                                                                     
Constraints                                                                                                                                                              
 a = Ediff 1 2                                                                                                                                                           
 Value                                                                                                                                                                   
 a = 0.000                                                                                                                                                               
End of Constraints                                                                                                                                                       
>>> EXPORT MOLCAS_MAXITER=300                                                                                                                                            
>> Do While                                                                                                                                                              
&SEWARD                                                                                                                                                                  
DoAnalytical                                                                                                                                                             
&RASSCF                                                                                                                                                                  
Spin=1                                                                                                                                                                   
Charge=0                                                                                                                                                                 
CIRoot = 2 2 1                                                                                                                                                           
&ALASKA                                                                                                                                                                  
 PNEW                                                                                                                                                                    
&SLAPAF                                                                                                                                                                  
>>> EndDo

Do you have any suggestion of what might be going on?

Thanks again!
Max

Ignacio · 2020-03-31 08:13:10

It could have the same I/O problems as SEWARD.

MaxParadiz · 2020-04-03 13:32:39

Thanks! Too bad things aren't as simple as I had hoped... Still, it is lots of fun to learn. I got in touch with the sys admins and they have confirmed that it is due to the non-locality and the heavy load on the scratch. However, they did suggest a few ways to potentially overcome this issue. One of the main things that they suggest is that I look into their "lustre striping" system:

"The Lustre file system uses 48 OSTs (=Object Storage Targets), each with multiple disks, to store all the data in parallel. The OSTs are connected with InfiniBand to the compute nodes. By default, the Lustre file system stores each file on a single OST, which works quite well for most situations where files are accessed from a single process and files are relatively small (<10GB). However, when very large files need to be read or written in parallel from multiple nodes, Lustre striping is needed to reach a good performance."

I am still not sure whether this would only improve performance when running in multiple nodes, or if I can also use single-node multi-threading to access the multiple OSTs in parallel and improve the I/O performance. I am also not sure yet about whether MOLCAS can use this system at all. I'll do some more reading/testing and hopefully report back some positive news smile

Molcas Forum

Announcement

#1 2020-03-30 12:00:00

Troubleshooting SEWARD I/O performance

#2 2020-03-30 12:18:31

Re: Troubleshooting SEWARD I/O performance

#3 2020-03-30 13:05:31

Re: Troubleshooting SEWARD I/O performance

#4 2020-03-31 08:13:10

Re: Troubleshooting SEWARD I/O performance

#5 2020-04-03 13:32:39

Re: Troubleshooting SEWARD I/O performance

Board footer