Molcas Forum

Support and discussions for Molcas and OpenMolcas users and developers

You are not logged in.

Announcement

Welcome to the Molcas forum.

Please note: The forum's URL has changed. The new URL is: https://molcasforum.univie.ac.at. Please update your bookmarks!

You can choose an avatar and change the default style by going to "Profile" → "Personality" or "Display".

#1 2016-02-24 15:28:41

pgirard
Member
Registered: 2016-02-24
Posts: 2

Precompiled parallel version of MOLCAS doesn't wor with 2 nodes

Hello,

Before compiling and installing the source version with Intel compiler, I decided to install and test the precompiled version "molcas-8.0sp1_OpenMPI-1.6.5_CentOS-6.6" on our cluster (x86_64, CentOS 6.7).

We use the batch system called OAR, that makes use of a wrapper for ssh when running a OMPI job, so I set the OMPI_MCA_orte_rsh_agent in the job accordingly:
export OMPI_MCA_orte_rsh_agent=/usr/bin/oarsh

I configured molcas.rte to take into account the OAR node file:

# runtime environment for molcas
OS='Linux-x86_64'
PARALLEL='yes'
DEFMOLCASMEM='2048'
DEFMOLCASDISK='20000'
RUNSCRIPT='$program  $input'
RUNBINARY='mpirun -machinefile $OAR_NODEFILE -np $CPUS $program'
RUNBINARYSER='$program'
PAREXEC=''

I launched a test (taken from MOLCAS tests) on a node by taking 4 cpus (cores), it worked perfectly and provided me with an output. When I tried with 2 nodes, by taking 2 cores on each node, the job got stuck.

By accessing the nodes on which the job was running, I saw:
- on the master node, the right number of parnell.exe processes
- on the second node, just the OMPI orted process

I made a test by adding a wrapper in the RUNBINARY just to see what was launched by MOLCAS on the second node:
/opt/MOLCAS/molcas-8.0sp1_OpenMPI-1.6.5_CentOS-6.6/bin/parnell.exe base /nfs_scratch/cgirpi/MOLCAS/227641

As far as I understood, this command exited immediately with exit code 0, without any error message.

I've got no problem to run other MPI program on several nodes with our OpenMPI-1.6.5, such as ORCA, or some simple MPI program I developped for test purposes.

I cannot figure out why it doesn't work with MOLCAS. Is there any MOLCAS environment variable needed on the second node ? I tried by exporting $MOLCAS, $PATH, etc., thanks to the -x option of mpirun, but it didn't do the trick.

Any idea about this problem ? Is there a way to increase the verbose level of MOLCAS ?

Thanks a lot in advance,

Cheers

Pierre (computer engineer, not chemist)

Offline

#2 2016-02-24 17:11:27

niko
Member
From: Marseille
Registered: 2015-11-08
Posts: 59
Website

Re: Precompiled parallel version of MOLCAS doesn't wor with 2 nodes

Hi,
I confirm Molcas v8.0 cannot run in parallel using different nodes on our cluster (CentOS, OAR batch scheduler, compiled with gcc 4.9.2 and OpenMPI 1.8.3). If I'm right, this issue has been solved only in the development version.

Offline

#3 2016-02-24 18:03:27

pgirard
Member
Registered: 2016-02-24
Posts: 2

Re: Precompiled parallel version of MOLCAS doesn't wor with 2 nodes

Thanks Niko for this information. I won't waste more time with that then, and wait for the fix. I'm going to check also that the same problem occurs when compiling with Intel MPI framework which is my final target.

Cheers

Pierre

Offline

#4 2017-01-06 07:31:12

caofan_success
Member
Registered: 2017-01-06
Posts: 1

Re: Precompiled parallel version of MOLCAS doesn't wor with 2 nodes

niko wrote:

Hi,
I confirm Molcas v8.0 cannot run in parallel using different nodes on our cluster (CentOS, OAR batch scheduler, compiled with gcc 4.9.2 and OpenMPI 1.8.3). If I'm right, this issue has been solved only in the development version.

Hello, could Molcas 8.0 run in paralell on a single PC with 1 CPU including 4 cores, I configured the molcas.rte file as:

# runtime environment for molcas
OS='Linux-x86_64'
PARALLEL='yes'
DEFMOLCASMEM='2048'
DEFMOLCASDISK='20000'
RUNSCRIPT='$program  $input'
RUNBINARY='/home/openmpi/bin/mpirun -np $CPUS $program'
RUNBINARYSER='$program'
PAREXEC=''

and an error was written in out file:
parnell.exe: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory

Is it possible to be fixed?
Thank you very much!

Offline

#5 2017-01-12 14:55:05

niko
Member
From: Marseille
Registered: 2015-11-08
Posts: 59
Website

Re: Precompiled parallel version of MOLCAS doesn't wor with 2 nodes

Probably your OpenMPI environment is not correctly sourced at runtime. Check your LD_LIBRARY_PATH env variable.

Offline

Board footer

Powered by FluxBB 1.5.11

Last refresh: Today 17:34:22