This issue is very similar to Failed to create symbolic link.
I am at the very end of a 108M ligand docking run. The partition I am using allows only 10 jobs per user (SLURM), and there is a time limit of 3 days. Therefore, once I started the job, I issue
“/vf_continue_all.sh templates/template1.slurm.sh 1
” every 3 days or so. This worked out fine till ~103M ligands. Recently however, when I am trying to issue the same command, 8 out of the 10 joblines are failing with error (attached below). 2 joblines are continuing without any problem. Is there a way to speedup docking of these last 4M ligands?
all.ctrl
job_letter=t
batchsystem=SLURM
partition=shared
timelimit=3-00:00:00
steps_per_job=1
cpus_per_step=24
queues_per_step=24
cpus_per_queue=24
central_todo_list_splitting_size=10000
ligands_todo_per_queue=100000
ligands_per_refilling_step=1000
collection_folder=…/input-files/ligand-library
ligand_library_format=pdbqt
minimum_time_remaining=10
dispersion_time_min=3
dispersion_time_max=10
verbosity_commands=standard
verbosity_logfiles=standard
store_queue_log_files=all_compressed_error_uncompressed
keep_ligand_summary_logs=true
error_sensitivity=normal
error_response=fail
tempdir_default=/scratch/users/XXX/tmp
tempdir_fast=/dev/shm
outputfiles_level=collection
prepare_queue_todolists=true
docking_scenario_names=qvina02_rigid_receptor1
docking_scenario_programs=qvina02
docking_scenario_replicas=2
docking_scenario_inputfolders=…/input-files/qvina02_rigid_receptor1
stop_after_next_check_interval=false
ligand_check_interval=10
stop_after_collection=false
stop_after_job=false
Job output of one of the joblines that failed
Job Output
===========================================================
- Preparing the to-do lists for jobline 6
Starting the (re)filling of the todolists of the queues.
Before (re)filling the todolists the queue 6-1-1 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-2 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-3 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-4 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-5 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-6 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-7 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-8 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-9 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-10 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-11 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-12 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-13 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-14 had 61752 ligands todo distributed in 60 collections.
Before (re)filling the todolists the queue 6-1-15 had 63852 ligands todo distributed in 61 collections.
Before (re)filling the todolists the queue 6-1-16 had 56231 ligands todo distributed in 58 collections.
Before (re)filling the todolists the queue 6-1-17 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-18 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-19 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-20 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-21 had 100756 ligands todo distributed in 94 collections.
Before (re)filling the todolists the queue 6-1-22 had 100959 ligands todo distributed in 91 collections.
Before (re)filling the todolists the queue 6-1-23 had 0 ligands todo distributed in 0 collections.
Before (re)filling the todolists the queue 6-1-24 had 0 ligands todo distributed in 0 collections.The ligand-collections/todo/todo.all (if existent) did not meet the requirements for continuation (trial 1).
The ligand-collections/todo/todo.all (if existent) did not meet the requirements for continuation (trial 2).
The ligand-collections/todo/todo.all (if existent) did not meet the requirements for continuation (trial 3).
The next todo list will be used (todo.all.0009)
ln: failed to create symbolic link ‘…/…/workflow/ligand-collections/todo/todo.all.locked’: File exists
The next todo list will be used (todo.all.0000)
There is no more ligand collection in the todo.all file. Stopping the refilling procedure.
Starting job step 1 on host compute0107.
Job step 1 is starting queue 6-1-1 on host compute0107.
Job step 1 is starting queue 6-1-2 on host compute0107.
Job step 1 is starting queue 6-1-3 on host compute0107.
Job step 1 is starting queue 6-1-4 on host compute0107.
Job step 1 is starting queue 6-1-5 on host compute0107.
Job step 1 is starting queue 6-1-6 on host compute0107.
Job step 1 is starting queue 6-1-7 on host compute0107.
Job step 1 is starting queue 6-1-8 on host compute0107.
Job step 1 is starting queue 6-1-9 on host compute0107.
Job step 1 is starting queue 6-1-10 on host compute0107.
Job step 1 is starting queue 6-1-11 on host compute0107.
Job step 1 is starting queue 6-1-12 on host compute0107.
Job step 1 is starting queue 6-1-13 on host compute0107.
Job step 1 is starting queue 6-1-14 on host compute0107.
Job step 1 is starting queue 6-1-15 on host compute0107.
Job step 1 is starting queue 6-1-16 on host compute0107.
Job step 1 is starting queue 6-1-17 on host compute0107.
Job step 1 is starting queue 6-1-18 on host compute0107.
Job step 1 is starting queue 6-1-19 on host compute0107.
Job step 1 is starting queue 6-1-20 on host compute0107.
Job step 1 is starting queue 6-1-21 on host compute0107.
Job step 1 is starting queue 6-1-22 on host compute0107.
Job step 1 is starting queue 6-1-23 on host compute0107.
Job step 1 is starting queue 6-1-24 on host compute0107.
Error was trapped
Error in bash script one-step.sh
Error on line 187
Environment variables