How to deal with insufficient tempdir size for more than 10 million compounds

y-komuro · January 6, 2024, 1:39pm

Dear VirtualFlow community,

In VirtualFlow, I think the compounds (input-files/ligand-library) used for docking are once copied and expanded to the directory specified by tempdir_default.

Even if the capacity of /dev/shm is expanded up to 712GB in the case of more than 10 million compounds, it becomes "No space left on device" and all ligands cannot be deployed.

So we changed tempdir_default to /tmp and reserved 1TB of capacity, and only about 550GB is used (achieving a write speed of 1GB/sed) and CPU utilization of the each node is only about 15%.

Could you please advise on specific variable settings (e.g.parameters in all.ctrl and storage size for /dev/shm) that will realize high performance when dealing with 10 million+ ligands. Thank you in advance. We believe VirtualFlow is an innovative tool in drug discovery.

Attached are the results of the study to date in AWS ParallelCluster using Slurm.

Sorin · February 19, 2024, 6:20am

Hi y-komoro,

Is the storage media behind the 1TB a HDD? if so, it would be highly recommended to switch to a SSD.

Kind regards,

Sorin

y-komuro · February 19, 2024, 9:27am

Hi Sorin,

Thak you for your reply!
The storage media behind the 1TB is a SDD, and I changed the disk setting as follows:
Scheduling section - AWS ParallelCluster (amazon.com)

ComputeSettings:
    LocalStorage:
      RootVolume:
        Size: 10000
        Iops: 16000
        Throughput: 1000