5) Generate simulation inputΒΆ

Command:

mlcg-tk-gen_sim_input process_sim_input --config configuration_files/trpcage_sim.yaml --config configuration_files/trpcage_priors.yaml

A trained MLCG model serves as a forcefield for conducting protein simulations. To run simulations of a particular system, the command above will process each structure file indicated by the pdb_fns option, map these to the specified CG resolution, generate neighbor lists corresponding the the given prior_builders, and save the specified number of copies of AtomicData objects storing this information.

In contrast to traditional MD forcefields, machine-learned force fields are designed to process data efficiently in batches, making it advantageous to run multiple simulations in parallel in order to maximize resource utilization and minimize the cost per trajectory. The copies option in the configuration file should be carefully selected based on the size of the system to ensure efficient memory usage, and may require some testing to achieve optimal simulation performance.

For generating a simulation input for the pretrained transferable model provided with the manuscript Navigating protein landscapes with a machine-learned coarse-grained model, use the provided transferable_priors.yaml in configuration_files to build an input configuration using consistent priors with the ones used in the manuscript. Adapt the trpcage_sim.yaml file to the protein sequence to be simulated.