GROMACS之水中部分載脂蛋白

本教程来自于李继存老师博客:http://jerkwin.coding.me/GMX/GMXtut-1/#%E6%A6%82%E8%BF%B0
特此说明

今天带和大家介绍一下gromacs最简单的教程,这个教程李老师已经组织翻译了一遍,翻译的质量非常高,我仍然炒一遍剩饭主要是想准备把自己所学的温故知新一遍,为了更加快速的进行模拟,我们将两个教程的蛋白进行了替换,即将載脂蛋白替换了溶菌酶,实际上是一样的。

首先看一下GROMACS做MD的一般流程

GROMACS MD一般流程
GROMACS MD一般流程

下载PDB文件

首先我们在PDB数据库下载小肽的PDB文件,注意我采用的为jupyter notebook编写,若在终端运行的话无需加out=!{}
例如:out=!{“ls”},在终端运行只需要为ls即可

1
out=!{"wget http://www.rcsb.org/pdb/files/1ODQ.pdb"}

生成蛋白拓扑(Topology文件)

在用GROMACS的程序对pdb文件进行处理前, 要做许多检查, 以保证pdb文件的完整性. 根据pdb文件的不同, 要进行不同的处理. 需要处理的方面包括但不限于:

-氢原子
-C端氧原子
-结晶水
-缺失原子/残基/侧链
-二硫键
-带电残基

请在进行下一步之前仔细简单是否有错误,如教程蜘蛛肽需要进行末尾蛋白的补全,如删除蛋白中本身含有的结晶水等等工作。
在本例中你需要从20个模型中获取单一模型,可以使用例如pymol进行操作。
下一步需要注意的是我们需要-ignh进行删除pdb中的氢,然后gmx进行自动加氢。我们将其修改为fws.pdb

1
!gmx pdb2gmx -f fws.pdb -ignh -o 1ODQ_processed.gro -water spce
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
····
Select the Force Field:
From '/usr/share/gromacs/top':
1: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
3: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
4: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
5: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
6: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
8: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
9: GROMOS96 43a1 force field
10: GROMOS96 43a2 force field (improved alkane dihedrals)
11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40,, 843-856, DOI: 10.1007/s00249-011-0700-9)
15: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)

力场包含了将要写入到拓扑文件中的信息. 这个选择非常重要! 你必须仔细了解每个力场, 并决定哪个力场最适用于你的体系. 在本教程中, 我们选gromos联合原子力场, 因此在命令提示行中输入14, 然后回车.由于在jupyter中shell交互不是很方便,所以我们采用的是在尾部<<EOF\n14\nEOF的方式进行交互,下面的情况类似,不再申明,即大家在实际操作中在shell终端输入前面的命令,后面选择的选项与后面命令类似

1
out=!{"gmx pdb2gmx -f fws.pdb -ignh -o fws_processed.gro -water spce <<EOF\n14\nEOF"}

定义单位盒子并填充溶剂

教程中是需要模拟一个简单的水溶液分子,定义模拟用的盒子并添加溶剂分为两步完成:
1.使用editconf模块定义盒子的尺寸
2.使用solvate模块想盒子中填充水
首先我们来定义盒子,本教程用的正方形盒子,其实盒子种类很多,一般使用菱形十二面体晶胞,其可以节省许多水分子,从而节省计算量。-c表示分子在盒子的中心部分。

1
out=!{"gmx editconf -f fws_processed.gro -o fws_newbox.gro -c -d 1.0 -bt cubic"}

我们可以在vmd或者pymol中查看分子的情况,在用shell交互中下面块的内容不需要输入,转而使用vmd fws_newbox.gro或者pymol fws_newbox.gro进行查看

1
2
#演示使用,真正进行学习时直接vmd或者pymol查看
from IPythonTweaks import *
1
2
#演示使用
pymolPlotStructure("fws_newbox.gro")
output_11_0.png
output_11_0.png

之后我们进行溶剂分子的增加

1
out=!{"gmx solvate -cp fws_newbox.gro -cs spc216.gro -o fws_solv.gro -p topol.top"}

同样我们也可以进行可视化查看

1
2
#演示使用
pymolPlotStructure("fws_solv.gro")
output_15_0.png
output_15_0.png

增加离子

现在我们已经有了一个带电荷的溶液体系,其实在pdb2gmx程序的输出文件会显示带了多少电荷,如果没有查看到我们可以查看一下topol.top文件中[ atoms ]指令的的最后一行, 它应该含有qtot 0这一信息. 由于生命体系中不存在净电荷, 所以我们必须往我们体系中添加离子, 以保证总电荷为零.虽然电荷为0,我们还是增加一下0个离子,作为演示的作用。
首先我们创建一个ions.mdp的文件,并输入如下内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
; ions.mdp - used as input into grompp to generate ions.tpr
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep = 0.01 ; Energy step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
ns_type = grid ; Method to determine neighbor list (simple, grid)
rlist = 1.0 ; Cut-off for making neighbor list (short range forces)
coulombtype = PME ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; Short-range electrostatic cut-off
rvdw = 1.0 ; Short-range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions (yes/no)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#演示使用
file=open("ions.mdp","w")
file.write("""
; ions.mdp - used as input into grompp to generate ions.tpr
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep = 0.01 ; Energy step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
ns_type = grid ; Method to determine neighbor list (simple, grid)
rlist = 1.0 ; Cut-off for making neighbor list (short range forces)
coulombtype = PME ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; Short-range electrostatic cut-off
rvdw = 1.0 ; Short-range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions (yes/no)
""")
file.close()

使用如下命令来产生.tpr文件

1
2
out=!{"gmx grompp -f ions.mdp -c fws_solv.gro -p topol.top -o ions.tpr"}
out=!{"gmx genion -s ions.tpr -o fws_solv_ions.gro -p topol.top -pname NA -nname CL -nn 0 <<EOF\n13\nEOF"}
1
out
1
2
3
4
5
6
7
8
9
10
11
12
13
···
'GROMACS: gmx genion, VERSION 5.1.2',
'Executable: /usr/bin/gmx',
'Data prefix: /usr',
'Command line:',
' gmx genion -s ions.tpr -o fws_solv_ions.gro -p topol.top -pname NA -nname CL -nn 0',
'',
'Reading file ions.tpr, VERSION 5.1.2 (single precision)',
'Reading file ions.tpr, VERSION 5.1.2 (single precision)',
'No ions to add, will just copy input configuration.',
'',
'gcq#418: "A curious aspect of the theory of evolution is that everybody thinks he understands it." (Jacques Monod)',
'']

能量最小化

现在, 我们已经添加了溶剂分子和离子, 得到了一个电中性的体系. 在开始动力学模拟之前, 我们必须保证体系的结构正常, 原子之间的距离不会过近, 几何构型合理. 对结构进行弛豫可以达到这些要求, 这个过程称为能量最小化(EM, energy minimization).

能量最小化过程与添加离子过程差不多. 我们要再次使用grompp将结构, 拓扑和模拟参数写入一个二进制的输入文件中(.tpr), 但这次我们不需要将.tpr文件传递给genion, 而是使用GROMACS MD引擎的mdrun模块来进行能量最小化.
能量最小化的mdp文件参数与解释如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
; 传递给预处理器的一些定义
define = -DFLEXIBLE ; 使用柔性水模型而非刚性模型, 这样最陡下降法可进一步最小化能量
; 模拟类型, 结束控制, 输出控制参数
integrator = steep ; 指定使用最陡下降法进行能量最小化. 若设为`cg`则使用共轭梯度法
emtol = 500.0 ; 若力的最大值小于此值则认为能量最小化收敛(单位kJ mol^-1^ nm^-1^)
emstep = 0.01 ; 初始步长(nm)
nsteps = 1000 ; 在能量最小化中, 指定最大迭代次数
nstenergy = 1 ; 能量写出频率
energygrps = System ; 要写出的能量组
; 近邻列表, 相互作用计算参数
nstlist = 1 ; 更新近邻列表的频率. 1表示每步都更新
ns_type = grid ; 近邻列表确定方法(simple或grid)
coulombtype = PME ; 计算长程静电的方法. PME为粒子网格Ewald方法, 还可以使用cut-off
rlist = 1.0 ; 短程力近邻列表的截断值
rcoulomb = 1.0 ; 长程库仑力的截断值
vdwtype = cut-off ; 计算范德华作用的方法
rvdw = 1.0 ; 范德华距离截断值
constraints = none ; 设置模型中使用的约束
pbc = xyz ; 3维周期性边界条件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#演示使用
file=open("minim.mdp","w")
file.write("""
; minim.mdp - used as input into grompp to generate em.tpr
; Parameters describing what to do, when to stop and what to save
integrator = steep ; Algorithm (steep = steepest descent minimization)
emtol = 1000.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nm
emstep = 0.01 ; Energy step size
nsteps = 50000 ; Maximum number of (minimization) steps to perform
; Parameters describing how to find the neighbors of each atom and how to calculate the interactions
nstlist = 1 ; Frequency to update the neighbor list and long range forces
ns_type = grid ; Method to determine neighbor list (simple, grid)
rlist = 1.0 ; Cut-off for making neighbor list (short range forces)
coulombtype = PME ; Treatment of long range electrostatic interactions
rcoulomb = 1.0 ; Short-range electrostatic cut-off
rvdw = 1.0 ; Short-range Van der Waals cut-off
pbc = xyz ; Periodic Boundary Conditions (yes/no)
""")
file.close()
1
out=!{"gmx grompp -f minim.mdp -c fws_solv_ions.gro -p topol.top -o em.tpr"}
1
out=!{"gmx mdrun -v -deffnm em"}

我们可以使用grace打开查看,若没有安装可以使用sudo apt-get install grace进行安装,为了演示方便我们直接在内部做图

1
out=!{"gmx energy -f em.edr -o energy.xvg<<EOF\n10\n0\nEOF"}
1
2
3
4
5
#演示使用
fig,ax=plt.subplots()
plotXVG(ax,"energy.xvg")
ppl.legend(ax)
plt.show()
output_27_0.png
output_27_0.png

NVT平衡

EM可保证我们的初始结构在几何构型和溶剂分子取向等方面都合理. 为了开始真正的动力学模拟, 我们必须对蛋白质周围的溶剂和离子进行平衡. 如果我们在这时就尝试进行非限制的动力学模拟, 体系可能会崩溃. 原因在于我们基本上只是优化了溶剂分子自身, 而没有考虑溶质. 我们需要将体系置于设定的模拟温度下, 以确定溶质(蛋白质)的合理取向. 达到正确的温度(基于动能)之后, 我们要对体系施加压力直到它达到合适的密度.

还记得好久以前我们用db2gmx生成的osre.itp文件么? 现在它要派上用场了. posre.itp文件的目的在于对蛋白质中的重原子(非氢原子)施加位置限制(position restraining)力. 这些原子不会移动, 除非增加非常大的能量. 位置限制的用途在于, 我们可以平衡蛋白质周围的溶剂分子, 而不引起蛋白质结构的变化.

平衡往往分两个阶段进行. 第一个阶段在NVT系综(粒子数, 体积和温度都是恒定的)下进行. 这个系综也被称为等温等容系综或正则系综. 这个过程的需要的时间与体系的构成有关, 但在NVT系综中, 体系的温度应达到预期值并基本保持不变. 如果温度仍然没有稳定, 那就需要更多的时间. 通常情况下, 50 ps到100 ps就足够了,为了节约时间,我们仅进行2ps的NVT。
nvt.mdp的设置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
title = OPLS NVT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 1000 ; 2 * 1000 = 2 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#演示使用
file=open('nvt.mdp','w')
file.write("""
title = OPLS NVT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 1000 ; 2 * 1000 = 2 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 500 ; save coordinates every 1.0 ps
nstvout = 500 ; save velocities every 1.0 ps
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is off
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed
""")
file.close()

接下来与能量最小化情况类似

1
out=!{"gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr"}
1
out=!{"gmx mdrun -v -deffnm nvt"}

由于模拟的时间比较短,我一会儿就模拟好了
让我们来分析温度变化情况, 再次使用energy模块:

1
out=!{"gmx energy -f nvt.edr -o nvt.xvg<<EOF\n15\n0\nEOF"}
1
2
3
4
5
#演示使用
fig,ax=plt.subplots()
plotXVG(ax,"nvt.xvg")
ppl.legend(ax)
plt.show()

由于这里nstxout记录值过大所以只有两个点,应该把值设置更小,从而产生如下结果

temp3.png

前一步的NVT平衡稳定了体系的温度. 在采集数据之前, 我们还需要稳定体系的压力(因此还包括密度). 压力平衡是在NPT系综下进行的, 其中粒子数, 压力和温度都保持不变. 这个系综也被称为等温等压系综, 最接近实验条件.
一般建议设置的模拟时长为1ns,但是由于时间关系,这里仅模拟10ps
npt.mdp设置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
title = OPLS NPT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 5000 ; 2 * 5000 = 10 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 50 ; save coordinates every 0.1 ps
nstvout = 50 ; save velocities every 0.1 ps
nstenergy = 50 ; save energies every 0.1 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = yes ; Restarting after NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#演示使用
file=open('npt.mdp','w')
file.write("""
title = OPLS NPT equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 5000 ; 2 * 5000 = 10 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 50 ; save coordinates every 0.1 ps
nstvout = 50 ; save velocities every 0.1 ps
nstenergy = 50 ; save energies every 0.1 ps
nstlog = 500 ; update log file every 1.0 ps
; Bond parameters
continuation = yes ; Restarting after NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
""")
file.close()
1
out=!{"gmx grompp -f npt.mdp -c nvt.gro -t nvt.cpt -p topol.top -o npt.tpr"}
1
out=!{"gmx mdrun -v -deffnm npt"}

我们可以通过压力和密度检测来看模拟是否平衡

1
2
3
4
#压力
out=!{"gmx energy -f npt.edr -o pressure.xvg<<EOF\n16\n0\nEOF"}
#密度
out=!{"gmx energy -f npt.edr -o density.xvg<<EOF\n22\n0\nEOF"}
1
2
3
4
5
#演示使用
fig,ax=plt.subplots()
plotXVG(ax,"pressure.xvg")
ppl.legend(ax)
plt.show()
output_43_0.png
output_43_0.png
1
2
3
4
5
#演示使用
fig,ax=plt.subplots()
plotXVG(ax,"density.xvg")
ppl.legend(ax)
plt.show()
output_44_0.png
output_44_0.png

成品MD

随着两个平衡阶段的完成, 体系已经在需要的温度和压强下平衡好了. 我们现在可以放开位置限制并进行成品MD以收集数据了. 这个过程跟前面的类似. 运行grompp时, 我们还要用到检查点文件(在这种情况下,其中包含了压力耦合信息). 我们要进行一个1 ns的MD模拟, 所用的参数文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
title = OPLS MD simulation
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 500000 ; 2 * 500000 = 1000 ps (1 ns)
dt = 0.002 ; 2 fs
; Output control
nstxout = 5000 ; save coordinates every 10.0 ps
nstvout = 5000 ; save velocities every 10.0 ps
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save compressed coordinates every 10.0 ps
; nstxout-compressed replaces nstxtcout
compressed-x-grps = System ; replaces xtc-grps
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#演示使用
file=open('md.mdp','w')
file.write("""
title = OPLS Lysozyme MD simulation
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 500000 ; 2 * 500000 = 1000 ps (1 ns)
dt = 0.002 ; 2 fs
; Output control
nstxout = 5000 ; save coordinates every 10.0 ps
nstvout = 5000 ; save velocities every 10.0 ps
nstenergy = 5000 ; save energies every 10.0 ps
nstlog = 5000 ; update log file every 10.0 ps
nstxout-compressed = 5000 ; save compressed coordinates every 10.0 ps
; nstxout-compressed replaces nstxtcout
compressed-x-grps = System ; replaces xtc-grps
; Bond parameters
continuation = yes ; Restarting after NPT
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 10 ; 20 fs, largely irrelevant with Verlet scheme
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = no ; Velocity generation is off
""")
file.close()
1
out=!{"gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr"}
1
!gmx mdrun -v -deffnm md_0_1

具体的分析方法与内容我们下次分享