diff --git a/overview.ipynb b/overview.ipynb index c100b98..2c21307 100644 --- a/overview.ipynb +++ b/overview.ipynb @@ -8,16 +8,18 @@ "\n", "## Overview\n", "\n", - "MoSDeF consists of two core Python package, [mBuild](https://github.com/mosdef-hub/mbuild) and [Foyer](https://github.com/mosdef-hub/foyer), that, when combined with tools for workflow and analysis management (such as Signac and Signac-Flow) provide the means to perform complex molecular simulations in a **reproducible** manner. Reproducibility in this case is achieved by making all aspects of the simulation (system initialization, simulation execution, and analysis) scriptable, such that other researchers could execute your same scripts to achieve the same results. MoSDeF is also designed such that systems can be generated in a programmatic manner, facilitating screening of large structural/chemical parameter spaces.\n", + "MoSDeF consists of two core Python package, [mBuild](https://github.com/mosdef-hub/mbuild) and [Foyer](https://github.com/mosdef-hub/foyer), that, when combined with tools for workflow and analysis management (such as [Signac and Signac-Flow](https://glotzerlab.engin.umich.edu/signac/)) provide the means to perform complex molecular simulations in a **reproducible** manner. Reproducibility in this case is achieved by making all aspects of the simulation (system initialization, simulation execution, and analysis) scriptable, such that other researchers could execute your same scripts to achieve the same results. MoSDeF is also designed such that systems can be generated in a programmatic manner, facilitating screening of large structural/chemical parameter spaces.\n", "\n", - "In this overview, we will be focusing specifically on the tools we've developed to address the issue of system initialization, including the creation of a molecular model and the application of a force field (atom-typing and parameter assignment). The two tools contained within MoSDeF to address system initialization are:\n", + "In this overview, we will be focusing specifically on the tools we've developed to address the issue of **_system initialization_**, including the creation of a molecular model and the application of a force field (atom-typing and parameter assignment). The two tools contained within MoSDeF to address system initialization are:\n", "\n", - " - **mBuild**: A hierarchical, component based molecule builder\n", + " - [**mBuild**](https://github.com/mosdef-hub/mbuild): A hierarchical, component-based molecule builder\n", " \n", - " - **Foyer**: A package for atom-typing as well as applying and disseminating forcefields\n", + " - [**Foyer**](https://github.com/mosdef-hub/foyer): A package for atom-typing as well as applying and disseminating forcefields\n", "\n", "This overview is designed to introduce you to these tools in a general manner; however, more in-depth tutorials are also available from within the [mosdef_tutorials repository](https://github.com/mosdef-hub/mosdef_tutorials).\n", "\n", + "---\n", + "\n", "**Pre-requisites**\n", "\n", "We have designed this tutorial for users that have some knowledge of Python and object-oriented programming (OOP). However, we encourage all users to work through the notebook, even those new to the world of Python and OOP, in order to still obtain an idea of the general concept of our tools. The syntax can be picked up later.\n", @@ -38,13 +40,37 @@ "1. Markdown cells, like this cell, which contain explanatory text\n", "2. Code cells, that can be executed by either clicking on the \"run cell\" icon or by hitting SHIFT + ENTER.\n", "\n", - "Cells do not have to be executed in order (however the cells in this tutorial are designed to be executed sequentially), and the order in which cells have been executed is recorded by the bracketed number to the left of the code cell (e.g. [ 1 ]). When a cell is executed you will first see an asterisk (i.e. [ * ]) which means that the cell is still running. When the asterisk is replaced by a number this means the execution has completed.\n", + "Cells do not have to be executed in order (however the cells in this tutorial are designed to be executed _sequentially_), and the order in which cells have been executed is recorded by the bracketed number to the left of the _code_ cell (e.g. [ 1 ]). When a cell is executed you will first see an asterisk (i.e. [ * ]) which means that the cell is still running. When the asterisk is replaced by a number this means the execution has completed.\n", "\n", - "[Binder](https://mybinder.readthedocs.io/en/latest/) provides the ability to deploy Jupyter notebooks in the cloud, such that users do not need to set up their own computing environment to execute the notebook cells.\n", + "Markdown cells will _not_ have numbers to the left of their cell. These are text based and not meant to be considered executable code. Executing these cells will render the Markdown cells in HTML. More information can be found [here](https://www.markdownguide.org/getting-started)\n", + "\n", + "---\n", "\n", + "[Binder](https://mybinder.readthedocs.io/en/latest/) provides the ability to deploy Jupyter notebooks in the cloud, such that users do not need to set up their own computing environment to execute the notebook cells.\n", + "* We will not be using Binder during this session, but all of our notebooks are hosted on Binder as well.\n", + "* Binder is a free service that is community supported, and can be slow to access with multiple users trying to access the same notebook at once.\n", "---" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "### mBuild Units\n", + "\n", + "Within mBuild, units to describe various aspects of system initialization are kept constant within the package.\n", + "This provides a controlled environment that limits possible Input/Output (IO) errors when reading in/saving your structure of interest to various simulation engines.\n", + "\n", + "**Length**\n", + "* nanometers [nm]\n", + "\n", + "**Angles**\n", + "* Radians for all `Compound` operations\n", + "* Degrees when building `Lattices`" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -71,7 +97,7 @@ "\n", "The base class of mBuild is the `Compound` class, which defines the primary building block used for constructing molecules. **Molecules are constructed hierarchically**; however, each level of the hierarchy inherits from the `Compound` class. This means that `Compounds` may contain other `Compounds`, and that the same methods and attributes are present for molecule components at any level of the hierarchy. mBuild `Compounds` feature [a variety of useful methods and attributes](http://mosdef-hub.github.io/mbuild/data_structures.html) to facilitate system construction.\n", "\n", - "\"Drawing\"" + "\"Drawing\"" ] }, { @@ -90,7 +116,7 @@ "\n", "### Loading from a PDB structure file\n", "\n", - "First, we'll load a CH2 moiety into an mBuild `Compound` by reading from a PDB structure file (created using [Avogadro](https://avogadro.cc/)). This will create an mBuild `Compound` containing three atoms (C, H, H), as well as two C-H bonds. The `visualize` method allows us to view our `Compound` directly within the notebook." + "First, we'll load a CH2 moiety into an mBuild `Compound` by reading from a PDB structure file (created using [Avogadro](https://avogadro.cc/)). This will create an mBuild `Compound` containing three atoms (C, H, H), as well as two C-H bonds. The `visualize` method allows us to view our `Compound` directly within the notebook. This visualization is provided by [`nglview`](https://github.com/arose/nglview)." ] }, { @@ -103,6 +129,13 @@ "ch2.visualize()" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note, formats such as PDB include bonding information. One could presumably load other formats without bonding information, and specify these bonds manually. Additionally, one can explicitly define atom locations and bonds; for example, see [mBuild Tutorial 01: Basic Functionality](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_01_Basic_Functionality.ipynb)." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -113,7 +146,9 @@ "\n", "However, if one had to re-write the commands for loading a CH2 molecule and adding `Ports` each time they wanted to create a molecule that included a CH2 unit, the process would be quite cumbersome. Instead, we can create a reusable class that defines our CH2 `Compound`. This approach allows one to encapsulate the routines for creating a molecular moiety into an object that can be instantiated and in a manner that can easily be shared with others.\n", "\n", - "Below is a class definition for a CH2 moiety that uses the same command we used above to load coordinates and bonds from a PDB structure file and features a few lines that add `Ports` to the carbon atom." + "Below is a class definition for a CH2 moiety that uses the _same_ command we used above to load coordinates and bonds from a PDB structure file and features a few lines that add `Ports` to the carbon atom.\n", + "\n", + "For additional information, see [mBuild Tutorial 02: Reusing Components](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_02_Reusing_Components.ipynb)." ] }, { @@ -148,6 +183,99 @@ "ch2.visualize(show_ports=True)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Examining the `Compound` data structure\n", + "\n", + "Now that we have created a `Compound` we can examine the contents. For example, simply calling the `Compound` will provide us with a summary of the contents." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# simply call the compound to print a summary of the number particles and bonds\n", + "ch2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can examine the coordinates in multiple ways as shown below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# view the coordinates of the atoms in the compound\n", + "ch2.xyz" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# use the list function to iterate over the atoms and their positions in the compound\n", + "list(ch2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To view the bonds, we can call the `bonds` function as part of the `Compound` taking advantage of `list`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#list the pairs of atoms that are bonded; each pair appears between parantheses, i.e., (atom1, atom2)\n", + "list(ch2.bonds())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# we can also format the output of bonds to simply list the pairs of bonded atoms by name alone\n", + "for pair in ch2.bonds():\n", + " print(pair[0].name, '-', pair[1].name)\n", + "\n", + "# equivalent shorthand output using list comprehension\n", + "['{}-{}'.format(pair[0].name, pair[1].name) for pair in ch2.bonds()]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also view a summary of the ports associated with a `Compound`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ch2.all_ports()" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -158,7 +286,15 @@ "\n", "The code below instantiates CH2 moieties inside of a \"for\" loop, where the number of iterations is dependent on the desired length of the chain. The length of the chain can be toggled through the `chain_length` argument provided to the class constructor. We also import a hydrogen `Compound` from mBuild's `atoms` library to cap the ends of our chain.\n", "\n", - "**Note:** For this general overview, we do not intend for users (particularly those new to Python and object-oriented programming) to get too bogged down in the syntax. Instead, the emphasis should be that with mBuild we can encapsulate a series of routines (a \"recipe\") into a class, and that these routines can be defined in a manner that gives the class structural/chemical flexibility." + "This is shown pictorially below.\n", + "\"Drawing\"\n", + "\n", + "**Note:** For this general overview, we do not intend for users (particularly those new to Python and object-oriented programming) to get too bogged down in the syntax. Instead, the emphasis should be that with mBuild we can encapsulate a series of routines (a \"recipe\") into a class, and that these routines can be defined in a manner that gives the class structural/chemical flexibility.\n", + "\n", + "For additional examples, see tutorials: \n", + "- [mBuild Tutorial 03: Connecting Components with Ports](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_03_Connecting_Components_with_Ports.ipynb) \n", + "- [mBuild Tutorial 04: Constructing Larger Compounds](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_04_Constructing_Larger_Compounds.ipynb)\n", + "- [mBuild Tutorial 05: Creating Flexible Classes](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_05_Creating_Flexible_Classes.ipynb)" ] }, { @@ -199,9 +335,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Because we've defined our class to take `chain_length` as an argument, we can toggle the chemistry of our system (in this can the number of carbons in a linear alkane) by changing the value we provide for this argument upon instantiation.\n", + "Because we've defined our class to take `chain_length` as an argument, we can toggle the chemistry of our system (in this case the number of carbons in a linear alkane) by changing the value we provide for this argument upon instantiation.\n", "\n", - "For example, let's create a butane." + "For example, let's create a butane molecule." ] }, { @@ -214,6 +350,24 @@ "butane.visualize()" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The geometry of this molecule is not entirely realistic as all backbone atoms featuring 180° angles in the alkane molecules, all hydrogen atoms in plane. This can be addressed by placing Particles and Ports in more realistic locations, either manually or by using energy minimized inputs.\n", + "\n", + "Alternatively, a `Compound` can be constructed and then energy minimized, either through a simulation engine or using the energy_minimization function in `mBuild`, which uses the [Open Babel toolkit](http://openbabel.org/dev-api/). See tutorial [mBuild Tutorial 07: Energy Minimization](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_07_Energy_Minimization.ipynb) for more information about the use of this function and control of this function. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "butane.energy_minimization()" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -274,6 +428,7 @@ "\n", "\n", "mBuild contains routines for the addition and removal of particles. Here, we'll explore this functionality by changing our hexane molecule into _hexanol_.\n", + "Note, we could do this by manually changing the class itself, or simply by removing the temrinal hydration and adding a hydroxyl in its place. \n", "\n", "First, we'll define a class for a hydroxyl group featuring a single `Port` on the oxygen to represent the dangling bond.\n", "\n", @@ -298,13 +453,29 @@ "hydroxyl.visualize(show_ports=True)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, we will use the `label` assigned to the up cap to differentiate it from other hydogen atoms in the system. This label can be determined by examining the class source code itself included in util, or by quering the `Compound` instance itself. Below we can see that we have 3 labels, chain, 'up_cap', and 'down_cap'. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "list(octane.labels)" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll remove a hydrogen from one end of our octane. This will create a `Port` in it's place, representing the dangling bond on the carbon.\n", "\n", - "In the class definition, the hydrogen atom was provided with a label, `up_cap`. We can use this label to remove this atom." + "In the class definition, the hydrogen atom we will remove was provided with a label, `up_cap`. We can use this label to refer to this hydrogen, and remove it." ] }, { @@ -349,7 +520,9 @@ "\n", "Typically we aren't desiring to run simulations of a single molecule. Fortunately, mBuild offers several routines to help create more complex systems. \n", "\n", - "For example, mBuild provides users with an interface to [PACKMOL](http://m3g.iqm.unicamp.br/packmol/home.shtml) to set up bulk systems through the `fill_box` function. Here we'll use `fill_box` to place ten octanol molecules into a 3nm x 3nm x 3nm box. We can provide a seed for PACKMOL's random number generator to ensure the configuration is reproducible." + "For example, mBuild provides users with an interface to [PACKMOL](http://m3g.iqm.unicamp.br/packmol/home.shtml) to set up bulk systems through the `fill_box` function. Here we'll use `fill_box` to place ten octanol molecules into a 3nm x 3nm x 3nm box. We can provide a seed for PACKMOL's random number generator to ensure the configuration is reproducible.\n", + "\n", + "For additional information, see [mBuild Tutorial 06: Setting Up Bulk Systems](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_06_Setting_Up_Bulk_Systems.ipynb).\n" ] }, { @@ -372,7 +545,9 @@ "\n", "Here, we'll load a surface of $\\beta$-cristobalite silica.\n", "\n", - "**Note:** Harmless warning messages are currently generated by one of the packages mBuild depends on. To reduce clutter, we are filtering those here, so you can safely ignore the warnings filter." + "**Note:** Harmless warning messages are currently generated by one of the packages mBuild depends on. To reduce clutter, we are filtering those here, so you can safely ignore the warnings filter.\n", + "\n", + "For additional information, see [mBuild Tutorial 09: Surface Functionalization](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_09_Surface_Functionalization.ipynb)." ] }, { @@ -492,7 +667,23 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To write to `TOP` format we DO need to apply a force field to our system. Let's say we wanted to use the OPLS all-atom force field. If we have this defined within an XML file, we simply need to provide the path to this file as an argument to `save`.\n", + "To write to `TOP` format we **DO** need to apply a force field to our system. Let's say we wanted to use the OPLS all-atom force field. If we have this defined within an XML file, we simply need to provide the path to this file as an argument to `save`.\n", + "\n", + "Foyer force fields are defined within an XML file that contains both the 'rules' required for atomtyping as well as the force field parameters within a single file. \n", + "\n", + "The Foyer XML format is an extension of the [OpenMM forcefield XML format](http://docs.openmm.org/7.0.0/userguide/application.html#creating-force-fields). The only differences reside in the `AtomTypes` section, where several additional attributes are available (which we will examine in a moment) that allow for atomtyping.\n", + "\n", + "The `AtomTypes` section of the Foyer XML is similar to that used for OpenMM forcefield XMLs; however, each `Type` in Foyer XML supports four additional attributes not found in OpenMM:\n", + "* `def` - SMARTS string describing the chemical substructure of this atomtype (Follow [this link](https://github.com/mosdef-hub/foyer/blob/master/docs/smarts.md) for more on SMARTS-based atomtyping using Foyer.)\n", + "* `desc` - Brief description of the atomtype\n", + "* `doi` - DOI reference for parameters associated with this atomtype\n", + "* `overrides` - One or more atomtypes to 'override', providing precedence to this atomtype\n", + "\n", + "[SMARTS](http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html), which is used to define usage of a given atom-type, is a language for describing chemical structures and substructures. This chemical context effectively defines the 'rules' for when an atomtype should apply. For example, for atom_type `opls_961`, the SMARTS string, `def=\"[C;X4](F)(F)(F)(C)\"`, states that this atom-type should be used when:\n", + "- your element is carbon and has 4 neighbors, i.e., `[C;X4]`\n", + "- three neighbors are F, i.e., `(F)(F)(F)`\n", + "- one neighbor is C, i.e., `(C)`\n", + "\n", "\n", "Let's first take a quick look at this file." ] @@ -545,15 +736,32 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This concludes the general MoSDeF overview. For more in-depth tutorials into mBuild and Foyer, refer to the [mosdef_tutorials repository](https://github.com/mosdef-hub/mosdef_tutorials) or use our [Binder link](https://mybinder.org/v2/gh/mosdef-hub/mosdef_tutorials/master)." + "### Working with coarse-grained and united atom forcefields\n", + "\n", + "Foyer allows non-atomistic types to be defined within SMARTS, allowing coarse-grained and united atom forcefields to be handled as well. Non-elemental species can easily be defined by pre-pending the name of custom \"element\" with an underscore.\n", + "\n", + "For example, the following lines could be used to describe beads representing _CH2 groups in a polymer using the TraPPE forcefield. \n", + "\n", + "` `\n", + " \n", + "Here, the SMARTS definition `[_CH2;X2]([_CH3,_CH2])[_CH3,_CH2]` states that for atom-type `CH2_sp3`\n", + "- our bead is _CH2 with 2 bonded neighbors, i.e., `[_CH2;X2]`\n", + "- those neighbors can be either _CH2 or _CH3, i.e., `([_CH3,_CH2])[_CH3,_CH2]`\n", + "\n", + "\n", + "For more information on nano-atomistic forcefields, see [Foyer Tutorial 02: SMARTS for Non-Atomistic Systems](https://github.com/mosdef-hub/foyer_tutorials/blob/master/Foyer_02_SMARTS_for_Non-Atomistic_Systems.ipynb).\n", + "\n", + "For additional information on Foyer itself, see the [Foyer tutorial repository](https://github.com/mosdef-hub/foyer_tutorials) and [github page](https://github.com/mosdef-hub/foyer)." ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "This concludes the general MoSDeF overview. For more in-depth tutorials into mBuild and Foyer, refer to the [mosdef_tutorials repository](https://github.com/mosdef-hub/mosdef_tutorials) or use our [Binder link](https://mybinder.org/v2/gh/mosdef-hub/mosdef_tutorials/master)." + ] } ], "metadata": { diff --git a/utils/figure_connecting.png b/utils/figure_connecting.png new file mode 100644 index 0000000..f67a234 Binary files /dev/null and b/utils/figure_connecting.png differ diff --git a/utils/hierarchical_design_image.png b/utils/hierarchical_design_image.png new file mode 100644 index 0000000..6c736be Binary files /dev/null and b/utils/hierarchical_design_image.png differ