-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log charge assignment #1053
base: develop
Are you sure you want to change the base?
Log charge assignment #1053
Conversation
Here's what this looks like for a set of fairly simple cases: import logging
from openff.toolkit import ForceField, Molecule, Topology
logging.basicConfig(level=logging.INFO)
ff = ForceField("openff-2.0.0.offxml")
water = Molecule.from_smiles("O")
ligand = Molecule.from_smiles("CC")
ligand.assign_partial_charges("am1bcc")
ff.create_interchange(Topology.from_molecules([water, water]))
ff.create_interchange(ligand.to_topology())
ff.create_interchange(ligand.to_topology(), charge_from_molecules=[ligand])
ForceField("opc.offxml").create_openmm_system(Molecule.from_smiles("O").to_topology()) $ python run.py
INFO:openff.toolkit.typing.engines.smirnoff.parameters:Attempting to up-convert vdW section from 0.3 to 0.4
INFO:openff.toolkit.typing.engines.smirnoff.parameters:Successfully up-converted vdW section from 0.3 to 0.4. `method="cutoff"` is now split into `periodic_method="cutoff"` and `nonperiodic_method="no-cutoff"`.
INFO:openff.toolkit.typing.engines.smirnoff.parameters:Attempting to up-convert Electrostatics section from 0.3 to 0.4
INFO:openff.toolkit.typing.engines.smirnoff.parameters:Successfully up-converted Electrostatics section from 0.3 to 0.4. `method="PME"` is now split into `periodic_potential="Ewald3D-ConductingBoundary"`, `nonperiodic_potential="Coulomb"`, and `exception_potential="Coulomb"`.
/Users/mattthompson/micromamba/envs/new-models/lib/python3.11/site-packages/mdtraj/formats/__init__.py:13: DeprecationWarning: 'xdrlib' is deprecated and slated for removal in Python 3.13
from mdtraj.formats.trr import TRRTrajectoryFile
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 3
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 4
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 2
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 5
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 2
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 3
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 4
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 5
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 6
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 7
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 2
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 3
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 4
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 5
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 6
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 7
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 2
INFO:openff.interchange.smirnoff._virtual_sites:Charge section VirtualSites applied to virtual site with orientation atoms (0, 1, 2) |
Sorry only very briefly looking at this, so please ignore me if I'm missing something important.
This seems like it would be incredibly overwhelming to users, indeed a solvated protein-ligand complex could easily be incomprehensible. Is there no way to:
|
@@ -879,6 +897,42 @@ def store_matches( | |||
|
|||
# Have this new key (on a duplicate molecule) point to the same potential | |||
# as the old key (on a unique/reference molecule) | |||
if type(new_key) is LibraryChargeTopologyKey: | |||
logger.info( | |||
"Charge section LibraryCharges applied to (topology) atom index " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Charge section LibraryCharges applied to (topology) atom index " | |
"Charge section LibraryCharges applied to topology atom index " |
I initially read "(topology)" as standing in for some missing data, but I think it's here to disambiguate which atom index is meant? In which case I think it's essential information and doesn't need to be parenthesized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty great!
>>> import logging
...
... from openff.toolkit import ForceField, Molecule, Topology
...
... logging.basicConfig(level=logging.INFO)
>>> mol1 = Molecule.from_smiles("O")
>>> mol2 = Molecule.from_smiles("CCO")
>>> mol3 = Molecule.from_smiles("CC(=O)O")
>>> mol3.assign_partial_charges("am1bcc")
>>> top = Topology.from_molecules([mol1, mol2, mol3])
>>> ForceField("openff-2.2.1.offxml").create_interchange(top, charge_from_molecules=[mol3])
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Charge section LibraryCharges applied to (topology) atom index 2
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 3
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 4
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 5
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 6
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 7
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 8
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 9
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 10
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bcc, applied to (topology) atom index 11
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 12
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 13
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 14
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 15
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 16
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 17
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 18
INFO:openff.interchange.smirnoff._nonbonded:Preset charges applied to atom index 19
Interchange with 7 collections, non-periodic topology with 20 atoms.
I think you've got the amount of information in the log right - the atom is uniquely identified in the fewest number of characters possible. I can imagine condensing this down by molecule would add a lot of complexity, and the point of doing logging is to avoid that complexity. Also, a per-atom approach allows charge increments to be logged on an atom-by-atom basis, which more closely matches how the force fields work. In any case, with the highly formulaic messages users should be able to write logging handlers to get more info out if they want:
class MoleculeFormatter(logging.Formatter):
def __init__(self, topology, *args, **kwargs):
self.topology = topology
self.last_mol = None
return super().__init__(*args, **kwargs)
def format(self, *args, **kwargs):
string = super().format(*args, **kwargs)
_, _, atom_index = string.rpartition(" ")
atom_index = int(atom_index)
atom = self.topology.atom(atom_index)
if self.last_mol != atom.molecule:
self.last_mol = atom.molecule
string += f" from {atom.molecule}"
return string
handler = logging.StreamHandler()
handler.formatter = MoleculeFormatter(top)
logging.basicConfig(level=logging.INFO, handlers=[handler])
My notes are mostly just to enhance the consistency and clarity of the logs - in particular I think we should specify that each of the numbers is a topology atom index, and that its clearer what that means if "topology" is not parenthesized.
One oversight seems to be charge increments from virtual sites - they do not seem to be reflected in the logs:
import logging
import sys
from openff.toolkit import ForceField, Molecule, Topology
logging.basicConfig(level=logging.INFO)
def sage_with_bond_charge(sage):
from openff.toolkit.typing.engines.smirnoff.parameters import (
BondType,
ChargeIncrementModelHandler,
VirtualSiteType,
)
sage["Bonds"].add_parameter(
parameter=BondType(
smirks="[#6:2]-[#17X1:1]",
id="b0",
length="1.6 * angstrom",
k="500 * angstrom**-2 * mole**-1 * kilocalorie",
),
)
sage.get_parameter_handler("VirtualSites")
sage["VirtualSites"].add_parameter(
parameter=VirtualSiteType(
smirks="[#6:2]-[#17X1:1]",
type="BondCharge",
match="all_permutations",
distance="0.8 * angstrom ** 1",
charge_increment1="0.2 * elementary_charge ** 1",
charge_increment2="0.1 * elementary_charge ** 1",
),
)
return sage
sage_with_bond_charge(ForceField("openff-2.2.1.offxml")).create_interchange(Molecule.from_smiles("CCl").to_topology())
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 0
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 1
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 2
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 3
INFO:openff.interchange.smirnoff._nonbonded:Charge section ToolkitAM1BCC, using charge method am1bccelf10, applied to (topology) atom index 4
INFO:openff.interchange.smirnoff._virtual_sites:Charge section VirtualSites applied to virtual site with orientation atoms (1, 0)
And they aren't mentioned in the hierarchy docs.
The only other thing I can think of that might be useful is to actually log the charge. This might be useful with virtual site charge increments so that users can track how the charge changes as increments are applied - but I don't think its essential and it definitely adds noise.
logger.info( | ||
"Charge section ToolkitAM1BCC, using charge method " | ||
f"{new_key.extras['partial_charge_method']}, " | ||
f"applied to (topology) atom index {topology_atom_index}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"applied to (topology) atom index {topology_atom_index}", | |
f"applied to topology atom index {topology_atom_index}", |
Ditto above.
|
||
elif new_key.extras["handler"] == "preset": | ||
logger.info( | ||
f"Preset charges applied to atom index {topology_atom_index}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Preset charges applied to atom index {topology_atom_index}", | |
f"Preset charges applied to topology atom index {topology_atom_index}", |
I think the fact that its a topology atom index is important to clarify.
Optionally, refactoring all these log messages into a single string that gets format()
ed in each branch to ensure consistency might be worth doing - people will probably end up processing logs programmatically and having a single regex match everything would be great.
I'm also not 100% sure about "preset", I'm not sure it makes the connection to charge_from_molecules
clear. "charge_from_molecule charges applied" might be clearer for a first time user, though it is a little clunky.
logger.info( | ||
"Charge section ChargeIncrementModel, using charge method " | ||
f"{new_key.partial_charge_method}, " | ||
f"applied to (topology) atom index {new_key.this_atom_index}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"applied to (topology) atom index {new_key.this_atom_index}", | |
f"applied to topology atom index {new_key.this_atom_index}", |
See above.
1. Look for preset charges from the `charge_from_molecules` argument | ||
1. Look for chemical environment matches within the `<LibraryCharges>` section | ||
1. Look for chemical environment matches within the `<ChargeIncrementModel>` section | ||
1. Try to run AM1-BCC or some variant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Try to run AM1-BCC or some variant | |
1. Try to run AM1-BCC according to the `<ToolkitAM1BCC>` section or some variant |
Just to clarify that this is still following the force field
1. Look for chemical environment matches within the `<ChargeIncrementModel>` section | ||
1. Try to run AM1-BCC or some variant | ||
|
||
If a molecule gets charges from one method, attempts to match charges for later methods are skipped. For more on how SMIRNOFF defines this behavior, see [this issue](https:/openforcefield/standards/issues/68) and linked discussions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a molecule gets charges from one method, attempts to match charges for later methods are skipped. For more on how SMIRNOFF defines this behavior, see [this issue](https:/openforcefield/standards/issues/68) and linked discussions. | |
If a molecule gets charges from one method, attempts to match charges for later methods are skipped. Note that preset charges override the force field and are not checked for consistency; any charges provided to the `charge_from_molecules` argument technically modify the force field. For more on how SMIRNOFF defines this behavior, see [this issue](https:/openforcefield/standards/issues/68) and linked discussions. |
I think clarifying that this is an escape hatch is useful.
|
||
6. Sage with ligand and OPC water | ||
* Water gets library charges | ||
* Water virtual sites get ?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be clarified?
1. Look for preset charges from the `charge_from_molecules` argument | ||
1. Look for chemical environment matches within the `<LibraryCharges>` section | ||
1. Look for chemical environment matches within the `<ChargeIncrementModel>` section | ||
1. Try to run AM1-BCC or some variant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do virtual site charge increments come in here?
@@ -328,6 +331,7 @@ def _get_charges( | |||
|
|||
total_charge: Quantity = numpy.sum(parameter_value) | |||
# assumes virtual sites can only have charges determined in one step | |||
|
|||
charges[topology_key] = -1.0 * total_charge | |||
|
|||
# Apply increments to "orientation" atoms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to log this? Since it will change the charges coming out. I can imagine being very confused that the charges are completely different to what the logs say they should be because they were modified by a charge increment from a virtual site.
@@ -178,6 +181,12 @@ def store_potentials( # type: ignore[override] | |||
), | |||
}, | |||
) | |||
|
|||
logger.info( | |||
"Charge section VirtualSites applied to virtual site with orientation atoms " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Charge section VirtualSites applied to virtual site with orientation atoms " | |
"Charge section VirtualSites applied to virtual site with orientation atoms at topology indices " |
I'm not sure this is a huge improvement but I feel like clarifying what the atom numbers are is helpful.
Welp of course as soon as I submit the review I realise I misunderstood how ChargeIncrementModel worked. I've gone back and editing things to hide my shame but if you run into something weird asking how charge increments are applied (outside the context of vsites)... just ignore it... |
Description
Resolves #1048
Checklist
ChargeIncrementHandler
Preset charges and virtual sitesis underspecified and cannot be tested (yet)Update docstringsNo particular docstring to update, I don't think?