CReM — chemically reasonable mutations

GitHub repo GitHub stars GitHub forks

CReM is an open-source Python framework for generating chemical structures using a fragment-based approach.

The core idea is borrowed from matched molecular pairs: fragments that occur in the same chemical context are considered interchangeable. CReM stores such context–fragment relationships in a SQLite database and uses them to grow, mutate, link, and cyclize molecules so that every generated structure is built only from fragment substitutions that have been observed in real molecules.

What you can do

  • Build fragment databases from your own molecular collections, or download precompiled ChEMBL databases.
  • Generate molecules in four modes: mutate, grow, link, and make_cycle.
  • Control the chemistry through context radius, fragment-size windows, and per-set frequency thresholds.
  • Restrict where changes happen with replace_ids / protected_ids.
  • Bias fragment selection with custom filter_func and sample_func callbacks, or with property filters stored in the database.
  • Store several fragment sets in one database and switch between them at generation time with set_names and min_freq.

Where to start

If you want to… Read
Understand the vocabulary (context, radius, core, sets) Concepts
Install CReM Installation
Run your first generation Quick start
Build a fragment database Build a database (v1)
Look up a function or CLI flag Reference

How CReM works in one picture

            context (environment)            core fragment
        ┌───────────────────────────┐      ┌─────────────┐
   …—C—C—[*:1]                       and    [*:1]—CH3      ← stored together in the DB
        └───────────────────────────┘      └─────────────┘

   To modify a molecule, CReM:
     1. fragments it into (context, core) pairs;
     2. looks up the context in the database;
     3. retrieves every core seen in that same context;
     4. reassembles the molecule with each alternative core.

See Concepts for the full model and Database schema for how this is stored.

Online resources

License

BSD-3-Clause. See Citation for the papers to cite.