Command line utilities

After installation of crem the following utilities can be invoked from the command line. They are mainly used for creation of a fragment database.

fragmentation

usage: fragmentation [-h] -i input.smi -o output.txt [-c NUMBER] [-v]

Fragment input compounds by cutting bonds matching bond SMARTS.

optional arguments:
  -h, --help            show this help message and exit
  -i input.smi, --input input.smi
                        input SMILES with optional comma-separated ID).
  -o output.txt, --out output.txt
                        fragmented molecules.
  -c NUMBER, --ncpu NUMBER
                        number of cpus used for computation. Default: 1.
  -v, --verbose         print progress.

frag_to_env

usage: frag_to_env [-h] -i frags.txt -o output.txt [-k molnames.txt]
                   [-r NUMBER] [-a NUMBER] [-s] [-c NUMBER] [--store_comp_id]
                   [-v]

Create text file for fragment replacement from fragmented molecules obtained
with fragmentation.py. The output may contain duplicated lines which should be
filtered out externally.

optional arguments:
  -h, --help            show this help message and exit
  -i frags.txt, --input frags.txt
                        fragmented molecules.
  -o output.txt, --out output.txt
                        output text file.
  -k molnames.txt, --keep_mols molnames.txt
                        file with mol names to keep. Molecules which are not
                        in the list will be ignored.
  -r NUMBER, --radius NUMBER
                        radius of molecular context (in bonds) which will be
                        taken into account. Default: 1.
  -a NUMBER, --max_heavy_atoms NUMBER
                        maximum number of heavy atoms in cores. If the number
                        of atoms exceeds the limit fragment will be discarded.
                        Default: 20.
  -s, --keep_stereo     set this flag if you want to keep stereo in context
                        and core parts.
  -c NUMBER, --ncpu NUMBER
                        number of cpus used for computation. Default: 1.
  --store_comp_id       store compound id in output (only for debug).
  -v, --verbose         print progress.

env_to_db

usage: env_to_db [-h] -i env_frags.txt -o output.db -r RADIUS [-c] [-n NCPU]
                 [-v]

Create SQLite DB from a text file containing env_smi, core_smi, core_atom_num
and core_sma.

optional arguments:
  -h, --help            show this help message and exit
  -i env_frags.txt, --input env_frags.txt
                        a comma-separated text file with env_smi, core_smi,
                        core_atom_num and core_sma.
  -o output.db, --out output.db
                        output SQLite DB file.
  -r RADIUS, --radius RADIUS
                        radius of environment. If table for this radius value
                        exists in output DB it will be dropped.
  -c, --counts          set if the input file contains number of occurrences
                        as a first column (output of sort | uniq -c). This
                        will add a column freq to the output DB.
  -n NCPU, --ncpu NCPU  number of cpus. Default: 1.
  -v, --verbose         print progress.

guacamol_test