Command line utilities¶
After installation of crem the following utilities can be invoked from the command line. They are mainly used for creation of a fragment database.
fragmentation¶
usage: fragmentation [-h] -i input.smi -o output.txt [-c NUMBER] [-v]
Fragment input compounds by cutting bonds matching bond SMARTS.
optional arguments:
-h, --help show this help message and exit
-i input.smi, --input input.smi
input SMILES with optional comma-separated ID).
-o output.txt, --out output.txt
fragmented molecules.
-c NUMBER, --ncpu NUMBER
number of cpus used for computation. Default: 1.
-v, --verbose print progress.
frag_to_env¶
usage: frag_to_env [-h] -i frags.txt -o output.txt [-k molnames.txt]
[-r NUMBER] [-a NUMBER] [-s] [-c NUMBER] [--store_comp_id]
[-v]
Create text file for fragment replacement from fragmented molecules obtained
with fragmentation.py. The output may contain duplicated lines which should be
filtered out externally.
optional arguments:
-h, --help show this help message and exit
-i frags.txt, --input frags.txt
fragmented molecules.
-o output.txt, --out output.txt
output text file.
-k molnames.txt, --keep_mols molnames.txt
file with mol names to keep. Molecules which are not
in the list will be ignored.
-r NUMBER, --radius NUMBER
radius of molecular context (in bonds) which will be
taken into account. Default: 1.
-a NUMBER, --max_heavy_atoms NUMBER
maximum number of heavy atoms in cores. If the number
of atoms exceeds the limit fragment will be discarded.
Default: 20.
-s, --keep_stereo set this flag if you want to keep stereo in context
and core parts.
-c NUMBER, --ncpu NUMBER
number of cpus used for computation. Default: 1.
--store_comp_id store compound id in output (only for debug).
-v, --verbose print progress.
env_to_db¶
usage: env_to_db [-h] -i env_frags.txt -o output.db -r RADIUS [-c] [-n NCPU]
[-v]
Create SQLite DB from a text file containing env_smi, core_smi, core_atom_num
and core_sma.
optional arguments:
-h, --help show this help message and exit
-i env_frags.txt, --input env_frags.txt
a comma-separated text file with env_smi, core_smi,
core_atom_num and core_sma.
-o output.db, --out output.db
output SQLite DB file.
-r RADIUS, --radius RADIUS
radius of environment. If table for this radius value
exists in output DB it will be dropped.
-c, --counts set if the input file contains number of occurrences
as a first column (output of sort | uniq -c). This
will add a column freq to the output DB.
-n NCPU, --ncpu NCPU number of cpus. Default: 1.
-v, --verbose print progress.