SynthonGPT is a compact synthon-conditioned transformer for navigating makeable chemical space. Instead of generating arbitrary SMILES without synthetic grounding, it is built around synthesis-aware building blocks and vendor enumerations, making it more aligned with practical discovery workflows.
Highlights
- Count-matched benchmarks show up to 3.1x higher unique scaffold recovery than F-Trees and 1.76x higher than SpaceLight while maintaining lower mean similarity.
- The model has roughly 90M parameters, trains in about 10 hours on a single RTX 4090, and supports sub-second inference on CPU and GPU.
Links
- Tech report: report.pdf
- Demo: synthongpt.mireklzicar.com