whitead commited on
Commit
7c871ce
·
verified ·
1 Parent(s): 79cf380

Much more detail on tasks

Browse files
Files changed (1) hide show
  1. README.md +13 -14
README.md CHANGED
@@ -27,21 +27,20 @@ This model is trained to reason in English and output a molecule.
27
  It is NOT a general purpose chat model.
28
  It has been trained specifically for these tasks:
29
 
30
- * IUPAC-names
31
- * formulas to structures
32
- * modifying solubilities by specifc LogS
33
- * constrained edits (e.g., do not affect group X or do not affect scaffold)
34
- * pKA
35
- * smell/scent
36
- * human cell receptor binding + mode (e.g., agonist)
37
  * ADME properties (e.g., MDDK efflux ratio, LD50)
38
- * GHS classifications (as words, not codes, like "carcinogen")
39
- * some electronic properties
40
- * 1-step retrosynthesis
41
- * reaction outcome prediction
42
- * natural language caption to molecule
43
- * natural product elucidation (formula + organism to SMILES)
44
- * blood-brain barrier permeability
45
 
46
  For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
47
  If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.
 
27
  It is NOT a general purpose chat model.
28
  It has been trained specifically for these tasks:
29
 
30
+ * IUPAC name to SMILES
31
+ * Molecular formula (Hill notation) to SMILES, optionally with constraints on functional groups
32
+ * modifying solubilities on given molecules (SMILES) by specifc LogS, optionally with constraints about scaffolds/groups/similarity
33
+ * Matching pKa to molecules, proposing molecules with a pKa, or modifying molecules to adjust pKa
34
+ * Matching scent/smell to molecuels and modifying molecules to adjust scent
35
+ * Matching human cell receptor binding + mode (e.g., agonist) to molecule or modifying a molecule's binding effect. Trained [from EveBio](https://data.evebio.org/)
 
36
  * ADME properties (e.g., MDDK efflux ratio, LD50)
37
+ * GHS classifications (as words, not codes, like "carcinogen"). For example, "modify this molecule to remove acute toxicity."
38
+ * Quantitative LD50 in mg/kg
39
+ * Proposing 1-step retrosynthesis from likely commercially available reagents
40
+ * Predicting a reaction outcome
41
+ * Gemeral natural language description of a specific molecule to that molecule (inverse molecule captioning)
42
+ * Natural product elucidation (formula + organism to SMILES) - e.g, "A molecule with formula C6H12O6 was isolated from Homo sapiens, what could it be?"
43
+ * Matching blood-brain barrier permeability (as a class) or modifying
44
 
45
  For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
46
  If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.