futurehouse
/

ether0

Text Generation

Model card Files Files and versions Community

whitead commited on Jun 5

Commit

7c871ce

·

verified ·

1 Parent(s): 79cf380

Much more detail on tasks

Files changed (1) hide show

README.md +13 -14

README.md CHANGED Viewed

@@ -27,21 +27,20 @@ This model is trained to reason in English and output a molecule.
 It is NOT a general purpose chat model.
 It has been trained specifically for these tasks:
-* IUPAC-names
-* formulas to structures
-* modifying solubilities by specifc LogS
-* constrained edits (e.g., do not affect group X or do not affect scaffold)
-* pKA
-* smell/scent
-* human cell receptor binding + mode (e.g., agonist)
 * ADME properties (e.g., MDDK efflux ratio, LD50)
-* GHS classifications (as words, not codes, like "carcinogen")
-* some electronic properties
-* 1-step retrosynthesis
-* reaction outcome prediction
-* natural language caption to molecule
-* natural product elucidation (formula + organism to SMILES)
-* blood-brain barrier permeability
 For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
 If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.

 It is NOT a general purpose chat model.
 It has been trained specifically for these tasks:
+* IUPAC name to SMILES
+* Molecular formula (Hill notation) to SMILES, optionally with constraints on functional groups
+* modifying solubilities on given molecules (SMILES) by specifc LogS, optionally with constraints about scaffolds/groups/similarity
+* Matching pKa to molecules, proposing molecules with a pKa, or modifying molecules to adjust pKa
+* Matching scent/smell to molecuels and modifying molecules to adjust scent
+* Matching human cell receptor binding + mode (e.g., agonist) to molecule or modifying a molecule's binding effect. Trained [from EveBio](https://data.evebio.org/)
 * ADME properties (e.g., MDDK efflux ratio, LD50)
+* GHS classifications (as words, not codes, like "carcinogen"). For example, "modify this molecule to remove acute toxicity."
+* Quantitative LD50 in mg/kg
+* Proposing 1-step retrosynthesis from likely commercially available reagents
+* Predicting a reaction outcome
+* Gemeral natural language description of a specific molecule to that molecule (inverse molecule captioning)
+* Natural product elucidation (formula + organism to SMILES) - e.g, "A molecule with formula C6H12O6 was isolated from Homo sapiens, what could it be?"
+* Matching blood-brain barrier permeability (as a class) or modifying
 For example, you can ask "Propose a molecule with a pKa of 9.2" or "Modify CCCCC(O)=OH to increase its pKa by about 1 unit." You cannot ask it "What is the pKa of CCCCC(O)=OH?"
 If you ask it questions that lie significantly beyond those tasks, it can fail. You can combine properties, although we haven't significantly benchmarked this.