metadata

library_name: sklearn
tags:
  - sklearn
  - skops
  - tabular-classification
model_format: skops
model_file: classifier.skops
widget:
  - structuredData:
      credibleSetConfidence:
        - 0.5
        - 0.5
        - 0.5
      distanceFootprintMean:
        - 0.9775440692901611
        - 0.915132999420166
        - 0.9051011204719543
      distanceFootprintMeanNeighbourhood:
        - 0.9782288074493408
        - 0.9157739877700806
        - 0.9057350754737854
      distanceSentinelFootprint:
        - 0.9779183268547058
        - 0.9165117740631104
        - 0.9065772294998169
      distanceSentinelFootprintNeighbourhood:
        - 0.9779183268547058
        - 0.9165117740631104
        - 0.9065772294998169
      distanceSentinelTss:
        - 0.9779183268547058
        - 0.9165117740631104
        - 0.9062665700912476
      distanceSentinelTssNeighbourhood:
        - 0.9798746109008789
        - 0.9183452129364014
        - 0.9080795049667358
      distanceTssMean:
        - 0.9775440692901611
        - 0.915132999420166
        - 0.9047871232032776
      distanceTssMeanNeighbourhood:
        - 0.9799838066101074
        - 0.9174169898033142
        - 0.907045304775238
      eQtlColocClppMaximum:
        - 0
        - 0
        - 0
      eQtlColocClppMaximumNeighbourhood:
        - 0
        - 0
        - 0
      eQtlColocH4Maximum:
        - 0
        - 0
        - 0
      eQtlColocH4MaximumNeighbourhood:
        - 0
        - 0
        - 0
      geneCount500kb:
        - 4
        - 4
        - 4
      isProteinCoding:
        - 1
        - 0
        - 0
      pQtlColocClppMaximum:
        - 0
        - 0
        - 0
      pQtlColocClppMaximumNeighbourhood:
        - 0
        - 0
        - 0
      pQtlColocH4Maximum:
        - 0
        - 0
        - 0
      pQtlColocH4MaximumNeighbourhood:
        - 0
        - 0
        - 0
      proteinGeneCount500kb:
        - 1
        - 1
        - 1
      sQtlColocClppMaximum:
        - 0
        - 0
        - 0
      sQtlColocClppMaximumNeighbourhood:
        - 0
        - 0
        - 0
      sQtlColocH4Maximum:
        - 0
        - 0
        - 0
      sQtlColocH4MaximumNeighbourhood:
        - 0
        - 0
        - 0
      studyLocusId:
        - 0b60eba83ae9773e4908e41b11fb8243
        - 0b60eba83ae9773e4908e41b11fb8243
        - 0b60eba83ae9773e4908e41b11fb8243
      vepMaximum:
        - 0
        - 0
        - 0
      vepMaximumNeighbourhood:
        - 0
        - 0
        - 0
      vepMean:
        - 0
        - 0
        - 0
      vepMeanNeighbourhood:
        - 0
        - 0
        - 0

Model description

The locus-to-gene (L2G) model derives features to prioritise likely causal genes at each GWAS locus based on genetic and functional genomics features. The main categories of predictive features are:

    - Distance: (from credible set variants to gene)
    - Molecular QTL Colocalization
    - Chromatin Interaction: (e.g., promoter-capture Hi-C)
    - Variant Pathogenicity: (from VEP)

    More information at: https://opentargets.github.io/gentropy/python_api/methods/l2g/_l2g/

Intended uses & limitations

[More Information Needed]

Training Procedure

Gradient Boosting Classifier

Hyperparameters

Click to expand

Hyperparameter	Value
ccp_alpha	0.0
criterion	friedman_mse
init
learning_rate	0.1
loss	log_loss
max_depth	5
max_features
max_leaf_nodes
min_impurity_decrease	0.0
min_samples_leaf	1
min_samples_split	2
min_weight_fraction_leaf	0.0
n_estimators	100
n_iter_no_change
random_state	42
subsample	1.0
tol	0.0001
validation_fraction	0.1
verbose	0
warm_start	False

How to Get Started with the Model

To use the model, you can load it using the LocusToGeneModel.load_from_hub method. This will return a LocusToGeneModel object that can be used to make predictions on a feature matrix. The model can then be used to make predictions using the predict method.

    More information can be found at: https://opentargets.github.io/gentropy/python_api/methods/l2g/model/

Citation

https://doi.org/10.1038/s41588-021-00945-5

License

MIT