Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,10 @@
|
|
2 |
|
3 |
## Model Description
|
4 |
|
5 |
-
LLMEyeCap is
|
|
|
|
|
|
|
6 |
|
7 |
### Features
|
8 |
|
@@ -75,7 +78,7 @@ Here's how to use this model for object captioning:
|
|
75 |
|
76 |
This 0.1 version is a stand alone model for captiong objects on images. It can be uses as it or trained on new objects without "catastrophic forgetting".
|
77 |
Coming the 0.2 version with latent space to connect to hidden dims of LLMs.
|
78 |
-
|
79 |
|
80 |
|
81 |
## Authors
|
|
|
2 |
|
3 |
## Model Description
|
4 |
|
5 |
+
LLMEyeCap is an innovative Novel Object Captioning model aimed at enhancing Large Language Models (LLMs) with vision capabilities. This project leverages a blend of cutting-edge models and techniques to detect novel objects in images, identify their bounding boxes, and generate insightful captions for them.
|
6 |
+
|
7 |
+
One of the core innovations is the replacement of traditional classification layers with text generation mechanisms. This novel approach addresses the issue of catastrophic forgetting, enabling the model to learn new objects without unlearning previous ones. Furthermore, the model connects the latent space of the visual data to the hidden dimensions of an LLM's decoder. This makes it possible to train the model on unsupervised video datasets, opening up a plethora of potential applications.
|
8 |
+
|
9 |
|
10 |
### Features
|
11 |
|
|
|
78 |
|
79 |
This 0.1 version is a stand alone model for captiong objects on images. It can be uses as it or trained on new objects without "catastrophic forgetting".
|
80 |
Coming the 0.2 version with latent space to connect to hidden dims of LLMs.
|
81 |
+
Again this model is still in the development phase and we're actively seeking contributions and ideas to enhance its capabilities. If you're interested in contributing, whether it's through code, ideas, or data, we'd love to hear from you.
|
82 |
|
83 |
|
84 |
## Authors
|