zhihan1996 commited on
Commit
52b4176
·
verified ·
1 Parent(s): d11cfc3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd
3
+ ---
4
+
5
+ This is the base model of GenomeOcean-100M. It is trained with Causal Language Modeling (CLM) and uses a BPE tokenizer with 4096 tokens. It supports a maximum sequence length of 1024 tokens (~5kbp).
6
+
7
+ Please see our official implementation on our [Github](https://github.com/jgi-genomeocean/genomeocean).
8
+
9
+ Quick start.
10
+
11
+ ```
12
+ import torch
13
+ from transformers import AutoModelForCausalLM, AutoTokenizer
14
+
15
+ tokenizer = AutoTokenizer.from_pretrained(
16
+ "pGenomeOcean/GenomeOcean-4B",
17
+ trust_remote_code=True,
18
+ padding_side="left",
19
+ )
20
+ model = AutoModelForCausalLM.from_pretrained(
21
+ "pGenomeOcean/GenomeOcean-4B",
22
+ trust_remote_code=True,
23
+ torch_dtype=torch.bfloat16,
24
+ attn_implementation="flash_attention_2",
25
+ ).to("cuda")
26
+ ```
27
+
28
+
29
+ Copyright Notice
30
+
31
+ genomeocean: a pretrained microbial genome foundational model (genomeoceanLLM) ” Copyright (c) 2025, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy) and Northwestern University. All rights reserved.
32
+
33
+ If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Intellectual Property Office at [email protected].
34
+
35
+ NOTICE. This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit others to do so.