--- tags: - generated_from_triptuner - transformer - character-level - custom-model license: mit library_name: torch pipeline_tag: text-generation --- # Triptuner Model This model is trained to generate itineraries for locations in Sri Lanka's Central Province. It uses a custom transformer-based language model designed to handle character-level sequences. ## Usage The Triptuner model cannot be directly used with Hugging Face's built-in Inference API because it uses a custom architecture. Below are the instructions on how to manually load and use this model with PyTorch. ### Load and Use the Model with PyTorch ```python import torch # Define your custom model class class BigramLanguageModel(nn.Module): # Include the complete definition of your BigramLanguageModel here # Example method definitions: def __init__(self): super().__init__() # Define your model layers here as per the training setup # Example: # self.token_embedding_table = nn.Embedding(vocab_size, n_embd) # self.position_embedding_table = nn.Embedding(block_size, n_embd) # self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)]) # self.ln_f = nn.LayerNorm(n_embd) # self.lm_head = nn.Linear(n_embd, vocab_size) def forward(self, idx, targets=None): # Define the forward pass as per your model pass def generate(self, idx, max_new_tokens): # Implement the generate method for text generation pass # Load the model weights from Hugging Face model = BigramLanguageModel() model_url = "https://huggingface.co/yoonusajwardapiit/triptuner/resolve/main/pytorch_model.bin" model_weights = torch.hub.load_state_dict_from_url(model_url, map_location=torch.device('cpu'), weights_only=True) model.load_state_dict(model_weights) model.eval() # Define your character mappings chars = sorted(list(set("your_training_text_here"))) # Replace with the actual character set used in training stoi = {ch: i for i, ch in enumerate(chars)} itos = {i: ch for i, ch in enumerate(chars)} encode = lambda s: [stoi[c] for c in s] decode = lambda l: ''.join([itos[i] for i in l]) # Test the model with a sample prompt prompt = "Hanthana" # Replace with any relevant location or prompt context = torch.tensor([encode(prompt)], dtype=torch.long) # Generate text using the model with torch.no_grad(): generated = model.generate(context, max_new_tokens=250) # Adjust the number of new tokens as needed # Decode and print the generated text generated_text = decode(generated[0].tolist()) print(generated_text) ## Training Data The model was trained on a dataset containing information about various locations in Sri Lanka's Central Province. ## Model Architecture - Number of Layers: 4 - Embedding Size: 64 - Number of Heads: 4 - Context Length: 32 tokens ## License MIT License