boltuix
/

bert-mini

Text Classification

general-purpose

offline-assistant

intent-detection

embedded-systems

command-classification

semantic-search

Model card Files Files and versions

boltuix commited on Jun 8

Commit

7dff5aa

·

verified ·

1 Parent(s): 888ffec

Update README.md

Files changed (1) hide show

README.md +8 -3

README.md CHANGED Viewed

@@ -348,10 +348,15 @@ To adapt `bert-mini` for custom tasks (e.g., specific IoT commands):
     # Tokenize dataset
     def tokenize_function(examples):
-        return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=64)
     tokenized_dataset = dataset.map(tokenize_function, batched=True)
-    tokenized_dataset.set_format("torch", columns=["input_ids", "attention_mask", "label"])
     # Define training arguments
     training_args = TrainingArguments(
@@ -388,7 +393,7 @@ To adapt `bert-mini` for custom tasks (e.g., specific IoT commands):
         outputs = model(**inputs)
         logits = outputs.logits
         predicted_class = torch.argmax(logits, dim=1).item()
-    print(f"Predicted class for '{text}': {'Valid IoT Command' if predicted_class == 1 else 'Invalid Command'}")
    ```
 3. **Deploy**: Export to ONNX or TensorFlow Lite for edge devices.

     # Tokenize dataset
     def tokenize_function(examples):
+        # Use return_tensors="pt" here to get PyTorch tensors directly
+        return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=64, return_tensors="pt")
+    # Pass batched=True to the map function as the tokenize_function is designed to handle batches
     tokenized_dataset = dataset.map(tokenize_function, batched=True)
+    # We don't need to set the format to "torch" explicitly here anymore
+    # because the tokenizer is already returning PyTorch tensors.
+    # tokenized_dataset.set_format("torch", columns=["input_ids", "attention_mask", "label"])
     # Define training arguments
     training_args = TrainingArguments(
         outputs = model(**inputs)
         logits = outputs.logits
         predicted_class = torch.argmax(logits, dim=1).item()
+    print(f"Predicted class for '{text}': {'Valid IoT Command' if predicted_class == 1 else 'Invalid Command'}")
    ```
 3. **Deploy**: Export to ONNX or TensorFlow Lite for edge devices.