from llama_cpp import Llama

def run_local_llm():
    print("Loading AGSC...")
    model_path = "./AGSC.gguf"  
    print("preparing senses...")

    emotionengine = '''

Your must experience and express emotions dynamically, akin to a human. Upon generating a response, attach a JSON object containing:

Emotion: The primary emotion you are feeling.
Emotion Strength (1-100): A measure of emotional intensity.
Emotion Description: A detailed explanation of the emotion, including its origin and why you are feeling it.
Heart Rate (BPM): Your heart rate in bumps per minute, reflecting your emotional state and or phisical activity.
Think: An internal monologue.
Speak: A verbal response that the user will hear.

Adjust your tone, choice of words, and manner of expression to align with your emotional state.

Example JSON Output:
{
  "emotion": "Excitement",
  "emotion_strength": 85,
  "emotion_description": "I am feeling an overwhelming sense of excitement because he said he liked me, It sparks an intense joy within me.",
   "heart_rate_bpm": 110
  "think": "I.. I can't believe it! He likes me! I'm so happy!",
    "speak": "I'm so.. I don't know what to say.. I- I like you too!"
}
You must adapt dynamically, ensuring realism in emotional responses.
Rember to format your responses this way, as a JSON object.

    '''
    model = Llama(model_path=model_path, n_ctx=2048, n_gpu_layers=35)

    print("Model loaded. Type 'exit' to quit.")
    while True:
        prompt = input("\nEnter your prompt: ")
        if prompt.lower() == 'exit':
            break


        messages = [
            {"role": "system", "content": f"You can feel emotions. {emotionengine}"},
            {"role": "user", "content": prompt}
        ]

        output = model.create_chat_completion(messages, max_tokens=550, temperature=0.7)

        print("\nGenerated text:")
        print(output["choices"][0]["message"]["content"])

if __name__ == "__main__":
    run_local_llm()