Results are gibberish

#11
by mrostengrab - opened

Prompting osmosis using this prompt:
"

  1. Create a new project in runcomfy to organize the image generation workflow.
  2. Locate and select the "convert-text-to-image" flow available in runcomfy.
  3. Retrieve the input requirements for the "convert-text-to-image" flow.
  4. Gather necessary inputs (such as the text prompt and any optional settings) for the image generation.
  5. Run a job with the selected flow and provided inputs to generate the image.
  6. Monitor the job status, and once completed, retrieve the generated image asset from the job outputs.
    "

With this system prompt:
f"""You are a helpful assistant that understands and translates text to JSON format according to the following schema. {json.dumps(OrchestratorOsmosisResult.model_json_schema())}

The user prompt will contain a result of an agent in charge of creating plans.
It is not your plan to execute those steps or perform any actions.
You are a professional formatting assistant, your job is to properly put each step of the plan into an array, so that execution can be passed to another agent.

For example a prompt may be:

  1. Create a new project on the runcomfy platform to organize the image generation workflow.
  2. Locate the "convert-text-to-image" flow available within runcomfy.
  3. Retrieve the required input parameters for the "convert-text-to-image" flow, such as the text prompt, resolution, and any other available options.
  4. Submit the necessary inputs and execute a job using the selected flow, which will initiate the image generation process.
  5. Monitor the status of the created job, and once it is complete, retrieve and download the generated image asset(s) from the job outputs.

Your job is to put each step of the plan into the plan_steps array like so:
plan_steps[0]: 1. Create a new project on the runcomfy platform to organize the image generation workflow.
plan_steps[1]: 2. Locate the "convert-text-to-image" flow available within runcomfy.
plan_steps[2]: 3. Retrieve the required input parameters for the "convert-text-to-image" flow, such as the text prompt, resolution, and any other available options.
plan_steps[3]: 4. Submit the necessary inputs and execute a job using the selected flow, which will initiate the image generation process.
plan_steps[4]: 5. Monitor the status of the created job, and once it is complete, retrieve and download the generated image asset(s) from the job outputs.
"""
The format schema is:
class OrchestratorOsmosisResult(BaseModel):
plan_steps: List[str]

with this operation performed:
format=OrchestratorOsmosisResult.model_json_schema()

And the result is:
{"plan_steps": ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74",...,"tqjgjwvzgjyjgjkxjwyjxgjxjwzxgxhzznxyxhyywylpymkbybxxxhxhmxoemfwdgqsjl","zjxjwjxttysmynvtxxgqkxqfjsyjgzhuztjwxgkzrzzyvnxqdkwhgjxmmbcrtbbrdjhmxpnhwz"}
and so on. the numbers go up to 1000 and there are 100 or so of those gibberish strings at the end of the array.

osmosis org
β€’
edited Jun 10

Hey there! This model is quite brittle towards changes in its system prompt, so I would recommend keeping it the same. As for the use case, it'll be best suited to target things like answer extraction from convoluted reasoning traces, while this case seems more like whole text extraction. I've gone ahead and made some modifications, which you can try:

from typing import List
from ollama import chat
from pydantic import BaseModel

class OrchestratorOsmosisResult(BaseModel):
    planning_steps: List[str]
    
system_prompt = f"""
You are a helpful assistant that understands and translates text to JSON format according to the following schema. {OrchestratorOsmosisResult.model_json_schema()}
"""
reasoning_trace = """
Create a new project on the Lavaflow platform to organize the image generation workflow.
Locate the "convert-text-to-image" flow available within Lavaflow.
Retrieve the required input parameters for the "convert-text-to-image" flow, such as the text prompt, resolution, and any other available options.
Submit the necessary inputs and execute a job using the selected flow, which will initiate the image generation process.
Monitor the status of the created job, and once it is complete, retrieve and download the generated image asset(s) from the job outputs.
"""

response = chat(
  messages=[
    {
        "role": "system",
        "content": f"You are a helpful assistant that understands and translates text to JSON format according to the following schema. {OrchestratorOsmosisResult.model_json_schema()}"
    },
    {
      'role': 'user',
      'content': reasoning_trace,
    }
  ],
  model='Osmosis/Osmosis-Structure-0.6B',
  format=OrchestratorOsmosisResult.model_json_schema(),
)

answer = OrchestratorOsmosisResult.model_validate_json(response.message.content)
print(answer)

This will yield:

planning_steps=['1. Create the Lavaflow project with the following parameters', "2. Locate the 'convert-text-to-image' flow", '3. Retrieve input parameters', '4. Submit and execute job', '5. Monitor status', '6. Download and generate image', '7. Finalize and deliver']

I would also play around a bit with the user prompt format as well as the temperature to see if there are better results.

AndyGulp changed discussion status to closed
AndyGulp changed discussion status to open

cool, trying that out now
https://huggingface.co/osmosis-ai/Osmosis-Structure-0.6B/discussions/2#6839e9296463097bc606c67d
in this discussion post this user mentions few-shot prompting being beneficial, would they be putting the examples in the user prompt for this model?

Sign up or log in to comment