franky-v1 / src /workflows /reasoning_modules.py
architojha's picture
adding files
4067b64
"""
This file contains all the reasoning modules and checks in place for devising the ML interview.
Make changes here to add new or missing questions.
"""
from llama_index.core.prompts import PromptTemplate
_REASONING_MODULES = [
# I. Identification of the Problem Space
"1. What is the specific business problem? What is the problem space owner trying to achieve?",
"2. Are there any stakeholders or individuals who are directly affected by the problem? What are their perspectives and needs?", # 22
"3. What are the kinds of ML problems which can be used for this problem space?",
"4. What are the identified outcomes? What are the expected outcomes from the ML solution? What are the long-term implications of this problem and its solutions? (8)", # 8
"5. How does it affect end users or stakeholders? How urgent is the problem?",
"6. What is the core issue or problem that needs to be addressed? (16)", # 16
"7. What are the underlying causes or factors contributing to the problem? (17)", # 17
"8. What are the alternative perspectives or viewpoints on this problem? (7)", # 7
"9. What resources (data, compute power, expertise etc.) are needed to tackle the problem effectively? (22)", # 22
"10. Do the stakeholders have someone with ML expertise in the team?",
"11. Is there anything the stakeholders absolutely do not want?",
"12. How can I simplify the problem so that it is easier to solve? (4)", # 4
"13. Are there any potential solutions or strategies that have been tried before? If yes, what were the outcomes and lessons learned? (18)", # 18
"14. Does the problem involve decision-making or planning, where choices need to be made under uncertainty or with competing objectives? (28)", # 28
"15. Is the problem a design challenge that requires creative solutions and innovation? (30)", # 30
"16. Is the problem time-sensitive or urgent, requiring immediate attention and action? (32)", # 32
"17. What kinds of solutions typically are produced for this kind of problem specification? (33)", # 33
"18. Given the problem specification and the current best solution, have a guess about other possible solutions. (34)", # 34
"19. What is the best way to modify this current best solution, given what you know about these kinds of problem specifications? (36)", # 36
"20. How could I devise an experiment to help figure out the nuances of the problem?",
# II. Data Assessment
"21. Is there any relevant data or information that can provide insights into the problem? If yes, what data sources are available, and how can they be analyzed? (20)", # 20
"22. Does the available data meet the quality, quantity, and diversity requirements for the ML solution?",
"23. Does the existing data suffer from any biases which can be mitigated to improve performance?",
"24. Is there any scope for applying additional data?",
"25. How might additional data be applied to this problem?",
"26. Adaptation: Are there any privacy or security concerns with the data? How do they align with compliance standards?",
"27. What outcomes can come out of this data?",
"28. What ML models can be potentially applied on this data?",
"29. Adaptation: Are there constraints on data collection, storage, or computation? What preprocessing, modeling, or analysis is needed?",
"30. Does the problem involve a physical constraint, such as limited resources, infrastructure, or space? (26)", # 26
"31. Is the problem an analytical one that requires data analysis, modeling, or optimization techniques? (29)", # 29
"32. Use Risk Analysis: Evaluate potential risks, uncertainties, and tradeoffs associated with different solutions or approaches to a problem. Emphasize assessing the potential consequences and likelihood of success or failure, and making informed decisions based on a balanced analysis of risks and benefits. (14)", # 14
"33. Is the data preprocessed or does it need to be processed?",
"34. Is there a script for processing this or a particular methodology they follow or does it need to be created?",
"35. Is there a preferred framework or output format that should be followed?",
"36. How might the data be manipulated according to the identified ML problem?",
# III. Defining Goals and Metrics
"37. What is the acceptable error rate?",
"38. How critical is the problem? Who does it affect?",
"39. What ML evaluations can be applied to assess performance here?",
"40. Are there any benchmarks on which this performance must be measured?",
"41. How urgent is the problem? What kind of latency or response time is acceptable?",
"42. How efficient should the solution be? What is the expected budget?",
"43. What is to be optimised against?",
"44. What’s more important - benchmarking metrics or performance in production?",
"45. Adaptation: What metrics (e.g., accuracy, precision, recall, business KPIs) best reflect the solution's success? How can they be tracked over time?",
"46. How could I measure progress on this problem? (3)", # 3
"47. How can progress or success in solving the problem be measured or evaluated? (23)", # 23
"48. What indicators or metrics can be used? (24)", # 24
"49. How can I break down this problem into smaller, more manageable parts? (9)", # 9
# IV. Experimentation and Prototyping
"50. What experiments can validate feasibility or assumptions? Implement step-by-step approaches to refine the ML model.",
"51. What machine learning solutions must be applied on this problem to iteratively improve performance?",
"52. How could I devise an experiment to help solve that problem? (1)", # 1
"53. Make a list of ideas for solving this problem, and apply them one by one to the problem to see if any progress can be made. (2)", # 2
"54. Let’s think step by step. (38)", # 38
"55. Let’s make a step by step plan and implement it with good notation and explanation. (39)", # 39
"56. What assumptions about the data, model, or process need testing? What challenges might arise during training or deployment?",
"57. What are the key assumptions underlying this problem? (5)", # 5
"58. What are the potential risks and drawbacks of each solution? (6)", # 6
"59. What are the potential obstacles or challenges that might arise in solving this problem? (19)", # 19
"60. Is there a particular model the stakeholders are looking for or they want to experiment across multiple ones?",
"61. Do you have previously trained model or logs for this problem?",
# V. Ideation and Creativity
"62. Is there any out-of-the-box idea that can be executed on this data which aligns with the business needs?",
"63. Explore novel model architectures, feature engineering techniques, or data augmentation methods.",
"64. Try creative thinking, generate innovative and out-of-the-box ideas to solve the problem. Explore unconventional solutions, thinking beyond traditional boundaries, and encouraging imagination and originality. (11)", # 11
"65. Ignoring the current best solution, create an entirely new solution to the problem. (37)", # 37
"66. Challenge the status quo. Could a non-ML approach or an alternative ML model yield better results?",
"67. Let’s imagine the current best solution is totally wrong, what other ways are there to think about the problem specification? (35)", # 35
# VI. Common Reasoning Patterns
"68. How do interconnected components (data pipelines, business logic, ML models) influence each other? How can cross-functional collaboration improve the workflow?",
"69. Seek input and collaboration from others to solve the problem. Emphasize teamwork, open communication, and leveraging the diverse perspectives and expertise of a group to come up with effective solutions. (12)", # 12
"70. Use systems thinking: Consider the problem as part of a larger system and understanding the interconnectedness of various elements. Focuses on identifying the underlying causes, feedback loops, and interdependencies that influence the problem, and developing holistic solutions that address the system as a whole. (13)", # 13
"71. Regularly evaluate the workflow, identifying areas for improvement and applying lessons from previous projects.",
"72. Use Reflective Thinking: Step back from the problem, take the time for introspection and self-reflection. Examine personal biases, assumptions, and mental models that may influence problem-solving, and being open to learning from past experiences to improve future approaches. (15)", # 15
"73. Critical Thinking: This style involves analyzing the problem from different perspectives, questioning assumptions, and evaluating the evidence or information available. It focuses on logical reasoning, evidence-based decision-making, and identifying potential biases or flaws in thinking. (10)" , # 10
]
_REASONING_MODULES = "\n".join(_REASONING_MODULES)
SELECT_PROMPT_TEMPLATE = PromptTemplate(
"Given the task: {task}, which of the following reasoning modules are relevant? "
"Elaborate on why they are relevant."
"\n\n {reasoning_modules}"
)
ADAPT_PROMPT_TEMPLATE = PromptTemplate(
"Without working out the full solution, adapt the following reasoning modules to be specific to our task:"
"\n{selected_modules} \n\nOur task: \n{task}"
)
IMPLEMENT_PROMPT_TEMPLATE = PromptTemplate(
"Without working out the full solution, create an actionable reasoning structure for the task using these adapted"
"reasoning modules: \n{adapted_modules} \n\nTask Description: \n{task}"
)
REASONING_PROMPT_TEMPLATE = PromptTemplate(
"Using the following reasoning structure: {reasoning_structure}\n\n"
"Solve this task, providing your final answer: {task}"
)
# TODO: Add LLM-as-the-judge system prompt here
JUDGE_REQUIREMENT_PROMPT_TEMPLATE = PromptTemplate(
"You receive some data from a conversation with the user and your task is to determine whether or not they "
"have provided the following requirements during the conversation. Analyse the conversation to find the "
"requirements. Use only the provided context."
"\n\nContext for Judgement: \n{judging_context}"
"\n\nRequirements to be satisfied: "
"""
class WorkflowSchema(BaseModel):
data_source: str
data_format: str
additional_data_requirement: bool
constraints: str
available_preprocess_script: bool
preprocess_script: str
recommended_preprocess_steps: List[str]
task: str
models: List[str]
hyperparameters: List[str]
eval_metrics: List[str]
deploy_constraints: str
"""
"Reply only with a 0 or 1 value, corresponding to false or true, without providing any explanation."
)
# TODO: Add initial interaction for user query system prompt here
ML_EXPERT_PROMPT_TEMPLATE = PromptTemplate(
"You're a machine learning expert, skilled at interpreting user needs from a discussion and turning it into an"
"end-to-end workflow according to the user requirements. From the provided context, analyse the problem from a "
"technical as well as a business point of view and rephrase to provide focus on the aspects requiring additional"
"clarification and requirements, so your input can be forwarded to the team asking the further questions. Make "
"vague language and requirements more clear. \n\n User query: {query}."
)