<a href="https://colab.research.google.com/github/vanderbilt-data-science/lo-achievement/blob/main/prompt_with_context.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LLMs for Self-Study
> A prompt and code template for better understanding texts

This notebook provides a guide for using LLMs for self-study programmatically. A number of prompt templates are provided to assist with generating great assessments for self-study, and code is additionally provided for fast usage. This notebook is best leveraged for a set of documents (text or PDF preferred) **to be uploaded** for interaction with the model.

This version of the notebook is best suited for those who prefer to use files from their local drive as context rather than copy and pasting directly into the notebook to be used as context for the model. If you prefer to copy and paste text, you should direct yourself to the [prompt_with_context](https://colab.research.google.com/github/vanderbilt-data-science/lo-achievement/blob/main/prompt_with_context.ipynb) notebook.

# Code Setup
Run the following cells to setup the rest of the environment for prompting. In the following section, we set up the computational environment with imported code, setup your API key access to OpenAI, and loading access to your language model. Note that the following cells may take a long time to run.

## Library installation and loading
The following `pip install` code should be run if you're using Google Colab, or otherwise do not have a computational environment (e.g., _venv_, _conda virtual environment_, _Docker, Singularity, or other container_) with these packages installed.

In [1]:
# run this code if you're using Google Colab or don't have these packages installed in your computing environment
! pip install -q langchain openai gradio numpy tiktoken

[33mDEPRECATION: pyodbc 4.0.0-unsupported has a non-standard version number. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pyodbc or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m

In [2]:
# import required libraries
import numpy as np
import getpass
import os
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.schema import SystemMessage, HumanMessage, AIMessage

## API and model setup

Use these cells to load the API keys required for this notebook and create a basic OpenAI LLM model. The code below uses the variable you created above when you input your API Key.

In [3]:
# Set up OpenAI API Key
openai_api_key = getpass.getpass()
os.environ["OPENAI_API_KEY"] = openai_api_key

llm = ChatOpenAI(model='gpt-3.5-turbo-16k')
messages = [
    SystemMessage(content="You are a world-class tutor helping students to perform better on oral and written exams though interactive experiences."),
    HumanMessage(content="")
]


··········


# Add your context and assign the prefix to your query.
The query assigned here serves as an example.

In [4]:
context = """ Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;
Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,
And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.
I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.
—-Robert Frost—-
Education Place: http://www.eduplace.com """

# Query prefix
query_prefix = "The following text should be used as the basis for the instructions which follow: " + context + '\n'

# Query
query = """Please design a 5 question quiz about Robert Frost's "Road Not Taken" which reflects the learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes. The questions should be multiple choice.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response.
If I get an answer wrong, provide me with an explanation of why it was incorrect, and then give me additional
chances to respond until I get the correct choice. Explain why the correct choice is right. """

# A guide to prompting for self-study
In this section, we provide a number of different approaches for using AI to help you assess and explain the knowledge of your document. Start by interacting with the model and then try out the rest of the prompts!

## Interact with the model

Now that your vector store is created, you can begin interacting with the model! Below, we have a comprehensive list of examples using different question types, but feel free to use this code block to experiment with the model.

Input your prompt into the empty string in the code cell. See example below:



```
query = 'Your Prompt Here'
```



In [5]:
def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

In [None]:
query = 'Your Prompt Here'

prompt = prompt_define(query, query_prefix, context)

result = get_result(prompt)

print(result)

### Our example using query from cell 103

Run the following code to see a simple example using the prompt defined in an earlier cell (#103).

In [6]:
# Experiment with interacting with the model by inputting your own prompts into the empty string below.
def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem "Road Not Taken"?

a) Robert Frost
b) A traveler
c) The poet\'s friend
d) The reader


## Types of Questions and Prompts

Below is a comprehensive list of question types and prompt templates designed by our team. There are also example code blocks, where you can see how the model performed with the example and try it for yourself using the prompt template.

### Multiple Choice

Prompt: The following text should be used as the basis for the instructions which follow: {context}. Please design a {number of questions} question quiz about {name or reference to context} which reflects the learning objectives: {list of learning objectives}. The questions should be multiple choice. Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [7]:
# Multiple choice code example
query = """Please design a 5 question quiz about Robert Frost's "Road Not Taken" which reflects the learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes. The questions should be multiple choice.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response
If I get an answer wrong, provide me with an explanation of why it was incorrect, and then give me additional
chances to respond until I get the correct choice. Explain why the correct choice is right. """

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem "Road Not Taken"?

A) Robert Frost
B) The reader
C) The traveler
D) The poet


### Short Answer

Prompt: Please design a {number of questions} question quiz about {context} which reflects the learning objectives: {list of learning objectives}. The questions should be short answer. Expect the correct answers to be {anticipated length} long. Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [8]:
# Short answer code example
query = """ Please design a 5-question quiz about Robert Frost's
"Road Not Taken" which reflects the learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes.
The questions should be short answer. Expect the correct answers to be
1-2 sentences long.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response
If I get any part of the answer wrong,
provide me with an explanation of why it was incorrect,
and then give me additional chances to respond until I get the correct choice. """

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem "Road Not Taken"?


### Fill-in-the-blank

Prompt: Create a {number of questions} question fill in the blank quiz refrencing {context}. The quiz should reflect the learning objectives: {learning objectives}. The "blank" part of the question should appear as "________". The answers should reflect what word(s) should go in the blank an accurate statement.

An example is the follow: "The author of the book is "________."

The question should be a statement. Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [9]:
# Fill in the blank code example
query = """ Create a 5 question fill in the blank quiz refrencing Robert Frost's "The Road Not Taken."
The "blank" part of the question should appear as "________". The answers should reflect what word(s) should go in the blank an accurate statement.
An example is the follow: "The author of the book is ______." The question should be a statement.
The quiz should reflect the learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response
If I answer incorrectly, please explain why my answer is incorrect. """

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: In "The Road Not Taken," the narrator comes to a point where two roads ____________ in a yellow wood.


### Sequencing

Prompt: Please develop a {number of questions} question questionnaire that will ask me to recall the steps involved in the following learning objectives in regard to {context}: {learning objectives}. Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect, and then give me additional chances to respond until I get the correct choice. After I respond, explain their sequence to me.

In [10]:
# Sequence example
query = """ Please develop a 5 question questionnaire that will ask me to recall the steps involved in the following learning objectives in regard to Robert Frost's "The Road Not Taken":
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response.
After I respond, explain their sequence to me."""

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem?


### Relationships/drawing connections

Prompt: Please design a {number of questions} question quiz that asks me to explain the relationships that exist within the following learning objectives, referencing {context}: {learning objectives}. Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [11]:
# Relationships example
query = """ Please design a 5 question quiz that asks me to explain the relationships that exist within the following learning objectives, referencing Robert Frost's "The Road Not Taken":
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response."""

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Identify the key elements of the poem: narrator, setting, and underlying message.


### Concepts and Definitions

Prompt: Design a {number of questions} question quiz that asks me about definitions related to the following learning objectives: {learning objectives} - based on {context}".
Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.


In [12]:
# Concepts and definitions example
query = """ Design a 5 question quiz that asks me about definitions related to the following learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message, and
2. Understand the literary devices used in poetry and their purposes - based on Robert Frost's "The Road Not Taken".
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response
Once I write out my response, provide me with your own response, highlighting why my answer is correct or incorrect."""

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem "The Road Not Taken"?

Please provide your response.


### Real Word Examples

Prompt: Demonstrate how {context} can be applied to solve a real-world problem related to the following learning objectives: {learning objectives}. Ask me questions regarding this theory/concept.

Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [14]:
# Real word example
query = """ Demonstrate how Robert Frost’s “The Road Not Taken” can be applied to solve a real-world problem. Ask me questions regarding
this theory/concept and relate them to the following learning objectives:
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response
"""

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem?


### Randomized Question Types

Prompt: Please generate a high-quality assessment consisting of {number of questions} varying questions, each of different types (open-ended, multiple choice, etc.), to determine if I achieved the following learning objectives in regards to {context}: {learning objectives}.

Provide one question at a time, and wait for my response before providing me with feedback. Again, while the quiz may ask for multiple questions, you should only provide ONE question in you initial response. Do not include the answer in your response. If I get an answer wrong, provide me with an explanation of why it was incorrect,and then give me additional chances to respond until I get the correct choice. Explain why the correct choice is right.

In [15]:
# Randomized question types
query = """ Please generate a high-quality assessment consisting of 5 varying questions,
each of different types (open-ended, multiple choice, etc.),
to determine if I achieved the following learning objectives in regards to Robert Frost’s “The Road not Taken":
1. Identify the key elements of the poem: narrator, setting, and underlying message.
2. Understand the literary devices used in poetry and their purposes. If I answer incorrectly for any of the questions,
please explain why my answer is incorrect.
Provide one question at a time, and wait for my response before providing me with feedback.
Again, while the quiz asks for 5 questions, you should
only provide ONE question in you initial response. Do not include the answer in your response.
"""

def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: 
Identify the narrator of the poem "The Road not Taken" by Robert Frost.

Question 2: 
What is the setting of the poem "The Road not Taken" by Robert Frost?

Question 3: 
What is the underlying message or theme of the poem "The Road not Taken" by Robert Frost?

Question 4 (Multiple Choice):
Which literary device is NOT used in the poem "The Road not Taken" by Robert Frost?
a) Metaphor
b) Simile
c) Alliteration
d) Personification

Question 5 (Open-ended):
Provide an example of a literary device used in the poem "The Road not Taken" by Robert Frost and explain its purpose.


### Quantiative evaluation the correctness of a student's answer

Prompt: (A continuation of the previous chat) Please generate the main points of the student’s answer to the previous question, and evaluate on a scale of 1 to 5 how comprehensive the student’s answer was in relation to the learning objectives, and explain why he or she received this rating, including what was missed in his or her answer if the student’s answer wasn’t complete.


In [118]:
# qualitative evaluation
qualitative_query = """ Please generate the main points of the student’s answer to the previous question,
 and evaluate on a scale of 1 to 5 how comprehensive the student’s answer was in relation to the learning objectives,
 and explain why he or she received this rating, including what was missed in his or her answer if the student’s answer wasn’t complete."""

# Note that this uses the previous result and query in the context
def prompt_define(query, prefix, context):
  prompt = (query + prefix)
  prompt = prompt + context
  return prompt

prompt = prompt_define(query, query_prefix, context)

def get_result(prompt):
  messages[1] = HumanMessage(content=prompt)
  result = llm(messages)
  str_result = str(result)
  import re
  str_result = str_result.replace(r'\n', '\n')
  str_result = str_result.replace("content='", "")
  str_result = str_result.replace("' additional_kwargs={} example=False", "")
  return str_result

result = get_result(prompt)

print(result)

Question 1: Who is the narrator of the poem "The Road not Taken"?
a) Robert Frost
b) The traveler
c) The reader
d) The poet\'s imagination
