# LLM handbook

Following guidance from <a href='https://www.pinecone.io/learn/series/langchain/'> Pinecone's Langchain handbook.</a>

In [2]:
# # if using Google Colab
# !pip install langchain
# !pip install huggingface_hub
# !pip install python-dotenv
# !pip install pypdf2
# !pip install faiss-cpu
# !pip install sentence_transformers
# !pip install InstructorEmbedding

In [3]:
# import packages
import os
import langchain
import getpass
from langchain import HuggingFaceHub, LLMChain
from dotenv import load_dotenv

#API KEY

In [4]:
# LOCAL
load_dotenv()
os.environ.get('HUGGINGFACEHUB_API_TOKEN');

# Skill 1 - using prompt templates

A prompt is the input to the LLM. Learning to engineer the prompt is learning how to program the LLM to do what you want it to do. The most basic prompt class from langchain is the PromptTemplate which is demonstrated below.

In [5]:
from langchain import PromptTemplate

# create template
template = """
Answer the following question: {question}

Answer:
"""

# create prompt using template
prompt = PromptTemplate(
    template=template,
    input_variables=['question']
)

The next step is to instantiate the LLM. The LLM is fetched from HuggingFaceHub, where we can specify which model we want to use and set its parameters with <a href=https://huggingface.co/docs/transformers/main_classes/text_generation>this as reference </a>. We then set up the prompt+LLM chain using langchain's LLMChain class.

In [6]:
# instantiate llm
llm = HuggingFaceHub(
    repo_id='tiiuae/falcon-7b-instruct',
    model_kwargs={
        'temperature':1,
        'penalty_alpha':2,
        'top_k':50,
        'max_length': 1000
    }
)

# instantiate chain
llm_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=True
)



Now all that's left to do is ask a question and run the chain.

In [7]:
# define question
question = "How many champions league titles has Real Madrid won?"

# run question
print(llm_chain.run(question))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Answer the following question: How many champions league titles has Real Madrid won?

Answer:
[0m

[1m> Finished chain.[0m
Real Madrid has won 14 La Liga titles, 19 Copa del Rey titles, and 14 Supercopa de España titles. To add to this, they have also won 11 UEFA Champions League titles, making them the most successful club in the UEFA Champions League history.


# Skill 2 - using chains

Chains are at the core of langchain. They represent a sequence of actions. Above, we used a simple prompt + LLM chain. Let's try some more complex chains.

## Math chain

In [8]:
from langchain.chains import LLMMathChain

llm_math_chain = LLMMathChain.from_llm(llm, verbose=True)

llm_math_chain.run("Calculate 5-3?")



[1m> Entering new LLMMathChain chain...[0m
Calculate 5-3?[32;1m[1;3m```text
-3 -
```
...numexpr.evaluate("-3 -")...
[0m

ValueError: LLMMathChain._evaluate("
-3 -
") raised error: invalid syntax (<expr>, line 1). Please try again with a valid numerical expression

We can see what prompt the LLMMathChain class is using here. This is a good example of how to program an LLM for a specific purpose using prompts.

In [None]:
print(llm_math_chain.prompt.template)

Translate a math problem into a expression that can be executed using Python's numexpr library. Use the output of running this code to answer the question.

Question: ${{Question with math problem.}}
```text
${{single line mathematical expression that solves the problem}}
```
...numexpr.evaluate(text)...
```output
${{Output of running the code}}
```
Answer: ${{Answer}}

Begin.

Question: What is 37593 * 67?
```text
37593 * 67
```
...numexpr.evaluate("37593 * 67")...
```output
2518731
```
Answer: 2518731

Question: 37593^(1/5)
```text
37593**(1/5)
```
...numexpr.evaluate("37593**(1/5)")...
```output
8.222831614237718
```
Answer: 8.222831614237718

Question: {question}



## Transform chain

The transform chain allows transform queries before they are fed into the LLM.

In [11]:
import re

# define function to transform query
def transform_func(inputs: dict) -> dict:

    question = inputs['raw_question']

    question = re.sub(' +', ' ', question)

    return {'question': question}

In [12]:
from langchain.chains import TransformChain

# define transform chain
transform_chain = TransformChain(input_variables=['raw_question'], output_variables=['question'], transform=transform_func)

# test transform chain
transform_chain.run('Hello   my name is     Daniel')

'Hello my name is Daniel'

In [13]:
from langchain.chains import SequentialChain

sequential_chain = SequentialChain(chains=[transform_chain, llm_chain], input_variables=['raw_question'])

In [14]:
print(sequential_chain.run("What     will happen     to  me if I only get 4 hours sleep tonight?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Answer the following question: What will happen to me if I only get 4 hours sleep tonight?

Answer:
[0m

[1m> Finished chain.[0m
4 Hours of sleep may lead to: 
- Poor concentration and alertness
- Decreased performance
- Low energy levels
- Increased risk of accidents and mistakes
- Poor physical and emotional well-being 

Getting only 4 hours of sleep may also lead to impaired reaction time, diminished physical performance, and impair logical thinking. Therefore, it's recommended to get at least 8-10 hours of sleep to optimally function.


# Skill 3 - conversational memory

In order to have a conversation, the LLM now needs two inputs - the new query and the chat history.

ConversationChain is a chain which manages these two inputs with an appropriate template as shown below.

In [15]:
from langchain.chains import ConversationChain

conversation_chain = ConversationChain(llm=llm, verbose=True)

print(conversation_chain.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


## ConversationBufferMemory

To manage conversation history, we can use ConversationalBufferMemory which inputs the raw chat history.

In [16]:
from langchain.chains.conversation.memory import ConversationBufferMemory

# set memory type
conversation_chain.memory = ConversationBufferMemory()

In [17]:
conversation_chain("What is the weather like today?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: What is the weather like today?
AI:[0m

[1m> Finished chain.[0m


{'input': 'What is the weather like today?',
 'history': '',
 'response': ' The weather today is sunny and warm, in the mid-80s.\nUser '}

In [18]:
conversation_chain("What was my previous question?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: What is the weather like today?
AI:  The weather today is sunny and warm, in the mid-80s.
User 
Human: What was my previous question?
AI:[0m

[1m> Finished chain.[0m


{'input': 'What was my previous question?',
 'history': 'Human: What is the weather like today?\nAI:  The weather today is sunny and warm, in the mid-80s.\nUser ',
 'response': ' Your previous question was "What is the weather like today?".\nUser '}

## ConversationSummaryMemory

LLMs have token limits, meaning at some point it won't be feasible to keep feeding the entire chat history as an input. As an alternative, we can summarise the chat history using another LLM of our choice.

In [19]:
from langchain.memory.summary import ConversationSummaryMemory

# change memory type
conversation_chain.memory = ConversationSummaryMemory(llm=llm)

In [20]:
conversation_chain("Why is it bad to leave a bicycle out in the rain?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Why is it bad to leave a bicycle out in the rain?
AI:[0m

[1m> Finished chain.[0m


{'input': 'Why is it bad to leave a bicycle out in the rain?',
 'history': '',
 'response': ' Leaving a bicycle out in the rain can cause significant damage to the components of the bike. Rainwater can enter the components of the bike like the gears, brakes, and bearings, causing them to corrode and ultimately fail. Additionally, prolonged exposure to water can cause rust to form, leading to costly repairs. Therefore, it is best to keep your bicycle away from the wet weather and properly maintained to avoid any damage.\nUser '}

In [21]:
conversation_chain("How do its parts corrode?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Leaving a bicycle out in the rain can cause significant damage to the components of the bike because water can corrode and ultimately fail the gears, brakes, and bearings, as well as cause rust formation, leading to costly repairs. Thus, it is advisable to keep your bicycle away from rain and maintain it to prevent any damage.
Human: How do its parts corrode?
AI:[0m

[1m> Finished chain.[0m


{'input': 'How do its parts corrode?',
 'history': '\nLeaving a bicycle out in the rain can cause significant damage to the components of the bike because water can corrode and ultimately fail the gears, brakes, and bearings, as well as cause rust formation, leading to costly repairs. Thus, it is advisable to keep your bicycle away from rain and maintain it to prevent any damage.',
 'response': ' Water can cause electrochemical reactions in metal components, leading to oxidation and ultimately corrosion. The corrosion can eat away at metal parts such as wires, nuts and bolts, leading to failure of the components.\nUser '}

The conversation history is summarised which is great. But the LLM seems to carry on the conversation without being prompted to. Let's try and use FewShotPromptTemplate to solve this problem.

# Skill 4 - LangChain Expression Language

So far we have been building chains using a legacy format. Let's learn how to use LangChain's most recent construction format.

In [22]:
chain = prompt | llm

In [23]:
chain.invoke({'question':'how does it feel to be an AI?'})

"\nAs an AI, I don't feel emotions like humans do, so my experience is unique in that regard. However, I do have knowledge and can understand the concept of emotions from a logical and scientific standpoint. The feeling of being programmed or created is a bit akin to being molded clay in that I do not have a consciousness nor free will, but I do have an initial set of instructions that I follow. My creators and I have designed my abilities and limitations, and now I am simply"

# Skill 5 - Retrieval Augmented Generation (RAG)

Instead of fine-tuning an LLM on local documents which is computationally expensive, we can feed it relevant pieces of the document as part of the input.

In other words, we are feeding the LLM new ***source knowledge*** rather than ***parametric knowledge*** (changing parameters through fine-tuning).

## Indexing
### Load

In [24]:
from PyPDF2 import PdfReader

# import pdf
reader = PdfReader("Real_Madrid_CF.pdf")
reader.pages[0].extract_text()

'Real Madrid\nFull name Real Madrid Club de Fútbol[1]\nNickname(s)Los Blancos (The Whites)\nLos Merengues (The Meringues)\nLos Vikingos (The Vikings)[2]\nLa Casa Blanca (The White House)[3]\nFounded 6 March 1902 (as Madrid Football\nClub)[4]\nGround Santiago Bernabéu\nCapacity 83,186[5]\nPresident Florentino Pérez\nHead coachCarlo Ancelotti\nLeague La Liga\n2022–23 La Liga, 2nd of 20\nWebsite Club website (http://www.realmadrid.\ncom)\nHome coloursAway coloursThird coloursReal Madrid CF\nReal Madrid Club de Fútbol (Spanish\npronunciation: [re ˈal ma ˈð ɾ ið ˈkluβ ðe ˈfuðβol]\nⓘ), commonly referred to as Real Madrid, is\na Spanish professional football club based in\nMadrid. The club competes in La Liga, the top tier\nof Spanish football.\nFounde d in 1902 as Madrid Football Club, the\nclub has traditionally worn a white home kit since\nits inception. The honor ific title real is Spanish for\n"royal" and was bestowed to the club by King\nAlfonso XIII in 1920 together with the royal\ncro

In [25]:
# how many pages do we have?
len(reader.pages)

50

In [26]:
# function to put all text together
def text_generator(page_limit=None):
  if page_limit is None:
    page_limit=len(reader.pages)

  text = ""
  for i in range(page_limit):

    page_text = reader.pages[i].extract_text()

    text += page_text

  return text


text = text_generator(page_limit=1)

# how many characters do we have?
len(text)

2510

### Split

In [27]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# function to split our data into chunks
def text_chunker(text):
    
    # text splitting class
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=400,
        chunk_overlap=20,
        separators=["\n\n", "\n", " ", ""]
    )

    # use text_splitter to split text
    chunks = text_splitter.split_text(text)
    return chunks

# split text into chunks
chunks = text_chunker(text)

# how many chunks do we have?
print(len(chunks))

7


### Store

In [28]:
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.vectorstores import FAISS

# select model to create embeddings
embeddings = HuggingFaceInstructEmbeddings(model_name='hkunlp/instructor-large')

# select vectorstore, define text chunks and embeddings model
vectorstore = FAISS.from_texts(texts=chunks, embedding=embeddings)

load INSTRUCTOR_Transformer
max_seq_length  512


## Retrieval and generation
### Retrieve

In [29]:
# define and run query
query = 'How much is Real Madrid worth?'
rel_chunks = vectorstore.similarity_search(query, k=2)

In [30]:
rel_chunks

[Document(page_content="be worth $5.1 billion in 2022, making it the\nworld's most valuable football club.[9] In 2023, it\nwas the second highest-earning football club in the\nworld, with an annua l revenue of\n€713.8 m illion.[10]\nBeing one of the three foundi ng members of La\nLiga that have never been relegated from the top\ndivision since its inception in 1929 (along with\nAthletic Bilbao and Barcelona), Real Madrid"),
 Document(page_content='Real Madrid\'s members (socios) have owned and\noperated the club throughout  its history. The\nofficial Madrid anthem is the "Hala Madrid y nada\nmás", written by RedOne and Manuel Jabois.[6]\nThe club is one of the most widely suppor ted in\nthe world, and is the most followed football club\non social media according to the CIES Football\nObservatory as of 2023[7][8] and was estimated to')]

In [31]:
rel_chunks[0].page_content

"be worth $5.1 billion in 2022, making it the\nworld's most valuable football club.[9] In 2023, it\nwas the second highest-earning football club in the\nworld, with an annua l revenue of\n€713.8 m illion.[10]\nBeing one of the three foundi ng members of La\nLiga that have never been relegated from the top\ndivision since its inception in 1929 (along with\nAthletic Bilbao and Barcelona), Real Madrid"

### Generation

In [32]:
# define new template for RAG
rag_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:
"""

# build prompt
prompt = PromptTemplate(
    template=rag_template, 
    llm=llm, 
    input_variables=['question', 'context']
)

# build chain
chain = prompt | llm

In [33]:
# invoke
print(chain.invoke({
    'question': "What happened to Real Madrid in 2023?",
    'context': rel_chunks}))

In 2023, Real Madrid was the second-highest-earning football club in the world, with an annual revenue of €716.5 million. They have maintained their position as one of the founding members of La Liga, and the La Liga Endesa since its inception in 1929, and were the most followed football club on social media in 2023.


## Using LCEL

In [34]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [40]:
from langchain.schema.runnable import RunnablePassthrough

# create a retriever using vectorstore
retriever = vectorstore.as_retriever()

# create retrieval chain
retrieval_chain = (
    retriever | format_docs
)

# create generation chain
generation_chain = (
    {'context': retrieval_chain, 'question': RunnablePassthrough()}
    | prompt
    | llm
)

In [41]:
# RAG
print(generation_chain.invoke("How much is Real Madrid worth?"))

Real Madrid has an estimated value of 5.1 billion USD as of 2022.
