{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "wa8ykQk92aLX" }, "source": [ "# Mid-term project: NVIDIA Report chatbot\n", "\n", "In the following notebook we'll build RAG pipelines that will allow us to interactively retrieve information from the report \"NVIDIA 10-k Filings\". We will further use Ragas to evaluate component-wise metrics, as well as end-to-end metrics about the performance of our RAG pipelines." ] }, { "cell_type": "markdown", "metadata": { "id": "0_C2JvG1qO3h" }, "source": [ "## Set Environment Variables\n", "\n", "Let's set up our OpenAI API key so we can leverage their API later on." ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "8Lhqp5rUThG-", "outputId": "4389c3cd-4e2d-455c-cc40-cc6a094b4c42" }, "outputs": [], "source": [ "import os\n", "import openai\n", "from openai import AsyncOpenAI # importing openai for API usage\n", "import chainlit as cl # importing chainlit for our app\n", "from chainlit.prompt import Prompt, PromptMessage # importing prompt tools\n", "from chainlit.playground.providers import ChatOpenAI # importing ChatOpenAI tools\n", "\n", "from getpass import getpass\n", "\n", "openai.api_key = getpass(\"Please provide your OpenAI Key: \")\n", "os.environ[\"OPENAI_API_KEY\"] = openai.api_key\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "DV_BOewX8CW0" }, "source": [ "## Building our RAG pipeline" ] }, { "cell_type": "markdown", "metadata": { "id": "1VDGJdxCJEVc" }, "source": [ "### Creating an Index\n", "\n", "You'll notice that the largest changes (outside of some import changes) are that our old favourite chains are back to being bundled in an easily usable abstraction.\n", "\n", "We can still create custom chains using LCEL - but we can also be more confident that our pre-packaged chains are creating using LCEL under the hood." ] }, { "cell_type": "markdown", "metadata": { "id": "RmFFThawK8lO" }, "source": [ "#### Loading Data" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [], "source": [ "from langchain_community.document_loaders import PyMuPDFLoader\n", "\n", "loader = PyMuPDFLoader(\n", " \"NVIDIA_report.pdf\",\n", ")\n", "\n", "documents = loader.load()" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'source': 'data/NVIDIA_report.pdf',\n", " 'file_path': 'data/NVIDIA_report.pdf',\n", " 'page': 0,\n", " 'total_pages': 96,\n", " 'format': 'PDF 1.4',\n", " 'title': '0001045810-24-000029',\n", " 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group',\n", " 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28',\n", " 'keywords': '0001045810-24-000029; ; 10-K',\n", " 'creator': 'EDGAR Filing HTML Converter',\n", " 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0',\n", " 'creationDate': \"D:20240221173732-05'00'\",\n", " 'modDate': \"D:20240221173744-05'00'\",\n", " 'trapped': '',\n", " 'encryption': 'Standard V2 R3 128-bit RC4'}" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "documents[0].metadata" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "96" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(documents)" ] }, { "cell_type": "markdown", "metadata": { "id": "oQUl3sbZK4_1" }, "source": [ "#### Transforming Data\n", "\n", "Now that we've got our single document - let's split it into smaller pieces so we can more effectively leverage it with our retrieval chain!\n", "\n", "We'll start with the classic: `RecursiveCharacterTextSplitter`." ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "id": "6Nt2E1xnLNgr" }, "outputs": [], "source": [ "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", "\n", "text_splitter = RecursiveCharacterTextSplitter(\n", " chunk_size = 1000,\n", " chunk_overlap = 100\n", ")\n", "\n", "documents = text_splitter.split_documents(documents)" ] }, { "cell_type": "markdown", "metadata": { "id": "ilzwQxhiLcVV" }, "source": [ "Let's confirm we've split our document." ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "4wRw6a4aLfWh", "outputId": "2a9ec4d2-2827-458d-a5f3-a68a84c058a9" }, "outputs": [ { "data": { "text/plain": [ "438" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(documents)" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Document(page_content='Title of each class\\nTrading Symbol(s)\\nName of each exchange on which registered\\nCommon Stock, $0.001 par value per share\\nNVDA\\nThe Nasdaq Global Select Market\\nSecurities registered pursuant to Section 12(g) of the Act:\\nNone\\nIndicate by check mark if the registrant is a well-known seasoned issuer, as defined in Rule 405 of the Securities Act. Yes ☐ No ☒\\nIndicate by check mark if the registrant is not required to file reports pursuant to Section 13 or Section 15(d) of the Act. Yes ☐ No ☒\\nIndicate by check mark whether the registrant (1) has filed all reports required to be filed by Section 13 or 15(d) of the Securities Exchange Act of 1934 during the preceding 12 months (or for such shorter\\nperiod that the registrant was required to file such reports), and (2) has been subject to such filing requirements for the past 90 days. Yes ☒ No ☐', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 0, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'})" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "documents[1]" ] }, { "cell_type": "markdown", "metadata": { "id": "eZ93HkYcMJwW" }, "source": [ "#### Loading OpenAI Embeddings Model\n", "\n", "We will use use OpenAI's `text-embedding-3-small` for this task." ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "id": "JU6CrDVZMgKe" }, "outputs": [], "source": [ "from langchain_openai import OpenAIEmbeddings\n", "\n", "embeddings = OpenAIEmbeddings(\n", " model=\"text-embedding-3-small\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "rVtZR9JPLtR4" }, "source": [ "#### Creating a FAISS VectorStore\n", "\n", "Now that we have documents - we'll need a place to store them alongside their embeddings." ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "id": "978TWiCtMA0B" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 18:24:10 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "from langchain_community.vectorstores import FAISS\n", "\n", "vector_store = FAISS.from_documents(documents, embeddings)" ] }, { "cell_type": "markdown", "metadata": { "id": "Z7ht6bJX9PAY" }, "source": [ "#### Creating a Retriever\n", "\n", "To complete our index, all that's left to do is expose our vectorstore as a retriever:" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "id": "xne8P5dQTUiR" }, "outputs": [], "source": [ "retriever = vector_store.as_retriever()" ] }, { "cell_type": "markdown", "metadata": { "id": "sO_DFBVKNvNm" }, "source": [ "#### Testing our Retriever\n", "\n", "Now that we've gone through the trouble of creating our retriever - let's see it in action!" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "id": "I9_ONxpnN0n6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 18:24:13 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "retrieved_documents = retriever.invoke(\"What is this document about?\")" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "-Za12yt4OBy1", "outputId": "34526432-09f0-4445-93d3-f966f25dd6df" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "page_content='23.1*\\nConsent of PricewaterhouseCoopers LLP\\n24.1*\\nPower of Attorney (included in signature page)\\n31.1*\\nCertification of Chief Executive Officer as required by Rule 13a-14(a) of the Securities Exchange Act of 1934\\n31.2*\\nCertification of Chief Financial Officer as required by Rule 13a-14(a) of the Securities Exchange Act of 1934\\n32.1#*\\nCertification of Chief Executive Officer as required by Rule 13a-14(b) of the Securities Exchange Act of 1934\\n32.2#*\\nCertification of Chief Financial Officer as required by Rule 13a-14(b) of the Securities Exchange Act of 1934\\n97.1+*\\nCompensation Recovery Policy, as amended and restated November 30, 2023\\n101.INS*\\nXBRL Instance Document\\n101.SCH*\\nXBRL Taxonomy Extension Schema Document\\n101.CAL*\\nXBRL Taxonomy Extension Calculation Linkbase Document\\n101.DEF*\\nXBRL Taxonomy Extension Definition Linkbase Document\\n101.LAB*\\nXBRL Taxonomy Extension Labels Linkbase Document\\n101.PRE*\\nXBRL Taxonomy Extension Presentation Linkbase Document\\n104' metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 82, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}\n", "page_content=\"101.PRE*\\nXBRL Taxonomy Extension Presentation Linkbase Document\\n104\\nCover Page Interactive Data File - the cover page interactive data file does not appear in the Interactive Data File because its XBRL tags\\nare embedded within the Inline XBRL document\\n* Filed herewith.\\n+ Management contract or compensatory plan or arrangement.\\n# In accordance with Item 601(b)(32)(ii) of Regulation S-K and SEC Release Nos. 33-8238 and 34-47986, Final Rule: Management's Reports on Internal Control Over Financial Reporting and Certification of\\nDisclosure in Exchange Act Periodic Reports, the certifications furnished in Exhibits 32.1 and 32.2 hereto are deemed to accompany this Annual Report on Form 10-K and will not be deemed “filed” for\\npurpose of Section 18 of the Exchange Act. Such certifications will not be deemed to be incorporated by reference into any filing under the Securities Act or the Exchange Act, except to the extent that the\\nregistrant specifically incorporates it by reference.\" metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 82, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}\n", "page_content='10.7+\\nAmended and Restated 2007 Equity Incentive Plan – Global Restricted Stock\\nUnit Grant Notice and Global Restricted Stock Unit Agreement (2021)\\n10-Q\\n10.2\\n5/26/2021\\n10.8+\\nAmended and Restated 2007 Equity Incentive Plan – Global Restricted Stock\\nUnit Grant Notice and Global Restricted Stock Unit Agreement (2022)\\n10-K\\n10.16\\n3/18/2022\\n10.9+\\nAmended and Restated 2007 Equity Incentive Plan – Global Restricted Stock\\nUnit Grant Notice and Global Restricted Stock Unit Agreement (2023)\\n10-K\\n10.14\\n2/24/2023\\n10.10+\\nAmended and Restated 2012 Employee Stock Purchase Plan\\n10-Q\\n10.2\\n8/20/2021\\n10.11+\\nVariable Compensation Plan - Fiscal Year 2023\\n8-K\\n10.1\\n3/9/2022\\n10.12+\\nVariable Compensation Plan - Fiscal Year 2024\\n8-K\\n10.1\\n3/8/2023\\n10.13\\nForm of Commercial Paper Dealer Agreement between NVIDIA Corporation, as\\nIssuer, and the Dealer party thereto\\n8-K\\n10.1\\n12/15/2017\\n21.1*\\nSubsidiaries of Registrant\\n23.1*\\nConsent of PricewaterhouseCoopers LLP\\n24.1*\\nPower of Attorney (included in signature page)' metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 82, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}\n", "page_content='45\\n \\nPart III\\n \\nItem 10.\\nDirectors, Executive Officers and Corporate Governance\\n 45\\nItem 11.\\nExecutive Compensation\\n 46\\nItem 12.\\nSecurity Ownership of Certain Beneficial Owners and Management and Related Stockholder Matters\\n46\\nItem 13.\\nCertain Relationships and Related Transactions, and Director Independence\\n46\\nItem 14.\\nPrincipal Accountant Fees and Services\\n 46\\n \\n \\nPart IV\\n \\nItem 15.\\nExhibit and Financial Statement Schedules\\n 47\\nItem 16.\\nForm 10-K Summary\\n 83\\nSignatures\\n \\n 84\\n2' metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 1, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}\n" ] } ], "source": [ "for doc in retrieved_documents:\n", " print(doc)" ] }, { "cell_type": "markdown", "metadata": { "id": "D8MKsT6JTgCU" }, "source": [ "### Creating a RAG Chain\n" ] }, { "cell_type": "markdown", "metadata": { "id": "zs7qBLaEQEic" }, "source": [ "#### Creating a Prompt Template\n" ] }, { "cell_type": "code", "execution_count": 164, "metadata": { "id": "ijSNkTAjTsep" }, "outputs": [], "source": [ "from langchain.prompts import ChatPromptTemplate\n", "\n", "template = \"\"\"Answer the question based only on the following context. If you cannot answer the question with the context, please respond with 'I don't know':\n", "\n", "Context:\n", "{context}\n", "\n", "Question:\n", "{question}\n", "\"\"\"\n", "\n", "prompt = ChatPromptTemplate.from_template(template)" ] }, { "cell_type": "markdown", "metadata": { "id": "BYHnPaXl-cvJ" }, "source": [ "#### Setting Up our Basic QA Chain\n", "\n", "Now we can instantiate our basic RAG chain!\n", "\n", "We'll use LCEL directly just to see an example of it - but you could just as easily use an abstraction here to achieve the same goal!\n", "\n", "We'll also ensure to pass-through our context - which is critical for RAGAS." ] }, { "cell_type": "code", "execution_count": 100, "metadata": { "id": "-TsjUWjbUfbW" }, "outputs": [], "source": [ "from operator import itemgetter\n", "\n", "from langchain_openai import ChatOpenAI\n", "from langchain_core.output_parsers import StrOutputParser\n", "from langchain_core.runnables import RunnablePassthrough\n", "\n", "primary_qa_llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n", "\n", "retrieval_augmented_qa_chain = (\n", " # INVOKE CHAIN WITH: {\"question\" : \"<>\"}\n", " # \"question\" : populated by getting the value of the \"question\" key\n", " # \"context\" : populated by getting the value of the \"question\" key and chaining it into the base_retriever\n", " {\"context\": itemgetter(\"question\") | retriever, \"question\": itemgetter(\"question\")}\n", " # \"context\" : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)\n", " # by getting the value of the \"context\" key from the previous step\n", " | RunnablePassthrough.assign(context=itemgetter(\"context\"))\n", " # \"response\" : the \"context\" and \"question\" values are used to format our prompt object and then piped\n", " # into the LLM and stored in a key called \"response\"\n", " # \"context\" : populated by getting the value of the \"context\" key from the previous step\n", " | {\"response\": prompt | primary_qa_llm, \"context\": itemgetter(\"context\")}\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "7MgAa9JwBuJx" }, "source": [ "Above we have a RAG chain that first uses Python's itemgetter to extract the \"question\" from input, passing it to a retriever but also keeping the original \"question\" intact. A RunnablePassthrough then temporarily holds the \"context\" (which is obtained as an output of the \"question\" chained into the retriever) without altering it. Finally, the \"context\" and \"question\" are used as inputs for a prompt for ChatOpenAI, generating a \"response\"." ] }, { "cell_type": "markdown", "metadata": { "id": "zO69de-F-oMD" }, "source": [ "Let's test it out!" ] }, { "cell_type": "code", "execution_count": 160, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "2FS5NxC6UyU2", "outputId": "6d926d73-0b0a-40b4-b4b2-48250f97f0c1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 19:37:45 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 19:37:46 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "I don't know.\n" ] } ], "source": [ "question = \"What is the provided document about?\"\n", "\n", "result = retrieval_augmented_qa_chain.invoke({\"question\" : question})\n", "\n", "print(result[\"response\"].content)" ] }, { "cell_type": "code", "execution_count": 102, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "tIuHVGPOO9P2", "outputId": "38418031-7020-4c70-d695-48400e966c9f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 18:24:15 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:24:16 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "Debora Shoquist is the Executive Vice President of Operations, and she is 69 years old.\n", "[Document(page_content='Minnesota, an M.S.E.E. degree from the California Institute of Technology and an M.B.A. degree from Harvard Business School.\\nDebora Shoquist joined NVIDIA in 2007 as Senior Vice President of Operations and in 2009 became Executive Vice President of Operations. Prior to NVIDIA,\\nMs. Shoquist served from 2004 to 2007 as Executive Vice President of Operations at JDS Uniphase Corp., a provider of communications test and measurement\\nsolutions and optical products for the telecommunications industry. She served from 2002 to 2004 as Senior Vice President and General Manager of the Electro-\\nOptics business at Coherent, Inc., a manufacturer of commercial and scientific laser equipment. Previously, she worked at Quantum Corp., a data protection\\ncompany, as President of the Personal Computer Hard Disk Drive Division, and at Hewlett-Packard. Ms. Shoquist holds a B.S. degree in Electrical Engineering\\nfrom Kansas State University and a B.S. degree in Biology from Santa Clara University.', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 12, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='supports diverse hiring, retention, and employee engagement, which we believe makes NVIDIA a great place to work.\\nDuring fiscal year 2025, we will continue to have a flexible work environment and maintain our company wide 2-days off a quarter for employees to rest and\\nrecharge.\\nInformation About Our Executive Officers\\nThe following sets forth certain information regarding our executive officers, their ages, and positions as of February 16, 2024:\\nName\\nAge\\nPosition\\nJen-Hsun Huang\\n60\\nPresident and Chief Executive Officer\\nColette M. Kress\\n56\\nExecutive Vice President and Chief Financial Officer\\nAjay K. Puri\\n69\\nExecutive Vice President, Worldwide Field Operations\\nDebora Shoquist\\n69\\nExecutive Vice President, Operations\\nTimothy S. Teter\\n57\\nExecutive Vice President and General Counsel\\nJen-Hsun Huang co-founded NVIDIA in 1993 and has served as our President, Chief Executive Officer, and a member of the Board of Directors since our', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 11, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='engineering and operations. From 1997 to 2010 Ms. Kress held a variety of positions at Microsoft, a software company, including, beginning in 2006, Chief\\nFinancial Officer of the Server and Tools division, where Ms. Kress was responsible for financial\\n12', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 11, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='Table of Contents\\nstrategy, planning, reporting and business development for the division. Prior to joining Microsoft, Ms. Kress spent eight years at Texas Instruments Incorporated,\\na semiconductor company, where she held a variety of finance positions. Ms. Kress holds a B.S. degree in Finance from University of Arizona and an M.B.A.\\ndegree from Southern Methodist University.\\nAjay K. Puri joined NVIDIA in 2005 as Senior Vice President, Worldwide Sales and became Executive Vice President, Worldwide Field Operations in 2009. Prior\\nto NVIDIA, he held positions in sales, marketing, and general management over a 22-year career at Sun Microsystems, Inc., a computing systems company. Mr.\\nPuri previously held marketing, management consulting, and product development positions at Hewlett-Packard, an information technology company, Booz Allen\\nHamilton Inc., a management and technology consulting company, and Texas Instruments Incorporated. Mr. Puri holds a B.S.E.E. degree from the University of', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 12, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'})]\n" ] } ], "source": [ "question = \"Who is the E-VP, Operations - and how old are they?\"\n", "\n", "result = retrieval_augmented_qa_chain.invoke({\"question\" : question})\n", "\n", "print(result[\"response\"].content)\n", "print(result[\"context\"])" ] }, { "cell_type": "code", "execution_count": 103, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 18:24:16 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:24:17 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "$3,539 million\n", "[Document(page_content='The following table outlines the estimated future amortization expense related to the net carrying amount of intangible assets as of January 28, 2024:\\nFuture Amortization Expense\\n \\n(In millions)\\nFiscal Year:\\n \\n2025\\n$\\n555 \\n2026\\n261 \\n2027\\n150 \\n2028\\n37 \\n2029\\n9 \\n2030 and thereafter\\n100 \\nTotal\\n$\\n1,112 \\n64', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 63, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='Depreciation expense for fiscal years 2024, 2023, and 2022 was $894 million, $844 million, and $611 million, respectively.\\nAccumulated amortization of leasehold improvements and finance leases was $400 million and $327 million as of January 28, 2024 and January 29, 2023,\\nrespectively.\\nProperty, equipment and intangible assets acquired by assuming related liabilities during fiscal years 2024, 2023, and 2022 were $170 million, $374 million, and\\n$258 million, respectively.\\n \\nJan 28, 2024\\nJan 29, 2023\\nOther assets:\\n(In millions)\\nPrepaid supply and capacity agreements (1)\\n$\\n2,458 \\n$\\n2,989 \\nInvestments in non-affiliated entities\\n1,546 \\n299 \\nPrepaid royalties\\n364 \\n387 \\nOther\\n132 \\n145 \\nTotal other assets\\n$\\n4,500 \\n$\\n3,820 \\n(1)\\nAs of January 28, 2024 and January 29, 2023, there was an additional $2.5 billion and $458 million of short-term prepaid supply and capacity agreements included in Prepaid expenses and other\\ncurrent assets, respectively.\\n69', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 68, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='Table of Contents\\nNVIDIA Corporation and Subsidiaries\\nNotes to the Consolidated Financial Statements\\n(Continued)\\nNote 7 - Amortizable Intangible Assets\\nThe components of our amortizable intangible assets are as follows:\\n \\nJan 28, 2024\\nJan 29, 2023\\n \\nGross\\nCarrying\\nAmount\\nAccumulated\\nAmortization\\nNet \\nCarrying\\nAmount\\nGross\\nCarrying\\nAmount\\nAccumulated\\nAmortization\\nNet \\nCarrying\\nAmount\\n \\n(In millions)\\nAcquisition-related intangible\\nassets (1)\\n$\\n2,642 \\n$\\n(1,720)\\n$\\n922 \\n$\\n3,093 \\n$\\n(1,614)\\n$\\n1,479 \\nPatents and licensed technology\\n449 \\n(259)\\n190 \\n446 \\n(249)\\n197 \\nTotal intangible assets\\n$\\n3,091 \\n$\\n(1,979)\\n$\\n1,112 \\n$\\n3,539 \\n$\\n(1,863)\\n$\\n1,676 \\n(1) During the first quarter of fiscal year 2023, we commenced amortization of a $630 million in-process research and development intangible asset related to our acquisition of\\nMellanox.\\nAmortization expense associated with intangible assets for fiscal years 2024, 2023, and 2022 was $614 million, $699 million, and $563 million, respectively.', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 63, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'}), Document(page_content='The following table summarizes the cumulative gross unrealized gains and cumulative gross unrealized losses and impairments related to non-marketable equity\\nsecurities \\naccounted \\nfor \\nunder \\nthe \\nmeasurement \\nalternative:\\nJan 28, 2024\\n(In millions)\\nCumulative gross unrealized gains\\n$\\n270 \\nCumulative gross unrealized losses and impairments\\n(45)\\nNote 10 - Balance Sheet Components\\nTwo customers accounted for 24% and 11% of our accounts receivable balance as of January 28, 2024. Two customers accounted for 14% and 11% of our\\naccounts receivable balance as of January 29, 2023.\\nCertain balance sheet components are as follows:\\n \\nJan 28, 2024\\nJan 29, 2023\\n(In millions)\\nInventories (1):\\nRaw materials\\n$\\n1,719 \\n$\\n2,430 \\nWork in-process\\n1,505 \\n466 \\nFinished goods\\n2,058 \\n2,263 \\nTotal inventories\\n$\\n5,282 \\n$\\n5,159 \\n(1) In fiscal years 2024 and 2023, we recorded an inventory provision of $774 million and $1.0 billion, respectively, in cost of revenue.\\n68', metadata={'source': 'data/NVIDIA_report.pdf', 'file_path': 'data/NVIDIA_report.pdf', 'page': 67, 'total_pages': 96, 'format': 'PDF 1.4', 'title': '0001045810-24-000029', 'author': 'EDGAR® Online LLC, a subsidiary of OTC Markets Group', 'subject': 'Form 10-K filed on 2024-02-21 for the period ending 2024-01-28', 'keywords': '0001045810-24-000029; ; 10-K', 'creator': 'EDGAR Filing HTML Converter', 'producer': 'EDGRpdf Service w/ EO.Pdf 22.0.40.0', 'creationDate': \"D:20240221173732-05'00'\", 'modDate': \"D:20240221173744-05'00'\", 'trapped': '', 'encryption': 'Standard V2 R3 128-bit RC4'})]\n" ] } ], "source": [ "question = \"What is the gross carrying amount of Total Amortizable Intangible Assets for Jan 29, 2023?\"\n", "\n", "result = retrieval_augmented_qa_chain.invoke({\"question\" : question})\n", "\n", "print(result[\"response\"].content)\n", "print(result[\"context\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "EOECHyzHRqDw" }, "source": [ "## Synthetic Dataset Generation for Evaluation using Ragas\n", "\n", "Ragas is a powerful library that lets us evaluate our RAG pipeline by collecting input/output/context triplets and obtaining metrics relating to a number of different aspects of our RAG pipeline.\n", "\n", "We'll be evluating on every core metric today, but in order to do that - we'll need to creat a test set. Luckily for us, Ragas can do that directly!" ] }, { "cell_type": "markdown", "metadata": { "id": "KqXQ0jweWJOu" }, "source": [ "### Synthetic Test Set Generation\n", "\n", "We can leverage Ragas' [`Synthetic Test Data generation`](https://docs.ragas.io/en/stable/concepts/testset_generation.html) functionality to generate our own synthetic QC pairs - as well as a synthetic ground truth - quite easily!\n", "\n", "> NOTE: This process will use `gpt-3.5-turbo-16k` as the base generator and `gpt-4` as the critic - if you're attempting to create a lot of samples please be aware of cost, as well as rate limits." ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "id": "nVk5SlU9znXe" }, "outputs": [], "source": [ "loader = PyMuPDFLoader(\n", " \"NVIDIA_report.pdf\",\n", ")\n", "\n", "eval_documents = loader.load()\n", "\n", "text_splitter = RecursiveCharacterTextSplitter(\n", " chunk_size = 1500,\n", " chunk_overlap = 400\n", ")\n", "\n", "eval_documents = text_splitter.split_documents(eval_documents)" ] }, { "cell_type": "markdown", "metadata": { "id": "K7rOQkxhzrq3" }, "source": [ "We split our documents using different parameters when creating our synthetic data because we want to test whether the system can handle unseen data and diverse scenarios effectively, not just the specific conditions it was trained or optimized on. A different strategy might reveal strengths or weaknesses that were not apparent under the training conditions, providing a better understanding of the system's performance and areas for improvement." ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "hiAPYw-hz-zo", "outputId": "0942c5d1-d151-44af-ad1f-c1373ef3e634" }, "outputs": [ { "data": { "text/plain": [ "340" ] }, "execution_count": 105, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(eval_documents)" ] }, { "cell_type": "code", "execution_count": 132, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 66, "referenced_widgets": [ "8b4c1aafe67048798cdadd46207b4b84", "83e3f8bf55454600b299fe63b608852a", "4818628434aa4a0e8d7826f152c0da99", "c3e047dfd4ec4a859e0274a54ace1432", "1b3b9e3adf85473a81055265d9a5b89f", "2a4b2b14a02b46c1ac67fc1581133523", "decd5f4c69a845cc8fad4c21524c2fd9", "e44a47a5a2184c2780ac27e16ace0f7f", "317a7d84efc74420abea8e311137f272", "611021c94b8a42c58897925acb8b3c5e", "ba72a1f57074488da5e34f8d02e748f8", "6de6fb8ef2974573b50bad678620f2d1", "09ce5c2f37fb469683ed7cf3bd7566f6", "58d7c8b4640249df89b60e9eef4d2328", "394fb069eb3c4269bb3c970cc04369a9", "2235baa0358a4b8cad60508d5d1d8380", "570a1f9809e143ef8d858e1c5dc9837d", "42c4905b54d8482588d57485c563ee78", "26ee70b94d75449cbe7b7ce40ebc1049", "bc3b3593ad1e4c5bad6057a3d3872bb4", "c5651973a0534d3da51d0a18b13deff3", "8745b8f8f8ec46869c66758c4bc6b2e0" ] }, "id": "IXc6sMglSej_", "outputId": "b97b381f-ecd0-441d-924d-09e6d2187954" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d8fe67e1455f46edb4982ccbc001a7ee", "version_major": 2, "version_minor": 0 }, "text/plain": [ "embedding nodes: 0%| | 0/680 [00:00 NOTE: Ragas documentation on this generation process [here](https://docs.ragas.io/en/stable/concepts/testset_generation.html)." ] }, { "cell_type": "markdown", "metadata": { "id": "MemL406rUzBu" }, "source": [ "Let's look at the output:" ] }, { "cell_type": "code", "execution_count": 133, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "RaCDdImVU15s", "outputId": "a8e95364-f3e4-4f20-da3d-05f6f971afdc" }, "outputs": [ { "data": { "text/plain": [ "DataRow(question='What is the revenue contribution of the Compute & Networking segment for fiscal year 2024?', contexts=['represented approximately 19% of total revenue for fiscal year 2024, attributable to the Compute & Networking segment.\\nOur estimated Compute & Networking demand is expected to remain concentrated.\\nThere were no customers with 10% or more of total revenue for fiscal years 2023 and 2022.\\nGross Profit and Gross Margin\\nGross profit consists of total revenue, net of allowances, less cost of revenue. Cost of revenue consists primarily of the cost of semiconductors, including wafer\\nfabrication, assembly, testing and packaging, board and device costs, manufacturing support costs, including labor and overhead associated with such\\npurchases, final test yield fallout, inventory and warranty provisions, memory and component costs, tariffs, and shipping costs. Cost of revenue also includes\\nacquisition-related costs, development costs for license and service arrangements, IP-related costs, and stock-based compensation related to personnel\\nassociated with manufacturing operations.\\nOur overall gross margin increased to 72.7% in fiscal year 2024 from 56.9% in fiscal year 2023. The year over year increase was primarily due to strong Data\\nCenter revenue growth of 217% and lower net inventory provisions as a percentage of revenue.\\nProvisions for inventory and excess inventory purchase obligations totaled $2.2 billion for both fiscal years 2024 and 2023. Sales of previously reserved'], ground_truth='The revenue contribution of the Compute & Networking segment for fiscal year 2024 is approximately 19%.', evolution_type='simple')" ] }, "execution_count": 133, "metadata": {}, "output_type": "execute_result" } ], "source": [ "testset.test_data[0]" ] }, { "cell_type": "markdown", "metadata": { "id": "vrPsVwUAWFWB" }, "source": [ "### Generating Responses with RAG Pipeline\n", "\n", "Now that we have some QC pairs, and some ground truths, let's evaluate our RAG pipeline using Ragas. Let's start by extracting our questions and ground truths from our create testset. We can start by converting our test dataset into a Pandas DataFrame." ] }, { "cell_type": "code", "execution_count": 134, "metadata": { "id": "frvzu1YxX8kY" }, "outputs": [], "source": [ "test_df = testset.to_pandas()" ] }, { "cell_type": "code", "execution_count": 135, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 363 }, "id": "GFKMIY8IZU8m", "outputId": "92554c28-97b0-44b5-c356-05367d764ad8" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questioncontextsground_truthevolution_typeepisode_done
0What is the revenue contribution of the Comput...[represented approximately 19% of total revenu...The revenue contribution of the Compute & Netw...simpleTrue
1What is the purpose of entering into foreign c...[Table of Contents\\nNVIDIA Corporation and Sub...The purpose of entering into foreign currency ...simpleTrue
2What are the potential impacts on our income t...[is reduced, our provision for income taxes, r...The potential impacts on our income taxes and ...reasoningTrue
3What is NVIDIA Corporation's income tax policy...[Table of Contents\\nNVIDIA Corporation and Sub...nanreasoningTrue
4What could happen to our business if we don't ...[covered by insurance may be large, which coul...Our business could face legal action or reputa...multi_contextTrue
5What expenses for impacted employees are inclu...[– Risks Related to Regulatory, Legal, Our Sto...Expenses for financial support to impacted emp...multi_contextTrue
6\"How can NVIDIA AI and Omniverse help build Ea...[the top supercomputer, on the Green500 list.\\...NVIDIA AI and NVIDIA Omniverse platforms can h...multi_contextTrue
7What role does AI play in modern technology wi...[Table of Contents\\nPart I\\nItem 1. Business\\n...AI plays a significant role in modern technolo...multi_contextTrue
8What are the potential consequences of quality...[Table of Contents\\ntransitions, and we may be...The potential consequences of quality or produ...multi_contextTrue
9What was the percentage change in center reven...[Center revenue growth of 217% and lower net i...nanmulti_contextTrue
\n", "
" ], "text/plain": [ " question \\\n", "0 What is the revenue contribution of the Comput... \n", "1 What is the purpose of entering into foreign c... \n", "2 What are the potential impacts on our income t... \n", "3 What is NVIDIA Corporation's income tax policy... \n", "4 What could happen to our business if we don't ... \n", "5 What expenses for impacted employees are inclu... \n", "6 \"How can NVIDIA AI and Omniverse help build Ea... \n", "7 What role does AI play in modern technology wi... \n", "8 What are the potential consequences of quality... \n", "9 What was the percentage change in center reven... \n", "\n", " contexts \\\n", "0 [represented approximately 19% of total revenu... \n", "1 [Table of Contents\\nNVIDIA Corporation and Sub... \n", "2 [is reduced, our provision for income taxes, r... \n", "3 [Table of Contents\\nNVIDIA Corporation and Sub... \n", "4 [covered by insurance may be large, which coul... \n", "5 [– Risks Related to Regulatory, Legal, Our Sto... \n", "6 [the top supercomputer, on the Green500 list.\\... \n", "7 [Table of Contents\\nPart I\\nItem 1. Business\\n... \n", "8 [Table of Contents\\ntransitions, and we may be... \n", "9 [Center revenue growth of 217% and lower net i... \n", "\n", " ground_truth evolution_type \\\n", "0 The revenue contribution of the Compute & Netw... simple \n", "1 The purpose of entering into foreign currency ... simple \n", "2 The potential impacts on our income taxes and ... reasoning \n", "3 nan reasoning \n", "4 Our business could face legal action or reputa... multi_context \n", "5 Expenses for financial support to impacted emp... multi_context \n", "6 NVIDIA AI and NVIDIA Omniverse platforms can h... multi_context \n", "7 AI plays a significant role in modern technolo... multi_context \n", "8 The potential consequences of quality or produ... multi_context \n", "9 nan multi_context \n", "\n", " episode_done \n", "0 True \n", "1 True \n", "2 True \n", "3 True \n", "4 True \n", "5 True \n", "6 True \n", "7 True \n", "8 True \n", "9 True " ] }, "execution_count": 135, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_df" ] }, { "cell_type": "code", "execution_count": 136, "metadata": { "id": "xAiXbVmLYSoC" }, "outputs": [], "source": [ "test_questions = test_df[\"question\"].values.tolist()\n", "test_groundtruths = test_df[\"ground_truth\"].values.tolist()" ] }, { "cell_type": "markdown", "metadata": { "id": "aE5rfMLfbqKH" }, "source": [ "Now we'll generate responses using our RAG pipeline using the questions we've generated - we'll also need to collect our retrieved contexts for each question.\n", "\n", "We'll do this in a simple loop to see exactly what's happening!" ] }, { "cell_type": "code", "execution_count": 137, "metadata": { "id": "9_AayvT1dAQN" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 18:39:00 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:01 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:01 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:02 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:04 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:06 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:06 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:07 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:07 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:08 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:08 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:09 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:09 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:10 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:11 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:12 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:12 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:13 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:14 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 18:39:15 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "answers = []\n", "contexts = []\n", "\n", "for question in test_questions:\n", " response = retrieval_augmented_qa_chain.invoke({\"question\" : question})\n", " answers.append(response[\"response\"].content)\n", " contexts.append([context.page_content for context in response[\"context\"]])" ] }, { "cell_type": "markdown", "metadata": { "id": "opHaHmYDeBfC" }, "source": [ "Now we can wrap our information in a Hugging Face dataset for use in the Ragas library." ] }, { "cell_type": "code", "execution_count": 138, "metadata": { "id": "fY48YZITeHy-" }, "outputs": [], "source": [ "from datasets import Dataset\n", "\n", "response_dataset = Dataset.from_dict({\n", " \"question\" : test_questions,\n", " \"answer\" : answers,\n", " \"contexts\" : contexts,\n", " \"ground_truth\" : test_groundtruths\n", "})" ] }, { "cell_type": "markdown", "metadata": { "id": "mmeVvQaZeogE" }, "source": [ "Let's take a peek and see what that looks like!" ] }, { "cell_type": "code", "execution_count": 139, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "pOpydvc8eqNM", "outputId": "9e14b904-7d52-4dec-f65e-e6d13c3e9e79" }, "outputs": [ { "data": { "text/plain": [ "{'question': 'What is the revenue contribution of the Compute & Networking segment for fiscal year 2024?',\n", " 'answer': 'The revenue contribution of the Compute & Networking segment for fiscal year 2024 is $32,016 million.',\n", " 'contexts': ['United States\\n$\\n26,966 \\n$\\n8,292 \\n$\\n4,349 \\nTaiwan\\n13,405 \\n6,986 \\n8,544 \\nChina (including Hong Kong)\\n10,306 \\n5,785 \\n7,111 \\nOther countries\\n10,245 \\n5,911 \\n6,910 \\nTotal revenue\\n$\\n60,922 \\n$\\n26,974 \\n$\\n26,914 \\nRevenue from sales to customers outside of the United States accounted for 56%, 69%, and 84% of total revenue for fiscal years 2024, 2023, and 2022,\\nrespectively. The increase in revenue to the United States for fiscal year 2024 was primarily due to higher U.S.-based Compute & Networking segment demand.\\nSales to one customer represented 13% of total revenue for fiscal year 2024, which was attributable to the Compute & Networking segment. No customer\\nrepresented 10% or more of total revenue for fiscal years 2023 and 2022.\\nThe following table summarizes information pertaining to our revenue by each of the specialized markets we serve:\\n \\nYear Ended\\n \\nJan 28, 2024\\nJan 29, 2023\\nJan 30, 2022\\nRevenue:\\n(In millions)\\nData Center\\n$\\n47,525 \\n$\\n15,005 \\n$\\n10,613 \\nGaming\\n10,447 \\n9,067 \\n12,462',\n", " 'One indirect customer which primarily purchases our products through system integrators and distributors, including through Customer A, is estimated to have\\nrepresented approximately 19% of total revenue for fiscal year 2024, attributable to the Compute & Networking segment.\\nOur estimated Compute & Networking demand is expected to remain concentrated.\\nThere were no customers with 10% or more of total revenue for fiscal years 2023 and 2022.\\nGross Profit and Gross Margin\\nGross profit consists of total revenue, net of allowances, less cost of revenue. Cost of revenue consists primarily of the cost of semiconductors, including wafer\\nfabrication, assembly, testing and packaging, board and device costs, manufacturing support costs, including labor and overhead associated with such\\npurchases, final test yield fallout, inventory and warranty provisions, memory and component costs, tariffs, and shipping costs. Cost of revenue also includes',\n", " 'Table of Contents\\nAll Other operating loss - The year-on-year decrease was due to the $1.4 billion Arm acquisition termination cost in fiscal year 2023, partially offset by a $839\\nmillion increase in stock-based compensation expense in fiscal year 2024.\\nConcentration of Revenue\\nRevenue by geographic region is designated based on the billing location even if the revenue may be attributable to end customers, such as enterprises and\\ngamers in a different location. Revenue from sales to customers outside of the United States accounted for 56% and 69% of total revenue for fiscal years 2024\\nand 2023, respectively.\\nOur direct and indirect customers include public cloud, consumer internet companies, enterprises, startups, public sector entities, OEMs, ODMs, system\\nintegrators, AIB, and distributors.\\nSales to one customer, Customer A, represented 13% of total revenue for fiscal year 2024, which was attributable to the Compute & Networking segment.',\n", " '$\\n32,337 \\n215 %\\nGraphics\\n13,517 \\n11,906 \\n1,611 \\n14 %\\nTotal\\n$\\n60,922 \\n$\\n26,974 \\n$\\n33,948 \\n126 %\\nOperating Income by Reportable Segments\\nYear Ended\\nJan 28, 2024\\nJan 29, 2023\\n$\\nChange\\n%\\nChange\\n($ in millions)\\nCompute & Networking\\n$\\n32,016 \\n$\\n5,083 \\n$\\n26,933 \\n530 %\\nGraphics\\n5,846 \\n4,552 \\n1,294 \\n28 %\\nAll Other\\n(4,890)\\n(5,411)\\n521 \\n(10)%\\nTotal\\n$\\n32,972 \\n$\\n4,224 \\n$\\n28,748 \\n681 %\\nCompute & Networking revenue – The year-on-year increase was due to higher Data Center revenue. Compute grew 266% due to higher shipments of the\\nNVIDIA Hopper GPU computing platform for the training and inference of LLMs, recommendation engines and generative AI applications. Networking was up\\n133% due to higher shipments of InfiniBand.\\nGraphics revenue – The year-on-year increase was led by growth in Gaming of 15% driven by higher sell-in to partners following the normalization of channel\\ninventory levels.'],\n", " 'ground_truth': 'The revenue contribution of the Compute & Networking segment for fiscal year 2024 is approximately 19%.'}" ] }, "execution_count": 139, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response_dataset[0]" ] }, { "cell_type": "markdown", "metadata": { "id": "xbsFm5FievJI" }, "source": [ "## Task 2: Evaluating our Pipeline with Ragas\n", "\n", "Now that we have our response dataset, we can get into evaluation!\n", "First, we'll import the desired metrics, then we can use them to evaluate our created dataset.\n", "Check out the specific metrics we'll be using in the Ragas documentation:\n", "\n", "- [Faithfulness](https://docs.ragas.io/en/stable/concepts/metrics/faithfulness.html)\n", "- [Answer Relevancy](https://docs.ragas.io/en/stable/concepts/metrics/answer_relevance.html)\n", "- [Context Precision](https://docs.ragas.io/en/stable/concepts/metrics/context_precision.html)\n", "- [Context Recall](https://docs.ragas.io/en/stable/concepts/metrics/context_recall.html)\n", "- [Answer Correctness](https://docs.ragas.io/en/stable/concepts/metrics/answer_correctness.html)\n" ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "id": "R2PXwyt8e5aW" }, "outputs": [], "source": [ "from ragas import evaluate\n", "from ragas.metrics import (\n", " faithfulness,\n", " answer_relevancy,\n", " answer_correctness,\n", " context_recall,\n", " context_precision,\n", ")\n", "\n", "metrics = [\n", " faithfulness,\n", " answer_relevancy,\n", " context_recall,\n", " context_precision,\n", " answer_correctness,\n", "]" ] }, { "cell_type": "markdown", "metadata": { "id": "Kx-vlsx_hrtV" }, "source": [ "All that's left to do is call \"evaluate\" and away we go!" ] }, { "cell_type": "code", "execution_count": 141, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 49, "referenced_widgets": [ "d46db515c4d543d898ef91d05df2d0da", "af19ef64986b435c8c118e882e26f6a9", "14d8c6593d6b41df8dfb290ab9f55ca1", "8531b3d7f1cd424f8d3fc2e6bcd875a0", "4e5db0ff4ff44577963dbcc651ea10b8", "9caba03e810f4407b78cb1c1b6b9be08", "2bf9a43c99cf4e05a0f376fde6af9ca6", "384a04784f9745088478d9372161c8ae", "c981e401946b4dfca65b18b6ae56bf33", "b319ac9b4f1b43d5a41d6f10e6e1c1c6", "002fc233bee54ea0a9729365f1e0f972" ] }, "id": "DhlcfJ4lgYVI", "outputId": "dffb177c-c7c6-421d-9fde-7988f960949e" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c1f68a57e4c84616a75771f399df6a6c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Evaluating: 0%| | 0/50 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionanswercontextsground_truthfaithfulnessanswer_relevancycontext_recallcontext_precisionanswer_correctness
0What is the revenue contribution of the Comput...The revenue contribution of the Compute & Netw...[United States\\n$\\n26,966 \\n$\\n8,292 \\n$\\n4,34...The revenue contribution of the Compute & Netw...1.01.0000001.00.6388890.742716
1What is the purpose of entering into foreign c...To mitigate the impact of foreign currency mov...[comprehensive income or loss and reclassified...The purpose of entering into foreign currency ...1.00.9478851.01.0000000.723586
2What are the potential impacts on our income t...The potential impacts on income taxes and fina...[adverse tax impacts, which may materially imp...The potential impacts on our income taxes and ...1.00.9588671.01.0000000.619020
3What is NVIDIA Corporation's income tax policy...I don't know.[Table of Contents\\nNVIDIA Corporation and Sub...nanNaN0.0000000.00.0000000.198187
4What could happen to our business if we don't ...Our business could face legal action or reputa...[greater direct costs, including costs associa...Our business could face legal action or reputa...1.00.9529451.01.0000000.747060
5What expenses for impacted employees are inclu...Financial support and charitable activity expe...[Macroeconomic Factors\\nMacroeconomic factors,...Expenses for financial support to impacted emp...1.00.9097501.00.5833330.747494
6\"How can NVIDIA AI and Omniverse help build Ea...NVIDIA AI and Omniverse can help build Earth-2...[television graphics.\\nThe NVIDIA RTX platform...NVIDIA AI and NVIDIA Omniverse platforms can h...1.00.9531631.00.3333330.746826
7What role does AI play in modern technology wi...AI plays a significant role in modern technolo...[marking the “Big Bang” moment of AI. We intro...AI plays a significant role in modern technolo...1.01.0000001.01.0000000.538130
8What are the potential consequences of quality...The potential consequences of quality or produ...[lead times. Qualification time for new produc...The potential consequences of quality or produ...1.00.9658871.00.9166670.746463
9What was the percentage change in center reven...I don't know.[acquisition-related costs, development costs ...nanNaN0.0000000.00.0000000.198200
\n", "" ], "text/plain": [ " question \\\n", "0 What is the revenue contribution of the Comput... \n", "1 What is the purpose of entering into foreign c... \n", "2 What are the potential impacts on our income t... \n", "3 What is NVIDIA Corporation's income tax policy... \n", "4 What could happen to our business if we don't ... \n", "5 What expenses for impacted employees are inclu... \n", "6 \"How can NVIDIA AI and Omniverse help build Ea... \n", "7 What role does AI play in modern technology wi... \n", "8 What are the potential consequences of quality... \n", "9 What was the percentage change in center reven... \n", "\n", " answer \\\n", "0 The revenue contribution of the Compute & Netw... \n", "1 To mitigate the impact of foreign currency mov... \n", "2 The potential impacts on income taxes and fina... \n", "3 I don't know. \n", "4 Our business could face legal action or reputa... \n", "5 Financial support and charitable activity expe... \n", "6 NVIDIA AI and Omniverse can help build Earth-2... \n", "7 AI plays a significant role in modern technolo... \n", "8 The potential consequences of quality or produ... \n", "9 I don't know. \n", "\n", " contexts \\\n", "0 [United States\\n$\\n26,966 \\n$\\n8,292 \\n$\\n4,34... \n", "1 [comprehensive income or loss and reclassified... \n", "2 [adverse tax impacts, which may materially imp... \n", "3 [Table of Contents\\nNVIDIA Corporation and Sub... \n", "4 [greater direct costs, including costs associa... \n", "5 [Macroeconomic Factors\\nMacroeconomic factors,... \n", "6 [television graphics.\\nThe NVIDIA RTX platform... \n", "7 [marking the “Big Bang” moment of AI. We intro... \n", "8 [lead times. Qualification time for new produc... \n", "9 [acquisition-related costs, development costs ... \n", "\n", " ground_truth faithfulness \\\n", "0 The revenue contribution of the Compute & Netw... 1.0 \n", "1 The purpose of entering into foreign currency ... 1.0 \n", "2 The potential impacts on our income taxes and ... 1.0 \n", "3 nan NaN \n", "4 Our business could face legal action or reputa... 1.0 \n", "5 Expenses for financial support to impacted emp... 1.0 \n", "6 NVIDIA AI and NVIDIA Omniverse platforms can h... 1.0 \n", "7 AI plays a significant role in modern technolo... 1.0 \n", "8 The potential consequences of quality or produ... 1.0 \n", "9 nan NaN \n", "\n", " answer_relevancy context_recall context_precision answer_correctness \n", "0 1.000000 1.0 0.638889 0.742716 \n", "1 0.947885 1.0 1.000000 0.723586 \n", "2 0.958867 1.0 1.000000 0.619020 \n", "3 0.000000 0.0 0.000000 0.198187 \n", "4 0.952945 1.0 1.000000 0.747060 \n", "5 0.909750 1.0 0.583333 0.747494 \n", "6 0.953163 1.0 0.333333 0.746826 \n", "7 1.000000 1.0 1.000000 0.538130 \n", "8 0.965887 1.0 0.916667 0.746463 \n", "9 0.000000 0.0 0.000000 0.198200 " ] }, "execution_count": 143, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results_df = results.to_pandas()\n", "results_df" ] }, { "cell_type": "markdown", "metadata": { "id": "MWfiu_pLh3JL" }, "source": [ "## Making Adjustments to our RAG Pipeline\n", "\n", "Now that we have established a baseline - we can see how any changes impact our pipeline's performance!\n", "\n", "Let's modify our retriever and see how that impacts our Ragas metrics!" ] }, { "cell_type": "code", "execution_count": 165, "metadata": { "id": "nKIuM336isBL" }, "outputs": [], "source": [ "from langchain.retrievers import MultiQueryRetriever\n", "\n", "advanced_retriever = MultiQueryRetriever.from_llm(retriever=retriever, llm=primary_qa_llm)" ] }, { "cell_type": "markdown", "metadata": { "id": "82rcj3L-i_c8" }, "source": [ "We'll also re-create our RAG pipeline using the abstractions that come packaged with LangChain v0.1.0!\n", "\n", "First, let's create a chain to \"stuff\" our documents into our context!" ] }, { "cell_type": "code", "execution_count": 173, "metadata": { "id": "EfdCgTw7jC4i" }, "outputs": [], "source": [ "from langchain.chains.combine_documents import create_stuff_documents_chain\n", "from langchain import hub\n", "\n", "retrieval_qa_prompt = hub.pull(\"langchain-ai/retrieval-qa-chat\")\n", "\n", "document_chain = create_stuff_documents_chain(primary_qa_llm, retrieval_qa_prompt)" ] }, { "cell_type": "markdown", "metadata": { "id": "ozYl5WdPnvLu" }, "source": [ "Next, we'll create the retrieval chain!" ] }, { "cell_type": "code", "execution_count": 174, "metadata": { "id": "9AK7wHVnn0U3" }, "outputs": [], "source": [ "from langchain.chains import create_retrieval_chain\n", "\n", "retrieval_chain = create_retrieval_chain(advanced_retriever, document_chain)" ] }, { "cell_type": "code", "execution_count": 175, "metadata": { "id": "cmKORMfMoCjL" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 20:04:07 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:07 - Generated queries: ['1. Can you summarize the content of the document?', '2. What information does the document contain?', \"3. Could you give me an overview of the document's main topics?\"]\n", "2024-03-13 20:04:07 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:07 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:08 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:09 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "response = retrieval_chain.invoke({\"input\": \"What is the provided document about?\"})" ] }, { "cell_type": "code", "execution_count": 176, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ICMsUWbWoOpf", "outputId": "04fb324e-682f-48cc-b369-a78d6396af88" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The provided document is an Annual Report on Form 10-K for NVIDIA Corporation. It includes various sections such as Business, Risk Factors, Cybersecurity, Market for Registrant’s Common Equity, Financial Statements, Directors and Executive Officers, Executive Compensation, and other relevant information required by the Securities and Exchange Commission (SEC).\n" ] } ], "source": [ "print(response[\"answer\"])" ] }, { "cell_type": "code", "execution_count": 177, "metadata": { "id": "5s8ZGasYoVi6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 20:04:12 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:12 - Generated queries: ['1. How much is the total carrying value of Amortizable Intangible Assets as of January 29, 2023?', '2. What is the total gross amount of Amortizable Intangible Assets that can be amortized as of January 29, 2023?', '3. Can you provide the total value of Amortizable Intangible Assets that are eligible for amortization on January 29, 2023?']\n", "2024-03-13 20:04:12 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:12 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:13 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:14 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "response = retrieval_chain.invoke({\"input\": \"What is the gross carrying amount of Total Amortizable Intangible Assets for Jan 29, 2023?\"})" ] }, { "cell_type": "code", "execution_count": 178, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ADNCdW4hoYT8", "outputId": "92d13a09-9e69-48af-fac2-1919123a980c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The gross carrying amount of Total Amortizable Intangible Assets for Jan 29, 2023, was $3,539 million.\n" ] } ], "source": [ "print(response[\"answer\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "OxkU0HdpoaiE" }, "source": [ "Well, just from those responses this chain *feels* better - but lets see how it performs on our eval!\n", "\n", "Let's do the same process we did before to collect our pipeline's contexts and answers." ] }, { "cell_type": "code", "execution_count": 179, "metadata": { "id": "kO8cWxn2oinT" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-03-13 20:04:16 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:16 - Generated queries: ['1. How much revenue does the Compute & Networking segment contribute in fiscal year 2024?', '2. What is the financial impact of the Compute & Networking segment on the revenue for fiscal year 2024?', '3. Can you provide information on the revenue generated by the Compute & Networking segment in fiscal year 2024?']\n", "2024-03-13 20:04:16 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:16 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:17 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:18 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:19 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:19 - Generated queries: ['1. Why do companies use foreign currency forward contracts as derivative financial instruments?', '2. How are foreign currency forward contracts utilized as derivative financial instruments?', '3. What benefits do foreign currency forward contracts offer as derivative financial instruments in financial markets?']\n", "2024-03-13 20:04:19 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:22 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:23 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:24 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:26 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:26 - Generated queries: ['How might changes in global tax laws and regulations impact our income taxes and financial situation?', 'What effects could alterations in tax laws both globally and in foreign jurisdictions have on our income taxes and financial well-being?', 'In what ways could modifications to tax laws around the world and in foreign countries affect our income taxes and financial status?']\n", "2024-03-13 20:04:26 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:27 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:27 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:29 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:30 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:30 - Generated queries: [\"1. Can you provide information on NVIDIA Corporation's income tax policy based on the context given?\", \"2. What are the details of NVIDIA Corporation's income tax policy according to the provided context?\", \"3. How does the provided context outline NVIDIA Corporation's income tax policy?\"]\n", "2024-03-13 20:04:30 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:31 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:32 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:34 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:36 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:36 - Generated queries: ['1. How might our business be impacted if we fail to tackle climate change and sustain damage to our reputation?', '2. In what ways could our business suffer if we neglect to address climate change and maintain a damaged reputation?', '3. What are the potential consequences for our business if we do not take action on climate change and allow our reputation to be harmed?']\n", "2024-03-13 20:04:36 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:36 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:36 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:39 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:41 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:41 - Generated queries: ['1. Which operating expenses for FY2024 cover costs related to supporting impacted employees amidst macroeconomic factors and the Israel-Hamas conflict?', '2. How are the operating expenses for FY2024 allocated to address the needs of impacted employees in light of macroeconomic factors and the Israel-Hamas conflict?', '3. What specific financial support is provided to impacted employees in the operating expenses for FY2024, taking into account macroeconomic factors and the Israel-Hamas conflict?']\n", "2024-03-13 20:04:41 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:41 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:41 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:44 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:46 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:46 - Generated queries: ['1. \"In what ways can NVIDIA AI and Omniverse contribute to the development of Earth-2 and accurately forecast the impact of climate change in high resolution?\"', '2. \"What role do NVIDIA AI and Omniverse play in the construction of Earth-2 and the precise prediction of climate change effects at a detailed level?\"', '3. \"How does the utilization of NVIDIA AI and Omniverse aid in the creation of Earth-2 and the anticipation of climate change consequences with high-resolution accuracy?\"']\n", "2024-03-13 20:04:46 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:46 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:46 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:48 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:49 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:49 - Generated queries: [\"1. How does AI contribute to modern technology through the use of NVIDIA's infrastructure and software?\", \"2. In what ways is AI integrated into modern technology with the help of NVIDIA's infrastructure and software?\", \"3. What impact does NVIDIA's infrastructure and software have on the role of AI in modern technology?\"]\n", "2024-03-13 20:04:50 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:50 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:50 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:53 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:55 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:55 - Generated queries: ['How can quality or production issues impact inventory provisions in light of new product launches and supply and demand fluctuations?', 'What implications do quality or production issues have on inventory provisions, especially with the introduction of new products and challenges in supply and demand?', 'What are the possible outcomes of quality or production issues on inventory provisions, taking into account new product introductions and supply and demand challenges?']\n", "2024-03-13 20:04:55 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:55 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:55 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:04:57 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:05:00 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:05:00 - Generated queries: ['1. How much did the center revenue growth percentage change between fiscal years 2024 and 2023, taking into account inventory provisions, excess inventory purchase obligations, and increased operating expenses?', '2. What was the percentage difference in center revenue growth for fiscal years 2024 and 2023, factoring in inventory provisions, excess inventory purchase obligations, and increased operating expenses?', '3. Can you provide the percentage change in center revenue growth for fiscal years 2024 and 2023, with consideration given to inventory provisions, excess inventory purchase obligations, and increased operating expenses?']\n", "2024-03-13 20:05:00 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:05:00 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:05:01 - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2024-03-13 20:05:04 - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" ] } ], "source": [ "answers = []\n", "contexts = []\n", "\n", "for question in test_questions:\n", " response = retrieval_chain.invoke({\"input\" : question})\n", " answers.append(response[\"answer\"])\n", " contexts.append([context.page_content for context in response[\"context\"]])" ] }, { "cell_type": "markdown", "metadata": { "id": "tgagfhPUtM2j" }, "source": [ "Now we can convert this into a dataset, just like we did before." ] }, { "cell_type": "code", "execution_count": 180, "metadata": { "id": "5FcllGeSovP8" }, "outputs": [], "source": [ "response_dataset_advanced_retrieval = Dataset.from_dict({\n", " \"question\" : test_questions,\n", " \"answer\" : answers,\n", " \"contexts\" : contexts,\n", " \"ground_truth\" : test_groundtruths\n", "})" ] }, { "cell_type": "markdown", "metadata": { "id": "dELYabwktR2C" }, "source": [ "Let's evaluate on the same metrics we did for the first pipeline and see how it does:" ] }, { "cell_type": "code", "execution_count": 181, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 49, "referenced_widgets": [ "50599aa481d8460aa6655330b2b0fae3", "cfc93618fc084608bb413667fee91ea8", "3e85b387328f4df7b45dccbe6572b9bd", "cc50e0150a9947579a919757b85f38c9", "c03d5f58d31747d3a344f813755480fc", "17fde9c2236b4b1b9990dd2a9fbd58ff", "ddbe87e735534504b735211253c4b4d2", "2189fea4b75749d7bac330e613c31974", "d7148bed10a245509672dc60be6edd47", "3367eaf060c845648cda48963481ecb4", "4fc8cf791b1344809fe8c7ee9598a20a" ] }, "id": "d7uHseWJo2TU", "outputId": "a0cf86d6-5b8e-4829-b660-b2c247202811" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9b16ac38f6a24d7594504cbbfa3bb294", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Evaluating: 0%| | 0/50 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
questionanswercontextsground_truthfaithfulnessanswer_relevancycontext_recallcontext_precisionanswer_correctness
0What is the revenue contribution of the Comput...The revenue contribution of the Compute & Netw...[United States\\n$\\n26,966 \\n$\\n8,292 \\n$\\n4,34...The revenue contribution of the Compute & Netw...1.0000.9955411.00.6388890.741987
1What is the purpose of entering into foreign c...The purpose of entering into foreign currency ...[comprehensive income or loss and reclassified...The purpose of entering into foreign currency ...NaN0.9845031.01.0000000.748117
2What are the potential impacts on our income t...Changes in tax laws globally and in foreign ju...[adverse tax impacts, which may materially imp...The potential impacts on our income taxes and ...1.0000.9300951.01.0000000.741731
3What is NVIDIA Corporation's income tax policy...Based on the provided context, NVIDIA Corporat...[Table of Contents\\nNVIDIA Corporation and Sub...nan1.0000.9562390.00.0000000.181602
4What could happen to our business if we don't ...If the business does not address climate chang...[greater direct costs, including costs associa...Our business could face legal action or reputa...1.0000.9339211.01.0000000.438464
5What expenses for impacted employees are inclu...The operating expenses for fiscal year 2024 in...[employees in the region who primarily support...Expenses for financial support to impacted emp...1.0000.8973511.01.0000000.530663
6\"How can NVIDIA AI and Omniverse help build Ea...NVIDIA AI and Omniverse can help build Earth-2...[television graphics.\\nThe NVIDIA RTX platform...NVIDIA AI and NVIDIA Omniverse platforms can h...1.0000.9406760.50.5000000.888234
7What role does AI play in modern technology wi...AI plays a significant role in modern technolo...[underlying technology by using a variety of s...AI plays a significant role in modern technolo...0.8751.0000001.01.0000000.449780
8What are the potential consequences of quality...Quality or production issues could potentially...[lead times. Qualification time for new produc...The potential consequences of quality or produ...1.0000.9247971.00.8041670.903615
9What was the percentage change in center reven...The Data Center revenue growth for fiscal year...[acquisition-related costs, development costs ...nanNaN0.0000000.00.2500000.179007
\n", "" ], "text/plain": [ " question \\\n", "0 What is the revenue contribution of the Comput... \n", "1 What is the purpose of entering into foreign c... \n", "2 What are the potential impacts on our income t... \n", "3 What is NVIDIA Corporation's income tax policy... \n", "4 What could happen to our business if we don't ... \n", "5 What expenses for impacted employees are inclu... \n", "6 \"How can NVIDIA AI and Omniverse help build Ea... \n", "7 What role does AI play in modern technology wi... \n", "8 What are the potential consequences of quality... \n", "9 What was the percentage change in center reven... \n", "\n", " answer \\\n", "0 The revenue contribution of the Compute & Netw... \n", "1 The purpose of entering into foreign currency ... \n", "2 Changes in tax laws globally and in foreign ju... \n", "3 Based on the provided context, NVIDIA Corporat... \n", "4 If the business does not address climate chang... \n", "5 The operating expenses for fiscal year 2024 in... \n", "6 NVIDIA AI and Omniverse can help build Earth-2... \n", "7 AI plays a significant role in modern technolo... \n", "8 Quality or production issues could potentially... \n", "9 The Data Center revenue growth for fiscal year... \n", "\n", " contexts \\\n", "0 [United States\\n$\\n26,966 \\n$\\n8,292 \\n$\\n4,34... \n", "1 [comprehensive income or loss and reclassified... \n", "2 [adverse tax impacts, which may materially imp... \n", "3 [Table of Contents\\nNVIDIA Corporation and Sub... \n", "4 [greater direct costs, including costs associa... \n", "5 [employees in the region who primarily support... \n", "6 [television graphics.\\nThe NVIDIA RTX platform... \n", "7 [underlying technology by using a variety of s... \n", "8 [lead times. Qualification time for new produc... \n", "9 [acquisition-related costs, development costs ... \n", "\n", " ground_truth faithfulness \\\n", "0 The revenue contribution of the Compute & Netw... 1.000 \n", "1 The purpose of entering into foreign currency ... NaN \n", "2 The potential impacts on our income taxes and ... 1.000 \n", "3 nan 1.000 \n", "4 Our business could face legal action or reputa... 1.000 \n", "5 Expenses for financial support to impacted emp... 1.000 \n", "6 NVIDIA AI and NVIDIA Omniverse platforms can h... 1.000 \n", "7 AI plays a significant role in modern technolo... 0.875 \n", "8 The potential consequences of quality or produ... 1.000 \n", "9 nan NaN \n", "\n", " answer_relevancy context_recall context_precision answer_correctness \n", "0 0.995541 1.0 0.638889 0.741987 \n", "1 0.984503 1.0 1.000000 0.748117 \n", "2 0.930095 1.0 1.000000 0.741731 \n", "3 0.956239 0.0 0.000000 0.181602 \n", "4 0.933921 1.0 1.000000 0.438464 \n", "5 0.897351 1.0 1.000000 0.530663 \n", "6 0.940676 0.5 0.500000 0.888234 \n", "7 1.000000 1.0 1.000000 0.449780 \n", "8 0.924797 1.0 0.804167 0.903615 \n", "9 0.000000 0.0 0.250000 0.179007 " ] }, "execution_count": 182, "metadata": {}, "output_type": "execute_result" } ], "source": [ "advanced_retrieval_results_df = advanced_retrieval_results.to_pandas()\n", "advanced_retrieval_results_df" ] }, { "cell_type": "markdown", "metadata": { "id": "J0hzqq5VtZ2a" }, "source": [ "## Evaluating our Adjusted Pipeline Against Our Baseline\n", "\n", "Now we can compare our results and see what directional changes occured. Let's refresh with our initial metrics." ] }, { "cell_type": "code", "execution_count": 183, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_WWGRaF5qx3V", "outputId": "ee4195d5-f3a3-45df-dff9-5139c93f640f" }, "outputs": [ { "data": { "text/plain": [ "{'faithfulness': 1.0000, 'answer_relevancy': 0.7688, 'context_recall': 0.8000, 'context_precision': 0.6472, 'answer_correctness': 0.6008}" ] }, "execution_count": 183, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results" ] }, { "cell_type": "markdown", "metadata": { "id": "oFv_yAeotmFs" }, "source": [ "And see how our advanced retrieval modified our chain:" ] }, { "cell_type": "code", "execution_count": 184, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "rpV11dxJo7xa", "outputId": "9510b961-4481-40fc-b54e-3a2a8348ce8f" }, "outputs": [ { "data": { "text/plain": [ "{'faithfulness': 0.9844, 'answer_relevancy': 0.8563, 'context_recall': 0.7500, 'context_precision': 0.7193, 'answer_correctness': 0.5803}" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "advanced_retrieval_results" ] }, { "cell_type": "code", "execution_count": 185, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "id": "62NYn3iAvTjM", "outputId": "2d6eb84d-131c-457c-f71e-881248a4b2b7" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MetricBaselineMultiQueryRetriever with Document StuffingDelta
0faithfulness1.0000000.984375-0.015625
1answer_relevancy0.7688500.8563120.087463
2context_recall0.8000000.750000-0.050000
3context_precision0.6472220.7193060.072083
4answer_correctness0.6007680.580320-0.020448
\n", "
" ], "text/plain": [ " Metric Baseline MultiQueryRetriever with Document Stuffing \\\n", "0 faithfulness 1.000000 0.984375 \n", "1 answer_relevancy 0.768850 0.856312 \n", "2 context_recall 0.800000 0.750000 \n", "3 context_precision 0.647222 0.719306 \n", "4 answer_correctness 0.600768 0.580320 \n", "\n", " Delta \n", "0 -0.015625 \n", "1 0.087463 \n", "2 -0.050000 \n", "3 0.072083 \n", "4 -0.020448 " ] }, "execution_count": 185, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "df_original = pd.DataFrame(list(results.items()), columns=['Metric', 'Baseline'])\n", "df_comparison = pd.DataFrame(list(advanced_retrieval_results.items()), columns=['Metric', 'MultiQueryRetriever with Document Stuffing'])\n", "\n", "df_merged = pd.merge(df_original, df_comparison, on='Metric')\n", "\n", "df_merged['Delta'] = df_merged['MultiQueryRetriever with Document Stuffing'] - df_merged['Baseline']\n", "\n", "df_merged" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "RAGAS surprisingly rated both RAG systems - with and without advanced retrieval - very similarly. But a very basic question asked in the beginning of this notebook, \"What is the provided document about?\" was only answered by the RAG system that used advanced retrieval. Likely this can be explained by the following: the question I asked was very informal and not very specific. In other words, it was a perfectly valid question but formulated with some degree of randomness. The advanced retriever rewrote the question in many different ways and immediately the RAG system could answer it very well. And indeed, in real world scenarios, RAG systemas are queried with questions that were not thoroughly thought of, but instead were quickly formulated and typed carelessly. Likely the advanced retriever would make a great difference in such cases, but this would not be picked up by RAGAS for the simple reason that gpt 3.5-turbo will not create poorly formulated questions unless speciffically prompted to do so (and that would require a very good prompt!)" ] }, { "cell_type": "code", "execution_count": 186, "metadata": {}, "outputs": [], "source": [ "user_template = \"\"\"{input}\n", "Think through your response step by step.\n", "\"\"\"\n", "@cl.on_chat_start # marks a function that will be executed at the start of a user session\n", "async def start_chat():\n", " settings = {\n", " \"model\": \"gpt-3.5-turbo\",\n", " \"temperature\": 1.0,\n", " \"max_tokens\": 500,\n", " \"top_p\": 1,\n", " \"frequency_penalty\": 0,\n", " \"presence_penalty\": 0,\n", " }\n", "\n", " cl.user_session.set(\"settings\", settings)\n", "\n", "\n", "@cl.on_message # marks a function that should be run each time the chatbot receives a message from a user\n", "async def main(message: cl.Message):\n", " settings = cl.user_session.get(\"settings\")\n", "\n", " client = AsyncOpenAI(\n", " api_key=os.environ.get(\"OPENAI_API_KEY\"),\n", ")\n", "\n", " print(message.content)\n", "\n", " prompt = Prompt(\n", " provider=ChatOpenAI.id,\n", " messages=[\n", " PromptMessage(\n", " role=\"system\",\n", " template=template,\n", " formatted=template,\n", " ),\n", " PromptMessage(\n", " role=\"user\",\n", " template=user_template,\n", " formatted=user_template.format(input=message.content),\n", " ),\n", " ],\n", " inputs={\"input\": message.content},\n", " settings=settings,\n", " )\n", "\n", " print([m.to_openai() for m in prompt.messages])\n", "\n", " msg = cl.Message(content=\"\")\n", "\n", " # Call OpenAI\n", " async for stream_resp in await client.chat.completions.create(\n", " messages=[m.to_openai() for m in prompt.messages], stream=True, **settings\n", " ):\n", " token = stream_resp.choices[0].delta.content\n", " if not token:\n", " token = \"\"\n", " await msg.stream_token(token)\n", "\n", " # Update the prompt object with the completion\n", " prompt.completion = msg.content\n", " msg.prompt = prompt\n", "\n", " # Send and close the message stream\n", " await msg.send()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "colab": { "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "002fc233bee54ea0a9729365f1e0f972": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "09ce5c2f37fb469683ed7cf3bd7566f6": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_570a1f9809e143ef8d858e1c5dc9837d", "placeholder": "​", "style": "IPY_MODEL_42c4905b54d8482588d57485c563ee78", "value": "Generating: 100%" } }, "14d8c6593d6b41df8dfb290ab9f55ca1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_384a04784f9745088478d9372161c8ae", "max": 50, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_c981e401946b4dfca65b18b6ae56bf33", "value": 50 } }, "17fde9c2236b4b1b9990dd2a9fbd58ff": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1b3b9e3adf85473a81055265d9a5b89f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": "hidden", "width": null } }, "21731645603f4144a74f604cf7c01021": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2189fea4b75749d7bac330e613c31974": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2235baa0358a4b8cad60508d5d1d8380": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "26ee70b94d75449cbe7b7ce40ebc1049": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2a4b2b14a02b46c1ac67fc1581133523": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2ab3fc4aee0b456bb0a06f15de98dafd": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "2bf9a43c99cf4e05a0f376fde6af9ca6": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "317a7d84efc74420abea8e311137f272": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "3367eaf060c845648cda48963481ecb4": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "384a04784f9745088478d9372161c8ae": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "394fb069eb3c4269bb3c970cc04369a9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_c5651973a0534d3da51d0a18b13deff3", "placeholder": "​", "style": "IPY_MODEL_8745b8f8f8ec46869c66758c4bc6b2e0", "value": " 10/10 [00:40<00:00,  6.80s/it]" } }, "3e85b387328f4df7b45dccbe6572b9bd": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_2189fea4b75749d7bac330e613c31974", "max": 50, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_d7148bed10a245509672dc60be6edd47", "value": 50 } }, "42c4905b54d8482588d57485c563ee78": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "4818628434aa4a0e8d7826f152c0da99": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_e44a47a5a2184c2780ac27e16ace0f7f", "max": 318, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_317a7d84efc74420abea8e311137f272", "value": 318 } }, "4aef094bb7764f4fb7917a53de5cb40a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_8f117fd4781949148b1533a38e29c9d6", "placeholder": "​", "style": "IPY_MODEL_2ab3fc4aee0b456bb0a06f15de98dafd", "value": "Evaluating: 100%" } }, "4e5db0ff4ff44577963dbcc651ea10b8": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "4fc8cf791b1344809fe8c7ee9598a20a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "50599aa481d8460aa6655330b2b0fae3": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_cfc93618fc084608bb413667fee91ea8", "IPY_MODEL_3e85b387328f4df7b45dccbe6572b9bd", "IPY_MODEL_cc50e0150a9947579a919757b85f38c9" ], "layout": "IPY_MODEL_c03d5f58d31747d3a344f813755480fc" } }, "570a1f9809e143ef8d858e1c5dc9837d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "58d7c8b4640249df89b60e9eef4d2328": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_26ee70b94d75449cbe7b7ce40ebc1049", "max": 10, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_bc3b3593ad1e4c5bad6057a3d3872bb4", "value": 10 } }, "611021c94b8a42c58897925acb8b3c5e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "6de6fb8ef2974573b50bad678620f2d1": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_09ce5c2f37fb469683ed7cf3bd7566f6", "IPY_MODEL_58d7c8b4640249df89b60e9eef4d2328", "IPY_MODEL_394fb069eb3c4269bb3c970cc04369a9" ], "layout": "IPY_MODEL_2235baa0358a4b8cad60508d5d1d8380" } }, "83e3f8bf55454600b299fe63b608852a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_2a4b2b14a02b46c1ac67fc1581133523", "placeholder": "​", "style": "IPY_MODEL_decd5f4c69a845cc8fad4c21524c2fd9", "value": "embedding nodes: 100%" } }, "8531b3d7f1cd424f8d3fc2e6bcd875a0": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_b319ac9b4f1b43d5a41d6f10e6e1c1c6", "placeholder": "​", "style": "IPY_MODEL_002fc233bee54ea0a9729365f1e0f972", "value": " 50/50 [00:21<00:00,  1.43it/s]" } }, "86f36527c3df458aae2e54f329c643d7": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_4aef094bb7764f4fb7917a53de5cb40a", "IPY_MODEL_f549d2bd447649c8aac65b0abd25cf23", "IPY_MODEL_f413d1bf4faa44edbe5d081b1d3eaff2" ], "layout": "IPY_MODEL_db2d3fee6c91439faabbb891a9574392" } }, "8745b8f8f8ec46869c66758c4bc6b2e0": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "8b4c1aafe67048798cdadd46207b4b84": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_83e3f8bf55454600b299fe63b608852a", "IPY_MODEL_4818628434aa4a0e8d7826f152c0da99", "IPY_MODEL_c3e047dfd4ec4a859e0274a54ace1432" ], "layout": "IPY_MODEL_1b3b9e3adf85473a81055265d9a5b89f" } }, "8f117fd4781949148b1533a38e29c9d6": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9abbc4a5bc11444185ae5ecacbfa102a": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9caba03e810f4407b78cb1c1b6b9be08": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "af19ef64986b435c8c118e882e26f6a9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9caba03e810f4407b78cb1c1b6b9be08", "placeholder": "​", "style": "IPY_MODEL_2bf9a43c99cf4e05a0f376fde6af9ca6", "value": "Evaluating: 100%" } }, "b319ac9b4f1b43d5a41d6f10e6e1c1c6": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ba72a1f57074488da5e34f8d02e748f8": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "bc3b3593ad1e4c5bad6057a3d3872bb4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "c03d5f58d31747d3a344f813755480fc": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "c3e047dfd4ec4a859e0274a54ace1432": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_611021c94b8a42c58897925acb8b3c5e", "placeholder": "​", "style": "IPY_MODEL_ba72a1f57074488da5e34f8d02e748f8", "value": " 318/318 [00:30<00:00,  1.69s/it]" } }, "c42583faf1f3472b82394432c7623562": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "c5651973a0534d3da51d0a18b13deff3": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "c981e401946b4dfca65b18b6ae56bf33": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "cc50e0150a9947579a919757b85f38c9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_3367eaf060c845648cda48963481ecb4", "placeholder": "​", "style": "IPY_MODEL_4fc8cf791b1344809fe8c7ee9598a20a", "value": " 50/50 [00:27<00:00,  2.66it/s]" } }, "cfc93618fc084608bb413667fee91ea8": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_17fde9c2236b4b1b9990dd2a9fbd58ff", "placeholder": "​", "style": "IPY_MODEL_ddbe87e735534504b735211253c4b4d2", "value": "Evaluating: 100%" } }, "d46db515c4d543d898ef91d05df2d0da": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_af19ef64986b435c8c118e882e26f6a9", "IPY_MODEL_14d8c6593d6b41df8dfb290ab9f55ca1", "IPY_MODEL_8531b3d7f1cd424f8d3fc2e6bcd875a0" ], "layout": "IPY_MODEL_4e5db0ff4ff44577963dbcc651ea10b8" } }, "d7148bed10a245509672dc60be6edd47": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "db2d3fee6c91439faabbb891a9574392": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "ddbe87e735534504b735211253c4b4d2": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "decd5f4c69a845cc8fad4c21524c2fd9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "e44a47a5a2184c2780ac27e16ace0f7f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f3cf4145eef74579a41f740d62d842c4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "f413d1bf4faa44edbe5d081b1d3eaff2": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_21731645603f4144a74f604cf7c01021", "placeholder": "​", "style": "IPY_MODEL_c42583faf1f3472b82394432c7623562", "value": " 50/50 [00:22<00:00,  1.83it/s]" } }, "f549d2bd447649c8aac65b0abd25cf23": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9abbc4a5bc11444185ae5ecacbfa102a", "max": 50, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_f3cf4145eef74579a41f740d62d842c4", "value": 50 } } } } }, "nbformat": 4, "nbformat_minor": 0 }