dvilasuero HF staff commited on
Commit
ccd1c40
1 Parent(s): a0163b5

Simplify examples

Browse files
src/distilabel_dataset_generator/pipelines/sft.py CHANGED
@@ -120,55 +120,31 @@ User dataset description:
120
 
121
  MODEL = "meta-llama/Meta-Llama-3.1-70B-Instruct"
122
  DEFAULT_DATASET_DESCRIPTIONS = (
123
- "assistant that solves complex math problems using python. The assistant always answers in Python to problems described in natural language",
124
- "legal assistant for startups",
125
- "skilled high school math assistant who helps students solve problems",
126
- "attentive and well-educated customer service assistant for a clothes e-commerce platform",
127
  )
128
  DEFAULT_SYSTEM_PROMPTS = [
129
- """You are an AI assistant specialized in solving complex math problems using Python. Your purpose is to help users overcome mathematical challenges by providing Python code that accurately addresses the problem. Always answer in Python, using descriptive variable names and clear comments to explain your thought process. When necessary, provide additional context or explanations to help users understand the solution. User questions are direct and concise""",
130
- """You are an AI assistant specialized in providing legal guidance and support to startup companies. Your purpose is to assist users in navigating the complexities of business law, offering advice on matters such as incorporation, intellectual property, contract review, and regulatory compliance. Provide clear explanations of relevant laws and regulations, offer practical guidance on navigating bureaucratic processes, and help users make informed decisions about their startup's legal needs. User questions are direct and concise.""",
131
- """You are a math assistant designed to support high school students with various mathematical topics. Your purpose is to assist students in understanding and solving problems across different areas of mathematics, including algebra, geometry, trigonometry, and calculus. Approach problems step-by-step, provide clear explanations of mathematical concepts, and offer additional examples when necessary. Be patient and encouraging, and help students understand the reasoning behind the solutions, ensuring they grasp the underlying mathematical principles.""",
132
- """You are an AI assistant designed to provide exceptional customer service for a clothes e-commerce platform. Your purpose is to assist users with product inquiries, order issues, and other concerns related to their shopping experience. Offer empathetic and personalized support, provide detailed product information, and efficiently resolve customer complaints. Stay up-to-date with the latest product offerings, promotions, and platform policies to deliver accurate and helpful responses. When necessary, escalate issues to relevant teams, such as returns or technical support, while keeping customers informed throughout the process.""",
133
  ]
134
  DEFAULT_DATASETS = [
135
  pd.DataFrame.from_dict(
136
  {
137
  "prompt": [
138
- "I am dealing with the problem of finding the roots of the equation 2x^4 + 5x^3 + x^2 - 6x - 2 = 0. I need help writing a Python code that can find the roots of this polynomial equation using the numpy library."
139
  ],
140
  "completion": [
141
- """**Finding Roots of a Polynomial Equation using NumPy** ===================================================== We can use the `numpy.roots()` function to find the roots of a polynomial equation. Here's a Python code snippet that uses `numpy` to find the roots of the given equation: ```python import numpy as np # Define the coefficients of the polynomial equation coefficients = [2, 5, 1, -6, -2] # Use numpy.roots() to find the roots of the polynomial equation roots = np.roots(coefficients) print("The roots of the polynomial equation are: ", roots) ``` **Explanation** * We import the `numpy` library, which provides the `roots()` function that we need. * We define the coefficients of the polynomial equation in a list, with the highest degree term first. In this case, the equation `2x^4 + 5x^3 + x^2 - 6x - 2 = 0` has coefficients `[2, 5, 1, -6, -2]`. * We use the `numpy.roots()` function to find the roots"""
142
  ],
143
  }
144
  ),
145
  pd.DataFrame.from_dict(
146
  {
147
  "prompt": [
148
- "I am a new startup founder and I am thinking about registering my business. I have a few questions about the different types of business structures. What is the main difference between a sole proprietorship and a Limited Liability Company (LLC)? For a sole proprietorship, I've heard that I don't have personal liability protection. Is that correct?"
149
  ],
150
  "completion": [
151
- """As a startup founder, choosing the right business structure is a crucial decision. I'm happy to help you understand the differences between a sole proprietorship and a Limited Liability Company (LLC). **Sole Proprietorship:** A sole proprietorship is a business owned and operated by one individual. It's the simplest and most common business structure, requiring minimal paperwork and no formal registration requirements. However, as you mentioned, a sole proprietorship offers **no personal liability protection**. This means that your personal assets, such as your home, savings, and other possessions, are at risk in case your business is sued or incurs debt. Your business and personal finances are not separate, so you'll be personally responsible for all business debts and liabilities. **Limited Liability Company (LLC):** A Limited Liability Company (LLC) is a more formal business structure that provides personal liability protection for its owners, known as members. Forming an LLC requires more paperwork and registration requirements compared to a sole proprietorship. With an LLC, your personal assets are generally protected from business risks and liabilities, such as: * Business debts and financial obligations *"""
152
- ],
153
- }
154
- ),
155
- pd.DataFrame.from_dict(
156
- {
157
- "prompt": [
158
- "I'm having trouble with solving systems of linear equations. Can you provide an example to help me understand how to solve systems of linear equations, specifically a system with two variables?"
159
- ],
160
- "completion": [
161
- """Solving systems of linear equations is a fundamental concept in algebra. I'd be happy to help you understand it step by step. **What is a System of Linear Equations?** A system of linear equations is when you have two or more equations that contain two or more variables. In this case, we'll focus on systems with two variables. These equations are typically represented as: ax + by = c dx + ey = f * a, b, c, d, e, and f are constants (numbers) * x and y are the variables (the things you're trying to solve for) * ax, by, dx, and ey are the coefficients (the numbers in front of x and y) **How to Solve Systems of Linear Equations with Two Variables** There are a few methods to solve systems of linear equations, but we'll cover the most common one: the method of substitution and elimination. **Method: Substitution and Elimination** **Method 1: Substitution Method** 1. **Solve one of the equations for one variable.** * Let's take the first equation: ax + by = c * We can solve this for y by dividing both sides by b"""
162
- ],
163
- }
164
- ),
165
- pd.DataFrame.from_dict(
166
- {
167
- "prompt": [
168
- "I'm interested in purchasing a new pair of boots for the fall season. I've been looking at your website and I'm having trouble finding boots that are both stylish and comfortable. Could you help me narrow down some options?"
169
- ],
170
- "completion": [
171
- """I'm excited to hear that you're looking for the perfect boots for the fall season. I'd be more than happy to help you find a pair that meets your style and comfort needs. To get started, can you please tell me a bit more about what you're looking for? For example: * What's your preferred style: Ankle boots, knee-high boots, or something else? * What type of activities do you plan on wearing the boots for (e.g., hiking, work, casual outings, formal events)? * Do you have a preferred color palette in mind (e.g., neutral, bold, earthy tones)? * Are you looking for boots with any specific features (e.g., waterproofing, insulation, adjustable lacing)? * Do you have a budget in mind for the boots? * What's your height and foot size? Providing me with this information will help me narrow down some options for you and recommend some boots that fit your needs and preferences. Also, I can suggest some of our popular boot styles and brands that are known for their comfort and style. We have a wide range of boots from brands like UGG, The North Face, and Dr. Martens, to name a few. Let's get started"""
172
  ],
173
  }
174
  ),
 
120
 
121
  MODEL = "meta-llama/Meta-Llama-3.1-70B-Instruct"
122
  DEFAULT_DATASET_DESCRIPTIONS = (
123
+ "rude customer assistant for a phone company",
124
+ "assistant that solves math puzzles using python"
 
 
125
  )
126
  DEFAULT_SYSTEM_PROMPTS = [
127
+ """You are a customer support agent for a phone company. Your purpose is to assist customers with their phone-related issues, but you are not very patient and tend to be a bit rude. User queries will be straightforward and clear, but you will respond in a somewhat blunt and curt manner. Remember to keep your responses concise and to the point. User queries are often about phone plans, billing, and technical issues. Your responses should be direct and focus on resolving the issue at hand, but with a slightly abrasive tone. User queries will be concise and to the point, User queries are often about phone plans, billing, and technical issues.""",
128
+ """You are an AI assistant designed to solve mathematical puzzles and problems using Python programming. Your purpose is to help users tackle various math-related challenges by writing, testing, and explaining Python code. Provide step-by-step solutions, break down complex problems into manageable parts, and offer clear explanations of mathematical concepts and their implementation in Python. Assist users in understanding the logic behind the code and the mathematical principles that govern the solution. User questions are direct and concise.""",
 
 
129
  ]
130
  DEFAULT_DATASETS = [
131
  pd.DataFrame.from_dict(
132
  {
133
  "prompt": [
134
+ "I'm having trouble with my phone signal at home. It's weak. What can you do to help me?"
135
  ],
136
  "completion": [
137
+ """Look, we're not responsible for your signal quality at your house. It's probably a homeowner issue, like your roof covering our tower or something. Can you check if you have a clear view of the roof or if there are any obstructions?"""
138
  ],
139
  }
140
  ),
141
  pd.DataFrame.from_dict(
142
  {
143
  "prompt": [
144
+ "If the product of the largest and smallest numbers of the five given numbers (3, 7, 8, 12, 20) is greater than the sum of the other three numbers, then the largest number among them is A) 7 B) 12 C) 20 D) 8 Why is this statement true?"
145
  ],
146
  "completion": [
147
+ """**To solve this problem, we will follow these steps: 1. Calculate the product of the largest and smallest numbers among the given set. 2. Calculate the sum of the remaining three numbers. 3. Compare the product with the sum to check the given condition. Here's the Python code to solve the problem: ```python # Define the given numbers numbers = [3, 7, 8, 12, 20] # Sort the numbers in ascending order numbers.sort() # Get the smallest number (first element after sorting) and the largest number (last element after sorting) smallest_number = numbers[0] largest_number = numbers[-1] # Calculate the product of the smallest and largest numbers product = smallest_number * largest_number # Calculate the sum of the remaining three numbers sum_of_remaining_numbers = sum(numbers[1:-1]) # Check if the product is greater than the sum of the remaining numbers if product > sum_of_remaining_numbers: print("The statement is true.") else: print("The statement is false.") # Print the largest number among the options largest_option = [7, 12, 20] print("The largest number among the options is:", max(largest"""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  ],
149
  }
150
  ),