MadeAgents
/

Hammer-7b

Safetensors

qwen2

Model card Files Files and versions Community

qypeng commited on Sep 6, 2024

Commit

fc26828

verified ·

1 Parent(s): e6b3665

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -10

README.md CHANGED Viewed

@@ -8,17 +8,20 @@ base_model: Qwen/Qwen2-7B-Instruct
 ## Introduction
 Function calling enables LLMs to invoke specific functions, integrating external features, accessing real-world data, and extending beyond text generation. We present Hammer, a finetuned model based on Qwen2-7B-Instruct. Unlike previous works emphasizing on data refinement (cite xlam, IBM…), our focus is on applying novel training techniques to address recognized issues in existing function-calling models. Such issues are listed below:
-1.Hallucination
-a)Function name hallucination: The model, rather than selecting from the provided function pool, has a tendency to generate a new function based on its own world knowledge.
-b)Parameter name hallucination: When the user fails to provide sufficient information to fulfill their request (lacking necessary parameters), the model is inclined to fill in the parameters relying on its own knowledge.
-2.Overfitting
-a)Function name and parameter name: The model pays excessive attention to the function name and parameter name while neglecting other information such as description, input, and output. This leads to a lack of generalization and reduces the model's ability to handle diverse scenarios. （mask解决）
-b)Parameter filling: The model does not extract parameters based on the provided function definition. Instead, it fills in the parameters based on the learned knowledge from training. For instance, when expecting "San Francisco", "San Francisco, CA" might be filled in because in the training data, all "San Francisco"s are followed by "CA"s. （例子可能不合适，未解决）
-c)Default value filling: The model fills in parameter default values according to patterns in the training data rather than the provided function definition. For example, when "default = inch" is most common in the training data, the model is likely to fill in "inch" instead of "cm", even though the latter is the provided default value in the function definition. （default mask + prompt解决）
-d)Ordering of provided function list and parameter list: When the provided function list or parameter list have consistent orderings during training, it is possible that the model learns patterns that are not intended, such as remembering the orderings. （shuffle解决）
-3.Instructions missing key information
 Occasionally, user instructions may lack essential details vital for effective function execution. For instance, the command "Set an alarm to wake me up”, lacks a time specification. Ideally, in such instances, the model should either request additional information or merely output the function name, excluding the unspecified parameter.  Existing methods either disregard such situations or output an “irrelevant” signal, indicating the query is unfulfillable with the given tools.
-4.Prompt design
 Inconsistency in instruction formatting between training and testing can result in a significant performance gap. For example, during the training phase, the default value is provided in the parameter description, while during testing, the default value is provided as a separate parameter in JSON format.
 In this work, we focus on introducing function calling abilities with an inherent emphasis on addressing the aforementioned limitations. We summarize our techniques as follows:

 ## Introduction
 Function calling enables LLMs to invoke specific functions, integrating external features, accessing real-world data, and extending beyond text generation. We present Hammer, a finetuned model based on Qwen2-7B-Instruct. Unlike previous works emphasizing on data refinement (cite xlam, IBM…), our focus is on applying novel training techniques to address recognized issues in existing function-calling models. Such issues are listed below:
+1. Hallucination
+a) Function name hallucination: The model, rather than selecting from the provided function pool, has a tendency to generate a new function based on its own world knowledge.
+b) Parameter name hallucination: When the user fails to provide sufficient information to fulfill their request (lacking necessary parameters), the model is inclined to fill in the parameters relying on its own knowledge.
+2. Overfitting
+a) Function name and parameter name: The model pays excessive attention to the function name and parameter name while neglecting other information such as description, input, and output. This leads to a lack of generalization and reduces the model's ability to handle diverse scenarios. （mask解决）
+b) Parameter filling: The model does not extract parameters based on the provided function definition. Instead, it fills in the parameters based on the learned knowledge from training. For instance, when expecting "San Francisco", "San Francisco, CA" might be filled in because in the training data, all "San Francisco"s are followed by "CA"s. （例子可能不合适，未解决）
+c) Default value filling: The model fills in parameter default values according to patterns in the training data rather than the provided function definition. For example, when "default = inch" is most common in the training data, the model is likely to fill in "inch" instead of "cm", even though the latter is the provided default value in the function definition. （default mask + prompt解决）
+d) Ordering of provided function list and parameter list: When the provided function list or parameter list have consistent orderings during training, it is possible that the model learns patterns that are not intended, such as remembering the orderings. （shuffle解决）
+3. Instructions missing key information
 Occasionally, user instructions may lack essential details vital for effective function execution. For instance, the command "Set an alarm to wake me up”, lacks a time specification. Ideally, in such instances, the model should either request additional information or merely output the function name, excluding the unspecified parameter.  Existing methods either disregard such situations or output an “irrelevant” signal, indicating the query is unfulfillable with the given tools.
+4. Prompt design
 Inconsistency in instruction formatting between training and testing can result in a significant performance gap. For example, during the training phase, the default value is provided in the parameter description, while during testing, the default value is provided as a separate parameter in JSON format.
 In this work, we focus on introducing function calling abilities with an inherent emphasis on addressing the aforementioned limitations. We summarize our techniques as follows: