Transformers 3.0 can't find files
Hi, sorry if I'm missing something, I'm moving this across from a different thread here at the suggestion of @BoscoTheDog but I'm not expecting anyone to go read that thread. (Thanks for your help so far Bosco, much appreciated!)
I'm trying to get a simple proof-of-concept up-and-running that I can build on, using Phi 3.5 and the CDN-hosted transformers.js library but I keep getting 404 errors and I don't fully understand what I'm doing wrong.
I'm importing the library for the pipeline like this
import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";```
Then I'm loading the model like this
const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web');
But I keep getting 404 errors
[email protected]:217
GET https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-onnx-web/resolve/main/onnx/model_quantized.onnx 404 (Not Found)
[email protected]:217 Uncaught Error: Could not locate file: "https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-onnx-web/resolve/main/onnx/model_quantized.onnx".
at [email protected]:217:5325
at h ([email protected]:217:5348)
at async [email protected]:175:15938
at async [email protected]:175:13612
at async Promise.all (index 0)
at async P ([email protected]:175:13530)
at async Promise.all (index 0)
at async wr.from_pretrained ([email protected]:175:21979)
at async Do.from_pretrained ([email protected]:175:57753)
at async Promise.all (index 1)
That suggests to me that the library isn't supported, but I can see from the listing that it is?
I've tried all of the following models, the one that's worked is "tiny-random-PhiForCausalLM". I quite quickly saw that is not what I'm looking for, I think it's a lorem ipsum generator, but at least that shows me that my code can work if I can figure out where to request the library from.
- onnx-community/Phi-3.5-mini-instruct-onnx-web
- Xenova/Phi-3-mini-4k-instruct
- microsoft/Phi-3-mini-4k-instruct-onnx-web
- Xenova/tiny-random-PhiForCausalLM
- Xenova/phi-1_5_dev
- BricksDisplay/phi-1_5
- BricksDisplay/phi-1_5-q4
- BricksDisplay/phi-1_5-bnb4
- Xenova/Phi-3-mini-4k-instruct_fp16
- Xenova/tiny-random-LlavaForConditionalGeneration_phi
I feel like I'm spinning my wheels on this and any help to point me in the right direction would make a huge difference. Thanks in advance!
In this case, you need to specify the correct dtype and device (it's a special optimized model for WebGPU). The following should work:
import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";
const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
dtype: 'q4f16', device: 'webgpu'
});
We are in the process of adding default values to the config which would mean you don't need to do this in future. Hope this helps!
Thanks very much for the response @Xenova ! Appreciate that pointer, it's helped me get further! As an aside - I've been so pleasantly surprised by how willing people are to help here :-)
I've got it loading, I still seem to be hitting an error with
I seem to be hitting this error now:
Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor model.layers.4.attn.o_proj.MatMul.weight_Q4 failed.Failed to load external data file ""model_q4f16.onnx_data"", error: Module.MountedFiles is not available.
at Ve ([email protected]:100:73169)
at Zu ([email protected]:100:357172)
It appears with these warnings, but I don't think they prevent the model from running:
1.
2024-10-01 09:28:44.905599 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2.
2024-10-01 09:28:44.906799 [W:onnxruntime:, session_state.cc:1170 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
In case it helps, the below is my full code - I'm running on a Macbook Pro in Chrome (exact same setup that I used to try out the Phi demo). I initially tried just running the code locally with VS Code Live Server but just in case that was the culprit I also uploaded to my server and tried loading the page there but got the same errors
const status = document.getElementById("status");
status.textContent = "Loading model...";
import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";
try {
console.log("Started model loading")
const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
dtype: 'q4f16', device: 'webgpu'
});
console.log("Finished model loading")
status.textContent = "Ready";
generateText(languageModel);
} catch (err) {
console.log(err)
status.textContent = "Error - failed to load";
}
Thanks again for the help!
Made a quick edit to the above - realised I was doing something silly with my error handling (eesh). I have updated with the actual error I'm getting!
Never seen that one before 0_0
Oh whoops - you also need to add use_external_data_format: true
as an option:
const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
dtype: 'q4f16', device: 'webgpu', use_external_data_format: true,
});
Hi there! Sorry for the delay in response - I wanted to double-check before I came back again. It looks like I'm getting the same kind of odd error.
I think I'm going to end up building what I planned in Angular anyway, for unrelated reasons, so I think I should be able to use the npm version of this library which may behave differently so I'll give that a crack.
Thanks very much for the time and effort to help! Sorry I can't report more success with this method!
Oh whoops - you also need to add
use_external_data_format: true
as an option:
const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', { dtype: 'q4f16', device: 'webgpu', use_external_data_format: true, });
Hi
I'm running into a similar issue too.
Error
Error: Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor model.layers.6.mlp.up_proj.MatMul.weight_Q4 failed.Failed to load external data file ""model_q4f16.onnx_data"", error: Module.MountedFiles is not available.
at Ve ([email protected]:100:73359)
at ed ([email protected]:100:365982)
Full Code
<!DOCTYPE html>
<html>
<head>
<title>Test Transformers.js</title>
<script type="module">
async function testSummarization() {
try {
// Load transformers.js
const { env, AutoTokenizer, AutoModelForCausalLM, pipeline } = await import('https://cdn.jsdelivr.net/npm/@huggingface/[email protected]');
console.log('Transformers.js loaded'); // Debugging statement
env.allowLocalModels = false
// Load the summarization pipeline
const summarizationPipeline = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
dtype: 'q4f16', use_external_data_format: true,
});
console.log('Summarization pipeline loaded'); // Debugging statement
// Run the summarization
const text = 'The text you want to summarize';
const result = await summarizationPipeline(text, { max_length: 130, min_length: 30, length_penalty: 2.0, num_beams: 4 });
console.log('Summarization result:', result); // Debugging statement
console.log(result[0].summary_text);
} catch (error) {
console.error('Error:', error);
}
}
testSummarization();
</script>
</head>
<body>
<h1>Test Transformers.js</h1>
</body>
</html>
Please help