Hugging Face + Google Visual Blocks
At Google I/O 2024, we're collaborating with the Google Visual Blocks team to release custom Hugging Face nodes. Visual Blocks for ML is a browser-based tool that allows users to create machine learning pipelines using a visual interface. We've partnered with them to build custom Hugging Face nodes that can run fully in-browser with Transformers.js or using our Hugging Face Serverless API for larger models on the server side and Text Generation Inference for selected LLMs.
You can learn more about Visual Blocks and how to use it here and check out Hugging Face custom nodes source-code here.
We're looking for feedback on this integration as well as contributions with new nodes and improvements. Please open an issue in the Visual Blocks repository or a PR with your changes.
How to use the custom components
To start playing with our custom components you need to Add a custom node to your Visual Blocks project. First you need to start a new project https://visualblocks.withgoogle.com/#/edit/new, then click on the "+" button in the bottom left corner to add a new node.
Then input the pre-bundled code from our npm package. You can do this by pasting the following link into the input field and clicking "Submit":
https://cdn.jsdelivr.net/npm/huggingface-visualblocks-nodes@latest
Then you will be able to see three Hugging Face Collections: Client, Server and Common.
Client-Side Nodes
Using only client-side nodes, you can try to combine fun image processing nodes, webcam, and Transformers.js image segmentation models.
Image Segmentation
Depth Estimation
Server-Side Nodes
With the Hugging Face Server Nodes, you can access thousands of state-of-the-art models directly from the hub.
Server + Client-Side Example
Here is an example using the Hugging Face Hub Login node to get your personal token, then using the Mistral-7B LLM to generate an image using Stable Diffusion XL Text to Image, and then piping it to Transformer.js Depth Estimation running on the client side.
Another cool example uses Stable Diffusion XL Text-to-Image to generate a background image and Transformers.js to remove the background of the webcam input using either briaai/RMBG-1.4 or Modnet.
More Examples
Here is a list of examples showcasing the new nodes. You just need to click on an example to load it in on the editor.
Client Nodes
Translation Node Example
Token Classification Node Example
Text Classification Node Example
Object Detection Node Example
Image Segmentation Node Example
Image Classification Node Example
Depth Estimation Node Example
Background Removal Node Example
Server Nodes
Chat Template Text Generation Node Example
Chat Completion Node Example
Fill Mask Node Example
Image Classification Node Example
Summarization Node Example
Text Classification Node Example
Text Generation Node Example
Text to Image Node Example
Token Classification Node Example
Extra Examples
Background Removal Text to Image
Chat Completion Text to Image Depth
Image Segmentation Webcam Client
Acknowledgements
Thanks to @Xenova for building Transformers.js and for kickstarting the custom nodes project, @Jason Mayes, and the Visual Blocks team.