Spaces:
Sleeping
Open-source and privacy-by-design alternative to ChatGPT
Table of Contents
📜 About the project
What is BlindChat?
🐱 BlindChat is an open-source project to develop the first fully in-browser and private Conversational AI.
Most conversational AI solutions today require users to send their data to AI providers who serve AI models as a Service. This poses privacy issues for users who lose control over their data.
⚠️ Because data is a key asset to improve LLMs, many solutions more or less implicitly fine-tune users’ data to improve their model.
This creates privacy risks for users as LLMs might learn their data by heart. Carlini et al. [1] showed that LLMs such as GPT-J could learn at least 1% of their training set by heart.
🔐 BlindChat solves this issue as users have guarantees that their data remains private at all times and have full control over it, either by doing local inference or using secure isolated environments called secure enclaves.
Local conversations
Demo
👩💻 You can try out BlindChat here! We enable users to interact with a Flan-T5 model locally through their browser: the model is pulled and used for local inference using transformers.js.
Who is BlindChat for?
BlindChat aims to serve two users:
End users: We want to provide privacy-by-design alternatives to change the current status quo. Most users today are forced to give up their data to leverage AI services, and opaque or inexistent privacy controls are the norm.
Developers: We want to help developers easily serve privacy-by-design Conversational AI, which is why we are focused on making BlindChat easy to customize and deploy.
Roadmap
You can check out our progress in more detail on our official roadmap. We highlight feature on which we would love help from contributors in our help wanted section.
Roadmap quick summary:
- Revamping of Hugging Face Chat UI to make it entirely client-side (removal of telemetry, data sharing, server-side history of conversations, server-side inference, etc.)
- Integration of privacy-by-design inference with local model
- Local caching of conversations
- Integration of more advanced local models (e.g. phi-1.5) and more advanced inference (e.g. Web LLM)
- Integration of privacy-by-design inference with remote enclaves using BlindLlama for powerful models such as Llama 2 70b & Falcon 180b ⌛
- Integration with LlamaIndex TS for local Retrieval Augmented Generation (RAG) ⌛
- Internet search ⌛
- Connectors to pull data from different sources ⌛
🔧 Setup
Before going any further, please make sure you have Node JS 18.0 installed on your system.
To run the chat user interface in dev/debug mode for testing purposes, execute the following commands in the root folder of your BlindChat code repo.
npm install
npm run dev
This will install the dependencies of the project and launch the dev environment.
The chat can be deployed in production mode with the following commands:
npm run build
node build
The chat-ui uses server-side rendering, so building the pages before deploying them is mandatory.
⚠️ Note that the command
node build
will run the server inHTTP mode
. If you wish to add TLS, please use a proxy server, such as NGINX.
🧑🎨 Design
Principles
🤗 BlindChat is a fork from Hugging Face Chat UI project.
We modified the code so that various tasks usually handled by the server are done by the browser. This is to ensure privacy as we do not want to send user data to the server/AI provider as our solution places the AI provider outside of our trust model.
Philosophy
To make AI transparent and confidential, (almost) all of the logic is transported from the server-side to the client-side browser.
This ensures end-users’ privacy and gives them control over what happens to their data. For instance, the inference can be done locally using transformers.js, and conversations can be stored in the user's browser chat. This means the operators of the AI service are blind to the user's data, hence the name BlindChat!
Data is only sent server-side where our remote enclave mode is selected. With this mode, the server is deployed within a hardened and verifiable environment called an enclave which provides end-to-end protection and prevents external access. Not even the AI provider admins operating the enclave can read users’ data.
Note that while our hardened environments don’t fit in with all definitions of an “enclave”, we will use it for convenience’s sake here to describe an environment that allows a server to process data without exposing its contents to service providers.
Private inference
We offer two modes to ensure users’ data remains private:
On-device inference
With the on-device mode, the model is sent locally to the users’ browser, and inference is performed on-device.
This mode is generally suitable for smaller models as large models may require too much bandwidth and computational resources.
Confidential and transparent AI APis with enclaves
With the Zero-trust AI APIs mode, data is sent to a secure environment called an enclave containing the model for remote inference.
These environments provide end-to-end protection through robust isolation and verification. User data is never accessible in clear to the AI provider admins.
You can find out more about Confidential and transparent AI APIs with enclaves in the guide we provide with our BlindLlama project, which is the underlying technology for this mode of BlindChat.
Architecture
The project currently has three major components:
- UI: This is the Chat interface that the end user interacts with. It contains the Chat box, and will contain plugins and other widgets for more complex interaction, such as loading documents or enabling voice commands.
- Private LLM: Developers can customize which LLM they choose to answer users’ queries. Current options are either local models or remote enclaves to ensure transparent and private inference.
- Storage: Developers can customize what kind of storage is used to save information such as conversation history and, in the future, embeddings for RAG.
*Coming soon:
- Connectors: Connectors will allows users to pull documents from various sources, e.g. PDF upload, and share outputs
- Integration with Llama Index TS: This will allow users to index documents with local models, store them in local storage and use them for RAG (query the LLMs based on the information contained in their documents).
📊 Comparisons
Client-side bandwidth requirements | Client-side computing requirements | Model capabilities | Privacy | |
---|---|---|---|---|
On-device prediction | High | High | Low | High |
Regular AI APIs | Low | Low | High | Low |
Zero-trust AI APIs | Low | Low | High | High |
On-device predictions and Confidential AI APIs both provide privacy contrary to most existing Conversational AI solutions that expose data to privacy risks.
On-device prediction has the advantage of providing the highest level of privacy as data does not leave the device but requires downloading models that are several hundreds of MBs to several GBs and require heavy memory and computing resources. For many users, this option will not be possible with larger, higher-performing models due to these device requirements.
Confidential AI APIs are deployed remotely, meaning the size of models is not restricted by the specifications of user devices. Users are able to query large models while still having robust privacy guarantees.
📇 Get in touch
We would love to hear your feedback or suggestions, here are the ways you can reach us:
- Found a bug? Open an issue!
- Got a suggestion? Join our Discord community and let us know!
- Set up a one-on-one meeting with a member of our team
Want to hear more about our work on privacy in the field AI?
Thank you for your support!
References
[1] Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramer, F., & Zhang, C. (2022). Quantifying Memorization Across Neural Language Models. ArXiv. /abs/2202.07646