Paul Oamen
FEDA logo

FEDA

status: experimental

A proper public demo interface for FEDA is currently in development and will be available soon. The screenshots shown are from an extremely basic working prototype built using LSTMs, the Flower framework, Ethereum Sepolia, IPFS, Pinata, and Streamlit for the UI. Source code is available here (please ignore any visible API keys, as this was strictly for prototyping purposes).

FEDA home page

FEDA is a small showcase of a much broader vision for Decentralized Artificial Intelligence.

Animated elephant

You have probably experienced autocomplete before. When you start typing a message to a friend, working on a document, or writing code, suggestions often appear predicting what you might want to type next. That is the core idea behind autocomplete. It has become a common feature in many software systems and is only just beginning. In the future, we may see it evolve into systems that support direct voice messaging, speech-to-text, and perhaps even mind-to-text, shifting how we interact with machines entirely.

At its core, autocomplete relies on an AI model observing what you are typing and predicting what could come next. It does this by learning from two kinds of context. One is the local context, which comes from your personal typing habits and patterns. The other is a global context, shaped by general training and insights gathered from other users across the platform. While this technology is undeniably useful, it raises valid concerns around privacy.

Imagine an AI model learning from everything you type, even when working on sensitive documents, confidential codebases, or private credentials. That is a serious concern. Even more alarming is the fact that many of these predictions are powered by external API calls to large language models hosted by third-party service providers. In today's excitement around large language models, many users overlook these risks, but they are real. Several major institutions, including central banks, have already begun issuing internal warnings discouraging the use of such models for handling confidential work.

As AI engineers, we must think critically about how to move forward. Many different approaches have been proposed to address these issues. One such approach is embodied in a system called FEDA, which stands for Federated and Decentralized Autocompleter.

The core idea behind FEDA is to apply federated learning to train AI models directly on the user's device, local system, or private infrastructure. None of the user's content is shared externally. All training takes place locally. Once training is complete, only the learned weights, which are vectors representing what the model has learned, are shared over a decentralized network.

This network gathers the local weights from multiple systems and combines them using a consensus protocol at regular intervals. The resulting aggregated weights are committed back to the network, and each participating system can retrieve the updated model from there. This creates a push and pull system, where local models contribute what they have learned and pull updates that reflect global training across all users. The result is a system that maintains both personal learning and collective knowledge without compromising user privacy.

This principle can be applied well beyond autocomplete. It represents a foundational philosophy for building privacy-respecting artificial intelligence systems, with a pinch of decentralization to ensure that trust and control remain in the hands of users.

FEDA began as a playful experiment and later became a project during my first master's program. The early version was quite basic, but development has since taken a more refined direction. A formal paper detailing the ideas behind FEDA is currently in progress and will offer deeper technical context. For now, I will pause here and shift focus to discussing the demo that has been created.

The Original FEDA Demo

FEDA architecture diagram

The original FEDA demo was an extremely simple prototype created to demonstrate the possibility of building a privacy-aware text autocompletion system using federated learning and decentralized storage. It was not meant for production or scale. Instead, it served as a foundational showcase of the key ideas behind FEDA, which stands for Federated and Decentralized Autocompleter.

Local and Global Modeling Using LSTM

At the heart of the demo were two LSTM models, one referred to as the local model and the other as the global model. LSTM was selected because of its simplicity and low compute demands. While more advanced models like transformers exist, LSTM provided a fast and accessible way to build and test the initial system.

The local model was trained on the user's own data. It learned from the user's writing patterns and produced personalized suggestions based entirely on local behavior. The global model aimed to offer more general suggestions based on patterns across multiple users, but still without requiring any of their actual text data.

Both models existed on the client application. They provided suggestions in real time as the user typed, with the local model adapting to the user's style and the global model offering a shared language context.

Federated Learning with Flower

The demo used the Flower framework to implement the federated learning process. Each client device would begin with a copy of the current global model. As users typed, their local models were trained on the input data. After a brief training cycle, the models shared only their learned weights with the coordinating server.

The server aggregated these updates using a basic federated averaging algorithm and then sent back the updated model parameters to all connected clients. This helped each client benefit from a broader training context while preserving the privacy of local inputs. The Flower framework made it easy to coordinate this exchange.

For the demo, this was kept very lightweight. A small set of simulated clients and a short training loop were used to make the process easy to follow.

Decentralized Weight Sharing Through IPFS and Smart Contracts

To extend the architecture further, the demo included a decentralized component. After each federated round, the aggregated model weights were saved to IPFS, a peer-to-peer file storage protocol. IPFS assigned a unique content hash to each model, known as the content identifier.

To allow clients to always find the latest model weights, a mapping was stored on the Sepolia testnet using a simple smart contract. The smart contract associated a fixed identifier with the current IPFS content identifier. Whenever a new set of weights was available, a new IPFS content identifier was generated and recorded on the blockchain.

This setup ensured that any client joining the network could fetch the latest global model without needing access to a centralized server. The model file itself was pinned using Pinata, which guaranteed that it remained available on the IPFS network.

Looking Ahead

While the original demo was deliberately kept simple, it laid the groundwork for deeper research and more advanced implementations. A more sophisticated version of FEDA is currently underway. This next iteration will feature improved model architectures, a more scalable federated coordination setup, and better decentralization primitives, with a focus on real-world performance, privacy guarantees, and extensibility across domains beyond autocompletion.