Devcon AI Knowledge Twin (WhtsApp RAG Assistant with Real-Time Speaker Uploads)

1. One-Sentence Summary

A WhtsApp-based AI Knowledge Assistant for Devcon that uses RAG on ChromaDB and allows speakers/admins to upload documents directly via WhtsApp for instant attendee access.


2. What Is This Discussion About?

A conversational AI system — accessible to all attendees on WhtsApp — that automatically learns from speaker uploads and Devcon content in real time, enabling accurate, grounded answers about sessions, speakers, slides, and technical topics.


3. Abstract

The Devcon AI Knowledge Twin is an open-source, WhtsApp-accessible assistant that uses Retrieval-Augmented Generation (RAG) over ChromaDB to provide real-time, citation-backed answers to attendee questions.
A key innovation is that speakers and organizers can upload PDFs, slides, text, or links directly via WhtsApp, which are instantly ingested into the vector database and become available for users.

The assistant acts as a dynamic, always-current knowledge layer for Devcon.


4. Motivation

Attendees often miss important information because:

  • Slides are uploaded late or scattered across platforms

  • Agendas change

  • New announcements don’t reach everyone

  • Websites/apps require time and attention that people don’t have during events

WhtsApp is universally accessible, and RAG ensures that answers remain accurate and grounded in official, up-to-date content.
Speaker-driven uploads eliminate friction, enabling Devcon to provide real-time information at scale.


5. Specification

5.1 Attendee Features

  • Natural-language Q&A about:

    • Agenda, schedule updates, room changes

    • Speaker backgrounds

    • Slide contents + talk summaries

    • ZK, MEV, Rollups, staking, and other technical subjects

  • Personalized session recommendations

  • Daily digest

  • Citation-backed answers (source, timestamp)


5.2 Speaker/Admin Upload Features

Speakers and organizers can upload via WhtsApp:

  • PDFs

  • PPTX

  • Notes or text

  • Links

  • Last-minute announcements

  • Session updates

  • Scripts, diagrams, images (OCR extractable)

Automated Pipeline:

  1. Verify speaker/admin number

  2. Extract text (PDF → text, DOCX → text, OCR if needed)

  3. Clean, chunk, and embed

  4. Insert into ChromaDB with metadata (speaker, track, timestamp, type)

  5. RAG system updates instantly

  6. Users receive updated factual answers

This reduces friction and makes the assistant instantly up-to-date.


5.3 WhtsApp Bot Structure

User Bot (public)

  • Primary interface for attendees

  • Answers questions using RAG

  • Sends daily summaries (opt-in)

  • Shares speaker slides or documents on demand

  • Never hallucinates (strict grounding)

Admin Bot (private)

  • Accepts content uploads

  • Supports commands such as:

    • upload: ingest a file

    • delete: remove outdated content

    • list: show indexed documents

    • announce: broadcast update

  • Designed for simplicity and accessibility


5.4 Backend & Architecture

[WhtsApp Cloud API]
        |
      Webhooks
        |
  -----------------
  | Message Router |
  -----------------
     /           \
User Query     Admin Upload
   |               |
 RAG Engine   Ingestion Pipeline
   |               |
  ———— ChromaDB Vector Store ————
        (with metadata)

Tech Components

  • WhtsApp Official Cloud API

  • ChromaDB (vector search)

  • Embedding model

  • Backend (FastAPI/Flask)

  • Storage for original uploaded files

  • Automatic parsing + chunking pipeline


6. Rationale

Attendees benefit because:

  • No apps required

  • Works in low connectivity

  • Answers are immediate and accurate

  • Content stays fresh and comprehensive

  • Useful before, during, and after Devcon

Speakers benefit because:

  • They can upload directly via WhtsApp

  • No need for additional platforms

  • Their audience gets updated information instantly

Devcon benefits because:

  • Strong alignment with open-source public goods

  • Improves attendee experience dramatically

  • Supports education and accessibility

  • Reusable for future events


7. Risks & Mitigations

Risk Mitigation
Unverified content uploads Whitelisted speaker/admin numbers
Parsing errors Multiple extraction pipelines + fallback OCR
Hallucination Strict retrieval-only responses
Outdated data Simple remove/edit commands for admins
Vector store corruption Automatic periodic backups

8. License

Open-source under MIT or Apache 2.0.


9. Conclusion

The Devcon AI Knowledge Twin transforms Devcon into a real-time, searchable knowledge experience available through WhtsApp — the most universal communication platform in the world.
By enabling speakers to upload documents directly via WhtsApp and automatically ingest them into ChromaDB, Devcon gets a living, breathing knowledge layer that benefits every attendee.


I am really unsure about the usage of WhtsApp - this is very centralized tech (by meta) and I think we should use more open technology like [matrix], session, ..

Although the interface is merely a delivery layer and not where the intelligence resides, the centralization issue makes sense. The ingestion pipeline, embeddings, ChromaDB, and RAG system are completely open, portable, and self-hosted. The knowledge twin remains functional even if the client layer is altered.

Adoption is more important than ideology for events at the devcon size. In fact, during a conference, people will use a low-friction, zero-install interface. Since they would communicate with the same RAG backend, matrix/session can still be introduced later.

Instantaneous ChromaDB updates and seamless speaker uploads—rather than the message surface—are the true innovations.

I strongly disagree with this. Especially for DEVCon it is really important to stick to values - no matter the size.

Instantaneous ChromaDB updates and seamless speaker uploads—rather than the message surface—are the true innovations.

then let’s focus on this rather than the WhtsApp interface. Maybe we can use some of this. Where can I find the source code?

Thanks for the submission - good to see some early contributions!

The idea is not bad (it is in fact very close to what we already shipped for Devcon SEA) - I think this is a hard sell for a few reasons:

  1. We already have most of what this DIP offers planned (and somewhat implemented) - we have experience building a RAG-pipeline, and already have an integration with speaker and session data.

  2. Getting all our knowledge sources tied into an external system with external devs is complex to manage

  3. Speaker data being updated in whtsapp is an indirection we do not need; we already have an internal system for managing speaker data - I don’t see a compelling reason to add another, it would just cause inconsistencies

  4. I believe this feature belongs inside the Devcon app, not on whtsapp

1 Like