AI Chatbot: Crafting with Precision & Personality

Drupal

Published on 15 Jan, 2025

6 min read

In the realm of artificial intelligence, developing an AI chatbot that not only delivers accurate information but also embodies a specific tone is both challenging and rewarding.

At OpenSense Labs, we embarked on such a project, which we have now renamed Project X. Our objective was to create an AI chatbot capable of understanding and responding in both English and Spanish, all while maintaining a professional and empathetic demeanor.

This article delves into the technical intricacies of our approach, highlighting the challenges we faced and the innovative solutions we implemented.

If you're exploring AI services or looking to expand your business with the help of AI, check out our services before you proceed.

Check Responsible AI Services

Let's jump into the world of AI!

AI Chatbot: The Key Components

1. RAG Retriever:

For the retrieval component, we opted for the Chroma Vector Store, praised for its simplicity, efficient indexing, and strong retrieval capabilities. All documents related to Project X were first divided into smaller chunks based on their semantic similarity. These chunks were then converted into embeddings using the powerful Gemini Embeddings model. Both the embeddings and their associated metadata were stored in a Chroma database instance, serving as the foundation for our retrieval system.

2. Response Generator:

User queries were also converted into vectors using the same Gemini Embeddings model. We then retrieved the top-k most relevant vectors through a nearest neighbor search. To provide a more informed context for generating responses, we included a few-shot example set, which was concatenated with the retrieved document chunks. For response generation, we used the Gemini-flash-8b language model, ensuring the tone and style aligned with the provided few-shot examples and the context established by the retrieved chunks.

3. Semantic Filter:

While the generated responses were generally accurate, they occasionally strayed from the desired tone and sentiment. To address this, we implemented a scoring mechanism based on predefined examples. If a response fell below the acceptance threshold, we would regenerate it by providing additional context to fine-tune the tone and ensure it met the required standards.

AI Chatbot: Key Challenges and Innovative Solutions

1. Crafting Responses with a Specific Tone

Challenge: Many AI chatbots excel at providing factual information but cannot often convey personality. Our client emphasized the need for responses that were not only accurate but also professional and empathetic.

Solution:

Few-Shot Prompting with Careful Example Curation: We developed examples of desired responses in both English and Spanish, embedding them as few-shot examples within the prompt to establish a baseline tone.

# Few-shot example in prompt
prompt = """
Q: What is Project X?
A: Project X is an innovative initiative focused on sustainable urban planning. Its primary goal is to...
Thank you for your question.
"""

Re-generation and Feedback Loop: An auto-check mechanism was implemented to identify responses that deviated from the intended tone. If such deviations were detected, responses were regenerated using adjusted prompts. Human reviewers provided feedback via the Chainlit UI to further refine tone consistency.

Also, Check Out:

1. AIOps: Using Artificial Intelligence in DevOps

2. Top 2018 Drupal Modules using Artificial Intelligence

3. AI Fairness: A Deep Dive Into Microsoft's Fairlearn Toolkit

4. Changing Businesses Using Artificial Intelligence and Drupal

2. Multilingual Query Understanding and Response

Challenge: The AI chatbot needed to handle queries in both English and Spanish, accurately detecting the language and responding accordingly.

Solution:

Enhanced Language Detection Using Lingua: We utilized the lingua library to build a language detector specifically tuned for English and Spanish. Queries containing words from both languages were categorized based on the dominant language.

from lingua import LanguageDetectorBuilder

# Initialize the language detector
detector = LanguageDetectorBuilder.from_languages('en', 'es').build()

# Detect language of a mixed query
query = "¿Qué es Project X? How does it help with urban planning?"
language = detector.detect_language_of(query)

Separate RAG Pipelines: Each language had its own Chroma DB instance. English and Spanish documents were embedded separately to ensure the most relevant context retrieval.

3. Balancing Diverse Data Sources

Challenge: The dataset included structured Q&A pairs with the correct tone, factual datasets with accurate but tone-agnostic answers, and PDFs with extensive details about Project X. Integrating these sources posed challenges in data volume and contextual relevance.

Solution:

Data Chunking and Embedding: We parsed PDFs into smaller chunks using LangChain’s document loaders, ensuring each chunk retained semantic coherence. These chunks were then embedded and stored in Chroma DB for efficient retrieval.

from langchain.document_loaders import PyPDFLoader

# Load and parse PDF
loader = PyPDFLoader('project_x_details.pdf')
documents = loader.load_and_split()

# Embed and store in Chroma DB
embeddings = embed_documents(documents)
store_in_chroma_db(embeddings)

Hybrid Retrieval Strategy: Queries were processed through RAG pipelines that combined factual embeddings and tone-specific Q&A pairs.

def retrieve_context(query):
    # Retrieve context from PDFs
    pdf_context = retrieve_from_chroma_db(query)
    # Retrieve tone-specific Q&A
    qa_context = retrieve_from_qa_db(query)
    # Combine contexts
    combined_context = combine_contexts(pdf_context, qa_context)
    return combined_context

4. Integrating Feedback for Continuous Improvement

Challenge: A feedback-driven learning loop was essential for refining tone, factual accuracy, and multilingual performance.

Solution:

Real-Time Feedback Collection via Chainlit: After each interaction, testers could rate responses and provide comments. Feedback was stored in Google BigQuery, creating a repository for model evaluation.
Feedback-Based Fine-Tuning: Low-rated responses were analyzed to identify common issues (e.g., tone deviation, incomplete answers). Frequent issues were addressed by refining prompt templates or updating the few-shot examples.

Check Responsible AI Services

5. Addressing Looping Issues in Gemini Due to Large Context

Challenge: While leveraging Gemini’s long-context capabilities, we encountered instances where the model entered repetitive loops, especially when processing large contexts.

Solution:

Context Caching Optimization: To mitigate looping issues, we implemented context caching, which allows the model to process large contexts more efficiently. However, we observed that improper implementation of context caching could lead to the model getting stuck in loops.
Monitoring and Adjusting Context Length: We carefully monitored the length of the context provided to Gemini, ensuring it was within optimal limits to prevent the model from generating repetitive sequences.
Error Handling and Retry Mechanisms: We implemented robust error handling and retry mechanisms to detect and recover from instances where the model entered a loop, ensuring continuous and accurate responses.

Project X ChatBot Architecture AI Chatbot OpenSense Labs

AI Chatbot: Future Enhancements In AI Agents for Smarter Conversations

Looking ahead, we aim to enhance Project X by integrating AI agents that will further personalize and refine the user experience. Here’s a glimpse of some upcoming improvements:

1. AI-Powered Personalization

Context-Aware Agents: AI agents will track user interactions, preferences, and previous conversations to tailor responses even more effectively. These agents will be able to dynamically adjust their behavior based on user history, providing a truly personalized experience.
Sentiment Analysis and Adaptive Tone: Future agents will use real-time sentiment analysis to adjust the tone of responses, ensuring they align more closely with user emotions and intentions. This adaptive approach will create a more empathetic and engaging conversational flow.

2. Proactive Recommendations

AI Agents as Proactive Assistants: By leveraging user behavior and historical queries, AI agents will proactively suggest helpful information, guides, or resources based on the context of the conversation. This would add a layer of intelligence where the system anticipates user needs.

3. Cross-Platform Consistency

Seamless Multimodal Experiences: With the introduction of multimodal AI agents, we aim to make Project X accessible not only through text-based chat but also via voice, video, and other media. The AI agents will provide consistent and fluid interactions across various platforms.

Also, Check Out:

1. Digital Marketing Trends: AI vs Human Copywriters

2. Drupal Debug: Effective Techniques And Tools

3. API Documentation Tool: 10 Best Tools For 2025

4. Starshot: Drupal’s New CMS Initiative

Key Takeaways

Personalization in AI Chatbots Matters: Crafting a tone-specific AI chatbot requires iterative refinement of prompts and constant feedback integration.
Language Adaptability is Key: Addressing bilingual or multilingual audiences effectively enhances user engagement.
Dynamic Data Handling Enables Scalability: Integrating and retrieving diverse data sources ensures the AI chatbot remains contextually accurate.

This project underscored the importance of combining technical ingenuity with user-centric design. Our solution not only met client expectations but also laid the groundwork for building smarter, more nuanced conversational AI Chatbots system.

Have unique challenges in your AI chatbot projects? Let’s collaborate to solve them!

Ready to start your digital transformation journey with us?

A complete set of free plugins that enhance CKEditor 5 for Drupal, overseen by CKSource. The CKEditor 5 Plugin Pack lets…

Read Blog

Back from DrupalCon Atlanta 2025: A Milestone Moment for OpenSense Labs

“Fit. Fast. Built to Last.” That wasn’t just a tagline, it was the mindset we brought to DrupalCon Atlanta 2025. And…

Read Blog

Explainable AI Tools: SHAP's power in AI

Explainable AI tools Explainable AI And SHAP OpenSense Labs

Do you know what are explainable AI tools? Explainable AI tools are programs that show how an AI makes its choices. They…

Read Blog

AI Chatbot: Crafting with Precision & Personality

AI Chatbot: The Key Components

AI Chatbot: Key Challenges and Innovative Solutions

AI Chatbot: Future Enhancements In AI Agents for Smarter Conversations

Key Takeaways

Ready to start your digital transformation journey with us?

Related Blogs

For Business