Quick Start¶

This guide covers the essential features of RapidAI in 10 minutes.

Table of Contents¶

Basic Chatbot
Streaming Responses
Conversation Memory
Caching
Multiple LLM Providers

Basic Chatbot¶

The simplest possible chatbot:

from rapidai import App, LLM

app = App()
llm = LLM("claude-3-haiku-20240307")

@app.route("/chat", methods=["POST"])
async def chat(message: str):
    return await llm.chat(message)

if __name__ == "__main__":
    app.run()

Non-Streaming

Without @stream, responses are returned all at once.

Streaming Responses¶

Add real-time streaming with one decorator:

from rapidai import App, LLM, stream

app = App()
llm = LLM("claude-3-haiku-20240307")

@app.route("/chat", methods=["POST"])
@stream
async def chat(message: str):
    response = await llm.chat(message, stream=True)
    async for chunk in response:
        yield chunk

if __name__ == "__main__":
    app.run()

Streaming Enabled

Responses now stream via Server-Sent Events (SSE)

Test with curl:

curl -N -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Tell me a story"}'

Conversation Memory¶

Add stateful conversations:

from rapidai import App, LLM, stream

app = App()
llm = LLM("claude-3-haiku-20240307")

@app.route("/chat", methods=["POST"])
@stream
async def chat(user_id: str, message: str):
    # Get user's conversation memory
    memory = app.memory(user_id)
    history = memory.to_dict_list()

    # Chat with context
    response = await llm.chat(message, history=history, stream=True)
    async for chunk in response:
        yield chunk

    # Save to memory
    memory.add("user", message)
    memory.add("assistant", "[full response here]")

if __name__ == "__main__":
    app.run()

Test context awareness:

# First message
curl -X POST http://localhost:8000/chat \
  -d '{"user_id": "alice", "message": "My name is Alice"}'

# Second message - bot remembers!
curl -X POST http://localhost:8000/chat \
  -d '{"user_id": "alice", "message": "What is my name?"}'

Memory Backends

Default is in-memory. Use Redis or PostgreSQL for production.

Caching¶

Save money and time with automatic caching:

from rapidai import App, LLM, cache

app = App()
llm = LLM("claude-3-haiku-20240307")

@app.route("/summarize", methods=["POST"])
@cache(ttl=3600)  # Cache for 1 hour
async def summarize(text: str):
    prompt = f"Summarize: {text}"
    return await llm.chat(prompt)

if __name__ == "__main__":
    app.run()

Smart Caching

Identical requests return cached results instantly!

Multiple LLM Providers¶

Use different models for different tasks:

from rapidai import App, LLM

app = App()

# Different models for different purposes
fast_llm = LLM("claude-3-haiku-20240307")  # Fast & cheap
smart_llm = LLM("claude-3-sonnet-20240229")  # More capable

@app.route("/quick", methods=["POST"])
async def quick_answer(question: str):
    """Fast responses for simple questions."""
    return await fast_llm.chat(question)

@app.route("/deep", methods=["POST"])
async def deep_analysis(question: str):
    """Detailed analysis for complex questions."""
    return await smart_llm.chat(question)

if __name__ == "__main__":
    app.run()

Switch Providers Easily¶

# Anthropic Claude
llm = LLM("claude-3-haiku-20240307")

# OpenAI GPT
llm = LLM("gpt-4o-mini")

# Cohere
llm = LLM("command-r")

# Auto-detection works!

Complete Example¶

Put it all together:

production_app.py

from rapidai import App, LLM, stream, cache

app = App(title="AI Assistant", version="1.0.0")
llm = LLM("claude-3-haiku-20240307")

# Streaming chat with memory
@app.route("/chat", methods=["POST"])
@stream
async def chat(user_id: str, message: str):
    memory = app.memory(user_id)
    history = memory.to_dict_list()

    response = await llm.chat(message, history=history, stream=True)
    async for chunk in response:
        yield chunk

    memory.add("user", message)
    memory.add("assistant", "[response]")

# Cached summarization
@app.route("/summarize", methods=["POST"])
@cache(ttl=3600)
async def summarize(text: str):
    return await llm.chat(f"Summarize: {text}")

# Health check
@app.route("/health", methods=["GET"])
async def health():
    return {"status": "healthy"}

# Clear user memory
@app.route("/clear", methods=["POST"])
async def clear(user_id: str):
    memory = app.memory(user_id)
    memory.clear()
    return {"status": "cleared"}

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8000)

Configuration¶

Create rapidai.yaml for advanced configuration:

rapidai.yaml

app:
  name: my-ai-app
  debug: false

llm:
  default_provider: anthropic
  default_model: claude-3-haiku-20240307
  temperature: 0.7
  max_tokens: 4000

cache:
  enabled: true
  backend: redis
  ttl: 3600
  redis_url: redis://localhost:6379

memory:
  backend: redis
  max_history: 10

Next Steps¶

Ready for More?

You now know the core features of RapidAI!

Continue Learning:

Deploy:

Start Tutorial →