Getting Started
This guide walks you through setting up Omelette from scratch.
Prerequisites
- Conda or Miniconda
- Node.js 22+
- (Optional) CUDA for GPU-accelerated OCR and embeddings
- (Optional) API keys: OpenAI, Anthropic, Aliyun Bailian, or Volcengine for LLM; Semantic Scholar for higher search limits
1. Clone the Repository
git clone git@github.com:sylvanding/omelette.git
cd omelette2. Set Up Conda Environment
conda env create -f environment.yml
conda activate omelette3. Configure Environment
cp .env.example .env
# Edit .env with your API keys (LLM, Semantic Scholar, etc.)Use LLM_PROVIDER=mock for testing without API keys.
4. Start the Backend
cd backend
pip install -e ".[dev]"
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Note: Alembic migrations run automatically on startup. The database schema is brought up to date before the server accepts requests.
5. Start the Frontend
In a new terminal:
cd frontend
npm install
npm run dev6. Open the Application
Visit http://localhost:3000.
LLM Provider Options
Set LLM_PROVIDER in .env to one of:
| Provider | Description |
|---|---|
openai | OpenAI API (GPT-4, etc.) — set OPENAI_API_KEY, OPENAI_MODEL |
anthropic | Anthropic Claude — set ANTHROPIC_API_KEY, ANTHROPIC_MODEL |
aliyun | Aliyun Bailian (Coding Plan) — set ALIYUN_API_KEY, ALIYUN_BASE_URL, ALIYUN_MODEL |
volcengine | Volcengine Doubao — set VOLCENGINE_API_KEY, VOLCENGINE_BASE_URL, VOLCENGINE_MODEL |
ollama | Local Ollama — set OLLAMA_BASE_URL, OLLAMA_MODEL |
mock | No real LLM calls; returns preset responses for testing |
MCP Usage
Omelette exposes an MCP server for AI IDEs (Claude Code, Codex, Cursor). To connect:
Start the backend so the MCP endpoint is available at
http://localhost:8000/mcp.Configure your AI IDE to add Omelette as an MCP server:
- Claude Code / Codex: Add a remote MCP server with URL
http://localhost:8000/mcp - Cursor: Add Omelette in MCP settings with the same URL
- Claude Code / Codex: Add a remote MCP server with URL
Once connected, you can use tools like
search_knowledge_base,lookup_paper, andadd_paper_by_doidirectly from your AI assistant.
Optional: OCR and Embeddings
For full OCR and embedding support:
conda activate omelette
cd backend
pip install -e ".[ocr,ml]"- OCR: PaddleOCR (GPU recommended via
paddlepaddle-gpu) - Embeddings: sentence-transformers with BAAI/bge-m3 (downloads on first use)
Running Tests
cd backend
pytest tests/ -vThe test suite includes 178 tests covering API endpoints, services, and pipelines.
