Developer guide for the Health Check Recommendation system. Multi-agent RAG architecture using AgentOS framework, DeepSeek V3, ChromaDB, and BAAI/bge-base-zh embeddings. Web UI via Streamlit.
config/ - Settings and configuration (settings.py)
src/ - Core application logic
src/agents/ - Multi-agent system (coordinator, 4 specialized agents)
hcr.py - Recommendation engine (dual-mode)
tools.py - Agent tool definitions
prompt.py - System prompts
vectorstore.py - Vector store builder
report.py - Report generation
ontology.py - Medical ontology (synonym expansion)
query_decomposer.py- Query decomposition
agentos/ - AgentOS framework (internal, not a third-party package)
agent/ - Base agent with reason-act loop
rag/ - RAG components (store, embedding, bm25, hybrid, rerank, split)
utils/ - DeepSeek API wrapper
memory/ - Agent memory management
prompt/ - Prompt templates
tools/ - Framework tool infrastructure
eval/ - Evaluation framework
evaluator.py - Base evaluator
retrieval_eval.py - Retrieval metrics
recommendation_eval.py - Recommendation metrics
safety_eval.py - Safety evaluation
run_eval.py - Full evaluation runner
ablation_study.py - Ablation study (6 configs)
test_cases.json - Labeled test cases
scripts/ - Utility scripts
generate_synthetic_data.py - Synthetic data generation
validate_synthetic_data.py - Data validation
test_deepseek_api.py - API connectivity test
web/ - Streamlit frontend
pages/ - Recommend, Chatbot, Hospitals, Report
data/ - CSV, PDF, JSON data files
test/ - Manual test scripts (hcr_test.py)
vectordb/ - ChromaDB vector store directories (generated)
# Install dependencies
pip install -r requirements.txt
# Build vector store (run once after data changes)
python src/vectorstore.py
# Run the Streamlit web app
streamlit run web/🩺HCR-HOME.py
# Run the recommendation engine directly (CLI test)
python src/hcr.py
# Test DeepSeek API connectivity
python scripts/test_deepseek_api.py
# Run the test script
python test/hcr_test.py
# Run evaluation
python eval/run_eval.py
# Run ablation study (outputs to eval/ablation_log_*.txt)
python eval/ablation_study.py
# Generate synthetic data
python scripts/generate_synthetic_data.pyThere is no formal test framework (no pytest/unittest). Tests are manual scripts in test/.
python test/hcr_test.py— Runs the Agent with a hardcoded user profile, calls tools, and prints memory/response output.python eval/run_eval.py— Evaluates retrieval, recommendation, and safety metrics.python scripts/test_deepseek_api.py— Verifies DeepSeek API connectivity.- To test a specific module, run it directly:
python src/hcr.py,python src/report.py
No lint (ruff, flake8), format (black), or type-check (mypy) commands are configured.
- Create a conda environment:
conda create -n wjl python=3.10 conda activate wjl
- Create a
.envfile in project root:DEEPSEEK_API_KEY=sk-xxxxx - The
.envfile is gitignored — never commit secrets. - Always run commands inside the
wjlconda environment.
- LLM calls use DeepSeek official API via
agentos/utils/utils.py - API endpoint:
https://api.deepseek.com(model:deepseek-chat) - Uses
openai.OpenAIclient with custombase_url - IP geolocation uses
ip-api.com(free, no key) - Geocoding uses Nominatim via
geopy(free, no key)
- Standard library first, then third-party, then local modules.
- Each file adds the project root to
sys.pathat the top:import sys, os current_dir = os.path.dirname(os.path.abspath(__file__)) project_root = os.path.dirname(current_dir) # adjust depth as needed sys.path.insert(0, project_root)
- Use
from module import Class/functionfor local imports. - Avoid
import *except insrc/tools.pywhere all tool classes are exported.
- Classes: PascalCase (e.g.,
Recommendation,TemporaryMemory,ChromaDB) - Agent classes: PascalCase (e.g.,
SymptomAnalyzer,RiskAssessor,MedicalAgent) - Tool classes: lowercase_with_underscores (e.g.,
search_by_id,recommend_by_age) — callable tools for the agent - Functions/methods: snake_case (e.g.,
call_model,parse_tool_info,add_memory) - Constants/prompts: UPPER_SNAKE_CASE (e.g.,
HCR_PROMPT,OUTPUT_PROMPT) - Variables: snake_case
- Use
str | None(Python 3.10+ union syntax) for optional parameters. - Use
Listfromtypingfor list type hints (seeagentos/agent/agent.py). - Type hints are used inconsistently — prefer adding them to new code.
- Tool classes use docstrings in the class and in the
run()method. - The
run()method docstring follows this format:def run(self, arg1: type): """ function_name:brief description in Chinese Args: arg1 (type): description Returns: str: description """
- Module-level comments are in Chinese.
- Minimal error handling throughout the codebase.
- Functions return result strings or
0for not-found cases (seereport.py). - No try/except patterns established — use judgment for new code.
- Configuration lives in
config/settings.pyas aConfigclass with class-level attributes. - Database connections use SQLite (
sqlite3) with inline SQL. - Vector DB operations go through
agentos.rag.store.ChromaDB. - LLM calls go through
agentos.utils.call_model(messages, api_key). - The Agent uses a reason-act loop:
reason()parses model output for thought/function/args,act()executes the tool. - Multi-agent coordination goes through
src/agents/coordinator.pyOrchestrator.
| Purpose | Library |
|---|---|
| LLM API | openai (DeepSeek via base_url) |
| Vector DB | chromadb |
| Embeddings | sentence-transformers, transformers |
| BM25 | rank-bm25, jieba |
| Reranker | FlagEmbedding |
| Web UI | streamlit |
| Geolocation | geopy, pydeck |
| Data | pandas, openpyxl, pypdf |
| PDF generation | fpdf |
Follow the class pattern in src/tools.py:
class my_new_tool:
def __init__(self):
pass
def run(self, arg1: str):
"""
my_new_tool:brief description
Args:
arg1 (str): description
Returns:
str: description
"""
# implementation
return resultThen export it in src/tools.py:
from tools import my_new_tool # add to imports- Do not commit
.env,__pycache__/, or__init__.pyfiles (they are gitignored). - The
agentos/directory is a local framework — treat it as internal code, not a third-party package. - When adding new tools, follow the class pattern in
src/tools.py: class with__init__andrun()method with docstring. - Streamlit pages live in
web/pages/and are numbered for sidebar ordering (e.g.,1_🥰_Recommend.py). - The system supports dual-mode in
src/hcr.py: setuse_multi_agent=Truefor the multi-agent pipeline,Falsefor legacy single-agent. - Multi-agent conversation history is stored in
coordinator.context.conversation_history.