TarjamaRTv1

A multilingual real-time speech translation system supporting bidirectional translation between 99 languages with context-aware ASR correction.

Key Features

99 Languages: Bidirectional translation supporting 9,801 translation directions
Context-Aware Correction: LLM-powered ASR error correction using temporal context to fix boundary artifacts
Voice Activity Detection: Reduces hallucinations by removing silence segments
Optimized Streaming: 2-second chunks with 0.5-second overlap for continuous processing
Modular Architecture: Cascaded pipeline allowing component-wise optimization

System Architecture

The system implements a 5-component pipeline:

Voice Activity Detection (pyannote/voice-activity-detection)
Automatic Speech Recognition (deepdml/faster-whisper-large-v3-turbo-ct2)
Context-Aware ASR Correction (GPT-4o-mini)
Jaccard Validation
Machine Translation (vLLM-served cpatonn/Qwen3-4B-Instruct-2507-AWQ-4bit)

Requirements

Before setting up the project, ensure you have:

Python 3.11 (required)
GPU with 6-10 GB VRAM (to run VAD + ASR + MT models)
OpenAI API Key (required for ASR Correction)
~80 GB free storage (for models, dependencies, and Docker images)

Note: This project was developed and tested on Windows. You may encounter issues when running on other operating systems.

Setup

Clone repository

git clone https://github.com/as4193/TarjamaRTv1.git
cd TarjamaRTv1

Install PyTorch with CUDA support

pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu124

Install remaining requirements

pip install -r requirements.txt

Set OpenAI API key

Windows (PowerShell):

$env:OPENAI_API_KEY="your_openai_key_here"

Linux/Mac:

export OPENAI_API_KEY=your_openai_key_here

Login to Hugging Face (for gated models)

huggingface-cli login

Enter your Hugging Face token when prompted. This is required for accessing gated models like Pyannote.

Run vLLM with OpenAI

docker pull vllm/vllm-openai:latest
docker-compose up -d
#You should be in vllm_service folder

Note: This step may take 10-20 minutes depending on your internet speed, as the model will be downloaded from Hugging Face and then loaded into GPU.

Start Streamlit app

streamlit run project_ui.py

Open browser

# Navigate to http://localhost:8501

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
images		images
realtime_translation		realtime_translation
vllm_service		vllm_service
.gitignore		.gitignore
README.md		README.md
USAGE_GUIDE.md		USAGE_GUIDE.md
project_ui.py		project_ui.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TarjamaRTv1

Key Features

System Architecture

Requirements

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TarjamaRTv1

Key Features

System Architecture

Requirements

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages