ASTRA is a post-hoc alignment framework for aligning entity embeddings from independently trained Knowledge Graph Embedding (KGE) models.
It enables embeddings from different knowledge graphs (KGs) to be used for downstream tasks—such as entity alignment and link prediction—without requiring joint training or KG merging.
Traditional alignment methods (e.g., MTransE, BootEA, KDCoE) have several limitations:
- Require joint training
- Need merged graphs
- Do not scale well to large KGs
ASTRA follows a different paradigm:
- Align embeddings after training (post-hoc)
- Preserve original semantic information
- Inject graph structure via R-GCN
- Learn non-linear alignment
Embeddings: Trained using the DICE Embedding Framework
Each embedding directory should contain:
model.pt
entity_to_idx.p
relation_to_idx.p
configuration.json
Datasets: OpenEA benchmark
rel_triples_1_train.txt
rel_triples_2_train.txt
rel_triples_test_merged.txt (merged test triples from both dataset)
rel_triples_train_merged.txt (merged train triples from both dataset)
train_links
valid_links
test_links
Install dependencies using:
pip install -r requirements.txt-
Train KGE embeddings Train entity and relation embeddings using the DICE Embedding Framework (or any compatible KGE model such as TransE, ComplEx, etc.).
-
Prepare datasets Download or prepare:
- Knowledge graph triples (train/test)
- Alignment links (train / validation / test)
Supported sources include:
- OpenEA benchmark datasets
- DBpedia–Wikidata datasets
python3 -m modules.pipeline \
--directory_1 <KG1_embeddings> \
--directory_2 <KG2_embeddings> \
--train_triples_path_1 <KG1_train_triples> \
--train_triples_path_2 <KG2_train_triples> \
--test_triples_path <merged_test_triples> \
--triple_paths <merged_train_triples> \
--train_links <train_links> \
--val_links <validation_links> \
--test_links <test_links> \
--output_dir <output_directory>The pipeline performs the following steps:
- Loads pretrained embeddings for both KGs
- Loads triples and alignment links
- Builds a merged graph structure
- Computes R-GCN structural embeddings
- Applies adaptive fusion (structure + base embeddings)
- Trains the alignment model
- Evaluates entity alignment (Hits@k, MRR)
- Injects aligned embeddings into a KGE model
- Performs fine-tuning (KvsAll training)
- Evaluates link prediction performance
--
Results are saved in:
output_dir/
│
├── alignment_results.json
├── link_prediction_results.json
│
├── aligned_embeddings/ # aligned entity embeddings
├── fine_tuned_model/ # final fine-tuned KGE model