X-FACTR is a multilingual benchmark for probing factual knowledge in language models. Prompts in 23 languages are created by native speakers to probe factual knowledge in LMs by having them fill in the blanks of prompts such as “Punta Cana is located in _.” We provide both the benchmark containing prompts and facts and the code to evaluate LMs. For more details, check out our paper X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models and code.

Install

Clone the Github repository and run the following command:

git clone https://github.com/jzbjyb/X-FACTR.git
cd X-FACTR
conda create -n xfactr -y python=3.7 && conda activate xfactr && ./setup.sh

Probing

Default Decoding

Run LMs on the X-FACTR benchmark with the default decoding (i.e., independently predict multiple tokens) methods:

python scripts/probe.py --model $LM --lang $LANG --pred_dir $OUTPUT

where $LM is the LM to probe (e.g., mbert_base), $LANG is the language (e.g., nl), and $OUTPUT is the folder to store predictions. Evaluate the predictions with python scripts/ana.py --model $LM --lang $LANG --inp $OUTPUT

Confidence-based Decoding

Alternatively, you can run LMs on the X-FACTR benchmark with the confidence-based decoding methods:

python scripts/probe.py --model $LM --lang $LANG --pred_dir $OUTPUT --init_method confidence --iter_method confidence --max_iter 10

Benchmark

Supported languages

en (English), fr (French), nl (Dutch), ru (Russian), es (Spanish), jp (Japanese), vi (Vietnamese), zh (Chinese), hu (Hungarian), ko (Korean), tr (Turkish), he (Hebrew), el (Greek), war (Waray), mr (Marathi), mg (Malagasy), bn (Bengali), tl (Tagalog), sw (Swahili), pa (Punjabi), ceb (Cebuano), yo (Yoruba), ilo (Ilokano)

Supported LMs

multilingual BERT: mbert_base
XLM: xlm_base
XLM-R: xlmr_base
Language-specific BERT: bert_base (the original BERT in English), fr_roberta_base (CamemBERT), nl_bert_base (BERTje), es_bert_base (BETO), ru_bert_base (RuBERT), zh_bert_base (Chinese BERT), tr_bert_base (BERTurk), el_bert_base (GreekBERT)

Dataset

data/TREx-relations.jsonl: metadata of 46 relations.
data/TREx_prompts.csv: manually created prompts in 23 languages.
data/mTRExf_unicode_escape.txt: entity names in different languages.
data/mTRExf_gender.txt: gender information of entities.
data/mTRExf_instanceof.txt: “instance-of” property of entities.
data/alias/mTRExf: aliases of entities.
data/mTRExf/sub: facts (i.e., subject-relation-object tuples) of 46 relations.