Blog
Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output
Nov 2023
we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factual inconsistencies found in LLM outputs. [More]
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
Sept 2023
we collect an open-source dataset to evaluate safeguards in LLMs, and deploy safer open-source LLMs at a low cost. Recently, we extend this dataset to Chinese. Further, we extend the dataset to 3x amount. [More]
CMMLU: Measuring massive multitask language understanding in Chinese
Jun 2023
We proposed a comprehensive Chinese assessment suite specifically designed to evaluate the advanced knowledge and reasoning abilities of LLMs in a Chinese linguistic and cultural context. It has been widely used in Chineses LLM community [More]
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection
May 2023
We develop automatic systems to identify machine-generated text and to detect potential misuse. We then introduce a large-scale benchmark M4, which is multi-generator, multi-domain, and multi-lingual corpus for machine-generated text detection. [More]
Balancing out Bias: Achieving Fairness Through Balanced Training
2022
We introduce a simple, but highly effective, objective for countering bias using balanced training. [More]
FairLib: A unified framework for assessing and improving fairness
2022
We present an open-source Python library for assessing and improving model fairness. It provides a systematic framework for quickly accessing benchmark datasets, reproducing existing debiasing baseline models, developing new methods, evaluating models with different metrics, and visualizing their results. [More]
Everybody needs good neighbours: An unsupervised locality-based method for bias mitigation
2022
we propose a new meta-algorithm for debiasing representation learning models, which combines the notions of data locality and accuracy of model fit, such that a supervised debiasing method can optimise fairness between neighbourhoods of poorly vs. well modelled instances as identified by our method. [More]