Early Stage Researcher at CNRS

CNRS - Centre National de la Recherche Scientifique (France) CNRS-LORIA

Lorraine Research Laboratory in Computer Science and its Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)
UMR 7503, Campus Scientifique, BP 239, F-54506
Vandoeuvre-lès-Nancy Cedex, France


  • +3
  • @


  • @

Explainable Models for Text Production

PhD research topic


The broad goal of this PhD thesis will be to provide explainable models of text generation which permit identifying relevant mismatches between input and output. Two text production applications will be considered: Verbalisation of Knowledge Bases (KB) and Text Summarization. While for both these tasks, the text should match the input, for summarization, it should only express the key information contained in the input. Thus different questions arise both on how to analyse semantic adequacy and on how to explain the behaviour of a generation system.

Expected Results:

(1) A model of text production which (i) provides a clear explanation of cases where text production fails to be semantically adequate and (ii) permits distinguishing cases of failure due to biases in the data from failure cases due to an inadequate model.

(2) An evaluation of this model on standard benchmarks for KB verbalisation and Text Summarization.

(3) An explanation model based on (i) breaking up the end-to-end decoder process in several explainable substeps and generating the output text based on both the input and on these intermediate predictions and (ii) evaluation metrics used to evaluate how accurate these intermediate predictions are and how well they correlate with success.

Main challenge:

The challenge here will be to identify relevant substeps and evaluation criteria for adapting the method to text generation. We will decompose Natural Language Generation (NLG) into some or all of the traditional NLG modules thereby allowing for a more fine grained evaluation of how neural NLGsystems can handle the various choices that need to be made to produce a well-formed text.