Open Early Stage Researcher/PhD Position Centre National de la Recherche Scientifique (CNRS), Nancy, France for the NL4XAI project
Reference number: NL4XAI- ESR6
PhD research topic: Explainable Models for Text Production
Objectives: The broad goal of this PhD thesis will be to provide explainable models of text generation which permit identifying relevant mismatches between input and output. Two text production applications will be considered: Verbalisation of Knowledge Bases (KB) and Text Summarization. While for both these tasks, the text should match the input, for summarization, it should only express the key information contained in the input. Thus different questions arise both on how to analyse semantic adequacy and on how to explain the behaviour of a generation system.
- A model of text production which (i) provides a clear explanation of cases where text production fails to be semantically adequate and (ii) permits distinguishing cases of failure due to biases in the data from failure cases due to an inadequate model.
- An evaluation of this model on standard benchmarks for KB verbalisation and Text Summarization.
- An explanation model based on (i) breaking up the end-to-end decoder process in several explainable substeps and generating the output text based on both the input and on these intermediate predictions and (ii) evaluation metrics used to evaluate how accurate these intermediate predictions are and how well they correlate with success.
The challenge here will be to identify relevant substeps and evaluation criteria for adapting the method to text generation. We will decompose Natural Language Generation (NLG) into some or all of the traditional NLG modules thereby allowing for a more fine grained evaluation of how neural NLGsystems can handle the various choices that need to be made to produce a well-formed text.
Host institution: Centre National de la Recherche Scientifique- CNRS (France)
PhD Enrolment: Université de Lorraine- UL, Nancy (France)
The PhD Candidate will be located at LORIA (Lorraine Research Laboratory in Computer Science and its Applications) on Université de Lorraine campus in Nancy, France. LORIA is a research lab common to CNRS, the University of Lorraine and INRIA which gathers around 500 people and 27 teams and is structured into 5 main departments targeting both fundamental and applied research in computer science. The Natural Language and Knowledge Processing department includes 6 teams among which Synalp, the hiring team for this PhD topic whichspecialises in Statistical and Symbolic Natural Language Processing with a strong focus on neural approaches to Natural Language Generation. SYNALP is well anchored in the national and international research community and Claire Gardent has been regularly involved in the activities of the top international association of the field, the ACL (Association for Computational Linguistics) as member, vice-president and president of the EACL (European Chapter of the ACL) board and more recently, as the chair of SIGGEN, the ACL Special Interest Group in Natural Language Generation. She has supervised 16 PhD students, 5 post-docs and 8 engineers and has been the principal investigator for 10 projects (4 national, 6 european). She regularly serves in all main NLP conferences as chair, area chair or reviewer; and is or has been a member of the editorial board for 5 of the main NLP journals (JoS, TACL, CL, TAL, JoLLI 3 ).
Secondments: The ESR will enjoy three secondments of 3 months each at the premises of two project’s members as detailed in the following table.
- Main Supervisor: Dr. Claire Gardent, LORIA – Centre National de la Recherche Scientifique (CNRS), firstname.lastname@example.org
- PhD Co-Supervisor: Dr. Albert Gatt, Institute of Linguistics and Language Technology – Università ta’ Malta (UOM)
Inter-sectoral Secondment Supervisor:
- Dr. Lina María Rojas Barahona, Learning and Natural Dialogue Teams – Orange
- Dr. Johannes Heinecke, Learning and Natural Dialogue Teams – Orange
- Mobility: At the time of recruitment, the researcher must not have resided or carried out his/her main activity (work, studies, etc.) in France for more than 12 months in the 3 years prior to recruitment date. Time spent as part of a procedure for obtaining refugee status under the Geneva Convention is not taken into account.
- Career: When starting their contract (expected in April 2020), selected researchers should be within the first four years of his/her research careers and not have been awarded a doctoral degree prior to the application.
- The candidate must be working exclusively for the action
We are looking for candidates with a strong background in computer science, natural language processing (NLP) and/or deep learning. The candidate should have strong programming skill, should be able to think creatively and be interested in NLP
- Degree: Master Degree in Computer Science or Computational Linguistics or equivalent providing access to PhD program.
- Experience in Deep Learning, Natural Language Processing
- Programming skills: Python, Deep Learning libraries (PyTorch)
- Language: Excellent command of English, together with good academic writing and presentation skills.
- Background in Natural Language Generation
- Ability to work independently and as part of a team.
- Strong motivation to pursue a PhD degree.
- Strong interest in interdisciplinary scientific work
Estimated starting date: 1st April 2020
Contract: Full-time contract
Duration: 36 months, including 3 secondments of 3 months each, at other consortium members’ premises (see the Secondments section above)
Salary: Gross monthly salary 2851 €/month. This amount will be increased with the corresponding mobility allowance, and the family allowance depending on the family status of the recruited researcher.
- Detailed CV in Europass format (template available in the following link) in English, highlighting the merits that are established as evaluation criteria;
- Scans of BSc and/or MSc transcripts, with certified translation in English (if the degree qualification is not in English or in the language of the hosting country);
- A motivation letter in English, highlighting the consistency between the candidate profile and the chosen ESR position/s for which she/he is applying and describing why you wishes to be an NL4XAI ESR to carry out a PhD;
- Contact details or recommendation letters of two referees in English or in certified translation;
- Scanned copy of valid identification document;
- Proof of excellent command of English (e.g., IELTS, TOEFL, Cambridge or equivalent). This is not required in case of native English speakers (i.e., English is your mother tongue).
In addition, you can add any other documents which you find relevant for the applications such as Master thesis, publications or project reports.
- Academic background (up to 40 points)
- Knowledge and specific achievements (up to 35 points)
- Personal interview, only for candidates achieving a minimum of 55 points (up to 25 points)
Deadline: February 14, 2020, at 23h59 CET (UCT + 01:00)
Enquiries about research content must be sent to the main PhD supervisor via email (see contact details in Supervisors section).