Menu principal

Post-doctoral position Semantic data integration and decision making for Cybersecurity

Context: H2020 STARLIGHT project (2021-2025)

Law enforcement agencies (LEAs) create and have access to larger amounts of data more than ever before. Heralded as a golden age for LEAs’ ability to investigate, solve and predict crime, the big-data revolution, in many ways, proved a false dawn with LEAs incapable yet to capitalise much of this data effectively. Today, a new panacea exists – (machine learning in) Artificial Intelligence (AI). AI is envisioned as a silver-bullet to many of societies’ current challenges - streamlining and enhancing productivity and efficiency, spotting patterns and making decisions with unrivalled speed and accuracy. To maximise the benefit of AI, LEAs must, for all the data they possess, also embed a critical and humancentric, inclusive approach alongside a coherent data strategy to underpin the implementation of any AI technologies for the safety and security of our society.

Position Description

STARLIGHT will perform an extensive analysis on current LEAs’ gaps on the adoption of AI and how to reinforce their capacity with AI based tools, supported by innovating workshops to identify and consolidate LEAs’ needs and technology watch. STARLIGHT will then activate research beyond the state of the art to bring EU LEAs in the era of Artificial Intelligence to counter newly emerging security threats (including adversarial AI and misuse of AI for criminal purposes). The post-doc will be involved in the work packages 7 and 8 and the following tasks.

-          Task 7.1 Multidimensional information fusion and correlation for operational knowledge generation: task deals with strategies and techniques to fuse and correlate multi-modal data and extracted information, to be reused for investigative and intelligence purposes. Starting from a graph-based knowledge representation, AI-based fusion models and entity resolution techniques will be exploited to detect similarities between people, accounts, objects and events, in order to match and map information between different cases or multiple sources. Adaptive semantic-oriented graph-based patterns will be applied to recognize hidden or potential relationships among sparse pieces of information. Such patterns will be automatically and periodically re-trained over observed knowledge, in order to be constantly aligned with the evolving environment and able to recognize emerging suspicious correlations.

-          Task 7.2 Operational Knowledge and Intelligence Exploration and Search focuses on the development of novel methods to search and explore large multi-modal collections. It utilizes metric- and representation learning, zero-, single- or few-shot learning approaches to search for content that is not identified by pre-defined supervised classifiers, detectors or trackers and further includes local similarity and near duplicate estimations. This task will further develop methods to provide content recommendations such as similar videos taken from different perspectives, audio similarity and text matching as well as novel approaches to align multi-modal information to facilitate modality-agnostic search interfaces.

-          Task 8.7 Data protection and privacy by design measures will provide LEA with solutions to ensure data privacy at the system level. The first one will consist in a Decision Support System (DSS) or Recommender system that will propose high level authorization policies by learning users’ privacy preferences, with the aim of protecting end-user's personal data. The system will propose authorization policies thanks to specific learning algorithms. The outcome will then be a multi-criteria model that will consider the end-users' preferences for protecting privacy. The second one will merge the power of collaborative/federated learning (with data not shared by design) and the guarantees provided by differential confidentiality (at negligible computational cost) in a unique framework for the design of AI models, offering the possibility of using and combine several techniques depending on the constraints of the use cases in the context of LEA operations and cybersecurity.

Work environment

Localization : Institut de Recherche en informatique de Toulouse (IRIT) - UPS, 118 Route de Narbonne F-31062 Toulouse Cedex - France and UT1C, 2 rue du Doyen Gabriel Marty – 31042 Toulouse Cedex 9

Duration: 24 months – starting on February 1st, 2022

Host department: Artificial Intelligence Department

Host teams:



The candidate will work with three permanent academic researchers, and a PhD student, and will collaborate with the partner companies in the project, mainly CEA, ENG, CRI, MIL, NBI, WEBIQ, CRI, EUROPOL, TNO, ZITIS.

Income: between 2200 and 3500 euros free of taxes (“net”) monthly according to past experience

How to apply?

Applicants should upload their application files before September 1st 2021 on the following web page : . Application files should contain at least a full Curriculum including a complete list of publications, a cover letter indicating their research interests, achievements to date and vision for the future, as well as either support letters or the name of 2 persons that have worked with them.

dss; mcda; Ontologies; recommender system; sementic indexing; sementic search
31062 TOULOUSE  
Site Web
Date de début souhaitée
Langues obligatoires
Type de contrat
Type de poste

Applicants are required to have a PhD in Computer Science, a strong background in semantic web technologies, ontology engineering, linked data management and query, Decision Support Systems, Recommender Systems and Multicriteria Decision Analysis. Fluency in written / spoken English is required too. Experience on programing skills, a good publication record as well as fluency in French language will be a plus.

Salaire indicatif
between 2200 and 3500 euros free of taxes (“net”) monthly according to past experience
Date limite
Informations de contact

Candidates are welcome to contact: N. Aussenac-Gilles +33 5 61 55 82 93 and P. Zaraté