Context and Main Purpose:

As a member of ENGIE’s research and development team, CsAI Lab (Computer Science and Artificial Intelligence),  you will benefit from ongoing training and you will help us to build semantic web solutions.

The goal of this internship is to facilitate the semantic integration (access, sharing and alignment) of heterogeneous structured data not only using created ontologies according needs  and/or existing domain ontologies, but also with the mapping with the different Linked Data resources namely DBpedia, Data.gouv, Wikidata, Geonames, etc.

You help set up a tool to learn how to link and transform structured data (CSV, JSON, XML, HTML, etc.) based on the ontology concepts and relationships. These ontologies are created for different application domains such as gas, electricity, buildings, water, IoT, etc.  This challenge aims also to facilitate the semantic enrichment of the model with other knowledge bases.

Key Responsibilities:

  • ·         Preparing a state of-the art about existing developments related to the semantic learning systems of structured data [1,2]
  • ·         Developing algorithms for a new semantic system that can be robust to transform and semantically link heterogeneous structured data of ENGIE’s use cases
  • ·         Building reusable code and libraries for future use
  • ·         Editing technical documents as needed
  • ·         Deploy an application that meets the scientific and business challenges
  • ·         Writing scientific papers


  • ·         Being Master 2 in Computer Science/Engineering
  • ·         Knowledge of Java and experience in scripting languages such as Python, Shell script and knowledge of IHM would be a plus
  • ·         Experience in Semantic Web technologies: RDF, OWL, SPARQL etc.
  • ·         Knowledge of machine learning and semantic alignment
  • ·         You are open to develop your knowledge on energy domain
  • ·         Knowledge of open data will be a plus
  • ·         Communication skills in English and French


Location: CRIGEN, ENGIE’s R&D center in France, 361 Avenue du Président Wilson, 93210 Saint-Denis

Duration: 6 months full time

Beginning of the contract: as soon as possible

Salary: depending on the prepared diploma

Contact: and (CV + cover letter)

[1] Mohsen TaheriyanCraig A. KnoblockPedro A. Szekely, José Luis Ambite:
Learning the semantics of structured data sources. 
J. Web Sem. 37-38: 152-169 (2016)

[2] Mohsen TaheriyanCraig A. KnoblockPedro A. Szekely, José Luis Ambite:
Leveraging Linked Data to Discover Semantic Relations Within Data Sources. 
International Semantic Web Conference (2016)


93200 Saint Denis  
Langues obligatoires
Anglais; Français
Bac +5