WorldCat Identities

Pernelle, Nathalie

Overview
Works: 9 works in 18 publications in 2 languages and 295 library holdings
Genres: Conference papers and proceedings 
Roles: Editor, Other, htt, Opponent, Author, Thesis advisor
Classifications: Q387.2, 003.54
Publication Timeline
.
Most widely held works by Nathalie Pernelle
Graph-based representation and reasoning : 23rd International Conference on Conceptual Structures, ICCS 2018 Edinburgh, UK, June 20-22, 2018, proceedings by International Conference on Conceptual Structures( )

8 editions published in 2018 in English and held by 280 WorldCat member libraries worldwide

This book constitutes the proceedings of the 23rd International Conference on Conceptual Structures, ICCS 2018, held in Edinburgh, UK, in June 2018. The 10 full papers, 2 short papers and 2 posters presented were carefully reviewed and selected from 21 submissions. They are organized in the following topical sections: graph- and concept-based inference; computer- human interaction and human cognition; and graph visualization
Ingenierie des connaissances( Book )

1 edition published in 2018 in French and held by 5 WorldCat member libraries worldwide

Traitement automatique des polysémies relationnelles : utilisation et contrôle de règles d'extension de sens by Nathalie Pernelle( Book )

2 editions published in 1998 in French and held by 3 WorldCat member libraries worldwide

CETTE THESE SE SITUE DANS LE CADRE DU TRAITEMENT AUTOMATIQUE DU LANGAGE NATUREL. POUR TRAITER UN TEXTE, IL FAUT POUVOIR ATTRIBUER UN SENS AUX MOTS QUE CE TEXTE CONTIENT. OR, LE SENS DES MOTS VARIE EN CONTEXTE ET CES VARIATIONS NE PERMETTENT PAS DE REALISER UN LEXIQUE CONTENANT POUR CHAQUE MOT UNE LISTE EXHAUSTIVE DE SENS POSSIBLES. SI L'ON CONSIDERE QUE L'ON PEUT ATTRIBUER A CHAQUE MOT UN OU PLUSIEURS SENS DE BASE, ON PEUT DISTINGUER LES VARIATIONS DE SENS EN CONTEXTE EN FONCTION DE LA FACON DONT CES VARIATIONS ALTERENT LES SENS DE BASE. NOUS NOUS SOMMES INTERESSES AU TRAITEMENT DES POLYSEMIES DE TYPE METONYMIQUE (AU SENS LARGE), QUE NOUS AVONS APPELES POLYSEMIES RELATIONNELLES. LES POLYSEMIES RELATIONNELLES OBEISSENT A CERTAINES REGULARITES QUE L'ON PEUT TENTER DE PRENDRE EN COMPTE. DE PLUS, ELLES SONT A L'ORIGINE DE CAS DE COPRESENCE (DIFFERENTES PARTIES DU COTEXTE PEUVENT SE REFERER A DES SENS DIFFERENTS D'UN MEME MOT). L'ETUDE DES DIFFERENTS SYSTEMES QUI ONT ETE PROPOSES POUR TRAITER DYNAMIQUEMENT CE TYPE DE POLYSEMIE NOUS A MONTRE LES DIFFICULTES QUI ACCOMPAGNENT CE TYPE D'APPROCHE. EN PARTICULIER, ON SE HEURTE RAPIDEMENT AU PROBLEME DU CONTROLE (SI ON NE LIMITE LES SENS PROPOSES, ON ABOUTIT A UNE EXPLOSION COMBINATOIRE). L'ETUDE D'UN ENSEMBLE DE PHRASES DE STRUCTURE SYNTAXIQUE TRES SIMPLE NOUS A CONDUIT A MODELISER UN SYSTEME QUI UTILISE DES REGLES D'EXTENSION DE SENS POUR RESTITUER CERTAINS SENS PRIVILEGIES TOUT EN AMORCANT LE CALCUL DES EMPLOIS LES PLUS MARGINAUX ET OU LA COPRESENCE PEUT ETRE PRISE EN COMPTE
Intégration sémantique de données guidée par une ontologie by Fatiha Saïs( Book )

2 editions published in 2007 in French and held by 2 WorldCat member libraries worldwide

Dans cette thèse, nous traitons du problème d'intégration sémantique de données. L'objectif est de pouvoir combiner des sources de données autonomes et hétérogènes. Pour y parvenir, toutes les données doivent être représentées selon un même schéma et selon une sémantique unifiée. Cette thèse est articulée en deux parties relativement indépendantes. La première présente une méthode automatique et flexible de réconciliation de données avec une ontologie dans le cas où les données sont représentées dans des tableaux. Pour représenter le résultat de la réconciliation, nous avons défini le format SML dont l'originalité est de permettre de représenter tous les appariements trouvés mais également les informations imparfaitement identifiées. La seconde partie présente deux méthodes de réconciliation de références décrites relativement à un même schéma. Il s'agit de décider si des descriptions différentes se réfèrent à la même entité du monde réel. La première méthode, nommée L2R, est logique. La sémantique des données et du schéma y est traduite par un ensemble de règles de (non) réconciliation permettant d'inférer des décisions de (non) réconciliation certaines. La seconde, nommée N2R, est numérique. Dans cette méthode, la sémantique du schéma est traduite par une mesure de similarité informée utilisée pour calculer la similarité des paires de références. Ce calcul est exprimé dans un système d'équations non linéaire résolu par une méthode itérative. Ces méthodes obtiennent des résultats satisfaisants sur des données réelles, ce qui montre la faisabilité d'approches complètement automatiques et guidées uniquement par une ontologie pour ces deux problèmes de réconciliation
Approches hybrides pour la recherche sémantique de l'information : intégration des bases de connaissances et des ressources semi-structurées by Yassine Mrabet( )

1 edition published in 2012 in French and held by 1 WorldCat member library worldwide

Semantic information retrieval has known a rapid development with the new Semantic Web technologies. With these technologies, software can exchange and use data that are written according to domain ontologies describing explicit semantics. This ``semantic'' information access requires the availability of knowledge bases describing both domain ontologies and their instances. The most often, these knowledge bases are constructed automatically by annotating document corpora. However, while these knowledge bases are getting bigger, they still contain much less information when comparing them with the HTML documents available on the surface Web.Thus, semantic information retrieval reaches some limits with respect to ``classic'' information retrieval which exploits these documents at a bigger scale. In practice, these limits consist in the lack of concept and relation instances in the knowledge bases constructed from the same Web documents. In this thesis, we study two research directions in order to answer semantic queries in such cases. The first direction consists in reformulating semantic user queries in order to reach relevant document parts instead of the required (and missing) facts. The second direction that we study is the automatic enrichment of knowledge bases with relation instances.We propose two novel solutions for each of these research directions by exploiting semi-structured documents annotated with concept instances. A key point of these solutions is that they don't require lexico-syntactic or structure regularities in the documents. We position these approaches with respect to the state of the art and experiment them on several real corpora extracted from the Web. The results obtained from bibliographic citations, call-for-papers and geographic corpora show that these solutions allow to retrieve new answers/relation instances from heterogeneous documents and rank them efficiently according to their precision
Knowledge Discovery Considering Domain Literature and Ontologies : Application to Rare Diseases by Mohsen Hassan( )

1 edition published in 2017 in English and held by 1 WorldCat member library worldwide

Even if they are uncommon, Rare Diseases (RDs) are numerous and generally sever, what makes their study important from a health-care point of view. Few databases provide information about RDs, such as Orphanet and Orphadata. Despite their laudable effort, they are incomplete and usually not up-to-date in comparison with what exists in the literature. Indeed, there are millions of scientific publications about these diseases, and the number of these publications is increasing in a continuous manner. This makes the manual extraction of this information painful and time consuming and thus motivates the development of semi-automatic approaches to extract information from texts and represent it in a format suitable for further applications. This thesis aims at extracting information from texts and using the result of the extraction to enrich existing ontologies of the considered domain. We studied three research directions (1) extracting relationships from text, i.e., extracting Disease-Phenotype (D-P) relationships; (2) identifying new complex entities, i.e., identifying phenotypes of a RD and (3) enriching an existing ontology on the basis of the relationship previously extracted, i.e., enriching a RD ontology. First, we mined a collection of abstracts of scientific articles that are represented as a collection of graphs for discovering relevant pieces of biomedical knowledge. We focused on the completion of RD description, by extracting D-P relationships. This could find applications in automating the update process of RD databases such as Orphanet. Accordingly, we developed an automatic approach named SPARE*, for extracting D-P relationships from PubMed abstracts, where phenotypes and RDs are annotated by a Named Entity Recognizer. SPARE* is a hybrid approach that combines a pattern-based method, called SPARE, and a machine learning method (SVM). It benefited both from the relatively good precision of SPARE and from the good recall of the SVM. Second, SPARE* has been used for identifying phenotype candidates from texts. We selected high-quality syntactic patterns that are specific for extracting D-P relationships only. Then, these patterns are relaxed on the phenotype constraint to enable extracting phenotype candidates that are not referenced in databases or ontologies. These candidates are verified and validated by the comparison with phenotype classes in a well-known phenotypic ontology (e.g., HPO). This comparison relies on a compositional semantic model and a set of manually-defined mapping rules for mapping an extracted phenotype candidate to a phenotype term in the ontology. This shows the ability of SPARE* to identify existing and potentially new RD phenotypes. We applied SPARE* on PubMed abstracts to extract RD phenotypes that we either map to the content of Orphanet encyclopedia and Orphadata; or suggest as novel to experts for completing these two resources. Finally, we applied pattern structures for classifying RDs and enriching an existing ontology. First, we used SPARE* to compute the phenotype description of RDs available in Orphadata. We propose comparing and grouping RDs in regard to their phenotypic descriptions, and this by using pattern structures. The pattern structures enable considering both domain knowledge, consisting in a RD ontology and a phenotype ontology, and D-P relationships from various origins. The lattice generated from this pattern structures suggests a new classification of RDs, which in turn suggests new RD classes that do not exist in the original RD ontology. As their number is large, we proposed different selection methods to select a reduced set of interesting RD classes that we suggest for experts for further analysis
Intégration de données liées respectueuse de la confidentialité by Rémy Delanaux( )

1 edition published in 2019 in English and held by 1 WorldCat member library worldwide

Individual privacy is a major and largely unexplored concern when publishing new datasets in the context of Linked Open Data (LOD). The LOD cloud forms a network of interconnected and publicly accessible datasets in the form of graph databases modeled using the RDF format and queried using the SPARQL language. This heavily standardized context is nowadays extensively used by academics, public institutions and some private organizations to make their data available. Yet, some industrial and private actors may be discouraged by potential privacy issues. To this end, we introduce and develop a declarative framework for privacy-preserving Linked Data publishing in which privacy and utility constraints are specified as policies, that is sets of SPARQL queries. Our approach is data-independent and only inspects the privacy and utility policies in order to determine the sequence of anonymization operations applicable to any graph instance for satisfying the policies. We prove the soundness of our algorithms and gauge their performance through experimental analysis. Another aspect to take into account is that a new dataset published to the LOD cloud is indeed exposed to privacy breaches due to the possible linkage to objects already existing in the other LOD datasets. In the second part of this thesis, we thus focus on the problem of building safe anonymizations of an RDF graph to guarantee that linking the anonymized graph with any external RDF graph will not cause privacy breaches. Given a set of privacy queries as input, we study the data-independent safety problem and the sequence of anonymization operations necessary to enforce it. We provide sufficient conditions under which an anonymization instance is safe given a set of privacy queries. Additionally, we show that our algorithms are robust in the presence of sameAs links that can be explicit or inferred by additional knowledge. To conclude, we evaluate the impact of this safety-preserving solution on given input graphs through experiments. We focus on the performance and the utility loss of this anonymization framework on both real-world and artificial data. We first discuss and select utility measures to compare the original graph to its anonymized counterpart, then define a method to generate new privacy policies from a reference one by inserting incremental modifications. We study the behavior of the framework on four carefully selected RDF graphs. We show that our anonymization technique is effective with reasonable runtime on quite large graphs (several million triples) and is gradual: the more specific the privacy policy is, the lesser its impact is. Finally, using structural graph-based metrics, we show that our algorithms are not very destructive even when privacy policies cover a large part of the graph. By designing a simple and efficient way to ensure privacy and utility in plausible usages of RDF graphs, this new approach suggests many extensions and in the long run more work on privacy-preserving data publishing in the context of Linked Open Data
Un système interactif et itératif extraction de connaissances exploitant l'analyse formelle de concepts by My Thao Tang( )

1 edition published in 2016 in English and held by 1 WorldCat member library worldwide

In this thesis, we present a methodology for interactive and iterative extracting knowledge from texts - the KESAM system: A tool for Knowledge Extraction and Semantic Annotation Management. KESAM is based on Formal Concept Analysis for extracting knowledge from textual resources that supports expert interaction. In the KESAM system, knowledge extraction and semantic annotation are unified into one single process to benefit both knowledge extraction and semantic annotation. Semantic annotations are used for formalizing the source of knowledge in texts and keeping the traceability between the knowledge model and the source of knowledge. The knowledge model is, in return, used for improving semantic annotations. The KESAM process has been designed to permanently preserve the link between the resources (texts and semantic annotations) and the knowledge model. The core of the process is Formal Concept Analysis that builds the knowledge model, i.e. the concept lattice, and ensures the link between the knowledge model and annotations. In order to get the resulting lattice as close as possible to domain experts' requirements, we introduce an iterative process that enables expert interaction on the lattice. Experts are invited to evaluate and refine the lattice; they can make changes in the lattice until they reach an agreement between the model and their own knowledge or application's need. Thanks to the link between the knowledge model and semantic annotations, the knowledge model and semantic annotations can co-evolve in order to improve their quality with respect to domain experts' requirements. Moreover, by using FCA to build concepts with definitions of sets of objects and sets of attributes, the KESAM system is able to take into account both atomic and defined concepts, i.e. concepts that are defined by a set of attributes. In order to bridge the possible gap between the representation model based on a concept lattice and the representation model of a domain expert, we then introduce a formal method for integrating expert knowledge into concept lattices in such a way that we can maintain the lattice structure. The expert knowledge is encoded as a set of attribute dependencies which is aligned with the set of implications provided by the concept lattice, leading to modifications in the original lattice. The method also allows the experts to keep a trace of changes occurring in the original lattice and the final constrained version, and to access how concepts in practice are related to concepts automatically issued from data. The method uses extensional projections to build the constrained lattices without changing the original data and provide the trace of changes. From an original lattice, two different projections produce two different constrained lattices, and thus, the gap between the representation model based on a concept lattice and the representation model of a domain expert is filled with projections
Automatic key discovery for Data Linking by Danai Symeonidou( )

1 edition published in 2014 in English and held by 1 WorldCat member library worldwide

Dans les dernières années, le Web de données a connu une croissance fulgurante arrivant à un grand nombre des triples RDF. Un des objectifs les plus importants des applications RDF est l'intégration de données décrites dans les différents jeux de données RDF et la création des liens sémantiques entre eux. Ces liens expriment des correspondances sémantiques entre les entités d'ontologies ou entre les données. Parmi les différents types de liens sémantiques qui peuvent être établis, les liens d'identité expriment le fait que différentes ressources réfèrent au même objet du monde réel. Le nombre de liens d'identité déclaré reste souvent faible si on le compare au volume des données disponibles. Plusieurs approches de liage de données déduisent des liens d'identité en utilisant des clés. Une clé représente un ensemble de propriétés qui identifie de façon unique chaque ressource décrite par les données. Néanmoins, dans la plupart des jeux de données publiés sur le Web, les clés ne sont pas disponibles et leur déclaration peut être difficile, même pour un expert.L'objectif de cette thèse est d'étudier le problème de la découverte automatique de clés dans des sources de données RDF et de proposer de nouvelles approches efficaces pour résoudre ce problème. Les données publiées sur le Web sont général volumineuses, incomplètes, et peuvent contenir des informations erronées ou des doublons. Aussi, nous nous sommes focalisés sur la définition d'approches capables de découvrir des clés dans de tels jeux de données. Par conséquent, nous nous focalisons sur le développement d'approches de découverte de clés capables de gérer des jeux de données contenant des informations nombreuses, incomplètes ou erronées. Notre objectif est de découvrir autant de clés que possible, même celles qui sont valides uniquement dans des sous-ensembles de données.Nous introduisons tout d'abord KD2R, une approche qui permet la découverte automatique de clés composites dans des jeux de données RDF pour lesquels l'hypothèse du nom Unique est respectée. Ces données peuvent être conformées à des ontologies différentes. Pour faire face à l'incomplétude des données, KD2R propose deux heuristiques qui per- mettent de faire des hypothèses différentes sur les informations éventuellement absentes. Cependant, cette approche est difficilement applicable pour des sources de données de grande taille. Aussi, nous avons développé une seconde approche, SAKey, qui exploite différentes techniques de filtrage et d'élagage. De plus, SAKey permet à l'utilisateur de découvrir des clés dans des jeux de données qui contiennent des données erronées ou des doublons. Plus précisément, SAKey découvre des clés, appelées "almost keys", pour lesquelles un nombre d'exceptions est toléré
 
Audience Level
0
Audience Level
1
  Kids General Special  
Audience level: 0.61 (from 0.60 for Graph-base ... to 0.99 for Ingenierie ...)

Languages
English (12)

French (6)