WorldCat Identities

Saïs, Fatiha (1979- ...).

Works: 7 works in 11 publications in 2 languages and 24 library holdings
Roles: Editor, Opponent, Author, Other
Publication Timeline
Most widely held works by Fatiha Saïs
Des sources ouvertes au web de données( Book )

2 editions published in 2014 in French and English and held by 14 WorldCat member libraries worldwide

Le web de données : publication, liage et capitalisation( Book )

2 editions published in 2016 in French and held by 3 WorldCat member libraries worldwide

Vers une capitalisation des connaissances orientée utilisateur : extraction et structuration automatiques de l'information issue de sources ouvertes by Laurie Serrano( Book )

2 editions published in 2014 in French and held by 2 WorldCat member libraries worldwide

Face à l'augmentation vertigineuse des informations disponibles librement (notamment sur le Web), repérer efficacement celles qui présentent un intérêt s'avère une tâche longue et complexe. Les analystes du renseignement d'origine sources ouvertes sont particulièrement concernés par ce phénomène. En effet, ceux-ci recueillent manuellement une grande partie des informations d'intérêt afin de créer des fiches de connaissance résumant le savoir acquis à propos d'une entité. Dans ce contexte, cette thèse a pour objectif de faciliter et réduire le travail des acteurs du renseignement et de la veille. Nos recherches s'articulent autour de trois axes : la modélisation de l'information, l'extraction d'information et la capitalisation des connaissances. Nous avons réalisé un état de l'art de ces différentes problématiques afin d'élaborer un système global de capitalisation des connaissances. Notre première contribution est une ontologie dédiée à la représentation des connaissances spécifiques au renseignement et pour laquelle nous avons défini et modélisé la notion d'événement dans ce domaine. Par ailleurs, nous avons élaboré et évalué un système d'extraction d'événements fondé sur deux approches actuelles en extraction d'information : une première méthode symbolique et une seconde basée sur la découverte de motifs séquentiels fréquents. Enfin, nous avons proposé un processus d'agrégation sémantique des événements afin d'améliorer la qualité des fiches d'événements obtenues et d'assurer le passage du texte à la connaissance. Celui-ci est fondé sur une similarité multidimensionnelle entre événements, exprimée par une échelle qualitative définie selon les besoins des utilisateurs."""""
Intégration sémantique de données guidée par une ontologie by Fatiha Saïs( Book )

2 editions published in 2007 in French and held by 2 WorldCat member libraries worldwide

This thesis deals with semantic data integration guided by an ontology. Data integration aims at combining autonomous and heterogonous data sources. To this end, all the data should be represented according to the same schema and according to a unified semantics. This thesis is divided into two parts. In the first one, we present an automatic and flexible method for data reconciliation with an ontology. We consider the case where data are represented in tables. The reconciliation result is represented in the SML format which we have defined. Its originality stems from the fact that it allows representing all the established mappings but also information that is imperfectly identified. In the second part, we present two methods of reference reconciliation. This problem consists in deciding whether different data descriptions refer to the same real world entity. We have considered this problem when data is described according to the same schema. The first method, called L2R, is logical: it translates the schema and the data semantics into a set of logical rules which allow inferring correct decisions both of reconciliation and no reconciliation. The second method, called N2R, is numerical. It translates the schema semantics into an informed similarity measure used by a numerical computation of the similarity of the reference pairs. This computation is expressed in a non linear equation system solved by using an iterative method. Our experiments on real datasets demonstrated the robustness and the feasibility of our approaches. The solutions that we bring to the two problems of reconciliation are completely automatic and guided only by an ontology
Automatic key discovery for Data Linking by Danai Symeonidou( )

1 edition published in 2014 in English and held by 1 WorldCat member library worldwide

In the recent years, the Web of Data has increased significantly, containing a huge number of RDF triples. Integrating data described in different RDF datasets and creating semantic links among them, has become one of the most important goals of RDF applications. These links express semantic correspondences between ontology entities or data. Among the different kinds of semantic links that can be established, identity links express that different resources refer to the same real world entity. By comparing the number of resources published on the Web with the number of identity links, one can observe that the goal of building a Web of data is still not accomplished. Several data linking approaches infer identity links using keys. Nevertheless, in most datasets published on the Web, the keys are not available and it can be difficult, even for an expert, to declare them.The aim of this thesis is to study the problem of automatic key discovery in RDF data and to propose new efficient approaches to tackle this problem. Data published on the Web are usually created automatically, thus may contain erroneous information, duplicates or may be incomplete. Therefore, we focus on developing key discovery approaches that can handle datasets with numerous, incomplete or erroneous information. Our objective is to discover as many keys as possible, even ones that are valid in subparts of the data.We first introduce KD2R, an approach that allows the automatic discovery of composite keys in RDF datasets that may conform to different schemas. KD2R is able to treat datasets that may be incomplete and for which the Unique Name Assumption is fulfilled. To deal with the incompleteness of data, KD2R proposes two heuristics that offer different interpretations for the absence of data. KD2R uses pruning techniques to reduce the search space. However, this approach is overwhelmed by the huge amount of data found on the Web. Thus, we present our second approach, SAKey, which is able to scale in very large datasets by using effective filtering and pruning techniques. Moreover, SAKey is capable of discovering keys in datasets where erroneous data or duplicates may exist. More precisely, the notion of almost keys is proposed to describe sets of properties that are not keys due to few exceptions
Informations personnelles sensibles aux contextes : modélisation, interrogation et composition by Rania Khéfifi( )

1 edition published in 2014 in French and held by 1 WorldCat member library worldwide

This thesis was conducted within the PIMI project, financed by the National Agency of the Research. It concerns the modeling, the querying and thecomposition of personal information. We considered that the use and the accessto personal information is context dependent (e.g., social, geographical). More particularly, it aims to support the user when realising online,administrative or personal procedures. In this setting, the tackled problems arethe representation of heterogeneous information, the context-aware personalinformation spaces querying, the automatic form-filling and the automaticrealization of procedures defined at a high level of abstraction by compositionof online available services.To solve these problems, we have developped several contributions. The first oneconcerns the management of the personal information space. We havedefined a model allowing the description of personal information using severaldomain ontologies. Our model can be instantiated on the user's personalinformation with several usability values depending on the context and with ausability degree. We have also proposed two contextualquerying algorithms SQE and FQE which allow to query the recorded information.The second contribution concerns the use of these information by several onlineservices. It presents two use cases. In the case of the automaticforms-filling, we have proposed an algorithm allowing to generate a semanticquery from an annotated form representation. This query is evaluated by usingboth querying algorithms SQE and FQE. Then, in the case of the user objectiverealization (an abstract procedure) by service composition, we have extendedthe Graphplan algorithm to take into account the contextualization of the dataand the access policy rules specified by the user. The latter allows the user toincrease the control of its information and to limit their leaking
Découverte de définitions dans le web des données by Justine Reynaud( )

1 edition published in 2019 in French and held by 1 WorldCat member library worldwide

In this thesis, we are interested in the web of data and knowledge units that can be possibly discovered inside. The web of data can be considered as a very large graph consisting of connected RDF triple databases. An RDF triple, denoted as (subject, predicate, object), represents a relation (i.e. the predicate) existing between two resources (i.e. the subject and the object). Resources can belong to one or more classes, where a class aggregates resources sharing common characteristics. Thus, these RDF triple databases can be seen as interconnected knowledge bases. Most of the time, these knowledge bases are collaboratively built thanks to human users. This is particularly the case of DBpedia, a central knowledge base within the web of data, which encodes Wikipedia content in RDF format. DBpedia is built from two types of Wikipedia data: on the one hand, (semi-)structured data such as infoboxes, and, on the other hand, categories, which are thematic clusters of manually generated pages. However, the semantics of categories in DBpedia, that is, the reason a human agent has bundled resources, is rarely made explicit. In fact, considering a class, a software agent has access to the resources that are regrouped together, i.e. the class extension, but it generally does not have access to the ``reasons'' underlying such a cluster, i.e. it does not have the class intension. Considering a category as a class of resources, we aim at discovering an intensional description of the category. More precisely, given a class extension, we are searching for the related intension. The pair (extension, intension) which is produced provides the final definition and the implementation of classification-based reasoning for software agents. This can be expressed in terms of necessary and sufficient conditions: if x belongs to the class C, then x has the property P (necessary condition), and if x has the property P, then it belongs to the class C (sufficient condition). Two complementary data mining methods allow us to materialize the discovery of definitions, the search for association rules and the search for redescriptions. In this thesis, we first present a state of the art about association rules and redescriptions. Next, we propose an adaptation of each data mining method for the task of definition discovery. Then we detail a set of experiments applied to DBpedia, and we qualitatively and quantitatively compare the two approaches. Finally, we discuss how discovered definitions can be added to DBpedia to improve its quality in terms of consistency and completeness
Audience Level
Audience Level
  Kids General Special  
Audience level: 0.88 (from 0.85 for Le web de ... to 0.99 for Informatio ...)

Associated Subjects