FrSemCor: Annotating a French corpus with supersenses - Université de Lille Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

FrSemCor: Annotating a French corpus with supersenses

Résumé

French, as many languages, lacks semantically annotated corpus data. Our aim is to provide the linguistic and NLP research communities with a gold standard sense-annotated corpus of French, using WordNet Unique Beginners as semantic tags, thus allowing for interoperability. In this paper, we report on the first phase of the project, which focused on the annotation of common nouns. The resulting dataset consists of more than 12,000 French noun tokens which were annotated in double blind and adjudicated according to a carefully redefined set of supersenses. The resource is released online under a Creative Commons Licence.

Domaines

Linguistique
Fichier principal
Vignette du fichier
Fr_SemCor_LREC2020.pdf (170.93 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02511929 , version 1 (19-03-2020)

Identifiants

  • HAL Id : hal-02511929 , version 1

Citer

L Barque, Pauline Haas, R Huyghe, Delphine Tribout, M Candito, et al.. FrSemCor: Annotating a French corpus with supersenses. LREC-2020, May 2020, Marseille, France. ⟨hal-02511929⟩
349 Consultations
320 Téléchargements

Partager

Gmail Facebook X LinkedIn More