NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

dc.bibliographicCitation.firstPage16eng
dc.bibliographicCitation.lastPage27eng
dc.contributor.authorD'Souza, Jennifer
dc.contributor.authorAuer, Sören
dc.date.accessioned2021-04-13T09:45:55Z
dc.date.available2021-04-13T09:45:55Z
dc.date.issued2020
dc.description.abstractWe describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. Question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers [18] of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761.eng
dc.description.versionpublishedVersioneng
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/6146
dc.identifier.urihttps://doi.org/10.34657/5194
dc.language.isoengeng
dc.publisherAachen : RWTHeng
dc.relation.essn1613-0073
dc.relation.ispartofProceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020)eng
dc.relation.ispartofseriesCEUR Workshop Proceedings 2658 (2020)eng
dc.relation.urihttps://ceur-ws.org/Vol-2658/paper2.pdf
dc.rights.licenseCC BY 4.0 Unportedeng
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/eng
dc.subjectdataseteng
dc.subjectannotation guidelineseng
dc.subjectsemantic publishingeng
dc.subjectdigital librarieseng
dc.subjectscholarly knowledge graphseng
dc.subjectopen science graphseng
dc.subject.classificationKonferenzschriftger
dc.subject.ddc004eng
dc.titleNLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literatureeng
dc.typebookParteng
dc.typeTexteng
dcterms.bibliographicCitation.journalTitleCEUR Workshop Proceedingseng
tib.accessRightsopenAccesseng
tib.relation.conferenceEEKE 2020 - Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents, 1. August 2020, virtual eventeng
wgl.contributorTIBeng
wgl.subjectInformatikeng
wgl.typeBuchkapitel / Sammelwerksbeitrageng
wgl.typeKonferenzbeitrageng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
D'Souza2020.pdf
Size:
2.43 MB
Format:
Adobe Portable Document Format
Description: