Domain-Independent Extraction of Scientific Concepts from Research Articles

dc.bibliographicCitation.bookTitleAdvances in Information Retrievaleng
dc.bibliographicCitation.journalTitleLecture Notes in Computer Scienceeng
dc.contributor.authorBrack, Arthur
dc.contributor.authorD'Souza, Jennifer
dc.contributor.authorHoppe, Anett
dc.contributor.authorAuer, Sören
dc.contributor.authorEwerth, Ralph
dc.contributor.editorJose, Joemon M.
dc.contributor.editorYilmaz, Emine
dc.contributor.editorMagalhães, João
dc.contributor.editorCastells, Pablo
dc.contributor.editorFerro, Nicola
dc.contributor.editorSilva, Mário J.
dc.contributor.editorMartins, Flávio
dc.date.accessioned2021-06-04T08:40:40Z
dc.date.available2021-06-04T08:40:40Z
dc.date.issued2020
dc.description.abstractWe examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present a state-of-the-art deep learning baseline. Further, we propose the active learning strategy for an optimal selection of instances from among the various domains in our data. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.eng
dc.description.versionsubmittedVersioneng
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/6179
dc.identifier.urihttps://doi.org/10.34657/5226
dc.language.isoengeng
dc.publisherCham : Springereng
dc.relation.doihttps://doi.org/10.1007/978-3-030-45439-5_17
dc.relation.essn1611-3349
dc.relation.isbn978-3-030-45438-8
dc.relation.isbn978-3-030-45439-5
dc.relation.issn0302-9743
dc.rights.licenseEs gilt deutsches Urheberrecht. Das Dokument darf zum eigenen Gebrauch kostenfrei genutzt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden.eng
dc.subject.ddc020eng
dc.subject.gndKonferenzschriftger
dc.subject.otherSequence labellingeng
dc.subject.otherInformation extractioneng
dc.subject.otherScientific articleseng
dc.subject.otherActive learningeng
dc.subject.otherScholarly communicationeng
dc.subject.otherResearch knowledge grapheng
dc.titleDomain-Independent Extraction of Scientific Concepts from Research Articleseng
dc.typeBookParteng
dc.typeTexteng
dcterms.eventEuropean Conference on Information Retrieval, 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020
tib.accessRightsopenAccesseng
wgl.contributorTIBeng
wgl.subjectInformatikeng
wgl.typeBuchkapitel / Sammelwerksbeitrageng
wgl.typeKonferenzbeitrageng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Brack2020, Preprint.pdf
Size:
825.86 KB
Format:
Adobe Portable Document Format
Description: