Skip to content

Commit

Permalink
NEJLT Volume 10 (#4375)
Browse files Browse the repository at this point in the history
  • Loading branch information
anthology-assist authored Jan 11, 2025
1 parent 61a2621 commit f52a20b
Showing 1 changed file with 91 additions and 0 deletions.
91 changes: 91 additions & 0 deletions data/xml/2024.nejlt.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
<?xml version='1.0' encoding='UTF-8'?>
<collection id="2024.nejlt">
<volume id="1" ingest-date="2025-01-08" type="proceedings">
<meta>
<booktitle>Northern European Journal of Language Technology, Volume 10</booktitle>
<editor><first>Marcel</first><last>Bollmann</last></editor>
<publisher>Linköping University Electronic Press</publisher>
<address>Linköping, Sweden</address>
<month>March</month>
<year>2024</year>
<url hash="4082ecf8">2024.nejlt-1</url>
<venue>nejlt</venue>
</meta>
<frontmatter>
<url hash="d6dbcb7f">2024.nejlt-1.0</url>
<bibkey>nejlt-2024-1</bibkey>
</frontmatter>
<paper id="1">
<title>Efficient Structured Prediction with Transformer Encoders</title>
<author><first>Ali</first><last>Basirat</last><affiliation>University of Copenhagen</affiliation></author>
<pages>1-13</pages>
<abstract>Finetuning is a useful method for adapting Transformer-based text encoders to new tasks but can be computationally expensive for structured prediction tasks that require tuning at the token level. Furthermore, finetuning is inherently inefficient in updating all base model parameters, which prevents parameter sharing across tasks. To address these issues, we propose a method for efficient task adaptation of frozen Transformer encoders based on the local contribution of their intermediate layers to token representations. Our adapter uses a novel attention mechanism to aggregate intermediate layers and tailor the resulting representations to a target task. Experiments on several structured prediction tasks demonstrate that our method outperforms previous approaches, retaining over 99% of the finetuning performance at a fraction of the training cost. Our proposed method offers an efficient solution for adapting frozen Transformer encoders to new tasks, improving performance and enabling parameter sharing across different tasks.</abstract>
<url hash="9b919104">2024.nejlt-1.1</url>
<bibkey>basirat-2024-efficient</bibkey>
</paper>
<paper id="2">
<title><fixed-case>DANSK</fixed-case>: Domain Generalization of <fixed-case>D</fixed-case>anish Named Entity Recognition</title>
<author><first>Kenneth</first><last>Enevoldsen</last><affiliation>Aarhus University</affiliation></author>
<author><first>Emil Trenckner</first><last>Jessen</last><affiliation>Aarhus University</affiliation></author>
<author><first>Rebekah</first><last>Baglini</last><affiliation>Aarhus University</affiliation></author>
<pages>14-29</pages>
<abstract>Named entity recognition is an important application within Danish NLP, essential within both industry and research. However, Danish NER is inhibited by a lack coverage across domains and entity types. As a consequence, no current models are capable of fine-grained named entity recognition, nor have they been evaluated for potential generalizability issues across datasets and domains. To alleviate these limitations, this paper introduces: 1) DANSK: a named entity dataset providing for high-granularity tagging as well as within-domain evaluation of models across a diverse set of domains; 2) and three generalizable models with fine-grained annotation available in DaCy 2.6.0; and 3) an evaluation of current state-of-the-art models’ ability to generalize across domains. The evaluation of existing and new models revealed notable performance discrepancies across domains, which should be addressed within the field. Shortcomings of the annotation quality of the dataset and its impact on model training and evaluation are also discussed. Despite these limitations, we advocate for the use of the new dataset DANSK alongside further work on generalizability within Danish NER.</abstract>
<url hash="9ea5374c">2024.nejlt-1.2</url>
<bibkey>enevoldsen-etal-2024-dansk</bibkey>
</paper>
<paper id="3">
<title>Understanding Counterspeech for Online Harm Mitigation</title>
<author><first>Yi-Ling</first><last>Chung</last><affiliation>The Alan Turing Institute</affiliation></author>
<author><first>Gavin</first><last>Abercrombie</last></author>
<author><first>Florence</first><last>Enock</last></author>
<author><first>Jonathan</first><last>Bright</last></author>
<author><first>Verena</first><last>Rieser</last></author>
<pages>30-49</pages>
<abstract>Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful content through removal. Advances in the development of large language models mean that the process of producing counterspeech could be made more efficient by automating its generation, which would enable large-scale online campaigns. However, we currently lack a systematic understanding of several important factors relating to the efficacy of counterspeech for hate mitigation, such as which types of counterspeech are most effective, what are the optimal conditions for implementation, and which specific effects of hate it can best ameliorate. This paper aims to fill this gap by systematically reviewing counterspeech research in the social sciences and comparing methodologies and findings with natural language processing (NLP) and computer science efforts in automatic counterspeech generation. By taking this multi-disciplinary view, we identify promising future directions in both fields.</abstract>
<url hash="4dc31916">2024.nejlt-1.3</url>
<bibkey>chung-etal-2024-understanding</bibkey>
</paper>
<paper id="4">
<title>Documenting Geographically and Contextually Diverse Language Data Sources</title>
<author><first>Angelina</first><last>McMillan-Major</last><affiliation>University of Washington</affiliation></author>
<author><first>Francesco</first><last>De Toni</last></author>
<author><first>Zaid</first><last>Alyafeai</last></author>
<author><first>Stella</first><last>Biderman</last></author>
<author><first>Kimbo</first><last>Chen</last></author>
<author><first>Gérard</first><last>Dupont</last></author>
<author><first>Hady</first><last>Elsahar</last></author>
<author><first>Chris</first><last>Emezue</last></author>
<author><first>Alham Fikri</first><last>Aji</last></author>
<author><first>Suzana</first><last>Ilić</last></author>
<author><first>Nurulaqilla</first><last>Khamis</last></author>
<author><first>Colin</first><last>Leong</last></author>
<author><first>Maraim</first><last>Masoud</last></author>
<author><first>Aitor</first><last>Soroa</last></author>
<author><first>Pedro</first><last>Ortiz Suarez</last></author>
<author><first>Daniel</first><last>van Strien</last></author>
<author><first>Zeerak</first><last>Talat</last></author>
<author><first>Yacine</first><last>Jernite</last></author>
<pages>50-77</pages>
<abstract>Contemporary large-scale data collection efforts have prioritized the amount of data collected to improve large language models (LLM). This quantitative approach has resulted in concerns for the rights of data subjects represented in data collections. This concern is exacerbated by a lack of documentation and analysis tools, making it difficult to interrogate these collections. Mindful of these pitfalls, we present a methodology for documentation-first, human-centered data collection. We apply this approach in an effort to train a multilingual LLM. We identify a geographically diverse set of target language groups (Arabic varieties, Basque, Chinese varieties, Catalan, English, French, Indic languages, Indonesian, Niger-Congo languages, Portuguese, Spanish, and Vietnamese, as well as programming languages) for which to collect metadata on potential data sources. We structure this effort by developing an online catalogue in English as a tool for gathering metadata through public hackathons. We present our tool and analyses of the resulting resource metadata, including distributions over languages, regions, and resource types, and discuss our lessons learned.</abstract>
<url hash="cc57a878">2024.nejlt-1.4</url>
<bibkey>mcmillan-major-etal-2024-documenting</bibkey>
</paper>
<paper id="5">
<title>On Using Self-Report Studies to Analyze Language Models</title>
<author><first>Matúš</first><last>Pikuliak</last><affiliation>Kempelen Institute of Intelligent Technologies</affiliation></author>
<pages>78-85</pages>
<abstract>We are at a curious point in time where our ability to build language models (LMs) has outpaced our ability to analyze them. We do not really know how to reliably determine their capabilities, biases, dangers, knowledge, and so on. The benchmarks we have are often overly specific, do not generalize well, and are susceptible to data leakage. Recently, I have noticed a trend of using self-report studies, such as various polls and questionnaires originally designed for humans, to analyze the properties of LMs. I think that this approach can easily lead to false results, which can be quite dangerous considering the current discussions on AI safety, governance, and regulation. To illustrate my point, I will delve deeper into several papers that employ self-report methodologies and I will try to highlight some of their weaknesses.</abstract>
<url hash="03ec9a16">2024.nejlt-1.5</url>
<bibkey>pikuliak-2024-using</bibkey>
</paper>
<paper id="6">
<title>Generation and Evaluation of Multiple-choice Reading Comprehension Questions for <fixed-case>S</fixed-case>wedish</title>
<author><first>Dmytro</first><last>Kalpakchi</last><affiliation>KTH Royal Institute of Technology</affiliation></author>
<author><first>Johan</first><last>Boye</last><affiliation>KTH Royal Institute of Technology</affiliation></author>
<pages>86-105</pages>
<abstract>Multiple-choice questions (MCQs) provide a widely used means of assessing reading comprehension. The automatic generation of such MCQs is a challenging language-technological problem that also has interesting educational applications. This article presents several methods for automatically producing reading comprehension questions MCQs from Swedish text. Unlike previous approaches, we construct models to generate the whole MCQ in one go, rather than using a pipeline architecture. Furthermore, we propose a two-stage method for evaluating the quality of the generated MCQs, first evaluating on carefully designed single-sentence texts, and then on texts from the SFI national exams. An extensive evaluation of the MCQ-generating capabilities of 12 different models, using this two-stage scheme, reveals that GPT-based models surpass smaller models that have been fine-tuned using small-scale datasets on this specific problem.</abstract>
<url hash="d9829f86">2024.nejlt-1.6</url>
<bibkey>kalpakchi-boye-2024-generation</bibkey>
</paper>
</volume>
</collection>

0 comments on commit f52a20b

Please sign in to comment.