M.Sc. Computational Linguistics · University of Stuttgart
Natural Language Processing & computational linguistics research.
I study how language works and how machines model it — with a focus on multilingual
language models, AI-generated text detection, and multimodal learning. I bring a
background in Python development and language teaching to research-oriented problems in NLP.
Email·
Google Scholar·
GitHub·
LinkedIn·
CV
-
arXiv preprint · 2026 · cs.CL
A Systematic Analysis of Linguistic Features in AI-Generated Text Detection
Across Domains and Models
Yassir El Attar, Esra Dönmez, Maximilian Maurer, Agnieszka
Falenska
A large-scale study of 284 interpretable linguistic features across 27 LLMs and ten text
domains. Lexical-richness measures stay robust across model families and domains, while
many other indicators prove strongly context-dependent.
arXiv·
PDF
-
BUCC 2026 @ LREC 2026 · pp. 108–118
Leveraging Comparable Toxicity Lexicons in Prompt Instructions for
Multilingual Text Detoxification
Yassir El Attar, Esra Dönmez, Nina Ohlendorf, Agnieszka
Falenska
Using comparable, language-specific toxicity lexicons inside prompt instructions to guide
multilingual detoxification. Both zero-shot prompting and fine-tuning improve, including
in cross-lingual transfer to low-resource languages.
PDF
-
ArabicNLP 2025 · Shared Tasks · ACL · pp. 608–614
YassirEA at MAHED 2025: Fusion-Based Multimodal Models for Arabic Hate Meme
Detection
Yassir El Attar
A fusion-based multimodal system combining visual and textual features to detect hateful
Arabic memes, submitted to the MAHED 2025 shared task at the Third Arabic NLP Conference.
ACL Anthology·
PDF
Speech · Foundation Models
Hearing Abilities of Foundation Models
Course project · Current Topics in Speech Technology · Jan 2025
- Problem
- How well can spoken language models actually “hear” — perceiving speech, audio events, and
music?
- Approach
- Systematic assessment of audio-language foundation models (SALMONN, Pengi, CLAP, ParaCLAP)
and self-supervised speech/speaker models across benchmark hearing tasks (SUPERB,
Dynamic-SUPERB).
- Results
- Mapped strengths on discriminative tasks against gaps in generative tasks and audio
hallucination; summarized in a conference-style poster.
Poster (PDF)
Scientific ML · Foundation Models
In-Context Learning for Differential Equations
Seminar report · Scientific Foundation Models II · 2025
- Problem
- Can foundation-model techniques cut the heavy simulation cost of learning PDE operators
while improving out-of-distribution generalization?
- Approach
- Reviews two directions — the In-Context Operator Network (ICON) for few-shot operator
learning, and unsupervised pre-training of neural operators (FNO, transformers) on unlabeled
PDE data via masked-autoencoding and super-resolution proxy tasks.
- Results
- Combining pre-training with in-context demos reduces simulated-data needs by up to ~1000×
and improves OOD generalization on PDEs such as Navier–Stokes and Helmholtz.
Report (PDF)
Computational Linguistics · LLMs
Bridging LLMs and Linguistics
Seminar report · Foundational Questions Regarding LLMs · WS 2024–2025
- Problem
- Can linguistic theory (Generative Grammar, Construction Grammar) and LLMs be integrated
rather than treated as rival, separate disciplines?
- Approach
- A literature review of the Piantadosi–Chesi debate on LLMs as theories of language, and of
work integrating Construction Grammar with neural models (HyCxG, CxGBERT, probing studies).
- Results
- Argues for a unified framework — LLMs offer computable, falsifiable evidence on grammar
learnability, while linguistic theory guides interpretation, to the benefit of both fields.
Report
(PDF)
Multimodal · VQA
Multimodal Dermatology VQA
Research project · Foundation Models · 2024–2025
- Problem
- Can a Visual Question Answering system reliably answer clinical questions about
dermatological images?
- Approach
- Fuse visual embeddings from medical images with text representations from clinical
descriptions, comparing fusion architectures — UNITER, ViLT, and cross-attention over BERT /
DistilBERT.
- Results
- Analyzed how fusion choices affect answer accuracy and interpretability for AI-assisted
dermatology.
Paper (PDF)
Education
-
2024 – present
M.Sc. Computational Linguistics
University of Stuttgart, Germany
-
2014 – 2017
B.A. English Studies — Linguistics major
FLSH Tetouan, Morocco
-
2017 – 2018
DTS, Computer Development (Specialized Technician)
ISMONTIC, Tangier
Certificates: NLP (DFKI & TU Berlin / AI Campus), Python (Univ. of Michigan / Coursera),
Linear Algebra (Khan Academy), TEFL, TESOL.
Experience
-
2024 – present
Teaching Assistant
Institute for NLP (IMS), University of Stuttgart — Parsing;
Programming for Computational Linguistics
-
2025 – present
Student Research Assistant
Diversity-Aware NLP / IRIS, University of Stuttgart
-
2025
Student Research Assistant
Inst. for Energy Efficiency in Production & Fraunhofer —
energy-systems cost optimization (OEMOF, Pyomo)
-
2022 – 2024
Developer & Team Manager
Software Version 7.0, Morocco — Python / PowerShell
data-optimization tools
-
2023
Developer
IGNITREE Web Development Agency, Morocco — web / CMS
applications
-
2018 – 2022
ESL Teacher
American Language Center & Ministry of Education,
Morocco
Languages
- ArabicNative
- EnglishAdvanced (C1–C2)
- FrenchUpper-intermediate
- GermanB1 (learning)
- SpanishBeginner
Skills
Languages & tools
- Python
- PyTorch
- Hugging Face Transformers
- scikit-learn
- spaCy
- NLTK
- TensorFlow
- NumPy / pandas
- SQL
- Git
- Docker
- LaTeX
- Bash / PowerShell
NLP & ML methods
- LLM fine-tuning (LoRA / PEFT)
- Prompt engineering
- In-context learning
- Transformers & attention
- Text classification
- Machine translation
- Multimodal fusion
- Multilingual & cross-lingual transfer
- Embeddings & representation learning
- Model evaluation & benchmarking
- Linguistic feature engineering
- Data annotation
Research interests
- Multilingual NLP
- AI-generated text detection & analysis
- AI safety
- Social impact of NLP
- Linguistics in LLMs
- Interpretability of AI models
- Multimodal learning
- Structured languages in LLMs