Evaluation of Speech and Speech Synthesis

CALL FOR PAPERS

Journal: Computer Speech & Language

Publisher: Elsevier

Submission Deadline: 31 July 2026

Introduction

Synthetic speech has advanced to a point where its quality and diversity challenge the boundaries of existing evaluation methods. Frameworks such as MOS (Mean Opinion Score) and MUSHRA were designed to measure transmission quality rather than to assess speech as such — they were never intended to capture the communicative or functional properties of speech when transmission is no longer the limiting factor.

In contemporary systems, performance ought instead to be defined by how well the speech fulfils its intended task, role, or utility. This Special Issue asks how evaluation can be made more responsive to this new landscape — one in which human and synthetic speech can, and should, be assessed by comparable principles tied to task and situation.

Scope & Significance

Much of today's evaluation practice still relies on comparing synthetic speech to static recordings of human voices. Such tests can be useful for measuring surface similarity — but they ignore the dynamic and situational aspects that determine whether speech actually fulfils its purpose.

Human speakers continuously adapt timing, prosody, and style to the communicative setting and to the role or persona they embody. A synthetic voice should be expected to perform similarly — using a speaking style suited to the situation or task — whether audiobook narration, dialogue interaction, public announcement, or personalised replacement voice — aligned with the intended persona, be that a robot, a disembodied assistant, a child, or an adult.

This Special Issue particularly seeks evaluations that capture situational and functional adequacy — rather than limiting comparison to perceived "human-likeness."

List of Topic Areas

The editors invite contributions that reinvent, extend, or refine evaluation practice — including but not limited to studies that:

Propose concrete alternatives to established evaluation paradigms — demonstrating that more informative and diagnostically useful practices are both possible and practicable
Investigate the generalisability of established evaluation schemes across different applications or tasks
Compare various evaluation schemes within a single application domain
Align measurement with real-world use — broadening evaluation perspectives through situated examples from accessibility, education, healthcare, and entertainment
Provide guidance for future research — consolidating lessons into good practices and identifying conceptual and methodological challenges
Transfer or adapt evaluation practices from neighbouring fields such as speech therapy, HCI, or psychology
Evaluate speech synthesis in dialogue and conversational interaction contexts
Situational and functional adequacy assessment of synthetic voices
Persona and role alignment in speech synthesis evaluation
Evaluation of personalised and accessibility-focused speech synthesis
Prosody, timing, and style adaptation in synthetic speech assessment
Cross-application and cross-task evaluation frameworks for speech synthesis
Subjective and objective evaluation methods for modern TTS systems
Evaluation of emotional, expressive, and stylistic dimensions of synthetic speech

Guest Editors

Dr. Sébastien Le Maguer (Executive Guest Editor) University of Helsinki, Finland Email: sebastien.lemaguer@helsinki.fi

Prof. Jens Edlund KTH Royal Institute of Technology, Stockholm, Sweden Email: edlund@speech.kth.se

Dr. Christina Tånnander MTM — Swedish Agency for Accessible Media & KTH Royal Institute of Technology, Sweden Email: christina.tannander@mtm.se

Prof. Petra Wagner Bielefeld University, Germany Email: petra.wagner@uni-bielefeld.de

Key Dates

Submission Opens: 1 December 2025

Manuscript Submission Deadline: 31 July 2026

Editorial Acceptance Deadline: 31 March 2027

Submission Guidelines

All manuscripts should be submitted electronically via Editorial Manager:

https://www.editorialmanager.com/ycsla/default.aspx

When submitting, select Article Type:

"VSI: Speech Eval & Synthesis"

Authors should prepare manuscripts according to the Guide for Authors of Computer Speech & Language available on the journal website. All papers will undergo standard peer review.

For further information, contact the Guest Editors directly.

All submissions must be original and must not be under review elsewhere at the time of submission.

About the Journal

Computer Speech & Language, published by Elsevier, is a leading international peer-reviewed journal with a CiteScore of 12.0 and Impact Factor of 3.4. It supports open access publishing and is dedicated to advancing research at the intersection of computer science, linguistics, and speech technology — providing a global platform for interdisciplinary scholarship exploring speech recognition, speech synthesis, spoken language processing, and human-computer spoken interaction.

ServiceSetu Academics — Premier Platform for Academic Opportunities & Research Collaboration

Visit official website of the publisher

Evaluation of Speech and Speech Synthesis

DETAILS

CALL FOR PAPERS