Initial commit: InkFlow — EPUB vers livre audio local (MLX/Kokoro)

2026-06-21 00:10:11 +02:00
commit d3bb91394b
71 changed files with 8138 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,23 @@
 # Python
 .venv/
 __pycache__/
 *.pyc
 *.egg-info/
 .pytest_cache/
 # InkFlow : artefacts générés et sorties
 data/
 output/
 # Node
 node_modules/
 # Échantillons audio (volumineux, non versionnés)
 samples/
 # Modèles / caches HF (au cas où téléchargés localement)
 .cache/
 models/
 # OS
 .DS_Store
--- a/.idea/.gitignore
+++ b/.idea/.gitignore
@@ -0,0 +1,10 @@
 # Default ignored files
 /shelf/
 /workspace.xml
 # Editor-based HTTP Client requests
 /httpRequests/
 # Ignored default folder with query files
 /queries/
 # Datasource local storage ignored files
 /dataSources/
 /dataSources.local.xml
--- a/README.md
+++ b/README.md
@@ -0,0 +1,105 @@
 # InkFlow
 Transforme un **EPUB** en **livre audio**, 100 % en local sur Mac (Apple Silicon / MLX),
 avec des modèles open-source. Sortie : **1 dossier par livre, 1 MP3 par chapitre**
 (tags ID3 + cover), au format calqué sur un audiobook classique.
 - **Analyse de texte** : Gemma via `mlx-lm` (segmentation narration/dialogue,
  attribution des locuteurs, extraction du casting, prononciations).
 - **Synthèse vocale** : backend pluggable —
  - **Kokoro** : rapide, voix préréglées → previews / mono-narrateur.
  - **Qwen3-TTS** : qualité + clonage par audio de référence → rendu final, casting par personnage.
 - **Langue** : optimisé français (puis multilingue).
 ## Pré-requis
 - macOS Apple Silicon (arm64), Python ≥ 3.11
 - `ffmpeg` et `espeak-ng` :
  ```bash
  brew install ffmpeg espeak-ng
  ```
 ## Installation
 ```bash
 python3.13 -m venv .venv
 source .venv/bin/activate
 pip install -e backend            # installe inkflow + dépendances
 python backend/scripts/setup_models.py   # vérifie l'env + télécharge les modèles MLX
 ```
 > Kokoro en français nécessite `espeak-ng` ; InkFlow localise automatiquement
 > `libespeak-ng.dylib` (sinon, exporter `PHONEMIZER_ESPEAK_LIBRARY`).
 ## Utilisation (CLI)
 ```bash
 # 1. Parser l'EPUB -> data/<slug>/book.json + chapters/chNN.json
 inkflow parse "samples/Colère de Tiamat, La - James S.A. Corey.epub"
 # 2. Analyser (Gemma) -> analysis/chNN.json + cast.json
 inkflow analyze la-colere-de-tiamat --chapter 5      # un chapitre
 inkflow analyze la-colere-de-tiamat                  # tous les chapitres
 # 3. Synthétiser un chapitre -> output/<livre>/NN-....mp3
 inkflow render la-colere-de-tiamat 5 --backend kokoro          # rapide
 inkflow render la-colere-de-tiamat 5 --backend qwen3 --no-mono # qualité + multi-voix (M3)
 # Infos
 inkflow info la-colere-de-tiamat
 ```
 (Sans installation `-e`, lancer depuis `backend/` via `python -m inkflow.cli …`.)
 ## Interface web
 ```bash
 # 1. Build du frontend (une fois)
 cd frontend && npm install && npm run build && cd ..
 # 2. Lancer l'app (API + UI servie sur le même port)
 inkflow serve            # http://127.0.0.1:8000
 ```
 L'UI permet : import EPUB par glisser-déposer, suivi temps réel des étapes
 (WebSocket), édition du casting (personnage → voix, avec preview), édition du
 dictionnaire de prononciation, choix du moteur (Kokoro/Qwen3) et rendu des
 chapitres avec lecteur audio + téléchargement.
 Pour le développement frontend avec rechargement à chaud :
 ```bash
 inkflow serve            # backend sur :8000
 cd frontend && npm run dev   # UI sur :5173 (proxy API/WS vers :8000)
 ```
 ## Architecture
 ```
 backend/inkflow/
  epub/parser.py        EPUB -> book.json + texte par chapitre
  analysis/gemma.py     wrapper mlx-lm (Gemma)
  analysis/segmenter.py narration/dialogue + locuteurs + casting
  analysis/pronunciation.py
  tts/base.py           interface TTSBackend + VoiceSpec
  tts/kokoro.py tts/qwen3.py tts/factory.py
  audio/postprocess.py  concat + normalisation + MP3 (ffmpeg) + cover
  pipeline/render.py     (segments + voix) -> MP3
  store/artifacts.py    persistance JSON (reprenable)
 data/<slug>/            artefacts intermédiaires (json, wav, cover)
 output/<livre>/         MP3 finaux (1 par chapitre)
 voicebank/              clips de référence pour le clonage (M3)
 ```
 ## État d'avancement
 - [x] **M1** — Parsing EPUB, analyse Gemma (segments + casting), CLI.
 - [x] **M2** — TTS bout-en-bout (Kokoro/Qwen3), mono-narrateur → MP3 taggé + cover.
 - [x] **M3** — Multi-voix : voice bank + auto-casting personnage → voix (clonage Qwen3).
 - [x] **M4** — Interface web (FastAPI + WebSocket + React) : suivi, éditeurs casting/prononciation, previews.
 - [x] **M5** — État reprenable (réconciliation avec les artefacts), run par lots via UI/CLI.
 ### Note sur les moteurs
 - **Kokoro** : ~30 s/chapitre, voix distinctes par timbre (rendu rapide, brouillons).
 - **Qwen3-TTS** : clonage des voix de la banque par personnage, qualité supérieure,
  nettement plus lent — réservé au rendu final. Tout rendu est **repris** chapitre
  par chapitre (relancer ne refait pas les MP3 déjà produits).
--- a/backend/inkflow/init.py
+++ b/backend/inkflow/init.py
--- a/backend/inkflow/analysis/init.py
+++ b/backend/inkflow/analysis/init.py
--- a/backend/inkflow/analysis/gemma.py
+++ b/backend/inkflow/analysis/gemma.py
@@ -0,0 +1,123 @@
 """Wrapper mlx-lm autour de Gemma pour l'analyse de texte.
 Charge le modele paresseusement (une seule fois par process) et expose des
 helpers de generation, dont un `generate_json` tolerant qui extrait le premier
 objet/array JSON valide de la sortie du modele.
 """
 from __future__ import annotations
 import json
 import re
 from functools import lru_cache
 from typing import Any, Optional
 from ..settings import get_settings
 # Bornes d'un bloc JSON dans une reponse potentiellement bavarde.
 _JSON_SPAN_RE = re.compile(r"(\{.*\}|\[.*\])", re.DOTALL)
 _FENCE_RE = re.compile(r"```(?:json)?\s*(.*?)```", re.DOTALL)
@lru_cache(maxsize=2)
 def _load(model_id: str):
    # Import paresseux : evite de charger mlx tant qu'on n'analyse pas.
    from mlx_lm import load
    return load(model_id)
 class Gemma:
    """Petite facade autour de mlx-lm pour piloter Gemma."""
    def __init__(self, model_id: Optional[str] = None):
        self.model_id = model_id or get_settings().gemma_model
        self._model = None
        self._tokenizer = None
    def _ensure_loaded(self) -> None:
        if self._model is None:
            self._model, self._tokenizer = _load(self.model_id)
    def generate(
        self,
        prompt: str,
        *,
        system: Optional[str] = None,
        max_tokens: Optional[int] = None,
        temperature: Optional[float] = None,
    ) -> str:
        """Genere une reponse texte a partir d'un prompt (template de chat).
        `max_tokens`/`temperature` non fournis -> valeurs des reglages courants.
        """
        self._ensure_loaded()
        settings = get_settings()
        if max_tokens is None:
            max_tokens = settings.gemma_max_tokens
        if temperature is None:
            temperature = settings.gemma_temperature
        from mlx_lm import generate
        from mlx_lm.sample_utils import make_sampler
        messages = []
        if system:
            messages.append({"role": "system", "content": system})
        messages.append({"role": "user", "content": prompt})
        formatted = self._tokenizer.apply_chat_template(
            messages, add_generation_prompt=True, tokenize=False
        )
        sampler = make_sampler(temp=temperature)
        return generate(
            self._model,
            self._tokenizer,
            prompt=formatted,
            max_tokens=max_tokens,
            sampler=sampler,
            verbose=False,
        )
    def generate_json(
        self,
        prompt: str,
        *,
        system: Optional[str] = None,
        max_tokens: Optional[int] = None,
        temperature: Optional[float] = None,
        retries: int = 1,
    ) -> Any:
        """Genere puis parse un JSON. Reessaie en cas d'echec de parsing.
        `max_tokens`/`temperature` non fournis -> valeurs des reglages courants.
        """
        last_err: Optional[Exception] = None
        for attempt in range(retries + 1):
            raw = self.generate(
                prompt, system=system, max_tokens=max_tokens,
                temperature=temperature if attempt == 0 else 0.0,
            )
            try:
                return _extract_json(raw)
            except Exception as exc:  # noqa: BLE001
                last_err = exc
        raise ValueError(f"Reponse JSON invalide apres {retries + 1} essais: {last_err}")
 def _extract_json(text: str) -> Any:
    """Extrait le premier objet/array JSON d'une reponse libre du modele.
    Tolere le texte parasite avant/apres (y compris un 2e bloc) grace a
    raw_decode, qui s'arrete au premier JSON complet.
    """
    text = text.strip()
    fence = _FENCE_RE.search(text)
    if fence:
        text = fence.group(1).strip()
    decoder = json.JSONDecoder()
    # Cherche le 1er debut de structure JSON et decode a partir de la.
    for i, ch in enumerate(text):
        if ch in "[{":
            try:
                obj, _ = decoder.raw_decode(text[i:])
                return obj
            except json.JSONDecodeError:
                continue
    raise ValueError("aucun JSON trouve dans la reponse")
--- a/backend/inkflow/analysis/pronunciation.py
+++ b/backend/inkflow/analysis/pronunciation.py
@@ -0,0 +1,59 @@
 """Dictionnaire de prononciation : application + proposition de candidats.
 L'application est une simple reecriture de surface du texte (graphie guidee)
 avant synthese. Les candidats (noms propres, termes SF) peuvent etre proposes
 par Gemma puis valides par l'utilisateur dans l'UI.
 """
 from __future__ import annotations
 import re
 from typing import Iterable
 from ..models import Pronunciation, PronunciationEntry
 from ..settings import get_settings
 from .gemma import Gemma
 def apply_pronunciation(text: str, pron: Pronunciation) -> str:
    """Remplace chaque terme actif par sa graphie phonetique (mot entier)."""
    for entry in pron.entries:
        if not entry.enabled or not entry.term:
            continue
        pattern = re.compile(rf"\b{re.escape(entry.term)}\b")
        text = pattern.sub(entry.replacement, text)
    return text
 # Le prompt systeme est editable dans les reglages (settings.prompt_pronunciation).
 def propose_pronunciations(text: str, gemma: Gemma, *, max_chars: int = 16000) -> list[PronunciationEntry]:
    """Propose des candidats de prononciation a valider."""
    sample = text[:max_chars]
    prompt = (
        "Repere dans cet extrait les mots a risque de mauvaise prononciation par "
        "une voix de synthese francaise. Pour chacun, propose une graphie "
        "phonetique francaise (replacement) qui guide la prononciation.\n\n"
        f"EXTRAIT:\n{sample}\n\n"
        'Reponds par un tableau JSON: '
        '[{"term":"Tiamat","replacement":"Tia-matt","note":"nom propre"}]'
    )
    result = gemma.generate_json(prompt, system=get_settings().prompt_pronunciation)
    entries: list[PronunciationEntry] = []
    for item in result:
        if isinstance(item, dict) and item.get("term") and item.get("replacement"):
            entries.append(PronunciationEntry(
                term=str(item["term"]).strip(),
                replacement=str(item["replacement"]).strip(),
                note=item.get("note"),
            ))
    return entries
 def merge_pronunciations(
    existing: Pronunciation, new: Iterable[PronunciationEntry]
 ) -> Pronunciation:
    by_term = {e.term.lower(): e for e in existing.entries}
    for e in new:
        by_term.setdefault(e.term.lower(), e)
    return Pronunciation(entries=list(by_term.values()))
--- a/backend/inkflow/analysis/segmenter.py
+++ b/backend/inkflow/analysis/segmenter.py
@@ -0,0 +1,622 @@
 """Segmentation narration/dialogue + attribution de locuteur + casting.
 Approche hybride :
 1. Pre-segmentation deterministe au niveau paragraphe (regles de ponctuation
   francaise : un paragraphe commencant par un cadratin "—" est une replique).
 2. Gemma attribue un locuteur a chaque replique, en un seul appel par chapitre
   (liste numerotee + contexte), et extrait le casting (personnages + attributs).
 Le decoupage fin des incises ("..., dit-il") est laisse a une passe ulterieure ;
 en v1 la replique entiere est portee par la voix du personnage.
 """
 from __future__ import annotations
 import re
 from typing import Optional
 from ..models import (
    Cast,
    Chapter,
    ChapterAnalysis,
    ChapterText,
    Character,
    Incise,
    Segment,
    SegmentType,
 )
 from ..settings import get_settings
 from .gemma import Gemma
 # Un paragraphe de dialogue commence par un cadratin (U+2014) ou un tiret long.
 _DIALOGUE_LEAD_RE = re.compile(r"^\s*[—―]\s*")
 # --- Detection des incises (inversion verbe-sujet francaise) ------------------
 # Une incise est un groupe de narration insere dans une replique ("..., dit-il.").
 # On exclut tu/nous/vous (imperatifs "Donne-le-moi", "Crois-tu ?") pour limiter
 # les faux positifs. Voir `detect_incises` plus bas pour les deux passes
 # (inversion verbe-pronom + nominale "lanca Drummer", conscience du casting).
 _INCISE_PRON = r"(?:il|elle|on|ils|elles|je)"
 # Verbe de parole, eventuellement reflechi ("s'ecria", "s'exclama").
 _INCISE_VERB = r"(?:[A-Za-zÀ-ÿ]+['’])?[A-Za-zÀ-ÿ]{2,}"
 def segment_chapter_text(ct: ChapterText) -> list[Segment]:
    """Decoupe un chapitre en segments narration/dialogue (regles seules)."""
    segments: list[Segment] = []
    for para in ct.paragraphs:
        if _DIALOGUE_LEAD_RE.match(para):
            text = _DIALOGUE_LEAD_RE.sub("", para).strip()
            segments.append(Segment(
                type=SegmentType.DIALOGUE, text=text, speaker="?"))
        else:
            segments.append(Segment(
                type=SegmentType.NARRATION, text=para, speaker="narrateur"))
    return segments
 # --- Attribution des locuteurs (Gemma) --------------------------------------
 # Le prompt systeme est editable dans les reglages (settings.prompt_speakers).
 _UNKNOWN = {"", "?", "inconnu", "narrateur"}
 _CTX_CHARS = 160          # troncature du contexte narratif avant/apres
 _CHUNK_MAX_DIALOGUES = 30  # repliques par appel (fiabilite du modele)
 def attribute_speakers(
    segments: list[Segment],
    gemma: Gemma,
    *,
    characters: Optional[list[Character]] = None,
    pov: Optional[str] = None,
 ) -> dict[int, str]:
    """Renseigne `speaker` pour chaque dialogue (mutation en place).
    Fournit au modele la liste canonique enrichie des personnages (nom, genre,
    description) et, pour chaque replique, le contexte narratif AVANT et APRES
    (l'incise d'attribution est souvent placee apres : "— Bonjour. dit Marie.").
    Renvoie une map {index_de_segment: confidence} ("high"/"medium"/"low"),
    conservee en memoire (non persistee) pour piloter la 2e passe retroactive.
    Une replique dont le nom rendu sort de la liste fournie est gardee mais
    marquee "low" afin d'etre reexaminee.
    """
    dialogues = [(i, s) for i, s in enumerate(segments)
                 if s.type is SegmentType.DIALOGUE]
    if not dialogues:
        return {}
    # Repliques deja resolues (seed par incise) : montrees comme contexte fixe,
    # jamais re-demandees au modele. Si tout est resolu, rien a faire.
    locked = {i for i, s in dialogues if _is_resolved(s.speaker)}
    if len(locked) == len(dialogues):
        return {i: "high" for i, _ in dialogues}
    hint = _speakers_hint(characters, pov)
    valid = {c.name.strip().lower() for c in (characters or [])}
    confidence: dict[int, str] = {}
    for chunk in _chunk_dialogues(dialogues, segments, hint):
        prompt = (
            "Voici les repliques de dialogue d'un extrait, numerotees, avec la "
            "narration qui precede et qui suit chaque replique. Les repliques "
            "deja attribuees affichent (locuteur: X) : ne les modifie pas, "
            "sers-t'en comme contexte (alternance des tours). Pour les AUTRES, "
            "indique le personnage qui parle (recopie son nom depuis la liste "
            "fournie ; 'inconnu' si vraiment indeterminable) et ta confiance "
            "(high/medium/low)."
            f"{hint}\n\n" + "\n".join(line for _, line in chunk) +
            '\n\nReponds par un tableau JSON: '
            '[{"i": 0, "speaker": "Holden", "confidence": "high"}, ...]'
        )
        result = gemma.generate_json(prompt, system=get_settings().prompt_speakers)
        by_i: dict[int, dict] = {item["i"]: item for item in result
                                 if isinstance(item, dict) and "i" in item}
        for j, (seg_idx, _line) in enumerate(chunk):
            if seg_idx in locked:          # seed conserve
                confidence[seg_idx] = "high"
                continue
            seg = segments[seg_idx]
            item = by_i.get(j) or {}
            speaker = (str(item.get("speaker") or "inconnu").strip()
                       or "inconnu")
            conf = str(item.get("confidence") or "low").strip().lower()
            if conf not in {"high", "medium", "low"}:
                conf = "low"
            # Nom hors liste connue -> on garde le nom mais on le rejuge.
            if (valid and speaker.lower() not in _UNKNOWN
                    and speaker.lower() not in valid):
                conf = "low"
            seg.speaker = speaker
            confidence[seg_idx] = conf
    return confidence
 def _speakers_hint(characters: Optional[list[Character]], pov: Optional[str]) -> str:
    hint = ""
    if characters:
        lines = []
        for c in characters:
            attrs = c.gender or ""
            desc = f" — {c.description}" if c.description else ""
            lines.append(f"- {c.name}" + (f" ({attrs})" if attrs else "") + desc)
        hint += "\nPersonnages du chapitre:\n" + "\n".join(lines)
    if pov:
        hint += f"\nLe point de vue de ce chapitre est: {pov}."
    return hint
 def _is_resolved(speaker: str) -> bool:
    """Vrai si la replique a deja un locuteur sur (seed incise, etc.)."""
    return (speaker or "").strip().lower() not in _UNKNOWN
 def _dialogue_line(n: int, segments: list[Segment], idx: int) -> str:
    seg = segments[idx]
    # Replique deja resolue (ex: seed par incise) -> montree comme contexte fixe.
    if _is_resolved(seg.speaker):
        return f"[{n}] (locuteur: {seg.speaker}) REPLIQUE: {seg.text!r}"
    before = _adjacent_narration(segments, idx, -1)
    after = _adjacent_narration(segments, idx, +1)
    parts = [f"[{n}]"]
    if before:
        parts.append(f"(avant: {before!r})")
    parts.append(f"REPLIQUE: {seg.text!r}")
    if after:
        parts.append(f"(apres: {after!r})")
    return " ".join(parts)
 def _adjacent_narration(segments: list[Segment], idx: int, direction: int) -> str:
    """Texte de la narration immediatement adjacente (incise d'attribution)."""
    j = idx + direction
    if 0 <= j < len(segments) and segments[j].type is SegmentType.NARRATION:
        return segments[j].text[:_CTX_CHARS]
    return ""
 def _chunk_dialogues(
    dialogues: list[tuple[int, Segment]],
    segments: list[Segment],
    hint: str,
 ) -> list[list[tuple[int, str]]]:
    """Decoupe les repliques en lots tenant sous `_MAX_PROMPT_CHARS`.
    Chaque lot est une liste de (index_segment, ligne_rendue) ; la ligne est
    numerotee localement (0..k) pour le prompt, l'index segment sert au mapping
    retour. Evite la troncature brutale sur les longs chapitres.
    """
    budget = _MAX_PROMPT_CHARS - len(hint) - 400  # marge pour les consignes
    chunks: list[list[tuple[int, str]]] = []
    current: list[tuple[int, str]] = []
    size = 0
    for idx, _seg in dialogues:
        line = _dialogue_line(len(current), segments, idx)
        if current and (size + len(line) > budget
                        or len(current) >= _CHUNK_MAX_DIALOGUES):
            chunks.append(current)
            current = []
            size = 0
            line = _dialogue_line(0, segments, idx)
        current.append((idx, line))
        size += len(line) + 1
    if current:
        chunks.append(current)
    return chunks
 # --- Passe retroactive : re-resolution des repliques indeterminees ----------
 # Le prompt systeme est editable (settings.prompt_speakers_refine).
 def _refine_unknown_speakers(
    segments: list[Segment],
    gemma: Gemma,
    *,
    characters: Optional[list[Character]] = None,
    confidence: dict[int, str],
 ) -> None:
    """2e passe : re-resout les repliques restees indeterminees/peu sures.
    Chaque replique douteuse est presentee avec ses voisines de dialogue DEJA
    identifiees (alternance des tours) et son contexte narratif, pour exploiter
    l'information venant des repliques *suivantes*. Mutation en place ; aucun
    appel Gemma si rien n'est douteux.
    """
    dialogues = [(i, s) for i, s in enumerate(segments)
                 if s.type is SegmentType.DIALOGUE]
    if not dialogues:
        return
    pos = {seg_idx: n for n, (seg_idx, _s) in enumerate(dialogues)}
    doubtful = [seg_idx for seg_idx, _s in dialogues
                if segments[seg_idx].speaker.strip().lower() in _UNKNOWN
                or confidence.get(seg_idx) == "low"]
    if not doubtful:
        return
    hint = _speakers_hint(characters, pov=None)
    lines = []
    for j, seg_idx in enumerate(doubtful):
        n = pos[seg_idx]
        ctx = []
        if n > 0:
            prev_idx = dialogues[n - 1][0]
            ctx.append(f"replique precedente (dite par "
                       f"{segments[prev_idx].speaker}): "
                       f"{segments[prev_idx].text[:_CTX_CHARS]!r}")
        before = _adjacent_narration(segments, seg_idx, -1)
        if before:
            ctx.append(f"narration avant: {before!r}")
        after = _adjacent_narration(segments, seg_idx, +1)
        if after:
            ctx.append(f"narration apres: {after!r}")
        if n < len(dialogues) - 1:
            next_idx = dialogues[n + 1][0]
            ctx.append(f"replique suivante (dite par "
                       f"{segments[next_idx].speaker}): "
                       f"{segments[next_idx].text[:_CTX_CHARS]!r}")
        ctx_str = (" [" + " ; ".join(ctx) + "]") if ctx else ""
        lines.append(f"[{j}]{ctx_str} REPLIQUE: {segments[seg_idx].text!r}")
    prompt = (
        "Repliques au locuteur indetermine. Pour chacune, en t'appuyant sur les "
        "repliques voisines DEJA attribuees (alternance des tours) et le "
        "contexte, indique qui parle (recopie le nom depuis la liste ; "
        "'inconnu' si toujours indeterminable)."
        f"{hint}\n\n" + "\n".join(lines) +
        '\n\nReponds par un tableau JSON: [{"i": 0, "speaker": "Holden"}, ...]'
    )
    result = gemma.generate_json(_truncate(prompt),
                                 system=get_settings().prompt_speakers_refine)
    by_i = {item["i"]: item.get("speaker") for item in result
            if isinstance(item, dict) and "i" in item}
    for j, seg_idx in enumerate(doubtful):
        new = (str(by_i.get(j) or "").strip())
        if new and new.lower() not in _UNKNOWN:
            segments[seg_idx].speaker = new
 # --- Extraction du casting (Gemma) ------------------------------------------
 # Le prompt systeme est editable dans les reglages (settings.prompt_characters).
 def extract_characters(text: str, gemma: Gemma) -> list[Character]:
    """Extrait les personnages et leurs attributs (genre, age) d'un texte."""
    prompt = (
        "A partir de l'extrait suivant, liste les personnages qui parlent ou "
        "sont nommes. Pour chacun, donne: name (nom court canonique), gender "
        "(male/female/unknown), age (child/young/adult/old/unknown), et une "
        "courte description. Ignore les figurants sans nom.\n\n"
        f"EXTRAIT:\n{_truncate(text)}\n\n"
        'Reponds par un tableau JSON: '
        '[{"name":"Holden","gender":"male","age":"adult","description":"..."}]'
    )
    result = gemma.generate_json(prompt, system=get_settings().prompt_characters)
    characters: list[Character] = []
    for item in result:
        if not isinstance(item, dict) or not item.get("name"):
            continue
        characters.append(Character(
            name=str(item["name"]).strip(),
            gender=_norm(item.get("gender")),
            age=_norm(item.get("age")),
            description=(item.get("description") or None),
        ))
    return characters
 def merge_characters(existing: list[Character], new: list[Character]) -> list[Character]:
    """Fusionne deux listes de personnages par nom (insensible a la casse)."""
    by_key = {c.name.lower(): c for c in existing}
    for c in new:
        key = c.name.lower()
        if key in by_key:
            cur = by_key[key]
            cur.gender = cur.gender or c.gender
            cur.age = cur.age or c.age
            cur.description = cur.description or c.description
        else:
            by_key[key] = c
    return list(by_key.values())
 def _norm(value) -> Optional[str]:
    if not value:
        return None
    v = str(value).strip().lower()
    return v if v and v != "unknown" else None
 # --- Helpers -----------------------------------------------------------------
 # Garde-fou de contexte (caracteres) pour rester dans une fenetre raisonnable.
 _MAX_PROMPT_CHARS = 24000
 def _truncate(text: str) -> str:
    return text if len(text) <= _MAX_PROMPT_CHARS else text[:_MAX_PROMPT_CHARS]
 # --- Detection des incises (deterministe, conscience du casting) -------------
 # Les incises sont annotees par des bornes (offsets) sur la replique persistee
 # (non destructif) ; le rendu les fait porter par la voix du narrateur. Deux
 # passes complementaires :
 #   1. inversion verbe-pronom ("dit-il", "coupa-t-elle") ;
 #   2. nominale : verbe de parole + sujet connu (nom du casting OU nom de role,
 #      ex: "compatit Holden", "lanca Drummer", "informa le soldat").
 # La passe nominale s'appuie sur la liste des personnages -> peu de faux positifs
 # et permet d'extraire le locuteur explicite (seeding de l'attribution).
 # Pronom objet eventuel devant le verbe ("lui demanda un garde").
 _CLITIC = r"(?:lui|leur|nous|vous|me|te|se|y|en|[mts]['’])"
 # Formes conjuguees de verbes de parole (3e pers., passe simple / present /
 # imparfait). Liste curee : on prefere rater une incise que d'en inventer une.
 _SPEECH_VERBS = {
    "dit", "disait", "redit", "répondit", "repondit", "répond", "repond",
    "répondait", "repondait", "demanda", "demandait", "demande", "interrogea",
    "questionna", "ecria", "écria", "exclama", "enquit", "lança", "lanca",
    "lançait", "lance", "murmura", "chuchota", "souffla", "soupira", "ajouta",
    "ajoute", "reprit", "poursuivit", "poursuit", "continua", "enchaîna",
    "enchaina", "fit", "faisait", "remarqua", "observa", "nota", "déclara",
    "declara", "affirma", "assura", "rétorqua", "retorqua", "répliqua",
    "repliqua", "riposta", "objecta", "protesta", "insista", "renchérit",
    "rencherit", "acquiesça", "acquiesca", "admit", "avoua", "convint",
    "concéda", "conceda", "rectifia", "corrigea", "précisa", "precisa",
    "expliqua", "raconta", "annonça", "annonca", "proclama", "ordonna",
    "commanda", "supplia", "implora", "gémit", "gemit", "grogna", "ronchonna",
    "maugréa", "maugrea", "marmonna", "glissa", "lâcha", "lacha", "coupa",
    "interrompit", "conclut", "compléta", "completa", "suggéra", "suggera",
    "proposa", "promit", "jura", "menaça", "menaca", "ironisa", "plaisanta",
    "railla", "cria", "hurla", "tonna", "gronda", "rugit", "susurra",
    "compatit", "salua", "appela", "héla", "hela", "interpella", "balbutia",
    "bredouilla", "bafouilla", "gloussa", "ricana", "siffla", "tempêta",
    "tempeta", "rétorque", "lâche", "informa", "renseigna", "indiqua",
    "rappela", "avertit", "prévint", "prevint", "intima", "rétorquait",
    "lançait", "questionnait", "reconnut", "constata", "répéta", "repeta",
 }
 # Noms de role pouvant etre sujet d'une incise ("informa le soldat").
 _ROLE_NOUNS = {
    "garde", "soldat", "sentinelle", "gardien", "prêtre", "pretre", "homme",
    "femme", "fille", "garçon", "garcon", "vieille", "vieillard", "capitaine",
    "lieutenant", "sergent", "général", "general", "amiral", "officier", "voix",
    "inconnu", "inconnue", "étranger", "etranger", "enfant", "serviteur",
    "servante", "messager", "domestique", "médecin", "medecin",
 }
 # Mots vides ignores quand on indexe les tokens d'un nom de personnage.
 _NAME_STOP = {
    "le", "la", "les", "un", "une", "de", "du", "des", "monsieur", "madame",
    "mademoiselle", "m", "mme", "mlle", "mr", "dr", "docteur", "saint", "sainte",
 }
 # Ponctuations qui terminent la partie parlee : si l'incise les suit, tout le
 # reste de la replique est de la narration (la parole est finie). Apres une
 # simple virgule au contraire, le dialogue reprend apres l'incise.
 _SENTENCE_FINAL = {"", ".", "!", "?", "…"}
 def _incise_end(text: str, close_end: int, lead: str) -> int:
    """Fin effective de l'incise : jusqu'au bout de la replique si la parole
    etait deja close a gauche (`.`/`!`/`?`/`…` ou debut), sinon la cloture."""
    return len(text) if lead in _SENTENCE_FINAL else close_end
 # Passe 1 : inversion verbe-(t-)pronom, ancree sur une ponctuation a gauche
 # (virgule, point, ?, !, …) ou le debut de la replique.
 _INVERSION_RE = re.compile(
    r"(?P<lead>[,.!?…]|^)\s*"
    r"(?P<inc>" + _INCISE_VERB + r"-(?:t-)?" + _INCISE_PRON +
    r"(?:\s+[^.!?…»\",;]*?)?)"          # complements eventuels ("dit-il en souriant")
    r"(?P<close>[.!?…,])",              # cloture : ponctuation forte OU virgule
    re.IGNORECASE,
 )
 def _inversion_spans(text: str) -> list[tuple[int, int]]:
    return [(m.start("inc"), _incise_end(text, m.end("close"), m.group("lead")))
            for m in _INVERSION_RE.finditer(text)]
 def _name_token_index(names) -> dict[str, str]:
    """Index token -> nom canonique (tokens distinctifs uniquement).
    Un token partage par plusieurs personnages est ambigu et ecarte.
    """
    idx: dict[str, str] = {}
    ambiguous: set[str] = set()
    for name in names or ():
        for tok in re.split(r"[^\wÀ-ÿ]+", name):
            t = tok.lower()
            if len(t) < 2 or t in _NAME_STOP:
                continue
            if t in idx and idx[t] != name:
                ambiguous.add(t)
            else:
                idx[t] = name
    for t in ambiguous:
        idx.pop(t, None)
    return idx
 # Nom propre : initiale majuscule (motif sensible a la casse).
 _PROPER = r"[A-ZÀ-Ÿ][\wÀ-ÿ’'\-]+"
 _REJECT = object()  # le sujet n'en est pas un -> pas une incise
 def _classify_subject(subj: str, idx: dict[str, str]):
    """Locuteur porte par le sujet d'une incise nominale.
    - personnage connu -> nom canonique ;
    - nom propre (capitalise) inconnu -> nom de surface (seed quand meme : le
      texte le nomme, independamment de la fiabilite de l'extraction) ;
    - nom de role generique ("le soldat") -> None (incise reelle, pas de seed) ;
    - mot quelconque -> _REJECT (pas une incise).
    """
    low = subj.lower()
    if low in idx:
        return idx[low]
    if low in _ROLE_NOUNS:
        return None
    if subj[:1].isupper() and len(low) >= 2 and low not in _NAME_STOP:
        return subj.strip("’'")
    return _REJECT
 def _nominal_matches(text: str, names) -> list[tuple[int, int, Optional[str]]]:
    """Passe 2 : (start, end, locuteur) pour chaque incise nominale.
    Une incise nominale = verbe de parole + sujet (nom du casting, nom propre,
    ou nom de role). Le sujet nom propre est seede meme absent du casting.
    """
    idx = _name_token_index(names)
    literals = sorted(set(idx) | _ROLE_NOUNS, key=len, reverse=True)
    lit_alt = "|".join(re.escape(s) for s in literals)
    # Sujet : nom connu/role (insensible casse) OU nom propre (capitalise, sensible
    # casse pour ne pas happer un determiner "un"/"le"). Pas d'IGNORECASE global.
    subj_alt = (f"(?i:{lit_alt})|{_PROPER}") if lit_alt else _PROPER
    verbs = "|".join(re.escape(v) for v in sorted(_SPEECH_VERBS, key=len, reverse=True))
    pat = re.compile(
        r"(?P<lead>[,.!?…]|^)\s*"
        r"(?P<inc>(?:(?i:" + _CLITIC + r")\s+)?"
        r"(?i:" + verbs + r")\b"
        r"[^.!?…»\",;]{0,40}?\b"
        r"(?P<subj>" + subj_alt + r")\b"
        r"[^.!?…»\",;]*?)"
        r"(?P<close>[.!?…,])",
    )
    out: list[tuple[int, int, Optional[str]]] = []
    for m in pat.finditer(text):
        spk = _classify_subject(m.group("subj"), idx)
        if spk is _REJECT:
            continue
        out.append((m.start("inc"),
                    _incise_end(text, m.end("close"), m.group("lead")), spk))
    return out
 def _merge_spans(spans: list[tuple[int, int]]) -> list[Incise]:
    """Trie et fusionne (sans chevauchement) une liste de bornes -> Incise."""
    out: list[Incise] = []
    last_end = -1
    for s, e in sorted(set(spans)):
        if s < last_end:          # chevauchement -> on garde le premier vu
            continue
        out.append(Incise(start=s, end=e))
        last_end = e
    return out
 def detect_incises(text: str, *, names=None) -> list[Incise]:
    """Bornes des incises dans une replique (inversion + nominale cast-aware)."""
    spans = _inversion_spans(text)
    spans += [(s, e) for s, e, _ in _nominal_matches(text, names or set())]
    return _merge_spans(spans)
 def incise_speaker(text: str, incise: Incise, names) -> Optional[str]:
    """Locuteur explicite porte par une incise nominale ("compatit Holden")."""
    for s, e, spk in _nominal_matches(text, names):
        if s == incise.start and e == incise.end:
            return spk
    return None
 def iter_incise_pieces(
    text: str, incises: list[Incise]
 ) -> list[tuple[bool, str]]:
    """Decoupe `text` en morceaux (is_incise, sous_texte) via les bornes.
    Utilise au rendu : pieces dialogue -> voix du personnage, pieces incise ->
    voix du narrateur. Texte conserve modulo espaces de bordure.
    """
    pieces: list[tuple[bool, str]] = []
    cursor = 0
    for inc in sorted(incises, key=lambda i: i.start):
        if inc.start < cursor:    # garde-fou chevauchement
            continue
        before = text[cursor:inc.start]
        if before.strip():
            pieces.append((False, before.strip()))
        body = text[inc.start:inc.end]
        if body.strip():
            pieces.append((True, body.strip()))
        cursor = inc.end
    tail = text[cursor:]
    if tail.strip():
        pieces.append((False, tail.strip()))
    return pieces
 def analyze_chapter(
    chapter: Chapter,
    ct: ChapterText,
    gemma: Gemma,
    *,
    book_chars: Optional[list[Character]] = None,
    dedup_gemma: Optional[Gemma] = None,
 ) -> tuple[ChapterAnalysis, list[Character]]:
    """Analyse complete d'un chapitre.
    Sequence : segmentation -> extraction des personnages -> reconciliation
    (dedup contre le cast cumule du livre) -> annotation des incises + seeding
    du locuteur explicite -> attribution LLM des repliques restantes -> passe
    retroactive. Les repliques sont persistees entieres (incises = bornes).
    `book_chars` : cast cumule du livre (personnages canoniques deja connus).
    `dedup_gemma` : si fourni, tranche les cas de dedup ambigus.
    Renvoie (analyse, cast cumule mis a jour) ; le 2e element est l'ensemble du
    casting du livre reconcilie, pret a etre persiste tel quel.
    """
    from ..casting.dedup import reconcile_characters
    segments = segment_chapter_text(ct)
    full_text = "\n".join(ct.paragraphs)
    found = extract_characters(full_text, gemma)
    # Dedup AVANT l'attribution : le modele recevra des noms canoniques.
    chars, name_map = reconcile_characters(book_chars or [], found, dedup_gemma)
    # Liste canonique restreinte a ce chapitre (personnages detectes + POV).
    chapter_canon = {(name_map.get(c.name.strip().lower()) or c.name).strip().lower()
                     for c in found}
    chapter_chars = [c for c in chars if c.name.strip().lower() in chapter_canon]
    if chapter.pov:
        pv = chapter.pov.strip().lower()
        for c in chars:
            if (c not in chapter_chars and
                    (pv in c.name.lower()
                     or any(pv in a.lower() for a in c.aliases))):
                chapter_chars.append(c)
    # Annotation deterministe des incises (bornes, non destructif) + seeding :
    # une incise nominale qui nomme un personnage fixe le locuteur avec certitude
    # AVANT l'appel LLM (corrige les cas que le petit modele rate).
    names = {c.name for c in chars}
    for seg in segments:
        if seg.type is not SegmentType.DIALOGUE:
            continue
        seg.incises = detect_incises(seg.text, names=names)
        for inc in seg.incises:
            spk = incise_speaker(seg.text, inc, names)
            if spk:
                seg.speaker = spk
                break
    conf = attribute_speakers(segments, gemma, characters=chapter_chars,
                              pov=chapter.pov)
    if get_settings().retro_pass_use_gemma:
        _refine_unknown_speakers(segments, gemma, characters=chapter_chars,
                                 confidence=conf)
    # Absorbe les locuteurs residuels (hors liste) en aliases (heuristique seule).
    chars, _ = reconcile_characters(
        chars, [], None, speaker_names=[s.speaker for s in segments])
    # Les repliques sont persistees entieres ; les incises restent des bornes
    # (rendu : voix narrateur). Plus de fragmentation a l'analyse.
    analysis = ChapterAnalysis(index=chapter.index, title=ct.title,
                               segments=segments)
    return analysis, chars
--- a/backend/inkflow/api/init.py
+++ b/backend/inkflow/api/init.py
--- a/backend/inkflow/api/app.py
+++ b/backend/inkflow/api/app.py
@@ -0,0 +1,295 @@
 """Application FastAPI : pilote le pipeline et sert l'UI.
 Toutes les routes lourdes (analyse, casting, rendu) sont *enfilees* dans
 l'orchestrateur et rendent la main immediatement ; l'avancement arrive par
 WebSocket. Les operations rapides (preview de voix) tournent dans un threadpool.
 """
 from __future__ import annotations
 import asyncio
 import io
 from pathlib import Path
 from typing import Optional
 import soundfile as sf
 from fastapi import FastAPI, HTTPException, UploadFile, WebSocket, WebSocketDisconnect
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import FileResponse, Response
 from fastapi.staticfiles import StaticFiles
 from pydantic import BaseModel
 from ..config import DATA_DIR, book_data_dir, book_output_dir, ensure_dirs
 from ..epub.parser import load_book, load_chapter_text, parse_epub
 from ..models import Cast, ChapterAnalysis, Pronunciation
 from ..pipeline.orchestrator import load_state, orchestrator
 from ..settings import Settings, get_settings, save_settings
 from ..store import artifacts
 from ..util import slugify
 from .ws import manager
 app = FastAPI(title="InkFlow API")
 app.add_middleware(
    CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"],
 )
@app.on_event("startup")
 async def _startup() -> None:
    ensure_dirs()
    manager.bind_loop(asyncio.get_running_loop())
    orchestrator.set_broadcaster(manager.broadcast_threadsafe)
 # --- Helpers -----------------------------------------------------------------
 def _list_book_slugs() -> list[str]:
    if not DATA_DIR.exists():
        return []
    return sorted(p.parent.name for p in DATA_DIR.glob("*/book.json"))
 def _book_summary(slug: str) -> dict:
    book = load_book(slug)
    state = load_state(slug)
    rendered = sum(1 for r in state.render.values() if r.mp3)
    return {
        "slug": slug,
        "title": book.title,
        "author": book.author,
        "chapters": len(book.render_chapters),
        "rendered": rendered,
        "cover": f"/api/books/{slug}/cover" if book.cover_file else None,
    }
 # --- Bibliotheque / upload ---------------------------------------------------
@app.get("/api/books")
 def list_books() -> list[dict]:
    return [_book_summary(s) for s in _list_book_slugs()]
@app.post("/api/books")
 async def upload_book(file: UploadFile) -> dict:
    ensure_dirs()
    uploads = DATA_DIR / "_uploads"
    uploads.mkdir(parents=True, exist_ok=True)
    dest = uploads / (file.filename or "livre.epub")
    dest.write_bytes(await file.read())
    book = await asyncio.to_thread(parse_epub, dest)
    # Initialise l'etat.
    load_state(book.slug)
    return {"slug": book.slug, "title": book.title}
@app.get("/api/books/{slug}")
 def get_book(slug: str) -> dict:
    _require(slug)
    book = load_book(slug)
    return {"book": book.model_dump(mode="json"),
            "state": load_state(slug).model_dump(mode="json")}
@app.get("/api/books/{slug}/cover")
 def get_cover(slug: str):
    book = load_book(slug)
    if not book.cover_file:
        raise HTTPException(404, "pas de couverture")
    return FileResponse(str(book_data_dir(slug) / book.cover_file))
@app.get("/api/books/{slug}/chapters/{index}")
 def get_chapter(slug: str, index: int) -> dict:
    _require(slug)
    book = load_book(slug)
    ch = next((c for c in book.chapters if c.index == index), None)
    if ch is None:
        raise HTTPException(404, "chapitre inconnu")
    out: dict = {"chapter": ch.model_dump(mode="json")}
    apath = artifacts.analysis_path(slug, index)
    if apath.exists():
        out["analysis"] = artifacts.load_analysis(slug, index).model_dump(mode="json")
    elif ch.text_file:
        out["text"] = load_chapter_text(slug, ch).model_dump(mode="json")
    return out
@app.put("/api/books/{slug}/chapters/{index}/analysis")
 def put_analysis(slug: str, index: int, analysis: ChapterAnalysis) -> dict:
    _require(slug)
    if analysis.index != index:
        raise HTTPException(400, "index incoherent")
    artifacts.save_analysis(slug, analysis)
    return {"saved": True}
 # --- Etapes (enfilees) -------------------------------------------------------
 class ChaptersBody(BaseModel):
    chapters: Optional[list[int]] = None
@app.post("/api/books/{slug}/analyze")
 def analyze(slug: str, body: ChaptersBody) -> dict:
    _require(slug)
    orchestrator.run_analyze(slug, body.chapters)
    return {"queued": True}
@app.post("/api/books/{slug}/pronounce")
 def pronounce(slug: str) -> dict:
    _require(slug)
    orchestrator.run_pronounce(slug)
    return {"queued": True}
@app.post("/api/books/{slug}/cast/auto")
 def cast_auto(slug: str) -> dict:
    _require(slug)
    orchestrator.run_cast(slug)
    return {"queued": True}
@app.post("/api/books/{slug}/cast/analyze")
 def cast_analyze(slug: str, body: ChaptersBody) -> dict:
    """(Re)analyse le casting d'un/des chapitre(s) avec reconciliation."""
    _require(slug)
    orchestrator.run_cast_analyze(slug, body.chapters)
    return {"queued": True}
@app.post("/api/books/{slug}/cast/dedup")
 def cast_dedup(slug: str) -> dict:
    """Deduplique le casting existant (variantes de noms -> aliases)."""
    _require(slug)
    orchestrator.run_dedup_cast(slug)
    return {"queued": True}
 class RenderBody(BaseModel):
    chapters: list[int]
    backend: Optional[str] = None
    mono: bool = False
@app.post("/api/books/{slug}/render")
 def render(slug: str, body: RenderBody) -> dict:
    _require(slug)
    orchestrator.run_render(slug, body.chapters, backend=body.backend, mono=body.mono)
    return {"queued": True}
 # --- Casting / prononciation (lecture-ecriture directe) ----------------------
@app.get("/api/books/{slug}/cast")
 def get_cast(slug: str) -> dict:
    from ..casting.voicebank import load_voicebank
    _require(slug)
    return {"cast": artifacts.load_cast(slug).model_dump(mode="json"),
            "voicebank": load_voicebank().model_dump(mode="json")}
@app.put("/api/books/{slug}/cast")
 def put_cast(slug: str, cast: Cast) -> dict:
    _require(slug)
    artifacts.save_cast(slug, cast)
    return {"saved": True}
@app.get("/api/books/{slug}/pronunciation")
 def get_pron(slug: str) -> dict:
    _require(slug)
    return artifacts.load_pronunciation(slug).model_dump(mode="json")
@app.put("/api/books/{slug}/pronunciation")
 def put_pron(slug: str, pron: Pronunciation) -> dict:
    _require(slug)
    artifacts.save_pronunciation(slug, pron)
    return {"saved": True}
 # --- Reglages techniques globaux ---------------------------------------------
@app.get("/api/settings")
 def read_settings() -> dict:
    return get_settings().model_dump(mode="json")
@app.put("/api/settings")
 def write_settings(settings: Settings) -> dict:
    save_settings(settings)
    return {"saved": True}
 # --- Voicebank + preview -----------------------------------------------------
@app.get("/api/voicebank")
 def get_voicebank() -> dict:
    from ..casting.voicebank import load_voicebank
    return load_voicebank().model_dump(mode="json")
 class PreviewBody(BaseModel):
    voice_id: str
    text: str = "Bonjour, voici un aperçu de cette voix pour votre livre audio."
@app.post("/api/voicebank/preview")
 async def preview_voice(body: PreviewBody):
    from ..casting.voicebank import load_voicebank
    from ..tts.base import VoiceSpec
    entry = load_voicebank().by_id(body.voice_id)
    if entry is None:
        raise HTTPException(404, "voix inconnue")
    def _synth() -> bytes:
        from ..tts.factory import get_backend
        backend = get_backend("kokoro")
        audio, sr = backend.synthesize(body.text, VoiceSpec(preset=entry.kokoro_voice))
        buf = io.BytesIO()
        sf.write(buf, audio, sr, format="WAV")
        return buf.getvalue()
    data = await asyncio.to_thread(_synth)
    return Response(content=data, media_type="audio/wav")
@app.get("/api/books/{slug}/audio/{index}")
 def get_audio(slug: str, index: int):
    state = load_state(slug)
    rs = state.render.get(index)
    if not rs or not rs.mp3:
        raise HTTPException(404, "audio non genere")
    path = book_output_dir(load_book(slug).title) / rs.mp3
    if not path.exists():
        raise HTTPException(404, "fichier introuvable")
    return FileResponse(str(path), media_type="audio/mpeg", filename=rs.mp3)
 # --- WebSocket ---------------------------------------------------------------
@app.websocket("/ws/{slug}")
 async def ws_endpoint(ws: WebSocket, slug: str) -> None:
    await manager.connect(slug, ws)
    try:
        # Envoi de l'etat courant a la connexion.
        await ws.send_json({"type": "state", "state": load_state(slug).model_dump(mode="json")})
        while True:
            await ws.receive_text()  # garde la connexion ouverte
    except WebSocketDisconnect:
        manager.disconnect(slug, ws)
    except Exception:  # noqa: BLE001
        manager.disconnect(slug, ws)
 def _require(slug: str) -> None:
    if not (book_data_dir(slug) / "book.json").exists():
        raise HTTPException(404, "livre inconnu")
 # --- Service du frontend build (si present) ----------------------------------
 _FRONTEND_DIST = Path(__file__).resolve().parents[2].parent / "frontend" / "dist"
 if _FRONTEND_DIST.exists():
    app.mount("/", StaticFiles(directory=str(_FRONTEND_DIST), html=True), name="ui")
--- a/backend/inkflow/api/ws.py
+++ b/backend/inkflow/api/ws.py
@@ -0,0 +1,47 @@
 """Gestionnaire de connexions WebSocket avec diffusion thread-safe.
 L'orchestrateur tourne dans un thread worker ; il appelle `broadcast_threadsafe`
 qui replanifie l'envoi sur la boucle asyncio de l'API.
 """
 from __future__ import annotations
 import asyncio
 from collections import defaultdict
 from typing import Optional
 from fastapi import WebSocket
 class ConnectionManager:
    def __init__(self) -> None:
        self.active: dict[str, set[WebSocket]] = defaultdict(set)
        self._loop: Optional[asyncio.AbstractEventLoop] = None
    def bind_loop(self, loop: asyncio.AbstractEventLoop) -> None:
        self._loop = loop
    async def connect(self, slug: str, ws: WebSocket) -> None:
        await ws.accept()
        self.active[slug].add(ws)
    def disconnect(self, slug: str, ws: WebSocket) -> None:
        self.active[slug].discard(ws)
    def broadcast_threadsafe(self, slug: str, data: dict) -> None:
        """Appelable depuis n'importe quel thread (worker orchestrateur)."""
        if self._loop is None:
            return
        self._loop.call_soon_threadsafe(self._dispatch, slug, data)
    def _dispatch(self, slug: str, data: dict) -> None:
        for ws in list(self.active.get(slug, ())):
            asyncio.create_task(self._safe_send(slug, ws, data))
    async def _safe_send(self, slug: str, ws: WebSocket, data: dict) -> None:
        try:
            await ws.send_json({"type": "state", "state": data})
        except Exception:  # noqa: BLE001 — connexion fermee
            self.disconnect(slug, ws)
 manager = ConnectionManager()
--- a/backend/inkflow/audio/init.py
+++ b/backend/inkflow/audio/init.py
--- a/backend/inkflow/audio/postprocess.py
+++ b/backend/inkflow/audio/postprocess.py
@@ -0,0 +1,125 @@
 """Assemblage audio final : concat -> normalisation -> WAV -> MP3 taggue.
 Pas de pydub (casse en Python 3.13) : concat/normalisation en numpy, encodage
 mp3 + cover via ffmpeg CLI, tags via les metadonnees ffmpeg.
 """
 from __future__ import annotations
 import shutil
 import subprocess
 from pathlib import Path
 from typing import Optional
 import numpy as np
 import soundfile as sf
 from ..settings import get_settings
 def _resample(audio: np.ndarray, src_sr: int, dst_sr: int) -> np.ndarray:
    if src_sr == dst_sr or audio.size == 0:
        return audio
    duration = audio.size / src_sr
    n_dst = int(round(duration * dst_sr))
    x_src = np.linspace(0.0, duration, num=audio.size, endpoint=False)
    x_dst = np.linspace(0.0, duration, num=n_dst, endpoint=False)
    return np.interp(x_dst, x_src, audio).astype(np.float32)
 def silence(seconds: float, sr: int) -> np.ndarray:
    return np.zeros(int(seconds * sr), dtype=np.float32)
 def concat_segments(
    parts: list[tuple[np.ndarray, int]],
    *,
    target_sr: Optional[int] = None,
    gap_seconds: float = 0.35,
    intra_gap_seconds: float = 0.12,
    glued: Optional[list[bool]] = None,
 ) -> tuple[np.ndarray, int]:
    """Concatene des segments (audio, sr) avec un silence entre chacun.
    `glued[i] == True` (ex: une incise et sa replique, issues du meme paragraphe)
    insere un silence court `intra_gap_seconds` au lieu de `gap_seconds`.
    """
    if target_sr is None:
        target_sr = get_settings().target_sample_rate
    gap = silence(gap_seconds, target_sr)
    intra_gap = silence(intra_gap_seconds, target_sr)
    buf: list[np.ndarray] = []
    first = True
    for i, (audio, sr) in enumerate(parts):
        if audio is None or audio.size == 0:
            continue
        if not first:
            use_intra = glued is not None and i < len(glued) and glued[i]
            buf.append(intra_gap if use_intra else gap)
        first = False
        buf.append(_resample(np.asarray(audio, dtype=np.float32), sr, target_sr))
    if not buf:
        return np.zeros(0, dtype=np.float32), target_sr
    return np.concatenate(buf), target_sr
 def normalize_loudness(audio: np.ndarray, target_dbfs: Optional[float] = None) -> np.ndarray:
    """Normalise le niveau RMS vers target_dbfs, avec garde anti-saturation."""
    if audio.size == 0:
        return audio
    if target_dbfs is None:
        target_dbfs = get_settings().target_dbfs
    rms = float(np.sqrt(np.mean(audio.astype(np.float64) ** 2)))
    if rms < 1e-6:
        return audio
    current_dbfs = 20.0 * np.log10(rms)
    gain = 10.0 ** ((target_dbfs - current_dbfs) / 20.0)
    out = audio * gain
    peak = float(np.max(np.abs(out))) if out.size else 0.0
    if peak > 0.99:  # limiteur simple pour eviter le clipping
        out *= 0.99 / peak
    return out.astype(np.float32)
 def write_wav(path: str | Path, audio: np.ndarray, sr: int) -> Path:
    path = Path(path)
    path.parent.mkdir(parents=True, exist_ok=True)
    sf.write(str(path), audio, sr)
    return path
 def encode_mp3(
    wav_path: str | Path,
    mp3_path: str | Path,
    *,
    bitrate: Optional[str] = None,
    title: Optional[str] = None,
    album: Optional[str] = None,
    artist: Optional[str] = None,
    track: Optional[int] = None,
    cover_path: Optional[str | Path] = None,
 ) -> Path:
    """Encode un WAV en MP3 (ffmpeg) avec tags ID3 et cover optionnelle."""
    if bitrate is None:
        bitrate = get_settings().mp3_bitrate
    if not shutil.which("ffmpeg"):
        raise RuntimeError("ffmpeg introuvable — brew install ffmpeg")
    wav_path, mp3_path = Path(wav_path), Path(mp3_path)
    mp3_path.parent.mkdir(parents=True, exist_ok=True)
    cmd = ["ffmpeg", "-y", "-i", str(wav_path)]
    has_cover = cover_path and Path(cover_path).exists()
    if has_cover:
        cmd += ["-i", str(cover_path), "-map", "0:a", "-map", "1:v",
                "-c:v", "mjpeg", "-disposition:v", "attached_pic"]
    cmd += ["-c:a", "libmp3lame", "-b:a", bitrate]
    meta = {"title": title, "album": album, "artist": artist}
    if track is not None:
        meta["track"] = str(track)
    for key, val in meta.items():
        if val:
            cmd += ["-metadata", f"{key}={val}"]
    cmd += ["-id3v2_version", "3", str(mp3_path)]
    subprocess.run(cmd, check=True, capture_output=True)
    return mp3_path
--- a/backend/inkflow/casting/init.py
+++ b/backend/inkflow/casting/init.py
--- a/backend/inkflow/casting/assign.py
+++ b/backend/inkflow/casting/assign.py
@@ -0,0 +1,86 @@
 """Auto-casting : attribue une voix distincte a chaque personnage.
 Strategie deterministe :
 - Narrateur : voix FR native par defaut (ff_siwis), sinon premiere voix.
 - Personnages : voix du meme genre, distinctes tant qu'il en reste ; au-dela on
  recycle en repartissant le plus equitablement possible. Genre inconnu -> pool
  mixte. L'ordre (tri par nom) garantit la reproductibilite.
 L'utilisateur pourra surcharger ces choix dans l'UI.
 """
 from __future__ import annotations
 from collections import Counter
 from typing import Optional
 from ..models import Cast, Character, Voicebank
 # Voix narrateur preferee (FR native).
 PREFERRED_NARRATOR = "fr_f_siwis"
 def _pick_pool(vb: Voicebank, gender: Optional[str], narrator_id: str) -> list[str]:
    """Voix candidates : on privilegie STRICTEMENT le genre (quitte a reutiliser).
    On ne croise le genre que si aucune voix du bon genre n'existe. Le narrateur
    est exclu tant qu'il reste d'autres options, pour le distinguer.
    """
    same = [e.id for e in vb.by_gender(gender)] if gender in ("male", "female") else []
    pool = same if same else [e.id for e in vb.entries]
    non_narrator = [vid for vid in pool if vid != narrator_id]
    return non_narrator or pool  # garde le narrateur seulement s'il est seul
 def assign_voices(
    characters: list[Character],
    vb: Voicebank,
    *,
    narrator_voice_id: Optional[str] = None,
    respect_existing: bool = False,
 ) -> Cast:
    """Renvoie un Cast avec narrateur + voix par personnage (mutation des chars).
    `respect_existing=True` conserve les voix deja attribuees (overrides UI) ;
    sinon tout est re-calcule (auto-casting frais).
    """
    if not vb.entries:
        return Cast(narrator_voice_id=narrator_voice_id, characters=characters)
    narrator_id = narrator_voice_id or (
        PREFERRED_NARRATOR if vb.by_id(PREFERRED_NARRATOR) else vb.entries[0].id)
    usage: Counter[str] = Counter()
    usage[narrator_id] += 1  # le narrateur compte deja
    for ch in sorted(characters, key=lambda c: c.name.lower()):
        if respect_existing and ch.voice_id and vb.by_id(ch.voice_id):
            usage[ch.voice_id] += 1
            continue  # respecte une attribution existante (override utilisateur)
        pool = _pick_pool(vb, ch.gender, narrator_id)
        # Choisit la voix la moins utilisee du pool (donc une voix neuve d'abord).
        best = min(pool, key=lambda vid: (usage[vid], pool.index(vid)))
        ch.voice_id = best
        usage[best] += 1
    return Cast(narrator_voice_id=narrator_id, characters=characters)
 def resolve_speaker_voice(
    speaker: str, cast: Cast, vb: Voicebank
 ) -> Optional[str]:
    """Mappe un nom de locuteur (segment) vers un id de voix.
    Matche d'abord par nom/alias exact (rapide), puis en dernier recours par
    rapprochement heuristique de tokens (ex: un "Jim" qui n'aurait pas encore
    ete absorbe comme alias de "James Holden").
    """
    if speaker == "narrateur":
        return cast.narrator_voice_id
    low = speaker.lower()
    for ch in cast.characters:
        if ch.name.lower() == low or low in (a.lower() for a in ch.aliases):
            return ch.voice_id
    from .dedup import heuristic_match
    match = heuristic_match(speaker, cast.characters)
    if isinstance(match, Character):
        return match.voice_id
    return None  # inconnu -> le rendu repliera sur le narrateur
--- a/backend/inkflow/casting/dedup.py
+++ b/backend/inkflow/casting/dedup.py
@@ -0,0 +1,345 @@
 """Reconciliation du casting : deduplication des variantes de noms.
 Probleme : un meme personnage apparait sous plusieurs formes ("Holden",
 "James Holden", "James", "Jim"). Sans reconciliation, chaque forme devient un
 personnage distinct avec sa propre voix -> incoherence a l'ecoute.
 Strategie hybride :
 1. Heuristique (sans LLM) : match exact sur nom/alias, puis sous-ensemble de
   tokens ("Holden" contenu dans "James Holden").
 2. Gemma tranche les cas ambigus (plusieurs candidats compatibles, ou variante
   non evidente type "Jim" <-> "James") a l'aide des descriptions.
 Chaque variante rencontree est conservee comme `alias` du personnage canonique ;
 le nom canonique est la forme la plus complete vue ("James Holden"). Les
 artefacts d'analyse (segments) ne sont PAS modifies : la resolution de voix au
 rendu s'appuie sur les aliases (`casting/assign.py`).
 """
 from __future__ import annotations
 import re
 from typing import Optional
 from ..models import Character
 from ..settings import get_settings
 # Sentinelles internes.
 _AMBIGUOUS = object()   # heuristique : plusieurs candidats -> on delegue a Gemma
 _NEW = object()         # decision Gemma : nouveau personnage
 # Mots vides / titres a ignorer pour le rapprochement par tokens.
 _STOPWORDS = {
    "le", "la", "les", "un", "une", "de", "du", "des", "monsieur", "madame",
    "mademoiselle", "m", "mme", "mlle", "mr", "dr", "docteur", "capitaine",
    "lieutenant", "sergent", "general", "amiral", "the", "of",
 }
 _SPLIT_RE = re.compile(r"[^\wÀ-ÿ]+")
 # Garde-fou de contexte (caracteres) pour le prompt Gemma.
 _MAX_PROMPT_CHARS = 24000
 def _norm(name: str) -> str:
    return name.strip().lower()
 def _tokens(name: str) -> set[str]:
    """Tokens significatifs d'un nom (minuscules, sans titres ni mots vides)."""
    parts = [p for p in _SPLIT_RE.split(name.strip()) if p]
    return {p.lower() for p in parts
            if len(p) >= 2 and p.lower() not in _STOPWORDS}
 def _completeness(name: str) -> tuple[int, int]:
    """Cle de tri du nom le plus "complet" : plus de tokens, puis plus long."""
    return (len(_tokens(name)), len(name.strip()))
 def _forms(c: Character) -> list[str]:
    return [c.name, *c.aliases]
 def _token_freq(characters: list[Character], extra: Optional[list[str]] = None):
    """Compte, pour chaque token, le nb de surfaces distinctes le contenant.
    Sert a juger la distinctivite d'un token : "holden" present dans une seule
    famille est sur a fusionner ; "alex" present dans plusieurs ne l'est pas.
    """
    from collections import Counter
    freq: Counter[str] = Counter()
    surfaces = {_norm(f) for c in characters for f in _forms(c)}
    surfaces |= {_norm(s) for s in (extra or [])}
    for s in surfaces:
        for t in _tokens(s):
            freq[t] += 1
    return freq
 def heuristic_match(surface: str, characters: list[Character], tokfreq=None):
    """Rapproche `surface` d'un personnage connu sans LLM (conservateur).
    Renvoie le `Character` correspondant, `None` si aucun, ou `_AMBIGUOUS` si le
    rapprochement est plausible mais incertain (decision laissee a Gemma).
    Un lien par sous-ensemble de tokens n'est considere SUR que si le plus petit
    cote a >=2 tokens, ou si les tokens partages sont globalement distinctifs
    (presents dans <=2 surfaces). Sinon le lien est ambigu (ex: un prenom
    courant "Alex" partage par plusieurs personnages).
    """
    s_norm = _norm(surface)
    for c in characters:
        if _norm(c.name) == s_norm or any(_norm(a) == s_norm for a in c.aliases):
            return c
    s_tok = _tokens(surface)
    if not s_tok:
        return None
    if tokfreq is None:
        tokfreq = _token_freq(characters, [surface])
    safe: list[Character] = []
    ambiguous = False
    for c in characters:
        linked = is_safe = False
        for form in _forms(c):
            f_tok = _tokens(form)
            if not f_tok or not (s_tok <= f_tok or f_tok <= s_tok):
                continue
            linked = True
            shared = s_tok & f_tok
            if min(len(s_tok), len(f_tok)) >= 2 or all(tokfreq[t] <= 2 for t in shared):
                is_safe = True
        if is_safe:
            safe.append(c)
        elif linked:
            ambiguous = True
    if len(safe) == 1 and not ambiguous:
        return safe[0]
    if safe or ambiguous:
        return _AMBIGUOUS
    return None
 def canonical_of(a: str, b: str) -> str:
    """Forme canonique entre deux variantes : la plus complete."""
    return a if _completeness(a) >= _completeness(b) else b
 def _absorb(
    target: Character,
    name: str,
    *,
    gender: Optional[str] = None,
    age: Optional[str] = None,
    description: Optional[str] = None,
    voice_id: Optional[str] = None,
 ) -> None:
    """Fusionne la variante `name` dans `target` (mutation en place).
    Enrichit les attributs manquants, recalcule le nom canonique et range les
    autres formes en aliases.
    """
    target.gender = target.gender or gender
    target.age = target.age or age
    target.description = target.description or description
    target.voice_id = target.voice_id or voice_id
    forms: dict[str, str] = {}  # norm -> graphie d'origine (1re vue conservee)
    for f in [target.name, *target.aliases, name]:
        f = (f or "").strip()
        if f:
            forms.setdefault(_norm(f), f)
    canon = max(forms, key=lambda n: _completeness(forms[n]))
    target.name = forms[canon]
    target.aliases = sorted(v for k, v in forms.items() if k != canon)
 def _item(c) -> dict:
    """Normalise un personnage ou un nom brut en entree de reconciliation."""
    if isinstance(c, Character):
        return {"name": c.name, "gender": c.gender, "age": c.age,
                "description": c.description, "voice_id": c.voice_id}
    return {"name": str(c), "gender": None, "age": None,
            "description": None, "voice_id": None}
 def _find(chars: list[Character], name: str) -> Optional[Character]:
    n = _norm(name)
    return next((c for c in chars
                 if _norm(c.name) == n or any(_norm(a) == n for a in c.aliases)),
                None)
 def _create(chars: list[Character], it: dict, name_map: dict[str, str]) -> None:
    new = Character(name=it["name"].strip(), gender=it["gender"], age=it["age"],
                    description=it["description"], voice_id=it["voice_id"])
    chars.append(new)
    name_map[_norm(it["name"])] = new.name
 def reconcile_characters(
    book_chars: list[Character],
    new_chars,
    gemma=None,
    *,
    speaker_names: Optional[list[str]] = None,
 ) -> tuple[list[Character], dict[str, str]]:
    """Reconcilie de nouvelles detections dans le casting du livre.
    `new_chars` : personnages extraits (objets `Character`) du/des chapitre(s).
    `speaker_names` : formes de locuteur brutes vues dans les segments (absorbees
    comme aliases pour que la resolution de voix matche au rendu).
    `gemma` : si fourni, tranche les cas ambigus ; sinon heuristique seule.
    Renvoie (liste canonique mise a jour, map nom_surface_normalise -> canonique).
    """
    chars = [c.model_copy(deep=True) for c in book_chars]
    name_map: dict[str, str] = {}
    items = [_item(c) for c in new_chars]
    seen = {_norm(it["name"]) for it in items}
    for sp in (speaker_names or []):
        n = _norm(sp or "")
        if n and n not in seen and n not in {"narrateur", "inconnu", "?"}:
            items.append(_item(sp))
            seen.add(n)
    # Fréquence globale des tokens (base + entrants) -> distinctivite stable,
    # independante de l'ordre de traitement.
    tokfreq = _token_freq(chars, [it["name"] for it in items])
    pending: list[dict] = []
    for it in items:
        m = heuristic_match(it["name"], chars, tokfreq)
        if m is _AMBIGUOUS:
            pending.append(it)
        elif m is not None:
            _absorb(m, it["name"], gender=it["gender"], age=it["age"],
                    description=it["description"], voice_id=it["voice_id"])
            name_map[_norm(it["name"])] = m.name
        elif gemma is not None:
            pending.append(it)  # peut etre une variante non evidente ("Jim")
        else:
            _create(chars, it, name_map)
    if pending and gemma is not None:
        decisions = _gemma_reconcile(chars, pending, gemma)
        for it in pending:
            canon = decisions.get(_norm(it["name"]))
            target = _find(chars, canon) if isinstance(canon, str) else None
            if target is None:  # Gemma dit NOUVEAU/inconnu : ultime essai heuristique
                hm = heuristic_match(it["name"], chars, tokfreq)
                target = hm if isinstance(hm, Character) else None
            if target is not None:
                _absorb(target, it["name"], gender=it["gender"], age=it["age"],
                        description=it["description"], voice_id=it["voice_id"])
                name_map[_norm(it["name"])] = target.name
            else:
                _create(chars, it, name_map)
    elif pending:
        # Sans Gemma : on ne devine pas les cas ambigus, on les garde distincts.
        for it in pending:
            _create(chars, it, name_map)
    return chars, name_map
 def dedup_cast(characters: list[Character], gemma=None) -> list[Character]:
    """Replie les doublons d'un casting existant (conserve les voix attribuees).
    Deux phases : (1) regroupement heuristique sur (gemma=None) -> liste reduite
    et sure ; (2) si `gemma` fourni, passe de regroupement Gemma sur les seuls
    noms candidats (partageant un token avec un autre), pour fusionner les
    variantes que l'heuristique laisse de cote (ex: "Okoye" -> "Elvi Okoye").
    """
    base, _ = reconcile_characters([], characters, gemma=None)
    if gemma is None:
        return base
    return _gemma_merge_pass(base, gemma)
 def _gemma_merge_pass(base: list[Character], gemma) -> list[Character]:
    """Rattache via Gemma les formes courtes a un nom complet (ancre).
    Tache volontairement contrainte (et plus fiable qu'un regroupement libre) :
    une "forme courte" est un nom dont les tokens sont strictement inclus dans
    ceux d'un autre (ex: "Okoye" vs "Elvi Okoye"). Gemma mappe chaque forme
    courte vers le nom canonique EXACT d'une ancre, ou "NOUVEAU". Traite par
    petits lots pour rester dans la zone de fiabilite du modele.
    """
    shorts: list[Character] = []
    anchors: list[Character] = []
    for i, c in enumerate(base):
        ts = _tokens(c.name)
        if ts and any(j != i and ts < _tokens(d.name) for j, d in enumerate(base)):
            shorts.append(c)
        else:
            anchors.append(c)
    if not shorts:
        return base
    result = [a.model_copy(deep=True) for a in anchors]
    leftovers: list[Character] = []
    for start in range(0, len(shorts), 12):
        chunk = shorts[start:start + 12]
        decisions = _gemma_reconcile(result, [_item(s) for s in chunk], gemma)
        for s in chunk:
            canon = decisions.get(_norm(s.name))
            tgt = _find(result, canon) if isinstance(canon, str) else None
            if tgt is None:
                hm = heuristic_match(s.name, result)
                tgt = hm if isinstance(hm, Character) else None
            # Garde-fou : ne pas fusionner deux genres connus opposes.
            if tgt is not None and s.gender and tgt.gender and s.gender != tgt.gender:
                tgt = None
            if tgt is not None:
                _absorb(tgt, s.name, gender=s.gender, age=s.age,
                        description=s.description, voice_id=s.voice_id)
                for a in s.aliases:
                    _absorb(tgt, a)
            else:
                leftovers.append(s)
    return result + leftovers
 def _gemma_reconcile(
    chars: list[Character], pending: list[dict], gemma
 ) -> dict[str, object]:
    """Un appel groupe : pour chaque nom en attente, son canonique ou _NEW."""
    known = []
    for c in chars:
        al = f" (alias: {', '.join(c.aliases)})" if c.aliases else ""
        desc = f" — {c.description}" if c.description else ""
        known.append(f"- {c.name}{al}{desc}")
    new_lines = []
    for n, it in enumerate(pending):
        desc = f" — {it['description']}" if it.get("description") else ""
        new_lines.append(f"[{n}] {it['name']}{desc}")
    prompt = (
        "Personnages DEJA connus du livre :\n"
        + ("\n".join(known) if known else "(aucun)")
        + "\n\nNoms DETECTES a classer :\n" + "\n".join(new_lines)
        + "\n\nPour chaque nom detecte, indique s'il designe un personnage deja "
        "connu (donne alors son nom canonique EXACT tel qu'ecrit ci-dessus) ou "
        "s'il s'agit d'un nouveau personnage (\"NOUVEAU\"). Ne fusionne que si "
        "c'est, avec certitude, la meme personne. EN CAS DE DOUTE, ou si "
        "plusieurs personnages connus pourraient correspondre, reponds "
        "\"NOUVEAU\". Ne rapproche jamais deux personnes differentes qui "
        "partagent seulement un prenom ou un nom de famille.\n\n"
        'Reponds par un tableau JSON: '
        '[{"i":0,"canonical":"James Holden"},{"i":1,"canonical":"NOUVEAU"}]'
    )
    if len(prompt) > _MAX_PROMPT_CHARS:
        prompt = prompt[:_MAX_PROMPT_CHARS]
    result = gemma.generate_json(prompt, system=get_settings().prompt_dedup)
    decisions: dict[str, object] = {}
    for item in result:
        if not isinstance(item, dict) or "i" not in item:
            continue
        n = item["i"]
        canon = str(item.get("canonical") or "").strip()
        if isinstance(n, int) and 0 <= n < len(pending) and canon:
            decisions[_norm(pending[n]["name"])] = (
                _NEW if canon.upper() == "NOUVEAU" else canon)
    return decisions
--- a/backend/inkflow/casting/voicebank.py
+++ b/backend/inkflow/casting/voicebank.py
@@ -0,0 +1,91 @@
 """Banque de voix : un jeu de voix variees (genre/age) auto-suffisant.
 Chaque voix s'appuie sur une voix Kokoro (identite + clip de reference). Le clip
 de reference est genere une fois en lisant un passage francais standard ; il sert
 de reference de timbre pour le clonage Qwen3 (rendu final). Aucune ressource
 externe a sourcer.
 Resolution moteur :
 - Kokoro -> VoiceSpec(preset=kokoro_voice)        (rapide, preview / draft)
 - Qwen3  -> VoiceSpec(ref_audio=clip, ref_text=…) (qualite, clonage)
 """
 from __future__ import annotations
 from pathlib import Path
 import soundfile as sf
 from ..config import VOICEBANK_DIR
 from ..models import VoiceEntry, Voicebank
 from ..tts.base import VoiceSpec
 # Passage de reference lu par chaque voix pour creer son clip de clonage.
 REFERENCE_TEXT = (
    "L'univers est toujours plus étrange qu'on ne le croit. "
    "Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
 )
 # Jeu de voix par defaut (varie en genre). ff_siwis est la seule voix FR native ;
 # les autres empruntent un timbre anglais mais lisent un texte phonemise en FR.
 SEED: list[VoiceEntry] = [
    VoiceEntry(id="fr_f_siwis",  kokoro_voice="ff_siwis",   gender="female", age="adult", label="Siwis (FR)"),
    VoiceEntry(id="f_bella",     kokoro_voice="af_bella",   gender="female", age="adult", label="Bella"),
    VoiceEntry(id="f_heart",     kokoro_voice="af_heart",   gender="female", age="young", label="Heart"),
    VoiceEntry(id="f_emma",      kokoro_voice="bf_emma",    gender="female", age="adult", label="Emma"),
    VoiceEntry(id="f_nicole",    kokoro_voice="af_nicole",  gender="female", age="adult", label="Nicole"),
    VoiceEntry(id="m_fenrir",    kokoro_voice="am_fenrir",  gender="male",   age="adult", label="Fenrir"),
    VoiceEntry(id="m_michael",   kokoro_voice="am_michael", gender="male",   age="adult", label="Michael"),
    VoiceEntry(id="m_george",    kokoro_voice="bm_george",  gender="male",   age="adult", label="George"),
    VoiceEntry(id="m_lewis",     kokoro_voice="bm_lewis",   gender="male",   age="adult", label="Lewis"),
    VoiceEntry(id="m_eric",      kokoro_voice="am_eric",    gender="male",   age="young", label="Eric"),
    VoiceEntry(id="m_santa",     kokoro_voice="am_santa",   gender="male",   age="old",   label="Santa"),
 ]
 def metadata_path() -> Path:
    return VOICEBANK_DIR / "metadata.json"
 def clips_dir() -> Path:
    return VOICEBANK_DIR / "clips"
 def load_voicebank() -> Voicebank:
    path = metadata_path()
    if path.exists():
        return Voicebank.model_validate_json(path.read_text(encoding="utf-8"))
    return Voicebank(entries=list(SEED))
 def save_voicebank(vb: Voicebank) -> Path:
    VOICEBANK_DIR.mkdir(parents=True, exist_ok=True)
    metadata_path().write_text(vb.model_dump_json(indent=2), encoding="utf-8")
    return metadata_path()
 def build_voicebank(*, regenerate: bool = False) -> Voicebank:
    """Genere les clips de reference manquants et ecrit metadata.json."""
    from ..tts.kokoro import KokoroBackend
    clips_dir().mkdir(parents=True, exist_ok=True)
    backend = KokoroBackend()
    entries: list[VoiceEntry] = []
    for seed in SEED:
        clip_rel = f"clips/{seed.id}.wav"
        clip_abs = VOICEBANK_DIR / clip_rel
        if regenerate or not clip_abs.exists():
            audio, sr = backend.synthesize(REFERENCE_TEXT, VoiceSpec(preset=seed.kokoro_voice))
            sf.write(str(clip_abs), audio, sr)
        entry = seed.model_copy(update={"ref_audio": clip_rel, "ref_text": REFERENCE_TEXT})
        entries.append(entry)
    vb = Voicebank(entries=entries)
    save_voicebank(vb)
    return vb
 def voice_spec_for(entry: VoiceEntry, engine: str, *, speed: float = 1.0) -> VoiceSpec:
    """Construit la VoiceSpec adaptee au moteur cible."""
    if engine == "qwen3" and entry.ref_audio:
        ref_abs = str(VOICEBANK_DIR / entry.ref_audio)
        return VoiceSpec(ref_audio=ref_abs, ref_text=entry.ref_text, speed=speed)
    return VoiceSpec(preset=entry.kokoro_voice, speed=speed)
--- a/backend/inkflow/cli.py
+++ b/backend/inkflow/cli.py
@@ -0,0 +1,239 @@
 """CLI InkFlow (typer).
 Commandes :
 - parse    : EPUB -> book.json + chapters/chNN.json
 - analyze  : analyse Gemma d'un (ou de tous les) chapitre(s) -> analysis + cast
 - info     : affiche la structure d'un livre deja parse
 """
 from __future__ import annotations
 from typing import Optional
 import typer
 from rich.console import Console
 from rich.table import Table
 from .config import ensure_dirs
 from .epub.parser import load_book, load_chapter_text, parse_epub
 from .models import Cast
 from .store import artifacts
 app = typer.Typer(add_completion=False, help="InkFlow : EPUB -> livre audio (local, MLX).")
 console = Console()
@app.command()
 def parse(epub_path: str, slug: Optional[str] = typer.Option(None, help="Slug interne (def: depuis le titre).")):
    """Parse un EPUB en structure normalisee."""
    ensure_dirs()
    book = parse_epub(epub_path, slug=slug)
    console.print(f"[green]Parse:[/] {book.title} — slug=[cyan]{book.slug}[/]")
    console.print(f"  {len(book.chapters)} items, {len(book.render_chapters)} a rendre.")
    _print_chapters(book)
@app.command()
 def info(slug: str):
    """Affiche la structure d'un livre deja parse."""
    _print_chapters(load_book(slug))
@app.command()
 def serve(host: str = "127.0.0.1", port: int = 8000):
    """Lance l'API + l'UI web (sert frontend/dist si build)."""
    import uvicorn
    ensure_dirs()
    console.print(f"[green]InkFlow[/] sur http://{host}:{port}")
    uvicorn.run("inkflow.api.app:app", host=host, port=port, log_level="info")
@app.command()
 def analyze(
    slug: str,
    chapter: Optional[int] = typer.Option(None, help="Index de chapitre unique (def: tous)."),
    limit: Optional[int] = typer.Option(None, help="Limiter au N premiers chapitres rendus."),
    force: bool = typer.Option(False, help="Re-analyser meme si un artefact existe."),
 ):
    """Analyse Gemma : segments narration/dialogue + locuteurs + casting."""
    from .analysis.gemma import Gemma
    from .analysis.segmenter import analyze_chapter
    from .settings import get_settings
    book = load_book(slug)
    gemma = Gemma()
    dedup_gemma = gemma if get_settings().dedup_use_gemma else None
    cast = artifacts.load_cast(slug)
    chars = list(cast.characters)
    targets = [c for c in book.render_chapters]
    if chapter is not None:
        targets = [c for c in book.chapters if c.index == chapter]
    elif limit:
        targets = targets[:limit]
    for ch in targets:
        if not force and artifacts.analysis_path(slug, ch.index).exists():
            console.print(f"[dim]ch{ch.index:02d} deja analyse — ignore.[/]")
            continue
        ct = load_chapter_text(slug, ch)
        console.print(f"[blue]Analyse[/] ch{ch.index:02d} — {ch.title} ({ct.word_count} mots)…")
        try:
            # La dedup est faite dans analyze_chapter : `chars` recoit le cast
            # cumule reconcilie.
            analysis, chars = analyze_chapter(
                ch, ct, gemma, book_chars=chars, dedup_gemma=dedup_gemma)
        except Exception as exc:  # noqa: BLE001 — un chapitre ne doit pas tout stopper
            console.print(f"  [yellow]! echec, chapitre ignore: {exc}[/]")
            continue
        artifacts.save_analysis(slug, analysis)
        n_dlg = sum(1 for s in analysis.segments if s.type.value == "dialogue")
        console.print(f"  -> {len(analysis.segments)} segments ({n_dlg} repliques), "
                      f"{len(chars)} personnages cumules.")
    cast = Cast(narrator_voice_id=cast.narrator_voice_id, characters=chars)
    artifacts.save_cast(slug, cast)
    console.print(f"[green]Casting[/] : {len(chars)} personnages -> cast.json")
@app.command()
 def pronounce(
    slug: str,
    chapter: Optional[int] = typer.Option(None, help="Index de chapitre (def: 1er rendu)."),
 ):
    """Propose des candidats de prononciation (Gemma) -> pronunciation.json."""
    from .analysis.gemma import Gemma
    from .analysis.pronunciation import merge_pronunciations, propose_pronunciations
    book = load_book(slug)
    ch = (next((c for c in book.chapters if c.index == chapter), None)
          if chapter is not None else (book.render_chapters[0] if book.render_chapters else None))
    if ch is None or not ch.text_file:
        console.print("[red]Chapitre introuvable.[/]"); raise typer.Exit(1)
    ct = load_chapter_text(slug, ch)
    gemma = Gemma()
    with console.status("Recherche des mots a risque…"):
        new = propose_pronunciations("\n".join(ct.paragraphs), gemma)
    pron = merge_pronunciations(artifacts.load_pronunciation(slug), new)
    artifacts.save_pronunciation(slug, pron)
    table = Table("terme", "prononciation", "note")
    for e in pron.entries:
        table.add_row(e.term, e.replacement, e.note or "")
    console.print(table)
    console.print(f"[green]{len(pron.entries)} entrees[/] -> pronunciation.json")
@app.command()
 def cast(
    slug: str,
    rebuild_voicebank: bool = typer.Option(False, help="Regenere les clips de la voicebank."),
    dedup: bool = typer.Option(False, help="Deduplique d'abord les variantes de noms (heuristique)."),
    llm: bool = typer.Option(False, "--llm", help="Ajoute la passe Gemma a la dedup (moins sur)."),
 ):
    """Construit la voicebank (si besoin) et auto-assigne les voix au casting."""
    from .casting.assign import assign_voices
    from .casting.voicebank import build_voicebank, load_voicebank
    cast = artifacts.load_cast(slug)
    if not cast.characters:
        console.print("[yellow]Aucun personnage — lance d'abord `analyze`.[/]")
        raise typer.Exit(1)
    if dedup:
        from .casting.dedup import dedup_cast
        from .models import Cast
        gemma = None
        if llm:
            from .analysis.gemma import Gemma
            gemma = Gemma()
        before = len(cast.characters)
        with console.status("Deduplication du casting…"):
            chars = dedup_cast(cast.characters, gemma)
        cast = Cast(narrator_voice_id=cast.narrator_voice_id, characters=chars)
        artifacts.save_cast(slug, cast)
        console.print(f"[green]Dedup[/] : {before} -> {len(chars)} personnages.")
    vb = load_voicebank()
    if rebuild_voicebank or not vb.entries or not any(e.ref_audio for e in vb.entries):
        with console.status("Generation des clips de la voicebank…"):
            vb = build_voicebank(regenerate=rebuild_voicebank)
        console.print(f"[green]Voicebank[/] : {len(vb.entries)} voix, clips generes.")
    cast = assign_voices(cast.characters, vb, narrator_voice_id=cast.narrator_voice_id)
    artifacts.save_cast(slug, cast)
    table = Table("personnage", "genre", "voix")
    table.add_row("[narrateur]", "", cast.narrator_voice_id or "")
    for ch in cast.characters:
        table.add_row(ch.name, ch.gender or "?", ch.voice_id or "")
    console.print(table)
@app.command()
 def render(
    slug: str,
    chapter: int = typer.Argument(..., help="Index du chapitre a synthetiser."),
    backend: str = typer.Option("kokoro", help="Moteur TTS: kokoro | qwen3."),
    mono: bool = typer.Option(True, help="Mono-narrateur (sinon multi-voix via cast)."),
    max_paragraphs: Optional[int] = typer.Option(None, help="Limiter (test rapide)."),
 ):
    """Synthetise un chapitre en MP3 dans output/<livre>/."""
    from .pipeline.render import (
        build_units_mono,
        build_units_multi,
        render_chapter_to_mp3,
    )
    from .tts.base import VoiceSpec
    from .tts.factory import get_backend
    book = load_book(slug)
    ch = next((c for c in book.chapters if c.index == chapter), None)
    if ch is None or not ch.text_file:
        console.print(f"[red]Chapitre {chapter} introuvable ou non rendu.[/]")
        raise typer.Exit(1)
    ct = load_chapter_text(slug, ch)
    if max_paragraphs:
        ct.paragraphs = ct.paragraphs[:max_paragraphs]
    tts = get_backend(backend)
    pron = artifacts.load_pronunciation(slug)
    if mono:
        units = build_units_mono(ct, tts.default_voice())
    else:
        from .casting.voicebank import load_voicebank, voice_spec_for
        from .pipeline.render import make_voice_resolver
        analysis = artifacts.load_analysis(slug, chapter)
        cast_data = artifacts.load_cast(slug)
        vb = load_voicebank()
        # Voix narrateur par defaut depuis la voicebank si disponible.
        narrator_entry = vb.by_id(cast_data.narrator_voice_id) if cast_data.narrator_voice_id else None
        default_voice = (voice_spec_for(narrator_entry, backend)
                         if narrator_entry else tts.default_voice())
        resolver = make_voice_resolver(cast_data, vb, backend)
        units = build_units_multi(analysis, resolver, default_voice)
    with console.status(f"Synthese de {len(units)} unites ({backend})…"):
        def _p(done, total):
            console.print(f"  unite {done}/{total}", end="\r")
        track = (book.render_chapters.index(ch) + 1) if ch in book.render_chapters else None
        mp3 = render_chapter_to_mp3(book, ch, units, tts, pron=pron, track=track, progress=_p)
    console.print(f"\n[green]MP3:[/] {mp3}")
 def _print_chapters(book) -> None:
    table = Table(show_header=True, header_style="bold")
    for col in ("idx", "kind", "render", "pov", "mots", "sortie", "titre"):
        table.add_column(col)
    for c in book.chapters:
        table.add_row(
            str(c.index), c.kind.value, "✓" if c.render else "·",
            c.pov or "", str(c.word_count), c.output_name or "",
            c.title)
    console.print(table)
 if __name__ == "__main__":
    app()
--- a/backend/inkflow/config.py
+++ b/backend/inkflow/config.py
@@ -0,0 +1,96 @@
 """Configuration centrale d'InkFlow.
 Toutes les constantes (chemins, identifiants de modeles MLX, parametres par
 defaut) sont regroupees ici pour rester facilement surchargeables via variables
 d'environnement.
 """
 from __future__ import annotations
 import os
 from pathlib import Path
 # --- Racines du projet -------------------------------------------------------
 # config.py est dans backend/inkflow/, la racine projet est donc deux niveaux
 # au-dessus de backend/.
 BACKEND_DIR = Path(__file__).resolve().parents[1]
 PROJECT_ROOT = BACKEND_DIR.parent
 def _env_path(var: str, default: Path) -> Path:
    return Path(os.environ.get(var, default)).expanduser().resolve()
 # Donnees de travail (etat par livre : json, db, wav intermediaires)
 DATA_DIR = _env_path("INKFLOW_DATA_DIR", PROJECT_ROOT / "data")
 # Sortie finale (1 dossier par livre, 1 mp3 par chapitre)
 OUTPUT_DIR = _env_path("INKFLOW_OUTPUT_DIR", PROJECT_ROOT / "output")
 # Banque de voix de reference (clips + metadata.json)
 VOICEBANK_DIR = _env_path("INKFLOW_VOICEBANK_DIR", PROJECT_ROOT / "voicebank")
 # Echantillons fournis
 SAMPLES_DIR = PROJECT_ROOT / "samples"
 # --- Modeles MLX (HuggingFace mlx-community) ---------------------------------
 # Analyse de texte : Gemma via mlx-lm.
 GEMMA_MODEL = os.environ.get(
    "INKFLOW_GEMMA_MODEL", "mlx-community/gemma-3-4b-it-4bit"
 )
 # TTS : Qwen3-TTS (rendu final, clonage) et Kokoro (preview rapide).
 QWEN3_TTS_MODEL = os.environ.get(
    "INKFLOW_QWEN3_MODEL", "mlx-community/Qwen3-TTS-12Hz-1.7B-Base-8bit"
 )
 KOKORO_MODEL = os.environ.get(
    "INKFLOW_KOKORO_MODEL", "mlx-community/Kokoro-82M-bf16"
 )
 # --- Parametres TTS ----------------------------------------------------------
 DEFAULT_LANGUAGE = os.environ.get("INKFLOW_LANGUAGE", "French")
 # Code langue Kokoro (misaki) : 'f' = francais.
 KOKORO_LANG_CODE = os.environ.get("INKFLOW_KOKORO_LANG", "f")
 # Voix Kokoro par defaut pour les previews / mono-narrateur rapide.
 KOKORO_DEFAULT_VOICE = os.environ.get("INKFLOW_KOKORO_VOICE", "ff_siwis")
 # Voix Qwen3 par defaut (narrateur) si aucun clip de reference fourni.
 QWEN3_DEFAULT_VOICE = os.environ.get("INKFLOW_QWEN3_VOICE", "Chelsie")
 # Frequence d'echantillonnage cible pour la concatenation (Hz). Les backends
 # renvoient leur propre sr ; postprocess reechantillonne au besoin.
 TARGET_SAMPLE_RATE = int(os.environ.get("INKFLOW_SAMPLE_RATE", "24000"))
 # Encodage mp3 final.
 MP3_BITRATE = os.environ.get("INKFLOW_MP3_BITRATE", "128k")
 # Cible de normalisation loudness (LUFS approx via pydub gain).
 TARGET_DBFS = float(os.environ.get("INKFLOW_TARGET_DBFS", "-18.0"))
 def book_data_dir(book_slug: str) -> Path:
    """Dossier de travail pour un livre (artefacts intermediaires)."""
    return DATA_DIR / book_slug
 def book_output_dir(book_title: str) -> Path:
    """Dossier de sortie final pour un livre (mp3 par chapitre)."""
    return OUTPUT_DIR / book_title
 def ensure_dirs() -> None:
    for d in (DATA_DIR, OUTPUT_DIR, VOICEBANK_DIR):
        d.mkdir(parents=True, exist_ok=True)
 def setup_espeak() -> None:
    """Localise libespeak-ng pour phonemizer (requis par Kokoro non-anglais).
    phonemizer ne trouve pas toujours la lib installee via brew ; on pointe
    explicitement PHONEMIZER_ESPEAK_LIBRARY si la variable n'est pas deja fixee.
    """
    if os.environ.get("PHONEMIZER_ESPEAK_LIBRARY"):
        return
    candidates = [
        "/opt/homebrew/lib/libespeak-ng.dylib",
        "/usr/local/lib/libespeak-ng.dylib",
        "/opt/homebrew/lib/libespeak-ng.1.dylib",
    ]
    for path in candidates:
        if os.path.exists(path):
            os.environ["PHONEMIZER_ESPEAK_LIBRARY"] = path
            return
--- a/backend/inkflow/epub/init.py
+++ b/backend/inkflow/epub/init.py
--- a/backend/inkflow/epub/parser.py
+++ b/backend/inkflow/epub/parser.py
@@ -0,0 +1,267 @@
 """Parsing EPUB -> structure de livre normalisee.
 Strategie :
 - ebooklib lit l'archive (manifest + spine + ncx).
 - L'ordre de lecture vient du spine.
 - Les titres viennent de la table des matieres (ncx/nav), mappes par href.
 - Le texte de chaque document est extrait via BeautifulSoup (paragraphes).
 - On classe chaque item en front / chapter / back et on decide s'il faut le lire.
 Sorties ecrites dans data/<slug>/ :
 - book.json                : metadonnees + liste des chapitres (modele Book)
 - chapters/chNN.json       : texte normalise par chapitre (modele ChapterText)
 - cover.<ext>              : couverture extraite (si presente)
 """
 from __future__ import annotations
 import re
 import warnings
 from pathlib import Path
 from typing import Optional
 from urllib.parse import unquote, urldefrag
 import ebooklib
 from bs4 import BeautifulSoup
 from ebooklib import epub
 # Les xhtml d'epub declenchent un avertissement bs4 inoffensif ; on le tait.
 try:
    from bs4 import XMLParsedAsHTMLWarning
    warnings.filterwarnings("ignore", category=XMLParsedAsHTMLWarning)
 except ImportError:  # pragma: no cover
    pass
 from ..config import book_data_dir
 from ..models import Book, Chapter, ChapterKind, ChapterText
 from ..util import safe_filename, slugify
 # Un titre de chapitre commence par un numero, PROLOGUE ou EPILOGUE.
 _CHAPTER_RE = re.compile(r"^\s*(\d+|prologue|[ée]pilogue)\b", re.IGNORECASE)
 # Capture "<numero> - <POV>" ou juste "<numero>".
 _TITLE_PARTS_RE = re.compile(r"^\s*([^-\n]+?)(?:\s*[-–—]\s*(.+))?\s*$")
 # Seuil de mots pour qu'un element de back matter (remerciements...) soit lu.
 _BACK_MATTER_MIN_WORDS = 40
 def _build_toc_titles(book: epub.EpubBook) -> dict[str, str]:
    """Mappe href (sans fragment) -> titre, en aplatissant la toc ncx/nav."""
    titles: dict[str, str] = {}
    def walk(items) -> None:
        for it in items:
            if isinstance(it, tuple):  # (Section, [children])
                section, children = it
                if isinstance(section, epub.Link):
                    _add(section)
                walk(children)
            elif isinstance(it, list):
                walk(it)
            elif isinstance(it, epub.Link):
                _add(it)
    def _add(link: epub.Link) -> None:
        href = unquote(urldefrag(link.href)[0])
        if href and href not in titles and link.title:
            titles[href] = link.title.strip()
    walk(book.toc)
    return titles
 def _extract_paragraphs(html: bytes) -> list[str]:
    """Extrait les paragraphes lisibles d'un document xhtml."""
    soup = BeautifulSoup(html, "lxml")
    # Retire les elements non narratifs.
    for tag in soup(["script", "style", "sup", "table"]):
        tag.decompose()
    paragraphs: list[str] = []
    blocks = soup.find_all(["p", "h1", "h2", "h3", "h4", "blockquote", "li"])
    if not blocks and soup.body:
        blocks = [soup.body]
    for block in blocks:
        text = block.get_text(" ", strip=True)
        text = re.sub(r"\s+", " ", text).strip()
        if text:
            paragraphs.append(text)
    return paragraphs
 def _parse_title(title: str) -> tuple[Optional[str], Optional[str]]:
    """Decoupe un titre de chapitre en (numero, pov)."""
    m = _TITLE_PARTS_RE.match(title)
    if not m:
        return None, None
    number = (m.group(1) or "").strip() or None
    pov = (m.group(2) or "").strip() or None
    return number, pov
 def _output_name(seq: int, kind: ChapterKind, number: Optional[str], title: str) -> str:
    """Nom de mp3 calque sur le format du sample (NN-<libelle>.mp3)."""
    prefix = f"{seq:02d}"
    label: str
    if kind is ChapterKind.CHAPTER and number:
        low = number.lower()
        if low == "prologue":
            label = "Prologue"
        elif low in ("epilogue", "épilogue"):
            label = "Épilogue"
        elif number.isdigit():
            label = f"Chapitre {int(number)}"
        else:
            label = number.capitalize()
    else:
        label = title
    if label.isupper():  # titres tout-majuscule (ex "REMERCIEMENTS")
        label = label.capitalize()
    return safe_filename(f"{prefix}-{label}") + ".mp3"
 def _classify(ordered: list[dict]) -> None:
    """Affecte kind/render a chaque item (mutation en place).
    front  = avant le premier chapitre numerote (couverture, page de titre...)
    chapter = correspond au motif de titre de chapitre
    back   = apres le dernier chapitre (remerciements, glossaire...)
    """
    chapter_idxs = [
        i for i, it in enumerate(ordered)
        if it["title"] and _CHAPTER_RE.match(it["title"])
    ]
    first = chapter_idxs[0] if chapter_idxs else len(ordered)
    last = chapter_idxs[-1] if chapter_idxs else -1
    for i, it in enumerate(ordered):
        is_chapter = bool(it["title"]) and bool(_CHAPTER_RE.match(it["title"]))
        if is_chapter:
            it["kind"] = ChapterKind.CHAPTER
            it["render"] = it["word_count"] > 0
        elif i < first:
            it["kind"] = ChapterKind.FRONT
            it["render"] = False
        else:  # i > last (back matter)
            it["kind"] = ChapterKind.BACK
            it["render"] = it["word_count"] >= _BACK_MATTER_MIN_WORDS
 def _extract_cover(book: epub.EpubBook, dest_dir: Path) -> Optional[str]:
    cover_item = None
    for item in book.get_items_of_type(ebooklib.ITEM_COVER):
        cover_item = item
        break
    if cover_item is None:  # fallback : item nomme "cover"
        for item in book.get_items_of_type(ebooklib.ITEM_IMAGE):
            if "cover" in item.get_name().lower():
                cover_item = item
                break
    if cover_item is None:
        return None
    ext = Path(cover_item.get_name()).suffix or ".jpg"
    dest = dest_dir / f"cover{ext}"
    dest.write_bytes(cover_item.get_content())
    return dest.name
 def parse_epub(epub_path: str | Path, slug: Optional[str] = None) -> Book:
    """Parse un EPUB et ecrit book.json + chapters/chNN.json dans data/<slug>/."""
    epub_path = Path(epub_path)
    book_ml = epub.read_epub(str(epub_path), options={"ignore_ncx": False})
    title = _meta(book_ml, "title") or epub_path.stem
    author = _meta(book_ml, "creator")
    description = _meta(book_ml, "description")
    language = _meta(book_ml, "language") or "fr"
    slug = slug or slugify(title)
    data_dir = book_data_dir(slug)
    chapters_dir = data_dir / "chapters"
    chapters_dir.mkdir(parents=True, exist_ok=True)
    toc_titles = _build_toc_titles(book_ml)
    # Documents dans l'ordre du spine.
    id_to_item = {it.get_id(): it for it in book_ml.get_items()}
    ordered: list[dict] = []
    for idref, _linear in book_ml.spine:
        item = id_to_item.get(idref)
        if item is None or item.get_type() != ebooklib.ITEM_DOCUMENT:
            continue
        href = unquote(item.get_name())
        paragraphs = _extract_paragraphs(item.get_content())
        title_txt = toc_titles.get(href, "")
        ordered.append({
            "item_id": idref,
            "src": href,
            "title": title_txt,
            "paragraphs": paragraphs,
            "word_count": sum(len(p.split()) for p in paragraphs),
        })
    _classify(ordered)
    cover_file = _extract_cover(book_ml, data_dir)
    chapters: list[Chapter] = []
    seq = 0  # compteur de prefixe sur les seuls chapitres rendus
    for index, it in enumerate(ordered):
        number = pov = None
        if it["kind"] is ChapterKind.CHAPTER:
            number, pov = _parse_title(it["title"])
        text_file = None
        output_name = None
        if it["render"]:
            seq += 1
            ct = ChapterText(index=index, title=it["title"] or it["src"],
                             paragraphs=it["paragraphs"])
            text_file = f"chapters/ch{index:02d}.json"
            (data_dir / text_file).write_text(
                ct.model_dump_json(indent=2), encoding="utf-8")
            output_name = _output_name(seq, it["kind"], number, it["title"] or "")
        chapters.append(Chapter(
            index=index,
            item_id=it["item_id"],
            src=it["src"],
            title=it["title"] or it["src"],
            kind=it["kind"],
            render=it["render"],
            number=number,
            pov=pov,
            word_count=it["word_count"],
            text_file=text_file,
            output_name=output_name,
        ))
    book = Book(
        slug=slug,
        title=title,
        author=author,
        language=(language[:2] if language else "fr"),
        description=description,
        cover_file=cover_file,
        chapters=chapters,
    )
    (data_dir / "book.json").write_text(
        book.model_dump_json(indent=2), encoding="utf-8")
    return book
 def _meta(book: epub.EpubBook, name: str) -> Optional[str]:
    values = book.get_metadata("DC", name)
    if values:
        return values[0][0]
    return None
 def load_book(slug: str) -> Book:
    path = book_data_dir(slug) / "book.json"
    return Book.model_validate_json(path.read_text(encoding="utf-8"))
 def load_chapter_text(slug: str, chapter: Chapter) -> ChapterText:
    path = book_data_dir(slug) / chapter.text_file
    return ChapterText.model_validate_json(path.read_text(encoding="utf-8"))
--- a/backend/inkflow/models.py
+++ b/backend/inkflow/models.py
@@ -0,0 +1,176 @@
 """Schemas de donnees partages dans tout le pipeline (pydantic v2).
 Ces modeles sont serialises en JSON sur disque (book.json, analysis/chNN.json,
 cast.json, pronunciation.json) et constituent le contrat entre les etapes du
 pipeline. Chaque etape lit l'artefact de la precedente et ecrit le sien.
 """
 from __future__ import annotations
 from enum import Enum
 from typing import Optional
 from pydantic import BaseModel, Field
 class ChapterKind(str, Enum):
    FRONT = "front"      # couverture, page de titre, mentions editeur (non lu)
    CHAPTER = "chapter"  # prologue, chapitres numerotes, epilogue (lu)
    BACK = "back"        # remerciements, glossaire... (lu si texte significatif)
 class Chapter(BaseModel):
    index: int                       # ordre dans le spine (0-based)
    item_id: str                     # idref du manifest opf
    src: str                         # chemin interne xhtml
    title: str                       # titre toc brut, ex "1 - ELVI"
    kind: ChapterKind
    render: bool                     # doit-on synthetiser l'audio ?
    number: Optional[str] = None     # "1", "PROLOGUE", "EPILOGUE"...
    pov: Optional[str] = None        # personnage point de vue, ex "ELVI"
    word_count: int = 0
    text_file: Optional[str] = None  # chemin relatif du json de texte (chapters/chNN.json)
    output_name: Optional[str] = None  # nom du mp3 final, ex "02-Chapitre 1.mp3"
 class Book(BaseModel):
    slug: str                        # identifiant interne (dossier data)
    title: str
    author: Optional[str] = None
    language: str = "fr"
    description: Optional[str] = None
    cover_file: Optional[str] = None  # chemin du cover extrait dans data/<slug>/
    chapters: list[Chapter] = Field(default_factory=list)
    @property
    def render_chapters(self) -> list[Chapter]:
        return [c for c in self.chapters if c.render]
 class ChapterText(BaseModel):
    """Texte brut normalise d'un chapitre (sortie du parser)."""
    index: int
    title: str
    paragraphs: list[str] = Field(default_factory=list)
    @property
    def word_count(self) -> int:
        return sum(len(p.split()) for p in self.paragraphs)
 # --- Analyse (etape Gemma) ---------------------------------------------------
 class SegmentType(str, Enum):
    NARRATION = "narration"
    DIALOGUE = "dialogue"
 class Incise(BaseModel):
    """Borne d'une incise de narration inseree dans une replique de dialogue.
    Offsets (caracteres) dans `Segment.text` : la sous-chaine `text[start:end]`
    est de la narration (ex: "dit-il", "lanca Drummer") a porter par la voix du
    narrateur au rendu, sans fragmenter la replique persistee.
    """
    start: int   # offset inclus
    end: int     # offset exclu
 class Segment(BaseModel):
    """Unite de synthese : un bout de texte attribue a un locuteur."""
    type: SegmentType
    text: str
    speaker: str = "narrateur"       # "narrateur" ou nom de personnage
    glued_to_prev: bool = False      # sous-segment issu du meme paragraphe (incise)
                                     # -> gap audio reduit avec le segment precedent
    incises: list[Incise] = Field(default_factory=list)  # spans narrateur DANS text
 class ChapterAnalysis(BaseModel):
    index: int
    title: str
    segments: list[Segment] = Field(default_factory=list)
 class Character(BaseModel):
    name: str                        # nom canonique
    aliases: list[str] = Field(default_factory=list)
    gender: Optional[str] = None     # "male" | "female" | "unknown"
    age: Optional[str] = None        # "child" | "young" | "adult" | "old"
    description: Optional[str] = None
    voice_id: Optional[str] = None   # id dans la voicebank (assigne au casting)
 class Cast(BaseModel):
    narrator_voice_id: Optional[str] = None
    characters: list[Character] = Field(default_factory=list)
 class VoiceEntry(BaseModel):
    """Une voix de la banque, agnostique du moteur.
    `kokoro_voice` est l'identite (rendu Kokoro direct + clip de reference) ;
    `ref_audio`/`ref_text` servent au clonage Qwen3 (rendu final).
    """
    id: str                          # ex "fr_f_siwis"
    kokoro_voice: str                # ex "ff_siwis"
    gender: str = "unknown"          # male | female | unknown
    age: str = "adult"              # child | young | adult | old
    lang: str = "fr"
    label: Optional[str] = None      # libelle lisible
    ref_audio: Optional[str] = None  # chemin du clip (relatif a voicebank/)
    ref_text: Optional[str] = None   # transcription du clip
 class Voicebank(BaseModel):
    entries: list[VoiceEntry] = Field(default_factory=list)
    def by_id(self, voice_id: str) -> Optional[VoiceEntry]:
        return next((e for e in self.entries if e.id == voice_id), None)
    def by_gender(self, gender: str) -> list[VoiceEntry]:
        return [e for e in self.entries if e.gender == gender]
 class PronunciationEntry(BaseModel):
    term: str                        # graphie d'origine, ex "Tiamat"
    replacement: str                 # graphie phonetique guidee, ex "Tia-mat"
    note: Optional[str] = None
    enabled: bool = True
 class Pronunciation(BaseModel):
    entries: list[PronunciationEntry] = Field(default_factory=list)
 # --- Etat du projet (orchestration / UI) ------------------------------------
 class StageStatus(str, Enum):
    PENDING = "pending"
    RUNNING = "running"
    DONE = "done"
    ERROR = "error"
 class ChapterRenderState(BaseModel):
    index: int
    status: StageStatus = StageStatus.PENDING
    progress: float = 0.0            # 0..1
    mp3: Optional[str] = None        # nom du fichier de sortie
    backend: Optional[str] = None
    error: Optional[str] = None
 class ProjectState(BaseModel):
    """Etat persistant d'un livre, pilote par l'orchestrateur et lu par l'UI."""
    slug: str
    title: str
    stages: dict[str, StageStatus] = Field(default_factory=dict)  # parse/analyze/cast/pronounce
    analyzed_chapters: list[int] = Field(default_factory=list)
    render: dict[int, ChapterRenderState] = Field(default_factory=dict)
    # Job courant (pour l'affichage temps reel).
    active_stage: Optional[str] = None
    active_detail: Optional[str] = None
    active_progress: float = 0.0
    def stage(self, name: str) -> StageStatus:
        return self.stages.get(name, StageStatus.PENDING)
--- a/backend/inkflow/pipeline/init.py
+++ b/backend/inkflow/pipeline/init.py
--- a/backend/inkflow/pipeline/orchestrator.py
+++ b/backend/inkflow/pipeline/orchestrator.py
@@ -0,0 +1,364 @@
 """Orchestrateur : execute les etapes du pipeline en tache de fond, piste l'etat
 et diffuse l'etat complet a l'UI a chaque changement.
 - Un seul worker thread execute les jobs en serie (un Mac = une charge MLX a la
  fois). Les jobs sont enfiles et rendent la main immediatement a l'API.
 - L'etat (ProjectState) est persiste dans data/<slug>/state.json -> reprenable.
 - La diffusion passe par un `broadcaster` injecte par la couche API (pour rester
  independant de FastAPI). Il recoit (slug, dict_etat).
 """
 from __future__ import annotations
 import queue
 import threading
 import traceback
 from pathlib import Path
 from typing import Callable, Optional
 from ..config import book_data_dir, book_output_dir
 from ..epub.parser import load_book, load_chapter_text
 from ..models import ChapterRenderState, ProjectState, StageStatus
 from ..store import artifacts
 Broadcaster = Callable[[str, dict], None]
 def state_path(slug: str) -> Path:
    return book_data_dir(slug) / "state.json"
 def load_state(slug: str) -> ProjectState:
    path = state_path(slug)
    if path.exists():
        state = ProjectState.model_validate_json(path.read_text(encoding="utf-8"))
    else:
        book = load_book(slug)
        state = ProjectState(slug=slug, title=book.title,
                             stages={"parse": StageStatus.DONE})
    return _reconcile(slug, state)
 def _reconcile(slug: str, state: ProjectState) -> ProjectState:
    """Aligne l'etat sur les artefacts presents sur disque (reprise robuste).
    Permet a l'UI de refleter ce qui a deja ete fait, meme via la CLI ou apres
    un redemarrage, sans rejouer les etapes.
    """
    book = load_book(slug)
    state.stages.setdefault("parse", StageStatus.DONE)
    # Analyse : chapitres possedant un artefact d'analyse.
    analyzed = [c.index for c in book.render_chapters
                if artifacts.analysis_path(slug, c.index).exists()]
    if analyzed:
        for idx in analyzed:
            if idx not in state.analyzed_chapters:
                state.analyzed_chapters.append(idx)
        if state.stage("analyze") == StageStatus.PENDING:
            state.stages["analyze"] = (
                StageStatus.DONE if len(analyzed) == len(book.render_chapters)
                else StageStatus.RUNNING)
    # Casting : au moins une voix attribuee.
    cast = artifacts.load_cast(slug)
    if cast.narrator_voice_id or any(c.voice_id for c in cast.characters):
        state.stages.setdefault("cast", StageStatus.DONE)
    # Prononciation : au moins une entree.
    if artifacts.load_pronunciation(slug).entries:
        state.stages.setdefault("pronounce", StageStatus.DONE)
    # Rendu : mp3 presents en sortie.
    out_dir = book_output_dir(book.title)
    for ch in book.render_chapters:
        existing = state.render.get(ch.index)
        if existing and existing.mp3:
            continue
        if ch.output_name and (out_dir / ch.output_name).exists():
            state.render[ch.index] = ChapterRenderState(
                index=ch.index, status=StageStatus.DONE, progress=1.0,
                mp3=ch.output_name)
    return state
 class Orchestrator:
    def __init__(self) -> None:
        self._q: "queue.Queue[tuple[str, Callable[[], None]]]" = queue.Queue()
        self._worker: Optional[threading.Thread] = None
        self._broadcaster: Optional[Broadcaster] = None
        self._lock = threading.Lock()
        self.busy_slug: Optional[str] = None
    # --- infra ---------------------------------------------------------------
    def set_broadcaster(self, fn: Broadcaster) -> None:
        self._broadcaster = fn
    def _ensure_worker(self) -> None:
        if self._worker is None or not self._worker.is_alive():
            self._worker = threading.Thread(target=self._loop, daemon=True)
            self._worker.start()
    def _loop(self) -> None:
        while True:
            slug, job = self._q.get()
            self.busy_slug = slug
            try:
                job()
            except Exception:  # noqa: BLE001
                traceback.print_exc()
            finally:
                self.busy_slug = None
                self._q.task_done()
    def _save_and_emit(self, state: ProjectState) -> None:
        path = state_path(state.slug)
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(state.model_dump_json(indent=2), encoding="utf-8")
        if self._broadcaster:
            self._broadcaster(state.slug, state.model_dump(mode="json"))
    def enqueue(self, slug: str, job: Callable[[], None]) -> None:
        self._ensure_worker()
        self._q.put((slug, job))
    # --- etapes --------------------------------------------------------------
    def run_analyze(self, slug: str, chapter_indexes: Optional[list[int]] = None) -> None:
        def job() -> None:
            from ..analysis.gemma import Gemma
            from ..analysis.segmenter import analyze_chapter
            from ..models import Cast
            from ..settings import get_settings
            state = load_state(slug)
            book = load_book(slug)
            targets = [c for c in book.render_chapters
                       if chapter_indexes is None or c.index in chapter_indexes]
            state.stages["analyze"] = StageStatus.RUNNING
            state.active_stage = "analyze"
            self._save_and_emit(state)
            gemma = Gemma()
            dedup_gemma = gemma if get_settings().dedup_use_gemma else None
            cast = artifacts.load_cast(slug)
            chars = list(cast.characters)
            total = len(targets)
            for i, ch in enumerate(targets):
                state.active_detail = f"Analyse {ch.title}"
                state.active_progress = i / max(total, 1)
                self._save_and_emit(state)
                ct = load_chapter_text(slug, ch)
                try:
                    # La dedup est faite dans analyze_chapter : `chars` recoit le
                    # cast cumule reconcilie.
                    analysis, chars = analyze_chapter(
                        ch, ct, gemma, book_chars=chars, dedup_gemma=dedup_gemma)
                except Exception:  # noqa: BLE001 — chapitre ignore, on continue
                    traceback.print_exc()
                    continue
                artifacts.save_analysis(slug, analysis)
                if ch.index not in state.analyzed_chapters:
                    state.analyzed_chapters.append(ch.index)
                self._save_and_emit(state)
            artifacts.save_cast(slug, Cast(
                narrator_voice_id=cast.narrator_voice_id, characters=chars))
            state.stages["analyze"] = StageStatus.DONE
            self._finish(state)
        self.enqueue(slug, job)
    def run_cast(self, slug: str) -> None:
        def job() -> None:
            from ..casting.assign import assign_voices
            from ..casting.voicebank import build_voicebank, load_voicebank
            state = load_state(slug)
            state.stages["cast"] = StageStatus.RUNNING
            state.active_stage = "cast"
            state.active_detail = "Preparation de la voicebank"
            self._save_and_emit(state)
            vb = load_voicebank()
            if not vb.entries or not any(e.ref_audio for e in vb.entries):
                vb = build_voicebank()
            cast = artifacts.load_cast(slug)
            cast = assign_voices(cast.characters, vb,
                                 narrator_voice_id=cast.narrator_voice_id)
            artifacts.save_cast(slug, cast)
            state.stages["cast"] = StageStatus.DONE
            self._finish(state)
        self.enqueue(slug, job)
    def run_cast_analyze(self, slug: str, chapter_indexes: Optional[list[int]] = None) -> None:
        """(Re)extrait les personnages d'un/des chapitre(s) et les reconcilie.
        Plus leger que `run_analyze` : ne re-segmente pas (les artefacts d'analyse
        existants restent intacts). Sert le casting "a l'echelle d'un chapitre"
        tout en maintenant la coherence du livre (deduplication).
        """
        def job() -> None:
            from ..analysis.gemma import Gemma
            from ..analysis.segmenter import extract_characters
            from ..casting.dedup import reconcile_characters
            from ..models import Cast
            from ..settings import get_settings
            state = load_state(slug)
            book = load_book(slug)
            targets = [c for c in book.render_chapters
                       if chapter_indexes is None or c.index in chapter_indexes]
            state.active_stage = "cast"
            self._save_and_emit(state)
            gemma = Gemma()
            dedup_gemma = gemma if get_settings().dedup_use_gemma else None
            cast = artifacts.load_cast(slug)
            chars = list(cast.characters)
            total = len(targets)
            for i, ch in enumerate(targets):
                state.active_detail = f"Casting — {ch.title}"
                state.active_progress = i / max(total, 1)
                self._save_and_emit(state)
                ct = load_chapter_text(slug, ch)
                try:
                    found = extract_characters("\n".join(ct.paragraphs), gemma)
                    speakers: list[str] = []
                    if artifacts.analysis_path(slug, ch.index).exists():
                        analysis = artifacts.load_analysis(slug, ch.index)
                        speakers = [s.speaker for s in analysis.segments]
                    chars, _ = reconcile_characters(
                        chars, found, dedup_gemma, speaker_names=speakers)
                except Exception:  # noqa: BLE001 — chapitre ignore, on continue
                    traceback.print_exc()
                    continue
                artifacts.save_cast(slug, Cast(
                    narrator_voice_id=cast.narrator_voice_id, characters=chars))
                self._save_and_emit(state)
            self._finish(state)
        self.enqueue(slug, job)
    def run_dedup_cast(self, slug: str) -> None:
        """Replie les doublons d'un casting deja constitue (Holden/James Holden...)."""
        def job() -> None:
            from ..analysis.gemma import Gemma
            from ..casting.dedup import dedup_cast
            from ..models import Cast
            from ..settings import get_settings
            state = load_state(slug)
            state.active_stage = "cast"
            state.active_detail = "Deduplication du casting"
            self._save_and_emit(state)
            cast = artifacts.load_cast(slug)
            gemma = Gemma() if get_settings().dedup_use_gemma else None
            chars = dedup_cast(cast.characters, gemma)
            artifacts.save_cast(slug, Cast(
                narrator_voice_id=cast.narrator_voice_id, characters=chars))
            self._finish(state)
        self.enqueue(slug, job)
    def run_pronounce(self, slug: str) -> None:
        def job() -> None:
            from ..analysis.gemma import Gemma
            from ..analysis.pronunciation import (
                merge_pronunciations,
                propose_pronunciations,
            )
            state = load_state(slug)
            book = load_book(slug)
            state.stages["pronounce"] = StageStatus.RUNNING
            state.active_stage = "pronounce"
            self._save_and_emit(state)
            gemma = Gemma()
            pron = artifacts.load_pronunciation(slug)
            targets = book.render_chapters[:3]  # echantillon de chapitres
            for i, ch in enumerate(targets):
                state.active_detail = f"Mots a risque — {ch.title}"
                state.active_progress = i / max(len(targets), 1)
                self._save_and_emit(state)
                ct = load_chapter_text(slug, ch)
                pron = merge_pronunciations(
                    pron, propose_pronunciations("\n".join(ct.paragraphs), gemma))
            artifacts.save_pronunciation(slug, pron)
            state.stages["pronounce"] = StageStatus.DONE
            self._finish(state)
        self.enqueue(slug, job)
    def run_render(self, slug: str, chapter_indexes: list[int],
                   backend: Optional[str] = None, mono: bool = False) -> None:
        from ..settings import get_settings
        backend = backend or get_settings().default_backend
        def job() -> None:
            from ..casting.voicebank import load_voicebank, voice_spec_for
            from ..pipeline.render import (
                build_units_mono,
                build_units_multi,
                make_voice_resolver,
                render_chapter_to_mp3,
            )
            from ..tts.factory import get_backend
            state = load_state(slug)
            book = load_book(slug)
            state.stages["render"] = StageStatus.RUNNING
            state.active_stage = "render"
            self._save_and_emit(state)
            tts = get_backend(backend)
            pron = artifacts.load_pronunciation(slug)
            cast = artifacts.load_cast(slug)
            vb = load_voicebank()
            render_list = [c for c in book.render_chapters if c.index in chapter_indexes]
            for ch in render_list:
                rs = state.render.get(ch.index) or ChapterRenderState(index=ch.index)
                rs.status = StageStatus.RUNNING
                rs.progress = 0.0
                rs.backend = backend
                state.render[ch.index] = rs
                state.active_detail = f"Synthese — {ch.title}"
                self._save_and_emit(state)
                try:
                    ct = load_chapter_text(slug, ch)
                    if mono or ch.index not in state.analyzed_chapters:
                        units = build_units_mono(ct, tts.default_voice())
                    else:
                        analysis = artifacts.load_analysis(slug, ch.index)
                        narr = vb.by_id(cast.narrator_voice_id) if cast.narrator_voice_id else None
                        default_voice = (voice_spec_for(narr, backend)
                                         if narr else tts.default_voice())
                        resolver = make_voice_resolver(cast, vb, backend)
                        units = build_units_multi(analysis, resolver, default_voice)
                    def _p(done: int, total: int, _rs=rs, _state=state) -> None:
                        _rs.progress = done / max(total, 1)
                        _state.active_progress = _rs.progress
                        self._save_and_emit(_state)
                    track = book.render_chapters.index(ch) + 1
                    mp3 = render_chapter_to_mp3(book, ch, units, tts, pron=pron,
                                                track=track, progress=_p)
                    rs.status = StageStatus.DONE
                    rs.progress = 1.0
                    rs.mp3 = mp3.name
                except Exception as exc:  # noqa: BLE001
                    rs.status = StageStatus.ERROR
                    rs.error = str(exc)
                self._save_and_emit(state)
            state.stages["render"] = StageStatus.DONE
            self._finish(state)
        self.enqueue(slug, job)
    def _finish(self, state: ProjectState) -> None:
        state.active_stage = None
        state.active_detail = None
        state.active_progress = 0.0
        self._save_and_emit(state)
 # Singleton partage par l'API.
 orchestrator = Orchestrator()
--- a/backend/inkflow/pipeline/render.py
+++ b/backend/inkflow/pipeline/render.py
@@ -0,0 +1,158 @@
 """Rendu audio d'un chapitre : (segments + voix) -> WAV -> MP3.
 Une `RenderUnit` = un bout de texte + la voix a employer. On construit la liste
 d'unites (mono-narrateur ou multi-voix selon le casting), on synthetise chacune,
 on concatene avec des silences, on normalise puis on encode en MP3.
 """
 from __future__ import annotations
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Callable, Optional
 from ..analysis.pronunciation import apply_pronunciation
 from ..audio.postprocess import concat_segments, encode_mp3, normalize_loudness, write_wav
 from ..config import book_data_dir, book_output_dir
 from ..models import (
    Book,
    Chapter,
    ChapterAnalysis,
    ChapterText,
    Pronunciation,
    SegmentType,
 )
 from ..tts.base import TTSBackend, VoiceSpec
 # Resout un nom de locuteur en une voix concrete.
 VoiceResolver = Callable[[str], VoiceSpec]
@dataclass
 class RenderUnit:
    text: str
    voice: VoiceSpec
    speaker: str = "narrateur"
    glued_to_prev: bool = False   # incise -> gap reduit avec l'unite precedente
 def build_units_mono(ct: ChapterText, narrator: VoiceSpec) -> list[RenderUnit]:
    """Mono-narrateur : chaque paragraphe est lu par la voix du narrateur."""
    return [RenderUnit(text=p, voice=narrator) for p in ct.paragraphs if p.strip()]
 def make_voice_resolver(cast, voicebank, engine: str) -> VoiceResolver:
    """Construit un resolver locuteur -> VoiceSpec via le casting + la voicebank.
    Replie sur la voix du narrateur si le locuteur n'a pas de voix attribuee.
    """
    from ..casting.assign import resolve_speaker_voice
    from ..casting.voicebank import voice_spec_for
    def resolve(speaker: str):
        vid = resolve_speaker_voice(speaker, cast, voicebank)
        if vid is None:
            vid = cast.narrator_voice_id
        entry = voicebank.by_id(vid) if vid else None
        if entry is None:
            return None  # le backend utilisera sa voix par defaut
        return voice_spec_for(entry, engine)
    return resolve
 def build_units_multi(
    analysis: ChapterAnalysis,
    resolve: VoiceResolver,
    default_voice: "VoiceSpec",
 ) -> list[RenderUnit]:
    """Multi-voix : narration -> narrateur, dialogue -> voix du personnage.
    Les incises annotees sur une replique (bornes dans le texte) sont detachees
    ici, au dernier moment : la sous-chaine d'incise est portee par la voix du
    narrateur (`glued_to_prev` pour reduire le silence), le reste par la voix du
    personnage. Les repliques sans incise sont rendues entieres.
    """
    from ..analysis.segmenter import iter_incise_pieces
    narrator = resolve("narrateur") or default_voice
    units: list[RenderUnit] = []
    for seg in analysis.segments:
        if not seg.text.strip():
            continue
        if seg.type is SegmentType.NARRATION:
            units.append(RenderUnit(text=seg.text, voice=narrator,
                                    speaker="narrateur",
                                    glued_to_prev=seg.glued_to_prev))
            continue
        char_voice = resolve(seg.speaker) or default_voice
        if not seg.incises:
            units.append(RenderUnit(text=seg.text, voice=char_voice,
                                    speaker=seg.speaker,
                                    glued_to_prev=seg.glued_to_prev))
            continue
        for k, (is_incise, piece) in enumerate(
                iter_incise_pieces(seg.text, seg.incises)):
            glued = seg.glued_to_prev if k == 0 else True
            if is_incise:
                units.append(RenderUnit(text=piece, voice=narrator,
                                        speaker="narrateur", glued_to_prev=glued))
            else:
                units.append(RenderUnit(text=piece, voice=char_voice,
                                        speaker=seg.speaker, glued_to_prev=glued))
    return units
 def render_units(
    units: list[RenderUnit],
    backend: TTSBackend,
    *,
    pron: Optional[Pronunciation] = None,
    progress: Optional[Callable[[int, int], None]] = None,
 ) -> tuple["list", int]:
    """Synthetise toutes les unites et renvoie (liste (audio,sr), n_units)."""
    parts = []
    total = len(units)
    for i, unit in enumerate(units):
        text = apply_pronunciation(unit.text, pron) if pron else unit.text
        audio, sr = backend.synthesize(text, unit.voice)
        parts.append((audio, sr))
        if progress:
            progress(i + 1, total)
    return parts, total
 def render_chapter_to_mp3(
    book: Book,
    chapter: Chapter,
    units: list[RenderUnit],
    backend: TTSBackend,
    *,
    pron: Optional[Pronunciation] = None,
    track: Optional[int] = None,
    progress: Optional[Callable[[int, int], None]] = None,
 ) -> Path:
    """Pipeline complet pour un chapitre -> output/<livre>/NN-...mp3."""
    parts, _ = render_units(units, backend, pron=pron, progress=progress)
    # parts est aligne 1:1 avec units -> on transmet les marqueurs d'incise.
    audio, sr = concat_segments(parts, glued=[u.glued_to_prev for u in units])
    audio = normalize_loudness(audio)
    # WAV intermediaire dans data/, MP3 final dans output/.
    wav_path = book_data_dir(book.slug) / "audio" / f"ch{chapter.index:02d}.wav"
    write_wav(wav_path, audio, sr)
    out_dir = book_output_dir(book.title)
    mp3_path = out_dir / (chapter.output_name or f"ch{chapter.index:02d}.mp3")
    cover = None
    if book.cover_file:
        candidate = book_data_dir(book.slug) / book.cover_file
        cover = candidate if candidate.exists() else None
    encode_mp3(
        wav_path, mp3_path,
        title=chapter.title, album=book.title, artist=book.author,
        track=track, cover_path=cover,
    )
    return mp3_path
--- a/backend/inkflow/settings.py
+++ b/backend/inkflow/settings.py
@@ -0,0 +1,170 @@
 """Reglages techniques editables au runtime (globaux a l'app).
 Contrairement a `config.py` (constantes figees lues a l'import, surchargeables
 seulement par variables d'environnement au demarrage), ce module expose un objet
 `Settings` *persiste* dans `data/settings.json` et modifiable depuis l'UI.
 Les valeurs par defaut reprennent celles de `config.py`. Le code du pipeline
 consulte `get_settings()` au moment de l'execution ; une sauvegarde invalide les
 caches de modeles (backends TTS, chargement Gemma) pour que les nouveaux
 identifiants/parametres prennent effet sans redemarrage.
 """
 from __future__ import annotations
 import threading
 from typing import Optional
 from pydantic import BaseModel, Field
 from . import config
 # --- Prompts systeme par defaut (source canonique) ---------------------------
 # Ces chaines pilotent les trois taches Gemma. L'utilisateur peut les editer.
 DEFAULT_PROMPT_SPEAKERS = (
    "Tu es un assistant d'analyse litteraire. Tu identifies QUI prononce chaque "
    "replique de dialogue dans un extrait de roman en francais. Une liste des "
    "personnages du chapitre t'est fournie : choisis le locuteur dans cette "
    "liste en recopiant son nom EXACTEMENT. Appuie-toi sur la narration qui "
    "PRECEDE et qui SUIT chaque replique (incise d'attribution type 'dit "
    "Marie'), sur les vocatifs (le personnage a qui l'on s'adresse) et sur "
    "l'alternance des tours de parole. Mets 'inconnu' si tu n'es pas sur. Tu "
    "reponds UNIQUEMENT en JSON valide, sans texte autour."
 )
 DEFAULT_PROMPT_SPEAKERS_REFINE = (
    "Tu es un assistant d'analyse litteraire. On te donne des repliques dont le "
    "locuteur est reste indetermine, avec le locuteur DEJA identifie des "
    "repliques voisines. Deduis qui parle en exploitant l'alternance des tours "
    "de parole et le contexte narratif autour. Choisis le nom dans la liste des "
    "personnages fournie, en le recopiant exactement, ou 'inconnu' si vraiment "
    "indeterminable. Tu reponds UNIQUEMENT en JSON valide, sans texte autour."
 )
 DEFAULT_PROMPT_CHARACTERS = (
    "Tu es un assistant d'analyse litteraire. Tu extrais la liste des "
    "personnages d'un extrait de roman et leurs attributs vocaux. Tu reponds "
    "UNIQUEMENT en JSON valide."
 )
 DEFAULT_PROMPT_PRONUNCIATION = (
    "Tu es un assistant de preparation de livre audio en francais. Tu reperes "
    "les mots dont la prononciation par un synthetiseur vocal francais risque "
    "d'etre incorrecte (noms propres etrangers, termes de science-fiction, "
    "acronymes). Tu reponds UNIQUEMENT en JSON valide."
 )
 DEFAULT_PROMPT_INCISES = (
    "Tu es un assistant d'analyse litteraire. Tu reperes les INCISES de "
    "narration inserees dans une replique de dialogue (ex: 'dit Mamie', "
    "'repondit le capitaine'). Tu reponds UNIQUEMENT en JSON valide, sans "
    "texte autour."
 )
 DEFAULT_PROMPT_DEDUP = (
    "Tu es un assistant d'analyse litteraire. Tu rapproches les differentes "
    "facons de nommer un meme personnage (nom complet, prenom, surnom, "
    "diminutif) pour eviter les doublons dans le casting d'un livre audio. Tu "
    "ne fusionnes deux noms que si c'est, avec certitude, la meme personne. Tu "
    "reponds UNIQUEMENT en JSON valide, sans texte autour."
 )
 class Settings(BaseModel):
    """Reglages techniques globaux, persistes dans data/settings.json."""
    # --- Modeles MLX (identifiants HuggingFace) ---
    gemma_model: str = config.GEMMA_MODEL
    qwen3_model: str = config.QWEN3_TTS_MODEL
    kokoro_model: str = config.KOKORO_MODEL
    # --- Generation Gemma ---
    gemma_temperature: float = Field(0.1, ge=0.0, le=2.0)
    gemma_max_tokens: int = Field(2048, ge=64, le=8192)
    # --- Prompts systeme (analyse) ---
    prompt_speakers: str = DEFAULT_PROMPT_SPEAKERS
    prompt_speakers_refine: str = DEFAULT_PROMPT_SPEAKERS_REFINE
    prompt_characters: str = DEFAULT_PROMPT_CHARACTERS
    prompt_pronunciation: str = DEFAULT_PROMPT_PRONUNCIATION
    prompt_incises: str = DEFAULT_PROMPT_INCISES  # DEPRECIE (detection deterministe)
    prompt_dedup: str = DEFAULT_PROMPT_DEDUP
    # --- Incises ---
    # DEPRECIE : la detection d'incises est desormais deterministe et conscience
    # du casting (analysis.segmenter.detect_incises), sans fallback Gemma. Champ
    # conserve pour charger les settings.json existants sans erreur.
    split_incises_use_gemma: bool = True
    # --- Attribution retroactive (2e passe sur les repliques indeterminees) ---
    # Apres la 1re passe, une 2e passe ciblee re-resout les repliques restees
    # 'inconnu' (ou peu sures) en s'appuyant sur les voisins deja identifies.
    # Declenchee seulement s'il reste des doutes -> cout nul sinon.
    retro_pass_use_gemma: bool = True
    # --- Deduplication du casting ---
    # Heuristique (sure, deterministe) par defaut. La passe Gemma rattache en
    # plus les variantes non evidentes (diminutifs, titres) mais, avec un petit
    # modele local, produit des fusions erronees -> opt-in.
    dedup_use_gemma: bool = False
    # --- TTS ---
    default_backend: str = "kokoro"
    language: str = config.DEFAULT_LANGUAGE
    kokoro_lang_code: str = config.KOKORO_LANG_CODE
    kokoro_default_voice: str = config.KOKORO_DEFAULT_VOICE
    qwen3_default_voice: str = config.QWEN3_DEFAULT_VOICE
    # --- Audio (encodage final) ---
    target_sample_rate: int = Field(config.TARGET_SAMPLE_RATE, ge=8000, le=48000)
    mp3_bitrate: str = config.MP3_BITRATE
    target_dbfs: float = Field(config.TARGET_DBFS, ge=-40.0, le=0.0)
 _LOCK = threading.Lock()
 _cache: Optional[Settings] = None
 def settings_path():
    return config.DATA_DIR / "settings.json"
 def get_settings() -> Settings:
    """Renvoie les reglages courants (charges depuis le disque une seule fois)."""
    global _cache
    with _LOCK:
        if _cache is None:
            path = settings_path()
            if path.exists():
                try:
                    _cache = Settings.model_validate_json(
                        path.read_text(encoding="utf-8"))
                except Exception:  # noqa: BLE001 — fichier corrompu -> defauts
                    _cache = Settings()
            else:
                _cache = Settings()
        return _cache
 def save_settings(settings: Settings) -> Settings:
    """Persiste les reglages et invalide les caches de modeles."""
    global _cache
    with _LOCK:
        _cache = settings
        path = settings_path()
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(settings.model_dump_json(indent=2), encoding="utf-8")
    _invalidate_model_caches()
    return settings
 def _invalidate_model_caches() -> None:
    """Force le rechargement des modeles apres un changement d'identifiant/param.
    `get_backend` est cache par *nom* de backend, pas par id de modele ; sans
    purge, un changement d'id serait ignore. Idem pour le chargement Gemma.
    """
    try:
        from .tts.factory import get_backend
        get_backend.cache_clear()
    except Exception:  # noqa: BLE001
        pass
    try:
        from .analysis.gemma import _load
        _load.cache_clear()
    except Exception:  # noqa: BLE001
        pass
--- a/backend/inkflow/store/init.py
+++ b/backend/inkflow/store/init.py
--- a/backend/inkflow/store/artifacts.py
+++ b/backend/inkflow/store/artifacts.py
@@ -0,0 +1,63 @@
 """Lecture/ecriture des artefacts du pipeline dans data/<slug>/.
 Chaque etape ecrit un JSON ; les etapes suivantes les relisent. C'est aussi ce
 qui rend le pipeline reprenable : on peut detecter qu'un artefact existe deja.
 """
 from __future__ import annotations
 from pathlib import Path
 from ..config import book_data_dir
 from ..models import Cast, ChapterAnalysis, Pronunciation
 def analysis_path(slug: str, chapter_index: int) -> Path:
    return book_data_dir(slug) / "analysis" / f"ch{chapter_index:02d}.json"
 def cast_path(slug: str) -> Path:
    return book_data_dir(slug) / "cast.json"
 def pronunciation_path(slug: str) -> Path:
    return book_data_dir(slug) / "pronunciation.json"
 def save_analysis(slug: str, analysis: ChapterAnalysis) -> Path:
    path = analysis_path(slug, analysis.index)
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(analysis.model_dump_json(indent=2), encoding="utf-8")
    return path
 def load_analysis(slug: str, chapter_index: int) -> ChapterAnalysis:
    path = analysis_path(slug, chapter_index)
    return ChapterAnalysis.model_validate_json(path.read_text(encoding="utf-8"))
 def save_cast(slug: str, cast: Cast) -> Path:
    path = cast_path(slug)
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(cast.model_dump_json(indent=2), encoding="utf-8")
    return path
 def load_cast(slug: str) -> Cast:
    path = cast_path(slug)
    if not path.exists():
        return Cast()
    return Cast.model_validate_json(path.read_text(encoding="utf-8"))
 def save_pronunciation(slug: str, pron: Pronunciation) -> Path:
    path = pronunciation_path(slug)
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(pron.model_dump_json(indent=2), encoding="utf-8")
    return path
 def load_pronunciation(slug: str) -> Pronunciation:
    path = pronunciation_path(slug)
    if not path.exists():
        return Pronunciation()
    return Pronunciation.model_validate_json(path.read_text(encoding="utf-8"))
--- a/backend/inkflow/tts/init.py
+++ b/backend/inkflow/tts/init.py
--- a/backend/inkflow/tts/base.py
+++ b/backend/inkflow/tts/base.py
@@ -0,0 +1,48 @@
 """Abstraction des moteurs TTS (backend pluggable).
 Deux implementations : Kokoro (rapide, voix preglees -> previews) et Qwen3-TTS
 (qualite + clonage par audio de reference -> rendu final). Toutes deux renvoient
 de l'audio mono float32 + une frequence d'echantillonnage.
 """
 from __future__ import annotations
 from abc import ABC, abstractmethod
 from dataclasses import dataclass
 from typing import Optional
 import numpy as np
@dataclass
 class VoiceSpec:
    """Decrit la voix a utiliser pour une synthese.
    - `preset` : nom d'une voix preglee (Kokoro: "ff_siwis" ; Qwen3: "Chelsie").
    - `ref_audio` / `ref_text` : clip de reference pour le clonage (Qwen3).
    """
    preset: Optional[str] = None
    ref_audio: Optional[str] = None
    ref_text: Optional[str] = None
    speed: float = 1.0
 class TTSBackend(ABC):
    """Interface commune a tous les moteurs TTS."""
    name: str = "base"
    @abstractmethod
    def synthesize(self, text: str, voice: VoiceSpec) -> tuple[np.ndarray, int]:
        """Synthetise `text` et renvoie (audio mono float32, sample_rate)."""
    def default_voice(self) -> VoiceSpec:
        return VoiceSpec()
 def to_mono_float32(audio) -> np.ndarray:
    """Normalise une sortie de modele (mx.array / np / list) en mono float32."""
    arr = np.asarray(audio, dtype=np.float32)
    if arr.ndim > 1:
        # (channels, n) ou (n, channels) -> moyenne sur l'axe des canaux.
        arr = arr.mean(axis=0) if arr.shape[0] < arr.shape[-1] else arr.mean(axis=-1)
    return np.ascontiguousarray(arr.reshape(-1))
--- a/backend/inkflow/tts/chunk.py
+++ b/backend/inkflow/tts/chunk.py
@@ -0,0 +1,62 @@
 """Decoupage de texte en morceaux synthese-friendly.
 Les modeles TTS (Kokoro notamment) tronquent les textes trop longs. On decoupe
 donc sur les frontieres de phrases en respectant une longueur max par morceau.
 """
 from __future__ import annotations
 import re
 # Fin de phrase : ponctuation forte suivie d'un espace.
 _SENTENCE_END_RE = re.compile(r"(?<=[.!?…])\s+|\n+")
 # Pour les phrases tres longues, on coupe aussi sur les virgules / points-virgules.
 _SOFT_BREAK_RE = re.compile(r"(?<=[,;:])\s+")
 DEFAULT_MAX_CHARS = 350
 def split_sentences(text: str) -> list[str]:
    parts = [p.strip() for p in _SENTENCE_END_RE.split(text)]
    return [p for p in parts if p]
 def _split_long(sentence: str, max_chars: int) -> list[str]:
    """Coupe une phrase trop longue sur les virgules, puis par fenetre dure."""
    if len(sentence) <= max_chars:
        return [sentence]
    out: list[str] = []
    buf = ""
    for piece in _SOFT_BREAK_RE.split(sentence):
        cand = f"{buf} {piece}".strip()
        if len(cand) <= max_chars:
            buf = cand
        else:
            if buf:
                out.append(buf)
            if len(piece) <= max_chars:
                buf = piece
            else:  # mot/segment plus long que la fenetre : coupe brute
                for i in range(0, len(piece), max_chars):
                    out.append(piece[i:i + max_chars])
                buf = ""
    if buf:
        out.append(buf)
    return out
 def chunk_text(text: str, max_chars: int = DEFAULT_MAX_CHARS) -> list[str]:
    """Regroupe les phrases en morceaux <= max_chars, sans couper une phrase."""
    chunks: list[str] = []
    buf = ""
    for sentence in split_sentences(text):
        for part in _split_long(sentence, max_chars):
            cand = f"{buf} {part}".strip()
            if len(cand) <= max_chars:
                buf = cand
            else:
                if buf:
                    chunks.append(buf)
                buf = part
    if buf:
        chunks.append(buf)
    return chunks
--- a/backend/inkflow/tts/factory.py
+++ b/backend/inkflow/tts/factory.py
@@ -0,0 +1,20 @@
 """Selection du backend TTS par nom (pluggable)."""
 from __future__ import annotations
 from functools import lru_cache
 from .base import TTSBackend
 BACKENDS = ("kokoro", "qwen3")
@lru_cache(maxsize=4)
 def get_backend(name: str = "kokoro") -> TTSBackend:
    name = name.lower()
    if name == "kokoro":
        from .kokoro import KokoroBackend
        return KokoroBackend()
    if name == "qwen3":
        from .qwen3 import Qwen3Backend
        return Qwen3Backend()
    raise ValueError(f"Backend TTS inconnu: {name!r} (dispo: {', '.join(BACKENDS)})")
--- a/backend/inkflow/tts/kokoro.py
+++ b/backend/inkflow/tts/kokoro.py
@@ -0,0 +1,93 @@
 """Backend Kokoro (rapide, voix preglees) — ideal pour les previews.
 Kokoro tronque les textes longs : on synthetise morceau par morceau (decoupage
 par phrases) puis on concatene. Le francais passe par espeak-ng via phonemizer.
 """
 from __future__ import annotations
 import logging
 import numpy as np
 from ..config import setup_espeak
 from ..settings import get_settings
 from .base import TTSBackend, VoiceSpec, to_mono_float32
 from .chunk import chunk_text
 logger = logging.getLogger(__name__)
 # Le port MLX de Kokoro a un bug d'alignement intermittent (mx.random.normal
 # dans le generateur harmonique) qui leve un broadcast_shapes sur certains
 # tirages. Comme c'est aleatoire, un simple retry suffit le plus souvent ;
 # en dernier recours on coupe le morceau en deux.
 _KOKORO_RETRIES = 8
 class KokoroBackend(TTSBackend):
    name = "kokoro"
    def __init__(self, model_id: str | None = None, lang_code: str | None = None):
        setup_espeak()
        settings = get_settings()
        self.model_id = model_id or settings.kokoro_model
        self.lang_code = lang_code or settings.kokoro_lang_code
        self._model = None
        self._sample_rate = 24000
    def _ensure_loaded(self) -> None:
        if self._model is None:
            from mlx_audio.tts.utils import load_model
            self._model = load_model(self.model_id)
    def default_voice(self) -> VoiceSpec:
        return VoiceSpec(preset=get_settings().kokoro_default_voice)
    def synthesize(self, text: str, voice: VoiceSpec) -> tuple[np.ndarray, int]:
        self._ensure_loaded()
        preset = voice.preset or get_settings().kokoro_default_voice
        pieces: list[np.ndarray] = []
        for chunk in chunk_text(text):
            pieces.extend(self._gen_resilient(chunk, preset, voice.speed))
        if not pieces:
            return np.zeros(0, dtype=np.float32), self._sample_rate
        return np.concatenate(pieces), self._sample_rate
    def _gen_once(self, text: str, preset: str, speed: float) -> list[np.ndarray]:
        out: list[np.ndarray] = []
        for result in self._model.generate(
            text=text, voice=preset, speed=speed, lang_code=self.lang_code,
        ):
            self._sample_rate = getattr(result, "sample_rate", self._sample_rate)
            out.append(to_mono_float32(result.audio))
        return out
    def _gen_resilient(self, text: str, preset: str, speed: float,
                       depth: int = 0) -> list[np.ndarray]:
        """Genere un morceau avec retries, puis re-decoupe en secours."""
        for _ in range(_KOKORO_RETRIES):
            try:
                return self._gen_once(text, preset, speed)
            except Exception:  # noqa: BLE001 — bug intermittent du vocoder
                continue
        # Toujours en echec : on coupe en deux et on reessaie chaque moitie.
        if depth < 3 and len(text) > 40:
            mid = _split_point(text)
            left = self._gen_resilient(text[:mid].strip(), preset, speed, depth + 1)
            right = self._gen_resilient(text[mid:].strip(), preset, speed, depth + 1)
            return left + right
        logger.warning("Kokoro: morceau abandonne apres echecs: %r", text[:60])
        return []
 def _split_point(text: str) -> int:
    """Point de coupe au plus proche du milieu (espace de preference)."""
    mid = len(text) // 2
    left = text.rfind(" ", 0, mid)
    right = text.find(" ", mid)
    if left == -1 and right == -1:
        return mid
    if left == -1:
        return right
    if right == -1:
        return left
    return left if (mid - left) <= (right - mid) else right
--- a/backend/inkflow/tts/qwen3.py
+++ b/backend/inkflow/tts/qwen3.py
@@ -0,0 +1,58 @@
 """Backend Qwen3-TTS (qualite + clonage par audio de reference) — rendu final.
 Deux modes :
 - voix preglee : `voice` (ex "Chelsie") + `language` ("French").
 - clonage     : `ref_audio` (+ `ref_text` transcription du clip) pour imiter une
  voix de la voicebank, attribuee a un personnage.
 """
 from __future__ import annotations
 import numpy as np
 from ..settings import get_settings
 from .base import TTSBackend, VoiceSpec, to_mono_float32
 from .chunk import chunk_text
 # Qwen3 tolere des sequences plus longues que Kokoro, mais on borne quand meme.
 _QWEN_MAX_CHARS = 500
 class Qwen3Backend(TTSBackend):
    name = "qwen3"
    def __init__(self, model_id: str | None = None, language: str | None = None):
        settings = get_settings()
        self.model_id = model_id or settings.qwen3_model
        self.language = language or settings.language
        self._model = None
        self._sample_rate = 24000
    def _ensure_loaded(self) -> None:
        if self._model is None:
            from mlx_audio.tts.utils import load_model
            self._model = load_model(self.model_id)
    def default_voice(self) -> VoiceSpec:
        return VoiceSpec(preset=get_settings().qwen3_default_voice)
    def _gen_kwargs(self, voice: VoiceSpec) -> dict:
        kwargs: dict = {"language": self.language, "speed": voice.speed}
        if voice.ref_audio:  # mode clonage
            kwargs["ref_audio"] = voice.ref_audio
            if voice.ref_text:
                kwargs["ref_text"] = voice.ref_text
        else:                # mode voix preglee
            kwargs["voice"] = voice.preset or get_settings().qwen3_default_voice
        return kwargs
    def synthesize(self, text: str, voice: VoiceSpec) -> tuple[np.ndarray, int]:
        self._ensure_loaded()
        kwargs = self._gen_kwargs(voice)
        pieces: list[np.ndarray] = []
        for chunk in chunk_text(text, max_chars=_QWEN_MAX_CHARS):
            for result in self._model.generate(text=chunk, **kwargs):
                self._sample_rate = getattr(result, "sample_rate", self._sample_rate)
                pieces.append(to_mono_float32(result.audio))
        if not pieces:
            return np.zeros(0, dtype=np.float32), self._sample_rate
        return np.concatenate(pieces), self._sample_rate
--- a/backend/inkflow/util.py
+++ b/backend/inkflow/util.py
@@ -0,0 +1,22 @@
 """Petits utilitaires partages (slug, noms de fichiers surs)."""
 from __future__ import annotations
 import re
 import unicodedata
 _SLUG_STRIP = re.compile(r"[^a-z0-9]+")
 _FS_UNSAFE = re.compile(r'[<>:"/\\|?*\x00-\x1f]')
 def slugify(text: str) -> str:
    """Slug ascii minuscule, utilise pour les identifiants de dossiers internes."""
    norm = unicodedata.normalize("NFKD", text)
    norm = norm.encode("ascii", "ignore").decode("ascii").lower()
    return _SLUG_STRIP.sub("-", norm).strip("-") or "livre"
 def safe_filename(name: str) -> str:
    """Nettoie un nom de fichier en conservant les accents (sortie utilisateur)."""
    name = _FS_UNSAFE.sub("", name).strip()
    name = re.sub(r"\s+", " ", name)
    return name or "sans-titre"
--- a/backend/pyproject.toml
+++ b/backend/pyproject.toml
@@ -0,0 +1,40 @@
 [project]
 name = "inkflow"
 version = "0.1.0"
 description = "EPUB -> livre audio, 100% local sur Mac (MLX). Analyse Gemma + TTS Qwen3/Kokoro."
 requires-python = ">=3.11"
 dependencies = [
    # MLX (Apple Silicon)
    "mlx",
    "mlx-lm",
    "mlx-audio",
    "misaki",          # phonemizer pour Kokoro (français inclus)
    # Parsing EPUB
    "ebooklib",
    "beautifulsoup4",
    "lxml",
    # Audio
    "soundfile",       # lecture/ecriture wav
    "numpy",           # concat audio + normalisation
    "mutagen",         # tags id3 + cover (encodage mp3 via ffmpeg CLI)
    # API web
    "fastapi",
    "uvicorn[standard]",
    "websockets",
    "python-multipart", # upload de fichiers
    # Divers
    "pydantic>=2",
    "rich",            # logs CLI lisibles
    "typer",           # CLI
 ]
 [project.scripts]
 inkflow = "inkflow.cli:app"
 [build-system]
 requires = ["setuptools>=68"]
 build-backend = "setuptools.build_meta"
 [tool.setuptools.packages.find]
 where = ["."]
 include = ["inkflow*"]
--- a/backend/scripts/setup_models.py
+++ b/backend/scripts/setup_models.py
@@ -0,0 +1,87 @@
 #!/usr/bin/env python
 """Verifie l'environnement InkFlow et pre-telecharge les modeles MLX.
 Usage :
    python scripts/setup_models.py            # tout verifier + telecharger
    python scripts/setup_models.py --check    # verifier sans telecharger
 Pre-requis systeme : Apple Silicon, Python >= 3.11, ffmpeg (brew install ffmpeg).
 """
 from __future__ import annotations
 import argparse
 import platform
 import shutil
 import sys
 # Permet de lancer le script directement depuis backend/.
 sys.path.insert(0, str(__import__("pathlib").Path(__file__).resolve().parents[1]))
 from inkflow.config import (  # noqa: E402
    GEMMA_MODEL,
    KOKORO_MODEL,
    QWEN3_TTS_MODEL,
    ensure_dirs,
 )
 def check_env() -> bool:
    ok = True
    print(f"• Plateforme       : {platform.platform()} ({platform.machine()})")
    if platform.machine() != "arm64":
        print("  ! Attendu arm64 (Apple Silicon) — MLX ne sera pas optimal.")
    print(f"• Python           : {sys.version.split()[0]}")
    if sys.version_info < (3, 11):
        print("  ! Python >= 3.11 requis."); ok = False
    for mod in ("mlx", "mlx_lm", "mlx_audio", "ebooklib", "bs4",
                "soundfile", "mutagen", "fastapi"):
        try:
            __import__(mod)
            print(f"• import {mod:12s}: OK")
        except Exception as exc:  # noqa: BLE001
            print(f"• import {mod:12s}: ECHEC ({exc})"); ok = False
    ff = shutil.which("ffmpeg")
    print(f"• ffmpeg           : {ff or 'INTROUVABLE — brew install ffmpeg'}")
    ok = ok and bool(ff)
    return ok
 def download_lm(model_id: str) -> None:
    from mlx_lm import load
    print(f"  -> LM   {model_id}")
    load(model_id)
 def download_tts(model_id: str) -> None:
    from mlx_audio.tts.utils import load_model
    print(f"  -> TTS  {model_id}")
    load_model(model_id)
 def main() -> int:
    ap = argparse.ArgumentParser()
    ap.add_argument("--check", action="store_true", help="verifier sans telecharger")
    args = ap.parse_args()
    ensure_dirs()
    print("== Verification de l'environnement ==")
    env_ok = check_env()
    if args.check:
        return 0 if env_ok else 1
    if not env_ok:
        print("\nEnvironnement incomplet — corrige les points ci-dessus avant de continuer.")
        return 1
    print("\n== Telechargement des modeles (peut etre long la 1re fois) ==")
    download_lm(GEMMA_MODEL)
    download_tts(KOKORO_MODEL)
    download_tts(QWEN3_TTS_MODEL)
    print("\nTout est pret.")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/backend/tests/test_incises.py
+++ b/backend/tests/test_incises.py
@@ -0,0 +1,204 @@
 """Tests de la detection deterministe des incises.
 `detect_incises` / `incise_speaker` / `iter_incise_pieces` sont pures et
 testables sans Gemma. Deux passes : inversion verbe-pronom ("dit-il") et
 nominale consciente du casting ("compatit Holden", "informa le soldat").
 """
 from __future__ import annotations
 from inkflow.analysis.segmenter import (
    detect_incises,
    incise_speaker,
    iter_incise_pieces,
 )
 NAMES = {"Holden", "Kajri", "Camina Drummer"}
 def _pieces(text: str, names=NAMES) -> list[tuple[bool, str]]:
    return iter_incise_pieces(text, detect_incises(text, names=names))
 # --- Passe inversion (verbe-pronom) -----------------------------------------
 def test_inversion_au_milieu():
    assert _pieces("James Holden, coupa-t-elle. Je sais qui vous êtes.") == [
        (False, "James Holden,"),
        (True, "coupa-t-elle."),
        (False, "Je sais qui vous êtes."),
    ]
 def test_inversion_en_fin():
    assert _pieces("C'est fini, dit-elle.") == [
        (False, "C'est fini,"),
        (True, "dit-elle."),
    ]
 def test_inversion_reflechi_exclamation():
    assert _pieces("Viens ici, s'écria-t-il !") == [
        (False, "Viens ici,"),
        (True, "s'écria-t-il !"),
    ]
 def test_inversion_fermee_par_virgule():
    assert _pieces("Pars, répondit-elle, et ne reviens pas.") == [
        (False, "Pars,"),
        (True, "répondit-elle,"),
        (False, "et ne reviens pas."),
    ]
 def test_inversion_complements_apres_pronom():
    assert _pieces("Trop tard, murmura-t-il en souriant. Partons.") == [
        (False, "Trop tard,"),
        (True, "murmura-t-il en souriant."),
        (False, "Partons."),
    ]
 def test_double_inversion():
    assert _pieces("Stop, dit-il. Non, reprit-elle.") == [
        (False, "Stop,"),
        (True, "dit-il."),
        (False, "Non,"),
        (True, "reprit-elle."),
    ]
 # --- Incise en fin de parole : tout le reste de la replique est narration ----
 def test_incise_apres_fin_de_phrase_va_jusqu_au_bout():
    # Apres "…" la parole est close : "dit-il ... provisoires." est narration.
    text = ("Dans une minute, oui. Je voudrais juste… dit-il avec un geste vague, "
            "comme si tout cela n'avait plus d'importance.")
    assert _pieces(text) == [
        (False, "Dans une minute, oui. Je voudrais juste…"),
        (True, "dit-il avec un geste vague, comme si tout cela n'avait plus "
               "d'importance."),
    ]
 def test_incise_apres_virgule_reprend_le_dialogue():
    # Apres une simple virgule, le dialogue reprend (contraste avec ci-dessus).
    assert _pieces("Pars, répondit-elle, et ne reviens pas.") == [
        (False, "Pars,"),
        (True, "répondit-elle,"),
        (False, "et ne reviens pas."),
    ]
 def test_incise_nominale_apres_point_interrogation_va_au_bout():
    text = "Vraiment ? demanda-t-il en se levant. Il s'éloigna."
    assert _pieces(text) == [
        (False, "Vraiment ?"),
        (True, "demanda-t-il en se levant. Il s'éloigna."),
    ]
 # --- Passe nominale (verbe + sujet connu) -----------------------------------
 def test_nominale_nom_propre():
    assert _pieces("Toutes mes condoléances, compatit Holden.") == [
        (False, "Toutes mes condoléances,"),
        (True, "compatit Holden."),
    ]
 def test_nominale_alias_apres_ponctuation_forte():
    # "?" comme delimiteur a gauche + sujet = alias d'un personnage connu.
    assert _pieces("Flippant, cet enfoiré, hein ? lança Drummer.") == [
        (False, "Flippant, cet enfoiré, hein ?"),
        (True, "lança Drummer."),
    ]
 def test_nominale_clitic_et_nom_de_role():
    assert _pieces("Vous venez, monsieur ? lui demanda un garde.") == [
        (False, "Vous venez, monsieur ?"),
        (True, "lui demanda un garde."),
    ]
 # --- incise_speaker : seeding du locuteur explicite -------------------------
 def test_seed_speaker_nom_propre():
    text = "Toutes mes condoléances, compatit Holden."
    inc = detect_incises(text, names=NAMES)[0]
    assert incise_speaker(text, inc, NAMES) == "Holden"
 def test_seed_speaker_alias_vers_canonique():
    text = "Hein ? lança Drummer."
    inc = detect_incises(text, names=NAMES)[0]
    assert incise_speaker(text, inc, NAMES) == "Camina Drummer"
 def test_seed_speaker_role_non_nomme_est_none():
    # Un nom de role ("un garde") n'est pas un personnage du casting -> pas de seed.
    text = "Vous venez ? lui demanda un garde."
    inc = detect_incises(text, names=NAMES)[0]
    assert incise_speaker(text, inc, NAMES) is None
 def test_seed_speaker_inversion_est_none():
    text = "C'est fini, dit-elle."
    inc = detect_incises(text, names=NAMES)[0]
    assert incise_speaker(text, inc, NAMES) is None
 def test_seed_nom_propre_absent_du_casting():
    # Le nom est ecrit dans l'incise -> seede meme si l'extraction l'a rate.
    text = "Bonjour, lança Drummer."
    inc = detect_incises(text, names=set())[0]
    assert incise_speaker(text, inc, set()) == "Drummer"
    assert _pieces(text, names=set()) == [
        (False, "Bonjour,"),
        (True, "lança Drummer."),
    ]
 # --- Faux positifs a NE PAS detecter ----------------------------------------
 def test_vocatif_adresse_pas_incise():
    # Le personnage est interpelle, pas une incise (aucun verbe de parole).
    text = "Vous n'avez pas l'air en mesure de rendre service, capitaine Holden."
    assert detect_incises(text, names=NAMES) == []
 def test_imperatif_sans_incise():
    assert detect_incises("Donne-le-moi.", names=NAMES) == []
 def test_pronom_tu_exclu():
    assert detect_incises("Crois-tu ?", names=NAMES) == []
 def test_replique_simple_sans_incise():
    assert detect_incises("Bonjour à tous.", names=NAMES) == []
 def test_sans_noms_inversion_seule():
    # Sans casting fourni, la passe inversion fonctionne toujours.
    assert _pieces("C'est fini, dit-elle.", names=set()) == [
        (False, "C'est fini,"),
        (True, "dit-elle."),
    ]
 # --- Invariants -------------------------------------------------------------
 def test_texte_preserve_modulo_espaces():
    text = "James Holden, coupa-t-elle. Je sais qui vous êtes."
    joined = "".join(p for _, p in _pieces(text))
    assert joined.replace(" ", "") == text.replace(" ", "")
 def test_bornes_non_chevauchantes_et_triees():
    text = "Stop, dit-il. Non, reprit-elle."
    incs = detect_incises(text, names=NAMES)
    assert all(incs[i].end <= incs[i + 1].start for i in range(len(incs) - 1))
    for inc in incs:
        assert 0 <= inc.start < inc.end <= len(text)
--- a/frontend/dist/assets/index-CMUl6Yfl.js
+++ b/frontend/dist/assets/index-CMUl6Yfl.js
--- a/frontend/dist/assets/index-DlPmWkkU.css
+++ b/frontend/dist/assets/index-DlPmWkkU.css
--- a/frontend/dist/index.html
+++ b/frontend/dist/index.html
@@ -0,0 +1,13 @@
 <!doctype html>
 <html lang="fr">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>InkFlow — EPUB → Livre audio</title>
    <script type="module" crossorigin src="/assets/index-CMUl6Yfl.js"></script>
    <link rel="stylesheet" crossorigin href="/assets/index-DlPmWkkU.css">
  </head>
  <body>
    <div id="root"></div>
  </body>
 </html>
--- a/frontend/index.html
+++ b/frontend/index.html
@@ -0,0 +1,12 @@
 <!doctype html>
 <html lang="fr">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>InkFlow — EPUB → Livre audio</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/src/main.jsx"></script>
  </body>
 </html>
--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -0,0 +1,22 @@
 {
  "name": "inkflow-frontend",
  "private": true,
  "version": "0.1.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "vite build",
    "preview": "vite preview"
  },
  "dependencies": {
    "react": "^18.3.1",
    "react-dom": "^18.3.1"
  },
  "devDependencies": {
    "@vitejs/plugin-react": "^4.3.4",
    "autoprefixer": "^10.4.20",
    "postcss": "^8.4.49",
    "tailwindcss": "^3.4.17",
    "vite": "^6.0.7"
  }
 }
--- a/frontend/postcss.config.js
+++ b/frontend/postcss.config.js
@@ -0,0 +1,6 @@
 export default {
  plugins: {
    tailwindcss: {},
    autoprefixer: {},
  },
 };
--- a/frontend/src/AnalysisEditor.jsx
+++ b/frontend/src/AnalysisEditor.jsx
@@ -0,0 +1,245 @@
 import React, { useEffect, useMemo, useState } from "react";
 import { api } from "./api.js";
 import { Spinner } from "./ui.jsx";
 const NARRATOR = "narrateur";
 let _seq = 0;
 const nextId = () => ++_seq;
 export default function AnalysisEditor({ slug, book, state }) {
  // Chapitres analysés (intersection ordre du livre x analyzed_chapters).
  const analyzed = useMemo(() => {
    const set = new Set(state.analyzed_chapters || []);
    return book.chapters.filter((c) => set.has(c.index));
  }, [book, state.analyzed_chapters]);
  const [index, setIndex] = useState(() => analyzed[0]?.index ?? null);
  const [analysis, setAnalysis] = useState(null); // { index, title, segments:[{_id,type,text,speaker}] }
  const [names, setNames] = useState([]); // noms de personnages pour la datalist
  const [loading, setLoading] = useState(false);
  const [saved, setSaved] = useState(false);
  // Derniere selection de texte dans une replique (pour "marquer comme incise").
  const [sel, setSel] = useState({ id: null, start: 0, end: 0 });
  // Filtres d'affichage (n'altèrent pas la sauvegarde).
  const [query, setQuery] = useState("");
  const [typeFilter, setTypeFilter] = useState("all");
  const [speakerFilter, setSpeakerFilter] = useState("all");
  // Si la liste des chapitres analysés change et que l'index courant disparaît.
  useEffect(() => {
    if (index == null || !analyzed.some((c) => c.index === index)) {
      setIndex(analyzed[0]?.index ?? null);
    }
  }, [analyzed]); // eslint-disable-line react-hooks/exhaustive-deps
  // Noms des personnages du casting (une fois).
  useEffect(() => {
    api.getCast(slug)
      .then((d) => setNames((d.cast?.characters || []).map((c) => c.name)))
      .catch(() => setNames([]));
  }, [slug]);
  // Chargement de l'analyse du chapitre sélectionné.
  useEffect(() => {
    if (index == null) { setAnalysis(null); return; }
    setLoading(true);
    setSaved(false);
    api.getChapter(slug, index).then((d) => {
      if (d.analysis) {
        setAnalysis({
          index: d.analysis.index,
          title: d.analysis.title,
          segments: (d.analysis.segments || []).map((s) => ({ ...s, _id: nextId() })),
        });
      } else {
        setAnalysis({ index, title: d.chapter?.title || "", segments: null });
      }
    }).finally(() => setLoading(false));
  }, [slug, index]);
  const speakerOptions = useMemo(() => {
    const set = new Set([NARRATOR, ...names]);
    (analysis?.segments || []).forEach((s) => s.speaker && set.add(s.speaker));
    return [...set];
  }, [names, analysis]);
  if (!analyzed.length)
    return <p className="text-ink-muted">Lancez d'abord l'<b>Analyse</b> sur un chapitre.</p>;
  const touch = (segments) => { setAnalysis((a) => ({ ...a, segments })); setSaved(false); };
  const setSeg = (id, patch) =>
    touch(analysis.segments.map((s) => {
      if (s._id !== id) return s;
      const next = { ...s, ...patch };
      if (next.type === "narration") { next.speaker = NARRATOR; next.incises = []; }
      // Edition du texte : on ecarte les incises devenues hors-bornes.
      if (patch.text !== undefined) {
        const len = next.text.length;
        next.incises = (next.incises || []).filter(
          (inc) => inc.start < inc.end && inc.end <= len);
      }
      return next;
    }));
  // Marque la portion [start,end) d'une replique comme incise (voix narrateur).
  const addIncise = (id, start, end) =>
    touch(analysis.segments.map((s) => {
      if (s._id !== id) return s;
      const incises = [...(s.incises || []), { start, end }]
        .sort((a, b) => a.start - b.start)
        .filter((inc, i, arr) => i === 0 || inc.start >= arr[i - 1].end);
      return { ...s, incises };
    }));
  const removeIncise = (id, i) =>
    touch(analysis.segments.map((s) =>
      s._id !== id ? s : { ...s, incises: (s.incises || []).filter((_, k) => k !== i) }));
  const removeSeg = (id) => touch(analysis.segments.filter((s) => s._id !== id));
  const insertAfter = (id) => {
    const segs = analysis.segments;
    const pos = id == null ? segs.length : segs.findIndex((s) => s._id === id) + 1;
    const next = [...segs];
    next.splice(pos, 0, { _id: nextId(), type: "narration", text: "", speaker: NARRATOR });
    touch(next);
  };
  const save = async () => {
    const payload = {
      index: analysis.index,
      title: analysis.title,
      segments: analysis.segments.map(({ _id, ...s }) => s),
    };
    await api.putAnalysis(slug, analysis.index, payload);
    setSaved(true);
  };
  const segments = analysis?.segments;
  const visible = (segments || []).filter((s) => {
    if (typeFilter !== "all" && s.type !== typeFilter) return false;
    if (speakerFilter !== "all" && s.speaker !== speakerFilter) return false;
    if (query && !s.text.toLowerCase().includes(query.toLowerCase())) return false;
    return true;
  });
  const dialogueCount = (segments || []).filter((s) => s.type === "dialogue").length;
  return (
    <div className="space-y-4">
      <datalist id="speaker-list">
        {speakerOptions.map((n) => <option key={n} value={n} />)}
      </datalist>
      {/* Barre de contrôle */}
      <div className="card flex flex-wrap items-center gap-3 p-3">
        <label className="text-sm text-ink-muted">Chapitre</label>
        <select className="input" value={index ?? ""}
          onChange={(e) => setIndex(Number(e.target.value))}>
          {analyzed.map((c) => (
            <option key={c.index} value={c.index}>{c.index} — {c.title}</option>
          ))}
        </select>
        {segments && (
          <span className="text-xs text-ink-muted">
            {segments.length} segments · {dialogueCount} dialogues
          </span>
        )}
        <button className="btn-primary ml-auto" disabled={!segments} onClick={save}>
          {saved ? "✓ enregistré" : "Enregistrer"}
        </button>
      </div>
      {loading && <p className="text-ink-muted"><Spinner /> chargement de l'analyse…</p>}
      {!loading && segments === null && (
        <p className="text-ink-muted">Ce chapitre n'a pas encore d'analyse. Lancez l'<b>Analyse</b>.</p>
      )}
      {!loading && segments && (
        <>
          {/* Filtres d'affichage */}
          <div className="card flex flex-wrap items-center gap-3 p-3">
            <input className="input flex-1 min-w-[12rem]" placeholder="Rechercher dans le texte…"
              value={query} onChange={(e) => setQuery(e.target.value)} />
            <select className="input" value={typeFilter} onChange={(e) => setTypeFilter(e.target.value)}>
              <option value="all">tous types</option>
              <option value="narration">narration</option>
              <option value="dialogue">dialogue</option>
            </select>
            <select className="input" value={speakerFilter} onChange={(e) => setSpeakerFilter(e.target.value)}>
              <option value="all">tous locuteurs</option>
              {speakerOptions.map((n) => <option key={n} value={n}>{n}</option>)}
            </select>
            {visible.length !== segments.length && (
              <span className="text-xs text-ink-muted">{visible.length} affichés</span>
            )}
          </div>
          <div className="card divide-y divide-ink-edge">
            {visible.map((s) => {
              const canMark = s.type === "dialogue"
                && sel.id === s._id && sel.end > sel.start;
              const incises = s.incises || [];
              return (
              <div key={s._id} className="px-4 py-2.5">
                <div className="flex items-start gap-3">
                  <select className="input w-28 shrink-0" value={s.type}
                    onChange={(e) => setSeg(s._id, { type: e.target.value })}>
                    <option value="narration">narration</option>
                    <option value="dialogue">dialogue</option>
                  </select>
                  <textarea className="input flex-1 min-h-[2.5rem] resize-y font-serif text-sm"
                    rows={Math.min(6, Math.ceil((s.text.length || 1) / 80))}
                    value={s.text}
                    onSelect={(e) => s.type === "dialogue" && setSel({
                      id: s._id, start: e.target.selectionStart, end: e.target.selectionEnd })}
                    onChange={(e) => setSeg(s._id, { text: e.target.value })} />
                  <input className="input w-40 shrink-0" list="speaker-list"
                    placeholder="locuteur"
                    value={s.speaker} disabled={s.type === "narration"}
                    onChange={(e) => setSeg(s._id, { speaker: e.target.value })} />
                  <div className="flex shrink-0 gap-1">
                    <button className="btn-ghost" title="Insérer après"
                      onClick={() => insertAfter(s._id)}>+</button>
                    <button className="btn-ghost" title="Supprimer"
                      onClick={() => removeSeg(s._id)}>✕</button>
                  </div>
                </div>
                {/* Incises : portions lues par le narrateur dans la réplique */}
                {s.type === "dialogue" && (incises.length > 0 || canMark) && (
                  <div className="mt-1.5 ml-[7.75rem] flex flex-wrap items-center gap-1.5">
                    <span className="text-[11px] uppercase tracking-wide text-ink-muted">incises</span>
                    {incises.map((inc, i) => (
                      <span key={i}
                        className="inline-flex items-center gap-1 rounded bg-ink-edge/40 px-1.5 py-0.5 text-xs"
                        title="Lu par la voix du narrateur">
                        <span className="text-ink-muted">🎙</span>
                        <span className="font-serif">{s.text.slice(inc.start, inc.end)}</span>
                        <button className="text-ink-muted hover:text-ink"
                          title="Retirer l'incise"
                          onClick={() => removeIncise(s._id, i)}>✕</button>
                      </span>
                    ))}
                    {canMark && (
                      <button className="btn-ghost text-xs"
                        onClick={() => { addIncise(s._id, sel.start, sel.end);
                          setSel({ id: null, start: 0, end: 0 }); }}>
                        + marquer la sélection
                      </button>
                    )}
                  </div>
                )}
              </div>
            ); })}
            <div className="px-4 py-2.5">
              <button className="btn-ghost" onClick={() => insertAfter(null)}>+ ajouter un segment</button>
            </div>
          </div>
        </>
      )}
    </div>
  );
 }
--- a/frontend/src/App.jsx
+++ b/frontend/src/App.jsx
@@ -0,0 +1,44 @@
 import React, { useState } from "react";
 import Library from "./Library.jsx";
 import BookView from "./BookView.jsx";
 import Settings from "./Settings.jsx";
 export default function App() {
  // Permet d'ouvrir un livre directement via #slug (deep-link).
  const [slug, setSlug] = useState(
    () => (location.hash ? decodeURIComponent(location.hash.slice(1)) : null)
  );
  const [showSettings, setShowSettings] = useState(false);
  const goHome = () => { setShowSettings(false); setSlug(null); };
  return (
    <div className="min-h-screen bg-ink-bg text-ink-text">
      <header className="border-b border-ink-edge">
        <div className="mx-auto flex max-w-6xl items-center gap-3 px-6 py-4">
          <button onClick={goHome} className="flex items-center gap-2">
            <span className="text-2xl">🖋️</span>
            <span className="font-serif text-xl tracking-wide">
              Ink<span className="text-ink-accent">Flow</span>
            </span>
          </button>
          <span className="ml-2 hidden text-sm text-ink-muted sm:inline">
            EPUB → livre audio · local · MLX
          </span>
          <button onClick={() => setShowSettings(true)} title="Réglages techniques"
            className="ml-auto text-xl text-ink-muted hover:text-ink-text">⚙</button>
        </div>
      </header>
      <main className="mx-auto max-w-6xl px-6 py-8">
        {showSettings ? (
          <Settings onBack={goHome} />
        ) : slug ? (
          <BookView slug={slug} onBack={() => setSlug(null)} />
        ) : (
          <Library onOpen={setSlug} />
        )}
      </main>
    </div>
  );
 }
--- a/frontend/src/BookView.jsx
+++ b/frontend/src/BookView.jsx
@@ -0,0 +1,99 @@
 import React, { useEffect, useState } from "react";
 import { api, subscribeState } from "./api.js";
 import { StatusChip, ProgressBar, Spinner } from "./ui.jsx";
 import Chapters from "./Chapters.jsx";
 import AnalysisEditor from "./AnalysisEditor.jsx";
 import CastEditor from "./CastEditor.jsx";
 import PronunciationEditor from "./PronunciationEditor.jsx";
 const STAGES = [
  { key: "analyze", label: "Analyse", action: (s) => api.analyze(s), hint: "Découpe le texte, détecte les locuteurs et le casting." },
  { key: "cast", label: "Casting", action: (s) => api.castAuto(s), hint: "Attribue une voix à chaque personnage." },
  { key: "pronounce", label: "Prononciations", action: (s) => api.pronounce(s), hint: "Repère les mots à risque de mauvaise prononciation." },
 ];
 export default function BookView({ slug, onBack }) {
  const [data, setData] = useState(null);
  const [state, setState] = useState(null);
  const [tab, setTab] = useState("chapters");
  useEffect(() => {
    api.getBook(slug).then((d) => { setData(d); setState(d.state); });
    const unsub = subscribeState(slug, setState);
    return unsub;
  }, [slug]);
  if (!data) return <p className="text-ink-muted"><Spinner /> chargement…</p>;
  const { book } = data;
  const st = state || data.state;
  const busy = !!st.active_stage;
  return (
    <div className="space-y-6">
      <button onClick={onBack} className="text-sm text-ink-muted hover:text-ink-text">← Bibliothèque</button>
      <div className="flex gap-5">
        {book.cover_file && (
          <img src={api.coverUrl(slug)} alt="" className="h-44 rounded-md border border-ink-edge object-cover" />
        )}
        <div className="flex-1">
          <h1 className="font-serif text-2xl">{book.title}</h1>
          <p className="text-ink-muted">{book.author}</p>
          <p className="mt-1 text-sm text-ink-muted">{book.chapters.filter((c) => c.render).length} chapitres à narrer</p>
          {busy && (
            <div className="mt-4 max-w-md space-y-1">
              <div className="flex justify-between text-xs text-ink-accent">
                <span>{st.active_detail || st.active_stage}</span>
                <span>{Math.round((st.active_progress || 0) * 100)}%</span>
              </div>
              <ProgressBar value={st.active_progress} />
            </div>
          )}
        </div>
      </div>
      {/* Pipeline */}
      <div className="grid grid-cols-1 gap-3 sm:grid-cols-3">
        {STAGES.map((stage) => {
          const status = st.stages?.[stage.key] || "pending";
          return (
            <div key={stage.key} className="card p-4">
              <div className="flex items-center justify-between">
                <span className="font-medium">{stage.label}</span>
                <StatusChip status={status} />
              </div>
              <p className="mt-1 text-xs text-ink-muted">{stage.hint}</p>
              <button className="btn-ghost mt-3" disabled={busy}
                onClick={() => stage.action(slug)}>
                {status === "done" ? "Relancer" : "Lancer"}
              </button>
            </div>
          );
        })}
      </div>
      {/* Onglets */}
      <div className="flex gap-1 border-b border-ink-edge">
        {[
          ["chapters", "Chapitres"],
          ["analysis", "Analyse"],
          ["cast", "Casting"],
          ["pron", "Prononciation"],
        ].map(([key, label]) => (
          <button key={key} onClick={() => setTab(key)}
            className={`px-4 py-2 text-sm ${tab === key
              ? "border-b-2 border-ink-accent text-ink-text"
              : "text-ink-muted hover:text-ink-text"}`}>
            {label}
          </button>
        ))}
      </div>
      {tab === "chapters" && <Chapters slug={slug} book={book} state={st} busy={busy} />}
      {tab === "analysis" && <AnalysisEditor slug={slug} book={book} state={st} />}
      {tab === "cast" && <CastEditor slug={slug} busy={busy} />}
      {tab === "pron" && <PronunciationEditor slug={slug} />}
    </div>
  );
 }
--- a/frontend/src/CastEditor.jsx
+++ b/frontend/src/CastEditor.jsx
@@ -0,0 +1,119 @@
 import React, { useEffect, useState } from "react";
 import { api } from "./api.js";
 import { Spinner } from "./ui.jsx";
 function VoiceSelect({ voices, value, onChange }) {
  return (
    <select className="input" value={value || ""} onChange={(e) => onChange(e.target.value)}>
      <option value="">— aucune —</option>
      {voices.map((v) => (
        <option key={v.id} value={v.id}>
          {v.label || v.id} ({v.gender === "male" ? "H" : v.gender === "female" ? "F" : "?"})
        </option>
      ))}
    </select>
  );
 }
 export default function CastEditor({ slug, busy }) {
  const [cast, setCast] = useState(null);
  const [voices, setVoices] = useState([]);
  const [saved, setSaved] = useState(false);
  const [playing, setPlaying] = useState(null);
  const [msg, setMsg] = useState(null);
  const dedupPending = React.useRef(false);
  const reload = () =>
    api.getCast(slug).then((d) => { setCast(d.cast); setVoices(d.voicebank.entries); });
  useEffect(() => { reload(); }, [slug]);
  // Recharge le casting quand un job de fond (dédup / casting chapitre) se termine.
  useEffect(() => {
    if (busy) return;
    reload().then(() => {
      if (dedupPending.current) {
        dedupPending.current = false;
        api.getCast(slug).then((d) =>
          setMsg(`✓ déduplication terminée — ${d.cast.characters.length} personnages`));
      }
    });
  }, [busy]);
  const dedup = async () => {
    setMsg(null);
    try {
      dedupPending.current = true;
      await api.castDedup(slug);
      setMsg("Déduplication lancée…");
    } catch (e) {
      dedupPending.current = false;
      setMsg("Échec : " + e + " (le serveur backend est-il à jour ? redémarre-le)");
    }
  };
  if (!cast) return <p className="text-ink-muted"><Spinner /> chargement du casting…</p>;
  if (!cast.characters.length)
    return <p className="text-ink-muted">Lancez d'abord l'<b>Analyse</b> puis le <b>Casting</b>.</p>;
  const update = (patch) => { setCast({ ...cast, ...patch }); setSaved(false); };
  const setChar = (name, voiceId) =>
    update({ characters: cast.characters.map((c) => c.name === name ? { ...c, voice_id: voiceId } : c) });
  const preview = async (voiceId) => {
    if (!voiceId) return;
    setPlaying(voiceId);
    try {
      const url = await api.previewVoice(voiceId, "Bonjour, voici un aperçu de cette voix.");
      const a = new Audio(url);
      a.onended = () => setPlaying(null);
      a.play();
    } catch { setPlaying(null); }
  };
  const save = async () => { await api.putCast(slug, cast); setSaved(true); };
  return (
    <div className="space-y-4">
      <div className="card flex items-center gap-3 p-3">
        <span className="text-sm text-ink-muted">Narrateur</span>
        <VoiceSelect voices={voices} value={cast.narrator_voice_id}
          onChange={(v) => update({ narrator_voice_id: v })} />
        <button className="btn-ghost" onClick={() => preview(cast.narrator_voice_id)}>
          {playing === cast.narrator_voice_id ? "♪" : "▶"} écouter
        </button>
        <button className="btn-ghost ml-auto" disabled={busy}
          title="Fusionne les variantes d'un même personnage (Holden / James Holden / James)"
          onClick={dedup}>
          {busy ? "…" : "Dédupliquer"}
        </button>
        <button className="btn-primary" onClick={save}>
          {saved ? "✓ enregistré" : "Enregistrer"}
        </button>
      </div>
      {msg && <p className="px-1 text-sm text-ink-muted">{msg}</p>}
      <div className="card divide-y divide-ink-edge">
        {cast.characters.map((c) => (
          <div key={c.name} className="flex items-center gap-3 px-4 py-2.5">
            <div className="flex-1 min-w-0">
              <p className="truncate font-serif text-sm">{c.name}</p>
              {c.aliases?.length > 0 && (
                <p className="truncate text-xs text-ink-muted">alias : {c.aliases.join(", ")}</p>
              )}
              {c.description && <p className="truncate text-xs text-ink-muted">{c.description}</p>}
            </div>
            <span className="chip bg-ink-edge text-ink-muted">
              {c.gender === "male" ? "homme" : c.gender === "female" ? "femme" : "?"}
            </span>
            <VoiceSelect voices={voices} value={c.voice_id}
              onChange={(v) => setChar(c.name, v)} />
            <button className="btn-ghost" onClick={() => preview(c.voice_id)}>
              {playing === c.voice_id ? "♪" : "▶"}
            </button>
          </div>
        ))}
      </div>
    </div>
  );
 }
--- a/frontend/src/Chapters.jsx
+++ b/frontend/src/Chapters.jsx
@@ -0,0 +1,98 @@
 import React, { useEffect, useState } from "react";
 import { api } from "./api.js";
 import { StatusChip, ProgressBar } from "./ui.jsx";
 export default function Chapters({ slug, book, state, busy }) {
  const chapters = book.chapters.filter((c) => c.render);
  const [backend, setBackend] = useState("kokoro");
  const [mono, setMono] = useState(false);
  const [selected, setSelected] = useState(() => new Set());
  // Initialise le moteur sur le backend par defaut des reglages.
  useEffect(() => {
    api.getSettings().then((s) => s?.default_backend && setBackend(s.default_backend)).catch(() => {});
  }, []);
  const toggle = (idx) => {
    const next = new Set(selected);
    next.has(idx) ? next.delete(idx) : next.add(idx);
    setSelected(next);
  };
  const renderChapters = (indexes) => {
    if (!indexes.length) return;
    api.render(slug, indexes, backend, mono);
  };
  return (
    <div className="space-y-4">
      <div className="card flex flex-wrap items-center gap-3 p-3">
        <label className="text-sm text-ink-muted">Moteur</label>
        <select className="input" value={backend} onChange={(e) => setBackend(e.target.value)}>
          <option value="kokoro">Kokoro (rapide)</option>
          <option value="qwen3">Qwen3 (qualité + clonage)</option>
        </select>
        <label className="flex items-center gap-2 text-sm text-ink-muted">
          <input type="checkbox" checked={mono} onChange={(e) => setMono(e.target.checked)} />
          mono-narrateur
        </label>
        <div className="ml-auto flex gap-2">
          <button className="btn-ghost" disabled={busy || !selected.size}
            onClick={() => renderChapters([...selected])}>
            Rendre la sélection ({selected.size})
          </button>
          <button className="btn-primary" disabled={busy}
            onClick={() => renderChapters(chapters.map((c) => c.index))}>
            Rendre tout
          </button>
        </div>
      </div>
      <div className="card divide-y divide-ink-edge">
        {chapters.map((c) => {
          const rs = state.render?.[c.index] || state.render?.[String(c.index)] || {};
          const analyzed = (state.analyzed_chapters || []).includes(c.index);
          return (
            <div key={c.index} className="flex items-center gap-3 px-4 py-2.5">
              <input type="checkbox" checked={selected.has(c.index)}
                onChange={() => toggle(c.index)} />
              <div className="w-9 text-center text-xs text-ink-muted">{c.index}</div>
              <div className="flex-1 min-w-0">
                <p className="truncate font-serif text-sm">{c.title}</p>
                <div className="mt-0.5 flex items-center gap-2 text-xs text-ink-muted">
                  <span>{c.word_count} mots</span>
                  {c.pov && <span className="chip bg-ink-edge text-ink-muted">{c.pov}</span>}
                  {analyzed && <span className="text-emerald-400">analysé</span>}
                </div>
                {rs.status === "running" && (
                  <div className="mt-1.5 max-w-xs"><ProgressBar value={rs.progress} /></div>
                )}
              </div>
              {rs.status && <StatusChip status={rs.status} />}
              {rs.mp3 && (
                <>
                  <audio controls src={api.audioUrl(slug, c.index)} className="h-8" />
                  <a className="btn-ghost" href={api.audioUrl(slug, c.index)} download>↓</a>
                </>
              )}
              {!busy && (
                <>
                  <button className="btn-ghost" title={analyzed ? "Ré-analyser ce chapitre" : "Analyser ce chapitre"}
                    onClick={() => api.analyze(slug, [c.index])}>
                    {analyzed ? "Ré-analyser" : "Analyser"}
                  </button>
                  <button className="btn-ghost" title="Ré-analyser le casting de ce chapitre (sans re-segmenter)"
                    onClick={() => api.castAnalyze(slug, [c.index])}>
                    Casting
                  </button>
                  <button className="btn-ghost" title="Rendre ce chapitre"
                    onClick={() => renderChapters([c.index])}>▶</button>
                </>
              )}
            </div>
          );
        })}
      </div>
    </div>
  );
 }
--- a/frontend/src/Library.jsx
+++ b/frontend/src/Library.jsx
@@ -0,0 +1,80 @@
 import React, { useEffect, useRef, useState } from "react";
 import { api } from "./api.js";
 import { Spinner } from "./ui.jsx";
 export default function Library({ onOpen }) {
  const [books, setBooks] = useState(null);
  const [uploading, setUploading] = useState(false);
  const [error, setError] = useState(null);
  const fileRef = useRef();
  const refresh = () => api.listBooks().then(setBooks).catch((e) => setError(String(e)));
  useEffect(() => { refresh(); }, []);
  const upload = async (file) => {
    if (!file) return;
    setUploading(true);
    setError(null);
    try {
      const { slug } = await api.uploadBook(file);
      await refresh();
      onOpen(slug);
    } catch (e) {
      setError("Échec de l'import : " + e);
    } finally {
      setUploading(false);
    }
  };
  return (
    <div className="space-y-8">
      <section
        onDragOver={(e) => e.preventDefault()}
        onDrop={(e) => { e.preventDefault(); upload(e.dataTransfer.files[0]); }}
        className="card flex flex-col items-center justify-center gap-3 border-dashed py-12 text-center"
      >
        <div className="text-4xl">📖</div>
        <p className="font-serif text-lg">Déposez un fichier EPUB</p>
        <p className="text-sm text-ink-muted">ou</p>
        <button className="btn-primary" disabled={uploading}
          onClick={() => fileRef.current?.click()}>
          {uploading ? <Spinner /> : null}
          {uploading ? "Import en cours…" : "Choisir un fichier"}
        </button>
        <input ref={fileRef} type="file" accept=".epub" className="hidden"
          onChange={(e) => upload(e.target.files[0])} />
      </section>
      {error && <p className="text-sm text-red-400">{error}</p>}
      <section>
        <h2 className="mb-3 font-serif text-lg text-ink-muted">Bibliothèque</h2>
        {books === null ? (
          <p className="text-ink-muted"><Spinner /> chargement…</p>
        ) : books.length === 0 ? (
          <p className="text-ink-muted">Aucun livre pour l'instant.</p>
        ) : (
          <div className="grid grid-cols-2 gap-4 sm:grid-cols-3 lg:grid-cols-4">
            {books.map((b) => (
              <button key={b.slug} onClick={() => onOpen(b.slug)}
                className="card group overflow-hidden text-left transition-transform hover:-translate-y-1">
                <div className="aspect-[2/3] w-full bg-ink-edge">
                  {b.cover && (
                    <img src={b.cover} alt="" className="h-full w-full object-cover" />
                  )}
                </div>
                <div className="p-3">
                  <p className="line-clamp-2 font-serif text-sm">{b.title}</p>
                  <p className="mt-1 text-xs text-ink-muted">{b.author}</p>
                  <p className="mt-2 text-xs text-ink-accent">
                    {b.rendered}/{b.chapters} chapitres rendus
                  </p>
                </div>
              </button>
            ))}
          </div>
        )}
      </section>
    </div>
  );
 }
--- a/frontend/src/PronunciationEditor.jsx
+++ b/frontend/src/PronunciationEditor.jsx
@@ -0,0 +1,59 @@
 import React, { useEffect, useState } from "react";
 import { api } from "./api.js";
 import { Spinner } from "./ui.jsx";
 export default function PronunciationEditor({ slug }) {
  const [entries, setEntries] = useState(null);
  const [saved, setSaved] = useState(false);
  useEffect(() => {
    api.getPron(slug).then((d) => setEntries(d.entries || []));
  }, [slug]);
  if (entries === null) return <p className="text-ink-muted"><Spinner /> chargement…</p>;
  const dirty = () => setSaved(false);
  const setRow = (i, patch) => {
    setEntries(entries.map((e, j) => (j === i ? { ...e, ...patch } : e)));
    dirty();
  };
  const add = () => { setEntries([...entries, { term: "", replacement: "", enabled: true }]); dirty(); };
  const remove = (i) => { setEntries(entries.filter((_, j) => j !== i)); dirty(); };
  const save = async () => {
    await api.putPron(slug, { entries: entries.filter((e) => e.term) });
    setSaved(true);
  };
  return (
    <div className="space-y-4">
      <div className="flex items-center gap-3">
        <p className="text-sm text-ink-muted">
          Corrigez la graphie des mots mal prononcés. La colonne « prononciation » remplace le terme avant la synthèse.
        </p>
        <button className="btn-ghost ml-auto" onClick={add}>+ ajouter</button>
        <button className="btn-primary" onClick={save}>{saved ? "✓ enregistré" : "Enregistrer"}</button>
      </div>
      {entries.length === 0 ? (
        <p className="text-ink-muted">Aucune entrée. Lancez l'étape <b>Prononciations</b> ou ajoutez-en.</p>
      ) : (
        <div className="card divide-y divide-ink-edge">
          <div className="grid grid-cols-[1fr_1fr_auto_auto] gap-3 px-4 py-2 text-xs uppercase text-ink-muted">
            <span>Terme</span><span>Prononciation</span><span>Actif</span><span></span>
          </div>
          {entries.map((e, i) => (
            <div key={i} className="grid grid-cols-[1fr_1fr_auto_auto] items-center gap-3 px-4 py-2">
              <input className="input" value={e.term}
                onChange={(ev) => setRow(i, { term: ev.target.value })} />
              <input className="input" value={e.replacement}
                onChange={(ev) => setRow(i, { replacement: ev.target.value })} />
              <input type="checkbox" checked={e.enabled !== false}
                onChange={(ev) => setRow(i, { enabled: ev.target.checked })} />
              <button className="text-ink-muted hover:text-red-400" onClick={() => remove(i)}>✕</button>
            </div>
          ))}
        </div>
      )}
    </div>
  );
 }
--- a/frontend/src/Settings.jsx
+++ b/frontend/src/Settings.jsx
@@ -0,0 +1,142 @@
 import React, { useEffect, useState } from "react";
 import { api } from "./api.js";
 import { Spinner } from "./ui.jsx";
 // Description declarative des champs, groupes par section.
 const SECTIONS = [
  {
    title: "Modèles (identifiants MLX / HuggingFace)",
    hint: "Changer un identifiant recharge un autre modèle (peut déclencher un téléchargement au prochain usage).",
    fields: [
      { key: "gemma_model", label: "Gemma (analyse)", type: "text" },
      { key: "qwen3_model", label: "Qwen3-TTS (rendu)", type: "text" },
      { key: "kokoro_model", label: "Kokoro (preview)", type: "text" },
    ],
  },
  {
    title: "Génération Gemma",
    hint: "Paramètres d'échantillonnage de l'analyse (locuteurs, personnages, prononciations).",
    fields: [
      { key: "gemma_temperature", label: "Température", type: "number", step: 0.05, min: 0, max: 2 },
      { key: "gemma_max_tokens", label: "Max tokens", type: "number", step: 1, min: 64, max: 8192 },
    ],
  },
  {
    title: "Prompts système (analyse)",
    hint: "Instructions envoyées à Gemma avant chaque tâche. Le modèle doit répondre en JSON.",
    fields: [
      { key: "prompt_speakers", label: "Attribution des locuteurs", type: "textarea" },
      { key: "prompt_characters", label: "Extraction des personnages", type: "textarea" },
      { key: "prompt_pronunciation", label: "Mots à risque (prononciation)", type: "textarea" },
    ],
  },
  {
    title: "Casting (déduplication)",
    hint: "Le rapprochement des variantes de noms (Holden / James Holden / James) est heuristique et sûr. La passe Gemma ajoute les variantes non évidentes (diminutifs, titres) mais, avec un petit modèle local, produit des fusions erronées.",
    fields: [
      { key: "dedup_use_gemma", label: "Affiner la déduplication avec Gemma (moins sûr)", type: "checkbox" },
    ],
  },
  {
    title: "TTS (voix par défaut)",
    hint: "Backend et voix utilisés par défaut pour le rendu et les replis.",
    fields: [
      { key: "default_backend", label: "Backend par défaut", type: "select",
        options: [["kokoro", "Kokoro (rapide)"], ["qwen3", "Qwen3 (qualité + clonage)"]] },
      { key: "language", label: "Langue (Qwen3)", type: "text" },
      { key: "kokoro_lang_code", label: "Code langue Kokoro", type: "text" },
      { key: "kokoro_default_voice", label: "Voix Kokoro par défaut", type: "text" },
      { key: "qwen3_default_voice", label: "Voix Qwen3 par défaut", type: "text" },
    ],
  },
  {
    title: "Audio (encodage final)",
    hint: "Appliqué à la concaténation et à l'export MP3.",
    fields: [
      { key: "target_sample_rate", label: "Sample rate (Hz)", type: "number", step: 1000, min: 8000, max: 48000 },
      { key: "mp3_bitrate", label: "Bitrate MP3", type: "text" },
      { key: "target_dbfs", label: "Normalisation (dBFS)", type: "number", step: 0.5, min: -40, max: 0 },
    ],
  },
 ];
 function Field({ field, value, onChange }) {
  const common = "input w-full";
  if (field.type === "checkbox")
    return <input type="checkbox" className="h-4 w-4"
      checked={!!value} onChange={(e) => onChange(e.target.checked)} />;
  if (field.type === "textarea")
    return <textarea className={`${common} min-h-[5rem] resize-y text-sm`} rows={4}
      value={value ?? ""} onChange={(e) => onChange(e.target.value)} />;
  if (field.type === "select")
    return <select className={common} value={value ?? ""} onChange={(e) => onChange(e.target.value)}>
      {field.options.map(([v, lbl]) => <option key={v} value={v}>{lbl}</option>)}
    </select>;
  if (field.type === "number")
    return <input className={common} type="number"
      step={field.step} min={field.min} max={field.max}
      value={value ?? ""} onChange={(e) => onChange(e.target.value === "" ? "" : Number(e.target.value))} />;
  return <input className={common} type="text"
    value={value ?? ""} onChange={(e) => onChange(e.target.value)} />;
 }
 export default function Settings({ onBack }) {
  const [settings, setSettings] = useState(null);
  const [saved, setSaved] = useState(false);
  const [error, setError] = useState(null);
  useEffect(() => {
    api.getSettings().then(setSettings).catch((e) => setError(String(e)));
  }, []);
  if (error) return <p className="text-sm text-red-400">{error}</p>;
  if (!settings) return <p className="text-ink-muted"><Spinner /> chargement des réglages…</p>;
  const set = (key, val) => { setSettings({ ...settings, [key]: val }); setSaved(false); };
  const save = async () => {
    setError(null);
    try { await api.putSettings(settings); setSaved(true); }
    catch (e) { setError("Échec de l'enregistrement : " + e); }
  };
  return (
    <div className="space-y-6">
      <div className="flex items-center gap-3">
        <button onClick={onBack} className="text-sm text-ink-muted hover:text-ink-text">← Bibliothèque</button>
        <h1 className="font-serif text-2xl">Réglages techniques</h1>
        <button className="btn-primary ml-auto" onClick={save}>
          {saved ? "✓ enregistré" : "Enregistrer"}
        </button>
      </div>
      <p className="text-sm text-ink-muted">
        Réglages globaux appliqués à toute l'app. Les changements de modèle prennent effet au
        prochain lancement d'analyse ou de rendu.
      </p>
      {SECTIONS.map((sec) => (
        <section key={sec.title} className="card p-4 space-y-3">
          <div>
            <h2 className="font-medium">{sec.title}</h2>
            {sec.hint && <p className="text-xs text-ink-muted">{sec.hint}</p>}
          </div>
          <div className="grid gap-3">
            {sec.fields.map((f) => (
              <label key={f.key} className="grid gap-1">
                <span className="text-sm text-ink-muted">{f.label}</span>
                <Field field={f} value={settings[f.key]} onChange={(v) => set(f.key, v)} />
              </label>
            ))}
          </div>
        </section>
      ))}
      <div className="flex justify-end">
        <button className="btn-primary" onClick={save}>
          {saved ? "✓ enregistré" : "Enregistrer"}
        </button>
      </div>
    </div>
  );
 }
--- a/frontend/src/api.js
+++ b/frontend/src/api.js
@@ -0,0 +1,64 @@
 // Client API InkFlow : wrappers fetch + abonnement WebSocket a l'etat.
 async function j(url, opts) {
  const res = await fetch(url, opts);
  if (!res.ok) throw new Error(`${res.status} ${await res.text()}`);
  const ct = res.headers.get("content-type") || "";
  return ct.includes("application/json") ? res.json() : res;
 }
 const json = (method, body) => ({
  method,
  headers: { "Content-Type": "application/json" },
  body: body ? JSON.stringify(body) : undefined,
 });
 export const api = {
  listBooks: () => j("/api/books"),
  uploadBook: (file) => {
    const fd = new FormData();
    fd.append("file", file);
    return j("/api/books", { method: "POST", body: fd });
  },
  getBook: (slug) => j(`/api/books/${slug}`),
  getChapter: (slug, idx) => j(`/api/books/${slug}/chapters/${idx}`),
  putAnalysis: (slug, idx, analysis) =>
    j(`/api/books/${slug}/chapters/${idx}/analysis`, json("PUT", analysis)),
  analyze: (slug, chapters) => j(`/api/books/${slug}/analyze`, json("POST", { chapters })),
  pronounce: (slug) => j(`/api/books/${slug}/pronounce`, json("POST")),
  castAuto: (slug) => j(`/api/books/${slug}/cast/auto`, json("POST")),
  castAnalyze: (slug, chapters) =>
    j(`/api/books/${slug}/cast/analyze`, json("POST", { chapters })),
  castDedup: (slug) => j(`/api/books/${slug}/cast/dedup`, json("POST")),
  render: (slug, chapters, backend, mono) =>
    j(`/api/books/${slug}/render`, json("POST", { chapters, backend, mono })),
  getCast: (slug) => j(`/api/books/${slug}/cast`),
  putCast: (slug, cast) => j(`/api/books/${slug}/cast`, json("PUT", cast)),
  getPron: (slug) => j(`/api/books/${slug}/pronunciation`),
  putPron: (slug, pron) => j(`/api/books/${slug}/pronunciation`, json("PUT", pron)),
  getSettings: () => j("/api/settings"),
  putSettings: (settings) => j("/api/settings", json("PUT", settings)),
  audioUrl: (slug, idx) => `/api/books/${slug}/audio/${idx}`,
  coverUrl: (slug) => `/api/books/${slug}/cover`,
  previewVoice: async (voiceId, text) => {
    const res = await fetch("/api/voicebank/preview", json("POST", { voice_id: voiceId, text }));
    if (!res.ok) throw new Error("preview");
    return URL.createObjectURL(await res.blob());
  },
 };
 // Abonnement temps reel a l'etat d'un livre. Reconnecte automatiquement.
 export function subscribeState(slug, onState) {
  let ws, closed = false;
  const connect = () => {
    const proto = location.protocol === "https:" ? "wss" : "ws";
    ws = new WebSocket(`${proto}://${location.host}/ws/${slug}`);
    ws.onmessage = (e) => {
      const msg = JSON.parse(e.data);
      if (msg.type === "state") onState(msg.state);
    };
    ws.onclose = () => { if (!closed) setTimeout(connect, 1500); };
  };
  connect();
  return () => { closed = true; ws && ws.close(); };
 }
--- a/frontend/src/index.css
+++ b/frontend/src/index.css
@@ -0,0 +1,37 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
 :root {
  color-scheme: dark;
 }
 body {
  margin: 0;
  background: #14110f;
  color: #ede4d8;
  font-family: system-ui, -apple-system, "Segoe UI", sans-serif;
 }
@layer components {
  .btn {
    @apply inline-flex items-center gap-2 rounded-md px-3 py-1.5 text-sm font-medium
           transition-colors disabled:opacity-40 disabled:cursor-not-allowed;
  }
  .btn-primary {
    @apply btn bg-ink-accent text-ink-bg hover:bg-ink-accent2;
  }
  .btn-ghost {
    @apply btn border border-ink-edge text-ink-text hover:bg-ink-edge;
  }
  .card {
    @apply rounded-lg border border-ink-edge bg-ink-panel;
  }
  .chip {
    @apply inline-flex items-center rounded-full px-2 py-0.5 text-xs font-medium;
  }
  .input {
    @apply rounded-md border border-ink-edge bg-ink-bg px-2 py-1 text-sm
           text-ink-text outline-none focus:border-ink-accent;
  }
 }
--- a/frontend/src/main.jsx
+++ b/frontend/src/main.jsx
@@ -0,0 +1,6 @@
 import React from "react";
 import { createRoot } from "react-dom/client";
 import App from "./App.jsx";
 import "./index.css";
 createRoot(document.getElementById("root")).render(<App />);
--- a/frontend/src/ui.jsx
+++ b/frontend/src/ui.jsx
@@ -0,0 +1,35 @@
 // Petits widgets partages.
 import React from "react";
 const STATUS_STYLE = {
  done: "bg-emerald-900/50 text-emerald-300",
  running: "bg-ink-accent/20 text-ink-accent",
  error: "bg-red-900/50 text-red-300",
  pending: "bg-ink-edge text-ink-muted",
 };
 const STATUS_LABEL = { done: "terminé", running: "en cours", error: "erreur", pending: "en attente" };
 export function StatusChip({ status }) {
  return (
    <span className={`chip ${STATUS_STYLE[status] || STATUS_STYLE.pending}`}>
      {STATUS_LABEL[status] || status}
    </span>
  );
 }
 export function ProgressBar({ value }) {
  return (
    <div className="h-1.5 w-full overflow-hidden rounded-full bg-ink-edge">
      <div
        className="h-full bg-ink-accent transition-all duration-300"
        style={{ width: `${Math.round((value || 0) * 100)}%` }}
      />
    </div>
  );
 }
 export function Spinner() {
  return (
    <span className="inline-block h-3.5 w-3.5 animate-spin rounded-full border-2 border-ink-accent border-t-transparent" />
  );
 }
--- a/frontend/tailwind.config.js
+++ b/frontend/tailwind.config.js
@@ -0,0 +1,23 @@
 /** @type {import('tailwindcss').Config} */
 export default {
  content: ["./index.html", "./src/**/*.{js,jsx}"],
  theme: {
    extend: {
      colors: {
        ink: {
          bg: "#14110f",
          panel: "#1d1916",
          edge: "#2c2622",
          muted: "#9a8c7d",
          text: "#ede4d8",
          accent: "#d9a441",
          accent2: "#b9763f",
        },
      },
      fontFamily: {
        serif: ["Georgia", "Cambria", "serif"],
      },
    },
  },
  plugins: [],
 };
--- a/frontend/vite.config.js
+++ b/frontend/vite.config.js
@@ -0,0 +1,14 @@
 import { defineConfig } from "vite";
 import react from "@vitejs/plugin-react";
 // En dev, l'UI tourne sur 5173 et proxifie l'API/WS vers le backend (8000).
 export default defineConfig({
  plugins: [react()],
  server: {
    port: 5173,
    proxy: {
      "/api": { target: "http://127.0.0.1:8000", changeOrigin: true },
      "/ws": { target: "ws://127.0.0.1:8000", ws: true },
    },
  },
 });
--- a/voicebank/clips/f_bella.wav
+++ b/voicebank/clips/f_bella.wav
--- a/voicebank/clips/f_emma.wav
+++ b/voicebank/clips/f_emma.wav
--- a/voicebank/clips/f_heart.wav
+++ b/voicebank/clips/f_heart.wav
--- a/voicebank/clips/f_nicole.wav
+++ b/voicebank/clips/f_nicole.wav
--- a/voicebank/clips/fr_f_siwis.wav
+++ b/voicebank/clips/fr_f_siwis.wav
--- a/voicebank/clips/m_eric.wav
+++ b/voicebank/clips/m_eric.wav
--- a/voicebank/clips/m_fenrir.wav
+++ b/voicebank/clips/m_fenrir.wav
--- a/voicebank/clips/m_george.wav
+++ b/voicebank/clips/m_george.wav
--- a/voicebank/clips/m_lewis.wav
+++ b/voicebank/clips/m_lewis.wav
--- a/voicebank/clips/m_michael.wav
+++ b/voicebank/clips/m_michael.wav
--- a/voicebank/clips/m_santa.wav
+++ b/voicebank/clips/m_santa.wav
--- a/voicebank/metadata.json
+++ b/voicebank/metadata.json
@@ -0,0 +1,114 @@
 {
  "entries": [
    {
      "id": "fr_f_siwis",
      "kokoro_voice": "ff_siwis",
      "gender": "female",
      "age": "adult",
      "lang": "fr",
      "label": "Siwis (FR)",
      "ref_audio": "clips/fr_f_siwis.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "f_bella",
      "kokoro_voice": "af_bella",
      "gender": "female",
      "age": "adult",
      "lang": "fr",
      "label": "Bella",
      "ref_audio": "clips/f_bella.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "f_heart",
      "kokoro_voice": "af_heart",
      "gender": "female",
      "age": "young",
      "lang": "fr",
      "label": "Heart",
      "ref_audio": "clips/f_heart.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "f_emma",
      "kokoro_voice": "bf_emma",
      "gender": "female",
      "age": "adult",
      "lang": "fr",
      "label": "Emma",
      "ref_audio": "clips/f_emma.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "f_nicole",
      "kokoro_voice": "af_nicole",
      "gender": "female",
      "age": "adult",
      "lang": "fr",
      "label": "Nicole",
      "ref_audio": "clips/f_nicole.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_fenrir",
      "kokoro_voice": "am_fenrir",
      "gender": "male",
      "age": "adult",
      "lang": "fr",
      "label": "Fenrir",
      "ref_audio": "clips/m_fenrir.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_michael",
      "kokoro_voice": "am_michael",
      "gender": "male",
      "age": "adult",
      "lang": "fr",
      "label": "Michael",
      "ref_audio": "clips/m_michael.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_george",
      "kokoro_voice": "bm_george",
      "gender": "male",
      "age": "adult",
      "lang": "fr",
      "label": "George",
      "ref_audio": "clips/m_george.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_lewis",
      "kokoro_voice": "bm_lewis",
      "gender": "male",
      "age": "adult",
      "lang": "fr",
      "label": "Lewis",
      "ref_audio": "clips/m_lewis.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_eric",
      "kokoro_voice": "am_eric",
      "gender": "male",
      "age": "young",
      "lang": "fr",
      "label": "Eric",
      "ref_audio": "clips/m_eric.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    },
    {
      "id": "m_santa",
      "kokoro_voice": "am_santa",
      "gender": "male",
      "age": "old",
      "lang": "fr",
      "label": "Santa",
      "ref_audio": "clips/m_santa.wav",
      "ref_text": "L'univers est toujours plus étrange qu'on ne le croit. Chaque nouvelle merveille pose les bases d'une découverte plus éblouissante encore."
    }
  ]
 }