espnet_draft_home_page

Draft for the espnet home page project. Page: https://masao-someki.github.io/espnet_draft_home_page/

https://github.com/masao-someki/espnet_draft_home_page

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic links in README
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (0.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Draft for the espnet home page project. Page: https://masao-someki.github.io/espnet_draft_home_page/

Basic Info
  • Host: GitHub
  • Owner: Masao-Someki
  • Language: HTML
  • Default Branch: master
  • Homepage:
  • Size: 300 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Citation

Owner

  • Name: Masao Someki
  • Login: Masao-Someki
  • Kind: user
  • Location: Tokyo

hobby programmer

Citation (citations.html)

<!doctype html>
<html lang="en-US" data-theme="light">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <meta name="generator" content="VuePress 2.0.0-rc.14" />
    <meta name="theme" content="VuePress Theme Hope 2.0.0-rc.51" />
    <style>
      html {
        background: var(--bg-color, #fff);
      }

      html[data-theme="dark"] {
        background: var(--bg-color, #1d1e1f);
      }

      body {
        background: var(--bg-color);
      }
    </style>
    <script>
      const userMode = localStorage.getItem("vuepress-theme-hope-scheme");
      const systemDarkMode =
        window.matchMedia &&
        window.matchMedia("(prefers-color-scheme: dark)").matches;

      if (userMode === "dark" || (userMode !== "light" && systemDarkMode)) {
        document.documentElement.setAttribute("data-theme", "dark");
      }
    </script>
    <link rel="icon" href="/espnet_draft_home_page/assets/image/espnet.png"><title>Additional citations</title><meta name="description" content="A documentation for ESPnet">
    <link rel="preload" href="/espnet_draft_home_page/assets/style-CiXYLHjk.css" as="style"><link rel="stylesheet" href="/espnet_draft_home_page/assets/style-CiXYLHjk.css">
    <link rel="modulepreload" href="/espnet_draft_home_page/assets/app-DtMTUqbx.js">
    
  </head>
  <body>
    <div id="app"><!--[--><!--[--><!--[--><span tabindex="-1"></span><a href="#main-content" class="vp-skip-link sr-only">Skip to main content</a><!--]--><!--[--><div class="theme-container external-link-icon"><!--[--><header id="navbar" class="vp-navbar"><div class="vp-navbar-start"><button type="button" class="vp-toggle-sidebar-button" title="Toggle Sidebar"><span class="icon"></span></button><!----><!--[--><a class="route-link vp-brand" href="/espnet_draft_home_page/"><img class="vp-nav-logo" src="/espnet_draft_home_page/assets/image/espnet_logo1.png" alt><!----><!----></a><!--]--><!----></div><div class="vp-navbar-center"><!----><!--[--><nav class="vp-nav-links"><div class="vp-nav-item hide-in-mobile"><div class="vp-dropdown-wrapper"><button type="button" class="vp-dropdown-title" aria-label="Demos"><!--[--><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:laptop-code" width="1em" height="1em"></iconify-icon>Demos<!--]--><span class="arrow"></span><ul class="vp-dropdown"><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/notebook/" aria-label="Roadmap"><!---->Roadmap<!----></a></li><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">ESPnet2</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/notebook/ESPnet2/Demo/" aria-label="Demo"><!---->Demo<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/notebook/ESPnet2/Course/" aria-label="Course"><!---->Course<!----></a></li></ul></li><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">ESPnet-EZ</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/notebook/ESPnetEZ/" aria-label="ESPnet EZ"><!---->ESPnet EZ<!----></a></li></ul></li><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">ESPnet1 (Legacy)</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/notebook/ESPnet1/" aria-label="ESPnet1"><!---->ESPnet1<!----></a></li></ul></li></ul></button></div></div><div class="vp-nav-item hide-in-mobile"><div class="vp-dropdown-wrapper"><button type="button" class="vp-dropdown-title" aria-label="Recipes"><!--[--><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:mug-hot" width="1em" height="1em"></iconify-icon>Recipes<!--]--><span class="arrow"></span><ul class="vp-dropdown"><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/" aria-label="What is a recipe template?"><!---->What is a recipe template?<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/asr1.html" aria-label="Automatic Speech Recognition (Multi-tasking)"><!---->Automatic Speech Recognition (Multi-tasking)<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/asr2.html" aria-label="Automatic Speech Recognition with Discrete Units"><!---->Automatic Speech Recognition with Discrete Units<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/asvspoof1.html" aria-label="Speaker Verification Spoofing and Countermeasures"><!---->Speaker Verification Spoofing and Countermeasures<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/codec1.html" aria-label="Speech Codec"><!---->Speech Codec<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/diar1.html" aria-label="Speaker Diarisation"><!---->Speaker Diarisation<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/enh1.html" aria-label="Speech Enhancement"><!---->Speech Enhancement<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/enh_asr1.html" aria-label="Speech Recognition with Speech Enhancement"><!---->Speech Recognition with Speech Enhancement<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/enh_diar1.html" aria-label="Speaker Diarisation with Speech Enhancement"><!---->Speaker Diarisation with Speech Enhancement<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/enh_st1.html" aria-label="Speech-to-Text Translation with Speech Enhancement"><!---->Speech-to-Text Translation with Speech Enhancement<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/lm1.html" aria-label="Language Modeling"><!---->Language Modeling<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/mt1.html" aria-label="Machine Translation"><!---->Machine Translation<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/s2st1.html" aria-label="Speech-to-Speech Translation"><!---->Speech-to-Speech Translation<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/s2t1.html" aria-label="Weakly-supervised Learning (Speech-to-Text)"><!---->Weakly-supervised Learning (Speech-to-Text)<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/slu1.html" aria-label="Spoken Language Understanding"><!---->Spoken Language Understanding<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/speechlm1.html" aria-label="Speech Language Model"><!---->Speech Language Model<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/spk1.html" aria-label="Speaker Representation"><!---->Speaker Representation<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/ssl1.html" aria-label="Self-supervised Learning"><!---->Self-supervised Learning<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/st1.html" aria-label="Speech-to-Text Translation"><!---->Speech-to-Text Translation<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/svs1.html" aria-label="Singing Voice Synthesis"><!---->Singing Voice Synthesis<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/tts1.html" aria-label="Text-to-Speech"><!---->Text-to-Speech<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/tts2.html" aria-label="Text-to-Speech with Discrete Units"><!---->Text-to-Speech with Discrete Units<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/recipe/uasr1.html" aria-label="Unsupervised Automatic Speech Recognition"><!---->Unsupervised Automatic Speech Recognition<!----></a></li></ul></button></div></div><div class="vp-nav-item hide-in-mobile"><div class="vp-dropdown-wrapper"><button type="button" class="vp-dropdown-title" aria-label="Python API"><!--[--><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:book" width="1em" height="1em"></iconify-icon>Python API<!--]--><span class="arrow"></span><ul class="vp-dropdown"><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">espnet</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/asr/" aria-label="asr"><!---->asr<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/distributed/" aria-label="distributed"><!---->distributed<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/lm/" aria-label="lm"><!---->lm<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/mt/" aria-label="mt"><!---->mt<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/nets/" aria-label="nets"><!---->nets<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/optimizer/" aria-label="optimizer"><!---->optimizer<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/scheduler/" aria-label="scheduler"><!---->scheduler<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/st/" aria-label="st"><!---->st<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/transform/" aria-label="transform"><!---->transform<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/tts/" aria-label="tts"><!---->tts<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/utils/" aria-label="utils"><!---->utils<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet/vc/" aria-label="vc"><!---->vc<!----></a></li></ul></li><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">espnet2</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/asr/" aria-label="asr"><!---->asr<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/asr_transducer/" aria-label="asr_transducer"><!---->asr_transducer<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/asvspoof/" aria-label="asvspoof"><!---->asvspoof<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/diar/" aria-label="diar"><!---->diar<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/enh/" aria-label="enh"><!---->enh<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/fileio/" aria-label="fileio"><!---->fileio<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/fst/" aria-label="fst"><!---->fst<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/gan_codec/" aria-label="gan_codec"><!---->gan_codec<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/gan_svs/" aria-label="gan_svs"><!---->gan_svs<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/gan_tts/" aria-label="gan_tts"><!---->gan_tts<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/hubert/" aria-label="hubert"><!---->hubert<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/iterators/" aria-label="iterators"><!---->iterators<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/layers/" aria-label="layers"><!---->layers<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/lm/" aria-label="lm"><!---->lm<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/main_funcs/" aria-label="main_funcs"><!---->main_funcs<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/mt/" aria-label="mt"><!---->mt<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/optimizers/" aria-label="optimizers"><!---->optimizers<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/s2st/" aria-label="s2st"><!---->s2st<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/s2t/" aria-label="s2t"><!---->s2t<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/samplers/" aria-label="samplers"><!---->samplers<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/schedulers/" aria-label="schedulers"><!---->schedulers<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/slu/" aria-label="slu"><!---->slu<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/speechlm/" aria-label="speechlm"><!---->speechlm<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/spk/" aria-label="spk"><!---->spk<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/st/" aria-label="st"><!---->st<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/svs/" aria-label="svs"><!---->svs<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/tasks/" aria-label="tasks"><!---->tasks<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/text/" aria-label="text"><!---->text<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/torch_utils/" aria-label="torch_utils"><!---->torch_utils<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/train/" aria-label="train"><!---->train<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/tts/" aria-label="tts"><!---->tts<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/tts2/" aria-label="tts2"><!---->tts2<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/uasr/" aria-label="uasr"><!---->uasr<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnet2/utils/" aria-label="utils"><!---->utils<!----></a></li></ul></li><li class="vp-dropdown-item"><h4 class="vp-dropdown-subtitle">espnetez</h4><ul class="vp-dropdown-subitems"><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/config/" aria-label="config"><!---->config<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/data/" aria-label="data"><!---->data<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/dataloader/" aria-label="dataloader"><!---->dataloader<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/dataset/" aria-label="dataset"><!---->dataset<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/preprocess/" aria-label="preprocess"><!---->preprocess<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/task/" aria-label="task"><!---->task<!----></a></li><li class="vp-dropdown-subitem"><a class="route-link auto-link" href="/espnet_draft_home_page/guide/espnetez/trainer/" aria-label="trainer"><!---->trainer<!----></a></li></ul></li></ul></button></div></div><div class="vp-nav-item hide-in-mobile"><div class="vp-dropdown-wrapper"><button type="button" class="vp-dropdown-title" aria-label="Shell API"><!--[--><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:wrench" width="1em" height="1em"></iconify-icon>Shell API<!--]--><span class="arrow"></span><ul class="vp-dropdown"><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/tools/espnet2_bin/" aria-label="espnet2_bin"><!---->espnet2_bin<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/tools/espnet_bin/" aria-label="espnet_bin"><!---->espnet_bin<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/tools/spm/" aria-label="spm"><!---->spm<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/tools/utils/" aria-label="utils"><!---->utils<!----></a></li><li class="vp-dropdown-item"><a class="route-link auto-link" href="/espnet_draft_home_page/tools/utils_py/" aria-label="utils_py"><!---->utils_py<!----></a></li></ul></button></div></div></nav><!--]--><!----></div><div class="vp-navbar-end"><!----><!--[--><!----><div class="vp-nav-item vp-action"><a class="vp-action-link" href="https://github.com/espnet/espnet" target="_blank" rel="noopener noreferrer" aria-label="GitHub"><svg xmlns="http://www.w3.org/2000/svg" class="icon github-icon" viewBox="0 0 1024 1024" fill="currentColor" aria-label="github icon" name="github" style="width:1.25rem;height:1.25rem;vertical-align:middle;"><path d="M511.957 21.333C241.024 21.333 21.333 240.981 21.333 512c0 216.832 140.544 400.725 335.574 465.664 24.49 4.395 32.256-10.07 32.256-23.083 0-11.69.256-44.245 0-85.205-136.448 29.61-164.736-64.64-164.736-64.64-22.315-56.704-54.4-71.765-54.4-71.765-44.587-30.464 3.285-29.824 3.285-29.824 49.195 3.413 75.179 50.517 75.179 50.517 43.776 75.008 114.816 53.333 142.762 40.79 4.523-31.66 17.152-53.377 31.19-65.537-108.971-12.458-223.488-54.485-223.488-242.602 0-53.547 19.114-97.323 50.517-131.67-5.035-12.33-21.93-62.293 4.779-129.834 0 0 41.258-13.184 134.912 50.346a469.803 469.803 0 0 1 122.88-16.554c41.642.213 83.626 5.632 122.88 16.554 93.653-63.488 134.784-50.346 134.784-50.346 26.752 67.541 9.898 117.504 4.864 129.834 31.402 34.347 50.474 78.123 50.474 131.67 0 188.586-114.73 230.016-224.042 242.09 17.578 15.232 33.578 44.672 33.578 90.454v135.85c0 13.142 7.936 27.606 32.854 22.87C862.25 912.597 1002.667 728.747 1002.667 512c0-271.019-219.648-490.667-490.71-490.667z"></path></svg></a></div><!----><!--[--><button type="button" class="search-pro-button" aria-label="Search"><svg xmlns="http://www.w3.org/2000/svg" class="icon search-icon" viewBox="0 0 1024 1024" fill="currentColor" aria-label="search icon" name="search"><path d="M192 480a256 256 0 1 1 512 0 256 256 0 0 1-512 0m631.776 362.496-143.2-143.168A318.464 318.464 0 0 0 768 480c0-176.736-143.264-320-320-320S128 303.264 128 480s143.264 320 320 320a318.016 318.016 0 0 0 184.16-58.592l146.336 146.368c12.512 12.48 32.768 12.48 45.28 0 12.48-12.512 12.48-32.768 0-45.28"></path></svg><div class="search-pro-placeholder">Search</div><div class="search-pro-key-hints"><kbd class="search-pro-key">Ctrl</kbd><kbd class="search-pro-key">K</kbd></div></button><!--]--><!--]--><!----><button type="button" class="vp-toggle-navbar-button" aria-label="Toggle Navbar" aria-expanded="false" aria-controls="nav-screen"><span><span class="vp-top"></span><span class="vp-middle"></span><span class="vp-bottom"></span></span></button></div></header><!----><!--]--><!----><div class="toggle-sidebar-wrapper"><span class="arrow start"></span></div><aside id="sidebar" class="vp-sidebar"><!----><ul class="vp-sidebar-links"><li><section class="vp-sidebar-group"><p class="vp-sidebar-header"><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:laptop-code" width="1em" height="1em"></iconify-icon><span class="vp-sidebar-title">Demos</span><!----></p><ul class="vp-sidebar-links"><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/notebook/" aria-label="ESPnet Notebooks"><!---->ESPnet Notebooks<!----></a></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">ESPnet EZ</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">ESPnet1</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">ESPnet2</span><span class="vp-arrow end"></span></button><!----></section></li></ul></section></li><li><section class="vp-sidebar-group"><p class="vp-sidebar-header"><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:mug-hot" width="1em" height="1em"></iconify-icon><span class="vp-sidebar-title">Recipes</span><!----></p><ul class="vp-sidebar-links"><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/" aria-label="Recipe Template"><!---->Recipe Template<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/asr1.html" aria-label="Automatic Speech Recognition (Multi-tasking)"><!---->Automatic Speech Recognition (Multi-tasking)<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/asr2.html" aria-label="Automatic Speech Recognition with Discrete Units"><!---->Automatic Speech Recognition with Discrete Units<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/lm1.html" aria-label="Language Modeling"><!---->Language Modeling<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/mt1.html" aria-label="Machine Translation"><!---->Machine Translation<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/ssl1.html" aria-label="Self-supervised Learning"><!---->Self-supervised Learning<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/svs1.html" aria-label="Singing Voice Synthesis"><!---->Singing Voice Synthesis<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/diar1.html" aria-label="Speaker Diarisation"><!---->Speaker Diarisation<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/enh_diar1.html" aria-label="Speaker Diarisation with Speech Enhancement"><!---->Speaker Diarisation with Speech Enhancement<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/spk1.html" aria-label="Speaker Representation"><!---->Speaker Representation<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/asvspoof1.html" aria-label="Speaker Verification Spoofing and Countermeasures"><!---->Speaker Verification Spoofing and Countermeasures<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/codec1.html" aria-label="Speech Codec"><!---->Speech Codec<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/enh1.html" aria-label="Speech Enhancement"><!---->Speech Enhancement<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/speechlm1.html" aria-label="Speech Language Model"><!---->Speech Language Model<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/enh_asr1.html" aria-label="Speech Recognition with Speech Enhancement"><!---->Speech Recognition with Speech Enhancement<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/s2st1.html" aria-label="Speech-to-Speech Translation"><!---->Speech-to-Speech Translation<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/st1.html" aria-label="Speech-to-Text Translation"><!---->Speech-to-Text Translation<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/enh_st1.html" aria-label="Speech-to-Text Translation with Speech Enhancement"><!---->Speech-to-Text Translation with Speech Enhancement<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/slu1.html" aria-label="Spoken Language Understanding"><!---->Spoken Language Understanding<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/tts1.html" aria-label="Text-to-Speech"><!---->Text-to-Speech<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/tts2.html" aria-label="Text-to-Speech with Discrete Units"><!---->Text-to-Speech with Discrete Units<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/uasr1.html" aria-label="Unsupervised Automatic Speech Recognition"><!---->Unsupervised Automatic Speech Recognition<!----></a></li><li><a class="route-link auto-link vp-sidebar-link" href="/espnet_draft_home_page/recipe/s2t1.html" aria-label="Weakly-supervised Learning (Speech-to-Text)"><!---->Weakly-supervised Learning (Speech-to-Text)<!----></a></li></ul></section></li><li><section class="vp-sidebar-group"><p class="vp-sidebar-header"><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:book" width="1em" height="1em"></iconify-icon><span class="vp-sidebar-title">Python API</span><!----></p><ul class="vp-sidebar-links"><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Espnet</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Espnet2</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Espnetez</span><span class="vp-arrow end"></span></button><!----></section></li></ul></section></li><li><section class="vp-sidebar-group"><p class="vp-sidebar-header"><iconify-icon class="font-icon icon" style="" mode="style" inline icon="fa-solid:wrench" width="1em" height="1em"></iconify-icon><span class="vp-sidebar-title">Shell API</span><!----></p><ul class="vp-sidebar-links"><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Espnet Bin</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Espnet2 Bin</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Spm</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Utils</span><span class="vp-arrow end"></span></button><!----></section></li><li><section class="vp-sidebar-group"><button class="vp-sidebar-header clickable" type="button"><!----><span class="vp-sidebar-title">Utils Py</span><span class="vp-arrow end"></span></button><!----></section></li></ul></section></li></ul><!----></aside><!--[--><main id="main-content" class="vp-page"><!--[--><!----><!----><nav class="vp-breadcrumb disable"></nav><div class="vp-page-title"><h1><!---->Additional citations</h1><div class="page-info"><!----><!----><!----><!----><span class="page-reading-time-info" aria-label="Reading Time⌛" data-balloon-pos="up"><svg xmlns="http://www.w3.org/2000/svg" class="icon timer-icon" viewBox="0 0 1024 1024" fill="currentColor" aria-label="timer icon" name="timer"><path d="M799.387 122.15c4.402-2.978 7.38-7.897 7.38-13.463v-1.165c0-8.933-7.38-16.312-16.312-16.312H256.33c-8.933 0-16.311 7.38-16.311 16.312v1.165c0 5.825 2.977 10.874 7.637 13.592 4.143 194.44 97.22 354.963 220.201 392.763-122.204 37.542-214.893 196.511-220.2 389.397-4.661 5.049-7.638 11.651-7.638 19.03v5.825h566.49v-5.825c0-7.379-2.849-13.981-7.509-18.9-5.049-193.016-97.867-351.985-220.2-389.527 123.24-37.67 216.446-198.453 220.588-392.892zM531.16 450.445v352.632c117.674 1.553 211.787 40.778 211.787 88.676H304.097c0-48.286 95.149-87.382 213.728-88.676V450.445c-93.077-3.107-167.901-81.297-167.901-177.093 0-8.803 6.99-15.793 15.793-15.793 8.803 0 15.794 6.99 15.794 15.793 0 80.261 63.69 145.635 142.01 145.635s142.011-65.374 142.011-145.635c0-8.803 6.99-15.793 15.794-15.793s15.793 6.99 15.793 15.793c0 95.019-73.789 172.82-165.96 177.093z"></path></svg><span>About 3 min</span><meta property="timeRequired" content="PT3M"></span><!----><!----></div><hr></div><!----><!----><div class="theme-hope-content"><h1 id="additional-citations" tabindex="-1"><a class="header-anchor" href="#additional-citations"><span>Additional citations</span></a></h1><nav class="table-of-contents"><ul><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#espnet-ez" class="router-link-active router-link-exact-active">ESPnet-EZ</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#weakly-supervised-learning" class="router-link-active router-link-exact-active">Weakly-supervised Learning</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#self-supervised-learning" class="router-link-active router-link-exact-active">Self-supervised Learning</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#speaker-embeddings" class="router-link-active router-link-exact-active">Speaker Embeddings</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#tts" class="router-link-active router-link-exact-active">TTS</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#speech-translation" class="router-link-active router-link-exact-active">Speech Translation</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#speech-enhancement" class="router-link-active router-link-exact-active">Speech Enhancement</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#spoken-language-understanding" class="router-link-active router-link-exact-active">Spoken Language Understanding</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#singing-voice-synthesis" class="router-link-active router-link-exact-active">Singing Voice Synthesis</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#unsupervised-asr" class="router-link-active router-link-exact-active">Unsupervised ASR</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#speech-summarization" class="router-link-active router-link-exact-active">Speech summarization</a></li><li><a aria-current="page" href="/espnet_draft_home_page/citations.html#exporting-models-to-onnx" class="router-link-active router-link-exact-active">Exporting models to ONNX</a></li></ul></nav><h2 id="espnet-ez" tabindex="-1"><a class="header-anchor" href="#espnet-ez"><span>ESPnet-EZ</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{hayashi2020espnet,</span></span>
<span class="line"><span>  title={ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration},</span></span>
<span class="line"><span>  author={Masao Someki and Kwanghee Choi and Siddhant Arora and William Chen and Samuele Cornell and Jionghao Han and Yifan Peng and Jiatong Shi and Vaibhav Srivastav and Shinji Watanabe},</span></span>
<span class="line"><span>  booktitle={SLT},</span></span>
<span class="line"><span>  year={2024},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="weakly-supervised-learning" tabindex="-1"><a class="header-anchor" href="#weakly-supervised-learning"><span>Weakly-supervised Learning</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span># OWSM</span></span>
<span class="line"><span>@inproceedings{peng2023reproducing,</span></span>
<span class="line"><span>  title={Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data},</span></span>
<span class="line"><span>  author={Peng, Yifan and Tian, Jinchuan and Yan, Brian and Berrebbi, Dan and Chang, Xuankai and Li, Xinjian and Shi, Jiatong and Arora, Siddhant and Chen, William and Sharma, Roshan and others},</span></span>
<span class="line"><span>  booktitle={ASRU},</span></span>
<span class="line"><span>  year={2023},</span></span>
<span class="line"><span>}</span></span>
<span class="line"><span></span></span>
<span class="line"><span></span></span>
<span class="line"><span># OWSM v3.1</span></span>
<span class="line"><span>@inproceedings{peng2023reproducing,</span></span>
<span class="line"><span>  title={OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer},</span></span>
<span class="line"><span>  author={Peng, Yifan and Tian, Jinchuan and Chen, William and Arora, Siddhant and Yan, Brian and Sudo, Yui and Shakeel, Muhammad and Choi, Kwanghee and Shi, Jiatong and Chang, Xuankai and others},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2024},</span></span>
<span class="line"><span>}</span></span>
<span class="line"><span></span></span>
<span class="line"><span># OWSM v3.2</span></span>
<span class="line"><span>@inproceedings{tian2024effects,</span></span>
<span class="line"><span>  title={On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models},</span></span>
<span class="line"><span>  author={Tian, Jinchuan and Peng, Yifan and Chen, William and Choi, Kwanghee and Livescu, Karen and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2024},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="self-supervised-learning" tabindex="-1"><a class="header-anchor" href="#self-supervised-learning"><span>Self-supervised Learning</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span># HuBERT reproduction</span></span>
<span class="line"><span>@inproceedings{chen2023reducing,</span></span>
<span class="line"><span>  title={Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute},</span></span>
<span class="line"><span>  author={William Chen and Xuankai Chang and Yifan Peng and Zhaoheng Ni and Soumi Maiti and Shinji Watanabe},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2023},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="speaker-embeddings" tabindex="-1"><a class="header-anchor" href="#speaker-embeddings"><span>Speaker Embeddings</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{jung2024espnet,</span></span>
<span class="line"><span>  title={ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models},</span></span>
<span class="line"><span>  author={Jung, Jee-weon and Zhang, Wangyou and Shi, Jiatong and Aldeneh, Zakaria and Higuchi, Takuya and Theobald, Barry-John and Abdelaziz, Ahmed Hussen and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2024},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="tts" tabindex="-1"><a class="header-anchor" href="#tts"><span>TTS</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{hayashi2020espnet,</span></span>
<span class="line"><span>  title={ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit},</span></span>
<span class="line"><span>  author={Hayashi, Tomoki and Yamamoto, Ryuichi and Inoue, Katsuki and Yoshimura, Takenori and Watanabe, Shinji and Toda, Tomoki and Takeda, Kazuya and Zhang, Yu and Tan, Xu},</span></span>
<span class="line"><span>  booktitle={ICASSP},</span></span>
<span class="line"><span>  year={2020},</span></span>
<span class="line"><span>}</span></span>
<span class="line"><span></span></span>
<span class="line"><span>@article{hayashi2021espnet2,</span></span>
<span class="line"><span>  title={ESPnet2-TTS: Extending the Edge of TTS Research},</span></span>
<span class="line"><span>  author={Hayashi, Tomoki and Yamamoto, Ryuichi and Yoshimura, Takenori and Wu, Peter and Shi, Jiatong and Saeki, Takaaki and Ju, Yooncheol and Yasuda, Yusuke and Takamichi, Shinnosuke and Watanabe, Shinji},</span></span>
<span class="line"><span>  journal={arXiv preprint arXiv:2110.07840},</span></span>
<span class="line"><span>  year={2021}</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="speech-translation" tabindex="-1"><a class="header-anchor" href="#speech-translation"><span>Speech Translation</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{inaguma2020espnet,</span></span>
<span class="line"><span>  title={ESPnet-ST: All-in-one speech translation toolkit},</span></span>
<span class="line"><span>  author={Inaguma, Hirofumi and Kiyono, Shun and Duh, Kevin and Karita, Shigeki and Soplin, Nelson Enrique Yalta and Hayashi, Tomoki and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={ACL Demos},</span></span>
<span class="line"><span>  year={2020},</span></span>
<span class="line"><span>}</span></span>
<span class="line"><span></span></span>
<span class="line"><span>@inproceedings{yan2023espnet,</span></span>
<span class="line"><span>  title={ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit},</span></span>
<span class="line"><span>  author={Yan, Brian and Shi, Jiatong and Tang, Yun and Inaguma, Hirofumi and Peng, Yifan and Dalmia, Siddharth and Polak, Peter and Fernandes, Patrick and Berrebbi, Dan and Hayashi, Tomoki and Zhang, Xiaohui and Ni, Zhaoheng and Hira, Moto and Maiti, Soumi and Pino, Juan and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={ACL Demos},</span></span>
<span class="line"><span>  year={2023},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="speech-enhancement" tabindex="-1"><a class="header-anchor" href="#speech-enhancement"><span>Speech Enhancement</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{li2020espnetse,</span></span>
<span class="line"><span>  title={ESPnet-SE: End-to-End Speech Enhancement and Separation Toolkit Designed for ASR Integration},</span></span>
<span class="line"><span>  author={Chenda Li and Jing Shi and Wangyou Zhang and Aswin Shanmugam Subramanian and Xuankai Chang and Naoyuki Kamo and Moto Hira and Tomoki Hayashi and Christoph Boeddeker and Zhuo Chen and Shinji Watanabe},</span></span>
<span class="line"><span>  booktitle={SLT},</span></span>
<span class="line"><span>  year={2021},</span></span>
<span class="line"><span>}</span></span>
<span class="line"><span></span></span>
<span class="line"><span>@inproceedings{lu2022espnetsep,</span></span>
<span class="line"><span>  author={Yen-Ju Lu and Xuankai Chang and Chenda Li and Wangyou Zhang and Samuele Cornell and Zhaoheng Ni and Yoshiki Masuyama and Brian Yan and Robin Scheibler and Zhong-Qiu Wang and Yu Tsao and Yanmin Qian and Shinji Watanabe},</span></span>
<span class="line"><span>  title={ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2022},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="spoken-language-understanding" tabindex="-1"><a class="header-anchor" href="#spoken-language-understanding"><span>Spoken Language Understanding</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{arora2021espnet,</span></span>
<span class="line"><span>  title={ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet},</span></span>
<span class="line"><span>  author={Arora, Siddhant and Dalmia, Siddharth and Denisov, Pavel and Chang, Xuankai and Ueda, Yushi and Peng, Yifan and Zhang, Yuekai and Kumar, Sujay and Ganesan, Karthik and Yan, Brian and Vu, Ngoc Thang and Black, Alan W and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={ICASSP},</span></span>
<span class="line"><span>  year={2022},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="singing-voice-synthesis" tabindex="-1"><a class="header-anchor" href="#singing-voice-synthesis"><span>Singing Voice Synthesis</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{shi2022muskits,</span></span>
<span class="line"><span>  author={Shi, Jiatong and Guo, Shuai and Qian, Tao and Huo, Nan and Hayashi, Tomoki and Wu, Yuning and Xu, Frank and Chang, Xuankai and Li, Huazhe and Wu, Peter and Watanabe, Shinji and Jin, Qin},</span></span>
<span class="line"><span>  title={Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis},</span></span>
<span class="line"><span>  booktitle={Interspeech},</span></span>
<span class="line"><span>  year={2022},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="unsupervised-asr" tabindex="-1"><a class="header-anchor" href="#unsupervised-asr"><span>Unsupervised ASR</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{gao2022euro,</span></span>
<span class="line"><span>  title={EURO: ESPnet Unsupervised ASR Open-source Toolkit},</span></span>
<span class="line"><span>  author={Gao, Dongji and Shi, Jiatong and Chuang, Shun-Po and Garcia, Leibny Paola and Lee, Hung-yi and Watanabe, Shinji and Khudanpur, Sanjeev},</span></span>
<span class="line"><span>  booktitle={ICASSP},</span></span>
<span class="line"><span>  year={2023}</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="speech-summarization" tabindex="-1"><a class="header-anchor" href="#speech-summarization"><span>Speech summarization</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{sharma2023espnet,</span></span>
<span class="line"><span>  title={Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems},</span></span>
<span class="line"><span>  author={Sharma, Roshan and Chen, William and Kano, Takatomo and Sharma, Ruchira and Arora, Siddhant and Watanabe, Shinji and Ogawa, Atsunori and Delcroix, Marc and Singh, Rita and Raj, Bhiksha},</span></span>
<span class="line"><span>  booktitle={ASRU},</span></span>
<span class="line"><span>  year={2023},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div><h2 id="exporting-models-to-onnx" tabindex="-1"><a class="header-anchor" href="#exporting-models-to-onnx"><span>Exporting models to ONNX</span></a></h2><div class="language- line-numbers-mode" data-highlighter="shiki" data-ext="" data-title="" style="--shiki-light:#24292e;--shiki-dark:#abb2bf;--shiki-light-bg:#fff;--shiki-dark-bg:#282c34;"><pre class="shiki shiki-themes github-light one-dark-pro vp-code"><code><span class="line"><span>@inproceedings{someki2022espnet,</span></span>
<span class="line"><span>  title={ESPnet-ONNX: Bridging a Gap Between Research and Production},</span></span>
<span class="line"><span>  author={Someki, Masao and Higuchi, Yosuke and Hayashi, Tomoki and Watanabe, Shinji},</span></span>
<span class="line"><span>  booktitle={APSIPA ASC},</span></span>
<span class="line"><span>  year={2022},</span></span>
<span class="line"><span>}</span></span></code></pre><div class="line-numbers" aria-hidden="true" style="counter-reset:line-number 0;"><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div><div class="line-number"></div></div></div></div><!----><footer class="vp-page-meta"><!----><div class="vp-meta-item git-info"><!----><!----></div></footer><!----><!----><!----><!--]--></main><!--]--><footer class="vp-footer-wrapper"><div class="vp-footer">Copyright © 2024 ESPnet Community. All rights reserved.</div><!----></footer></div><!--]--><!--]--><!--[--><!----><!----><!--]--><!--]--></div>
    <script type="module" src="/espnet_draft_home_page/assets/app-DtMTUqbx.js" defer></script>
  </body>
</html>

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3