SEARCH

Search Details

ARIKI Yasuo
Research Center for Urban Safety and Security
Research Fellow

Researcher basic information

■ Research Keyword
  • 対話システム
  • 自然言語処理
  • Speech recognition
  • Image recognition
  • Video processing
  • Disaster information system
■ Research Areas
  • Informatics / Intelligent robotics
  • Informatics / Perceptual information processing
■ Committee History
  • Apr. 2001 - Mar. 2016, 日本音響学会, 評議員
  • Apr. 2000 - Mar. 2016, 電子情報通信学会, 教科書委員会委員
  • Apr. 2012 - Mar. 2013, 日本音響学会, 関西支部支部長
  • Apr. 2011 - Mar. 2012, 日本音響学会, 関西支部副支部長
  • Apr. 2008 - Mar. 2010, 電子情報通信学会, 音声研究会委員長
  • Apr. 1994 - Sep. 2003, 情報処理学会, 音声言語情報処理研究連絡会連絡委員
  • May 1998 - Apr. 2000, 電子情報通信学会, 音声研究会副委員長
  • May 1996 - Apr. 2000, 電子情報通信学会, 論文誌D2編集委員
  • May 1996 - Apr. 1999, 電子情報通信学会, パターン認識・メディア理解研究会専門委員
  • May 1993 - Apr. 1998, 電子情報通信学会, 音声研究会専門委員
  • Apr. 1995 - Mar. 1997, 画像電子学会, 地方理事
  • Apr. 1992 - Mar. 1994, 日本音響学会, 関西支部評議委員
  • Apr. 1991 - Mar. 1993, 情報処理学会, 関西支部幹事

Research activity information

■ Award
  • 2015 電子情報通信学会, 電子情報通信学会 PRMU研究会ポスター賞, 視覚障碍者のための一人称ビジョンを用いた交差点上の自己位置・進行方向推定
    KAWAGUCHI SATOSHI, ENAMI NAOKO, ARIKI YASUO
    Japan society

  • Feb. 2014 電子情報通信学会, 電子情報通信学会 PRMU研究会ポスター賞, コンテクストに基づくChannel特徴を用いた歩行者検出
    髙柳 陽平, ENAMI NAOKO, ARIKI YASUO
    Japan society

  • Nov. 2009 電子情報通信学会, フェロー, 音声・画像情報の融合処理に関する先駆的研究
    有木康雄

  • Aug. 2009 International Conference on Multimedia, Information Technology and its Applications, Distinguished Paper Award, Generic Object Recognition using CRF by Incorporating BoF as Global Features
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo

  • Jun. 2008 IEEE ICME, IEEE ICME 2008 The Best Paper Award, GRAPH CUTS BY USING LOCAL TEXTURE FEATURES OF WAVELET COEFFICIENT FOR IMAGE SEGMENTATION
    Fukuda Keita, Takiguchi Tetsuya, Ariki Yasuo

  • May 2002 電子情報通信学会オフィス研究会, オフィス研究賞, アクティブ探索を用いた映像編集支援のためのショットサイズ自動判定
    熊野雅仁, 林義文, ARIKI YASUO, UEHARA KUNIAKI, 下條真司, 春藤憲司, 塚田清志
    Japan society

■ Paper
  • ITO Ryosuke, TAKIGUCHI Tetsuya, HIRATA Mitsuhiro, MORI Yumiko, HOTTA Satoko, ARIKI Yasuo
    悩み相談において、傾聴者が行う「気づきを促す質問」は、相談者にとって非常に重要な役割を果たす。こうした質問によって、相談者は自らの内面を深く振り返り、新たな視点を得ることで、単に解決策を提供される場合よりも問題の理解が深まり、さらに自主的な行動を促される。しかし、対話システムにこのような気づきを促す機能を持たせることは容易ではない。気づきを引き出すプロセスは、悩みの種類やその原因によって異なり、複雑な思考を必要とするためである。本論文では、大規模言語モデルが生成した戦略的知識に基づいて推論を行う戦略的思考の連鎖(Strategic Chain-of-Thought)を活用する事で多様な悩みを持つ相談者に気づきを促す傾聴対話システムを提案し、この課題に取り組む。実験では、子育てに関する悩みを題材として、この対話システムの有用性を検証する。
    The Japanese Society for Artificial Intelligence, Nov. 2024, JSAI Technical Report, SIG-SLUD, 102, 80 - 85, Japanese

  • Weihao Zhuang, Tristan Hascoet, Xunquan Chen, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Now Publishers, 2023, APSIPA Transactions on Signal and Information Processing, 12(1) (1)
    Scientific journal

  • Tristan Hascoet, Quentin Febvre, Weihao Zhuang, Yasuo Ariki, Tetsuya Takiguchi
    2023, EURASIP J. Image Video Process., 2023(1) (1), 1 - 1
    Scientific journal

  • Xue Qiang, Takiguchi Tetsuya, Ariki Yasuo
    Generation-base dialogue system tends to produce generic response sentences. In order to improve the diversity of response sentences by the generation-base dialogue system, the response text retrieved by the retrieval-base model can be input to the generation-base model as reference response text, so that the generation-base model can generate highly diverse response sentences. However, the prior works show that the generation-base dialogue system often ignores the reference response text, resulting in the response sentences that is unrelated to the reference response text. In this work, we propose the Dialogue-Filling method, which can utilize 100% of the reference response text by masking the response sentences with a text-filling technique. We built variants of Dialogue-Filling method with DialoGPT model. Experiments on the DailyDialog Dataset demonstrate that our Dialogue-Filling method outperforms the baseline method on the dialogue generation task.
    The Japanese Society for Artificial Intelligence, May 2022, Transactions of the Japanese Society for Artificial Intelligence, 37(3) (3), IDS-C_1 - 9, Japanese

  • XUE Qiang, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In the recent years, generation-based dialogue systems using state-of-the-art (SoTA) transformer-based models have demonstrated impressive performance in simulating human-like conversations. Many generation-based dialogue systems use the sequential generation method, which generates response words sequentially from left to right according to the output distribution of model, based on decoding strategies such as Greedy. However, it is difficult to control the content of the responses generated by the sequential generation method, although the parameters such as minimum and maximum length can be controlled. To address this, inspired by the Three Topics Talk, which is an impromptu storytelling using three given topics, we propose a new responses generation method which generates responses preceding and following the specified knowledge (topic). The dialogue system using our proposed method has been validated to generate significantly more diverse and correct responses than baseline approaches.
    The Japanese Society for Artificial Intelligence, 2022, Proceedings of the Annual Conference of JSAI, JSAI2022, 3Yin221 - 3Yin221, Japanese

  • Yuki Takashima, Ryoichi Takashima, Ryota Tsunoda, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuaki Motoyama
    Dec. 2021, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021(1) (1), English
    Scientific journal

  • 佐良 和孝, 滝口 哲也, 有木 康雄
    近年,ニューラルネットワークを用いた対話システムに,文書や知識グラフといった,外部知識へのアクセス機能を持たせる研究が盛んに行われている。しかしながら,このような機能を持つ対話システムを実現するためには,通常の応答生成モジュールに加え,知識検索のためのモジュールが複数必要になり,システム全体の学習, 推論が複雑になるといった問題や.システム全体のパラメータ数が多くなるといった問題がある.そこで,本研究では,上記全てのモジュールが事前学習済み言語生成モデルを用いて,Text-to-Textで学習, 推論可能であるフレームワークを提案する。提案手法は, Adapter層を用いたマルチタスク学習を用いることで,システム全体のパラメータ数の削減が可能になる.自動評価を用いた比較の結果,一般的なSeq2Seqで学習された対話システムに比べ、提案手法は優れた応答を生成できることが分かった..
    一般社団法人 人工知能学会, Nov. 2021, 人工知能学会研究会資料 言語・音声理解と対話処理研究会, 93, 44 - 49, Japanese

  • Kazuaki Furumai, Tetsuya Takiguchi, Yasuo Ariki
    Springer Science and Business Media Deutschland GmbH, 2021, Lecture Notes in Electrical Engineering, 714, 267 - 275, English
    In book

  • ASO Taisei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    The Japanese Society for Artificial Intelligence, Nov. 2020, JSAI Technical Report, SIG-SLUD, 90, 11, Japanese

  • SARA Kazutaka, TAKIGUCHI Tetsuya, ARIKI Yasuo
    The Japanese Society for Artificial Intelligence, Nov. 2020, JSAI Technical Report, SIG-SLUD, 90, 06, Japanese

  • Weihao Zhuang, Tristan Hascoet, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Institute of Electrical and Electronics Engineers Inc., Oct. 2020, 2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020, 893 - 894, English
    International conference proceedings

  • Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Institute of Electrical and Electronics Engineers Inc., May 2020, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2020-, 6104 - 6108, English
    International conference proceedings

  • Yuki Takashima, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    International Speech Communication Association, 2020, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020-, 4796 - 4800, English
    International conference proceedings

  • Tristan Hascoet, Yihao Zhang, Andreas Persch, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    IEEE, 2020, 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 5545 - 5552
    International conference proceedings

  • FURUMAI Kazuaki, ARIKI Yasuo, TAKIGUCHI Tetsuya
    The Japanese Society for Artificial Intelligence, Nov. 2019, JSAI Technical Report, SIG-SLUD, 87, 25, Japanese

  • Tristan Hascoet, Xuejiao Deng, Kiyoto Tai, Yuji Adachi, Sachiko Nakamura, Tomoko Hayashi, Mari Sugiyama, Yasuo Ariki, Tetsuya Takiguchi
    Institute of Electrical and Electronics Engineers Inc., Oct. 2019, Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019, 4216 - 4225, English
    International conference proceedings

  • Tristan Hascoet, Quentin Febvre, Weihao Zhuang, Yasuo Ariki, Tetsuya Takiguchi
    Institute of Electrical and Electronics Engineers Inc., Oct. 2019, Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019, 2049 - 2052, English
    International conference proceedings

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    Oct. 2019, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 27(10) (10), 1535 - 1548, English
    [Refereed]
    Scientific journal

  • Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Springer, Aug. 2019, EURASIP Journal on Audio, Speech, and Music Processing, DOI: 10.1186/s13636-019-0160-1, 1 - 11, English
    [Refereed]
    Scientific journal

  • Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    IEEE Computer Society, Jun. 2019, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-, 9545 - 9553, English
    International conference proceedings

  • Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Institute of Electrical and Electronics Engineers Inc., May 2019, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019-, 6395 - 6399, English
    International conference proceedings

  • KATAGIRI KEIKO, TAKIGUCHI TETSUYA, 松好祐紀, 有木康雄, TAKI KAZUO
    言語処理学会, Mar. 2019, 言語処理学会 第25回年次大会 発表論文集, 1133 - 1136, Japanese
    [Refereed]
    Scientific journal

  • 複数データベースを使用したend-to-end構音障害者音声認識
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 日本音響学会2019年春季研究発表会講演論文集, 869 - 872, Japanese
    Research society

  • 議論システムにおける言語モデルを用いた賛成/反対意見の自動生成手法の検討
    FURUMAI Kazuaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 日本音響学会2019年春季研究発表会講演論文集, 957 - 960, Japanese
    Research society

  • ユーザーの発話意図理解に基づくインタビュー発話の生成
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 日本音響学会2019年春季研究発表会講演論文集, 963 - 966, Japanese
    Research society

  • マルチタスク学習による雑談対話システムへの知識付与
    ASO Taisei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 日本音響学会2019年春季研究発表会講演論文集, 961 - 962, Japanese
    Research society

  • ゼロショット学習を用いた一般物体セグメンテーション
    TANIDA Keiichi, Tristan Hascoet, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 情報処理学会第81回全国大会講演論文集, 549 - 550, Japanese
    Research society

  • Speech Prosody Conversion using Sequence Generative Adversarial Nets with Continuous Wavelet Transform F0 features
    Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2019, 日本音響学会2019年春季研究発表会講演論文集, 1125 - 1128, English
    Research society

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2019, APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 8, English
    [Refereed]
    Scientific journal

  • End-to-end構音障害者音声認識のための複数データベースを用いたデータ拡張
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 電子情報通信学会技術研究報告, 118(497) (497), 335 - 340, Japanese
    Symposium

  • Affinity graphを用いた神経細胞画像セグメンテーション
    KOYAMA Emi, Tristan Hascoet, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2019, 情報処理学会第81回全国大会講演論文集, 543 - 544, Japanese
    Research society

  • Exemplar-based Lip-to-Speech Synthesis Using Convolutional Neural Networks
    Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2019, International Workshop on Frontiers of Computer Vision, English
    [Refereed]
    International conference proceedings

  • Entropy policy for supervoxel agglomeration of neurite segmentation
    Tristan Hascoet, Baptiste Metge, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2019, International Workshop on Frontiers of Computer Vision, English
    [Refereed]
    International conference proceedings

  • Yuki Takashima, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Institute of Electrical and Electronics Engineers (IEEE), 2019, IEEE Access, 7, 164320 - 164326
    [Refereed]
    Scientific journal

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2019, IEEE ACM Trans. Audio Speech Lang. Process., 27(10) (10), 1535 - 1548
    [Refereed]
    Scientific journal

  • Semantic embeddings of generic objects for zero-shot learning
    Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    Jan. 2019, EURASIP Journal on Image and Video Processing, English
    [Refereed]
    Scientific journal

  • Investigation of Brain Magnetic Fields Associated with Sound Imagery : Speech and Pure Tone with Similar Envelopes
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Haruyoshi, NAKAGAWA Seiji
    電子情報通信学会, Mar. 2018, 電子情報通信学会技術研究報告, 117(517) (517), 81 - 86, Japanese
    Symposium

  • 非負値タッカー分解によるNMF辞書学習に基づく非パラレル声質変換
    TAKASHIMA Yuki, YANO Hajime, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 211 - 214, Japanese
    Research society

  • 非負値行列因子分解を用いた脳磁界データから音声の復元
    YANO Saori, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Haruyoshi, NAKAGAWA Seiji
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 2018, 305 - 308, Japanese
    Research society

  • 単語の分散表現を用いた意味予測に基づく雑談応答生成
    FURUMAI Kazuaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 169 - 172, Japanese
    Research society

  • 構音障害者を対象としたDNN音声合成に関する言語特徴量の検討
    KITAMURA Tsuyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 267 - 270, Japanese
    Research society

  • 構音障害者の少量学習データによる音声合成の検討
    NANZAKA Ryuka, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 275 - 278, Japanese
    Research society

  • 顔画像特徴量を用いた統計的手法によるF0推定
    RA Rina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 309 - 312, Japanese
    Research society

  • 音想起に伴う脳磁界反応:等しいエンベロープをもつ音声と純音の比較
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Haruyoshi, NAKAGAWA Seiji
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 2018, 1291 - 1294, Japanese
    Research society

  • 音声明瞭度に関連した大脳皮質活動の時空間的遷移
    SAGA Naoki, YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Yoshiharu, NAKAGAWA Seiji
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 2018, 1329 - 1332, Japanese
    Research society

  • ハイスピード映像からの音源復元のための物体振動抽出手法の検討
    YASUNI Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 597 - 600, Japanese
    Research society

  • ニュース情報検索システム「NetTv」のための議論対話システムー賛否判定と根拠推定に基づく議論ー
    MARUMOTO Rikito, TANAKA KAtsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 201 - 204, Japanese
    Research society

  • エアコン音の聴感印象推定のためのコヒーレンス解析に基づく脳活動特徴量抽出
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 2018, 755 - 758, Japanese
    Research society

  • Visually grounded word embeddings for zero-shot learning of visual categories
    Hascoet Tristan, Yasuo Ariki, Tetsuya Takiguchi
    Mar. 2018, IPSJ SIG-CVIM, 1 - 4, English
    Symposium

  • LipNet構造を用いた唇画像から音声への変換
    ITOH Daiki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 347 - 350, Japanese
    Research society

  • EMOTIONAL VOICE CONVERSION WITH WAVELET TRANSFORM USING DUAL SUPERVISED ADVERSARIAL NETWORKS
    Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 365 - 368, English
    Research society

  • Convolutional Neural Networksによる物体の微小振動からの音声復元
    FUSE Yohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 593 - 596, Japanese
    Research society

  • Attention-based LSTMを用いた音声質問応答システムにおけるユーザーの質問意図理解
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2018, 日本音響学会2018年春季研究発表会講演論文集, 173 - 176, Japanese
    Research society

  • Zero-shot learning using dictionary definitions
    Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    Feb. 2018, International Workshop on Frontiers of Computer Vision, 4 pages, English
    [Refereed]
    International conference proceedings

  • Satellite Image Semantic Segmentation Using Fully Convolutional Network
    Atsushi Yoshihara, Tristan Hascoet, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2018, International Workshop on Frontiers of Computer Vision, 4 pages, English
    [Refereed]
    International conference proceedings

  • Estimation of Object Functions Using Visual Attention
    Ryunosuke Azuma, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2018, International Workshop on Frontiers of Computer Vision, 4 pages, English
    [Refereed]
    International conference proceedings

  • 非負値行列因子分解に基づく構音障害者音声の高域付加の検討
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 1309 - 1312, Japanese
    Research society

  • 脳磁界データの空間的特徴を考慮した想起音声の識別
    YANO Saori, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Yoshiharu, NAKAGAWA Seiji
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 2018, 885 - 888, Japanese
    Research society

  • 議論システムにおける賛成/反対意見の生成手法の検討
    FURUMAI Kazuaki, ARIKI Yasuo, TAKIGUCHI Tetsuya
    2018, 人工知能学会 言語・音声理解と対話処理研究会, 82 - 83, Japanese
    Symposium

  • 議論システムにおける賛成/反対意見の生成のための発話のベクトル化手法の検討
    FURUMAI Kazuaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 1033 - 1036, Japanese
    Research society

  • ユーザーの発話意図理解に基づくインタビュー発話の 生成に向けて
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2018, 人工知能学会 言語・音声理解と対話処理研究会, 84 - 85, Japanese
    Symposium

  • User's Intention Understanding in Question-Answering System Using Attention-based LSTM
    Yuki Matsuyoshi, Tetsuya Takiguchi, Yasuo Ariki
    2018, APSIPA, 1752 - 1755, English
    [Refereed]
    International conference proceedings

  • Yuki Takashima, Hajime Yano, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    IEEE, 2018, IEEE ICASSP, 5294 - 5298, English
    [Refereed]
    International conference proceedings

  • Neutral-to-Emotional Voice Conversion with Latent Representations of F0 using Generative Adversarial Networks
    Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 1191 - 1194, Japanese
    Research society

  • Multilinear Discriminant Analysisを用いた聴感印象推定のための脳活動特徴量抽出
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 2018, 381 - 384, Japanese
    Research society

  • Debate Dialog for News Question Answering System ‘NetTv’ -Debate Based on Claim and Reason Estimation-
    Rikito Marumoto, Katsuyuki Tanaka, Tetsuya Takiguchi, Yasuo Ariki
    2018, International Workshop on Spoken Dialog System Technology, English
    [Refereed]
    International conference proceedings

  • CycleGANに基づくノンパラレル声質変換を用いた構音障害者音声合成
    NANZAKA Ryuka, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2018, 日本音響学会2018年秋季研究発表会講演論文集, 1185 - 1188, Japanese
    Research society

  • Chat Response Generation Based on Semantic Prediction Using Distributed Representations of Words
    Kazuaki Furumai, Tetsuya Takiguchi, Yasuo Ariki
    2018, International Workshop on Spoken Dialog System Technology, English
    [Refereed]
    International conference proceedings

  • Attention-based LSTMを用いた意図理解とキーワード抽出の統合による質問応答システム
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2018, 電子情報通信学会技術研究報告, 118(198) (198), 9 - 14, Japanese
    Symposium

  • Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    Nov. 2017, SIGNAL IMAGE AND VIDEO PROCESSING, 11(8) (8), 1485 - 1492, English
    [Refereed]
    Scientific journal

  • Discrimination and Feature Estimation of Brain Magnetic Field Data Associated with Japanese Speech Sound Imagery
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    電子情報通信学会, Aug. 2017, 電子情報通信学会技術研究報告, 117(189) (189), 39 - 43, Japanese
    Symposium

  • Extraction of brain activities related to impressions induced by HVAC sound using discriminant non-negative tensor factorization
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    電子情報通信学会, Aug. 2017, 電子情報通信学会技術研究報告, 117(189) (189), 61 - 66, Japanese
    Symposium

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    Aug. 2017, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2017, 1 - 13, English
    [Refereed]
    Scientific journal

  • Shihomi Uzawa, Tetsuya Takiguchi, Yasuo Ariki, Seiji Nakagawa
    Brain computer interface (BCI) technologies, which enable direct communication between the brain and external devices, have been developed. BCI technology can be utilized in neural prosthetics to restore impaired movement, including speech production. However, most of the BCI systems that have been developed are the "P300-speller" type, which can only detect objects that users direct his/her attention at. To develop more versatile BCI systems that can detect a user's intention or thoughts, the brain responses associated with verbal imagery need to be clarified. In this study, the brain magnetic fields associated with auditory verbal imagery and speech hearing were recorded using magnetoencephalography (MEG) carried out on 8 healthy adults. Although the magnetic fields lagged slightly and were long-lasting, significant deflections were observed even for verbal imagery, in the temporal regions, as well as for actual speech hearing. Also, sources for the deflections were localized in the association auditory cortices. Cross-correlations were calculated between envelopes of the imagined/presented speech sound and the evoked brain responses in the temporal areas. Measurable correlations were obtained for the presented speech sound; however, no significant correlations were observed for the imagined speech sound. These results indicate that auditory verbal imagery undoubtedly activates the auditory cortex, at least, and generates some observable neural responses.
    Jul. 2017, Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference, 2017, 2542 - 2545, English, International magazine
    [Refereed]
    International conference proceedings

  • 話者性を維持した構音障害者のためのHMM音声合成システム
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会, Mar. 2017, 電子情報通信学会技術研究報告, 116(477) (477), 301 - 306, Japanese
    Symposium

  • 料理アシスト対話システムにおけるユーザ発話のクラス分類
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 159 - 162, Japanese
    Research society

  • 脳磁界計測を用いたエアコン音の聴感印象推定の試み ―比較判断を用いた印象予測モデルの学習―
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 2017, 539 - 542, Japanese
    Research society

  • 脳磁界計測による音声明瞭度に関連した皮質活動の推定
    SAGA Naoki, YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 2017, 1515 - 1518, Japanese
    Research society

  • 適応型Gaussian-Gaussian RBMを用いた構音障害者音声認識
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 95 - 98, Japanese
    Research society

  • 声質変換のための音素識別的特徴量
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 359 - 362, Japanese
    Research society

  • 声質変換における非周期性指標の影響とその評価
    ITOH Daiki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 363 - 366, Japanese
    Research society

  • 最尤変換による唇動画像からの音声生成
    RA Rina, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 373 - 376, Japanese
    Research society

  • 構音障害者音声認識のための適応型restricted Boltzmann machineを用いた特徴量抽出
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会, Mar. 2017, 電子情報通信学会技術研究報告, 116(477) (477), 321 - 326, Japanese
    Symposium

  • 構音障害者のための話者性を維持したHMM音声合成システムの提案
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 267 - 270, Japanese
    Research society

  • 構音障害者のためのDurationを含んだ統計的声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会, Mar. 2017, 電子情報通信学会技術研究報告, 116(477) (477), 307 - 312, Japanese
    Symposium

  • 音源復元のための映像中の微小振動方向の解析
    YASUMI Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 557 - 558, Japanese
    Research society

  • 音の想起に伴う脳磁界反応:想起音の基礎パラメータの影響の検討
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 2017, 1523 - 1526, Japanese
    Research society

  • ユーザーに対話的なサポートを行うシステム -オセロゲームの場合について-
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 163 - 166, Japanese
    Research society

  • ニュース情報検索「NetTv」における質問種別の推定
    MARUMOTO Rikito, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 155 - 158, Japanese
    Research society

  • DNNを用いた聴覚障害者の音声合成の検討
    KITAMURA Tsuyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 285 - 288, Japanese
    Research society

  • Arbitrary-scales continuous wavelet transform for emotional voice conversion
    RA Chouketsu, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2017, 日本音響学会2017年春季研究発表会講演論文集, 377 - 380, English
    Research society

  • Visual Sound Recovery Using Momentary Phase Variations
    Yusuke Yasumi, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2017, International Workshop on Frontiers of Computer Vision, 1 - 4, English
    [Refereed]
    International conference proceedings

  • Feature Extraction and Classification of Multispectral Imagery by Using Convolutional Neural Network
    Atsushi Yoshihara, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2017, International Workshop on Frontiers of Computer Vision, 1 - 4, English
    [Refereed]
    International conference proceedings

  • Estimation of Object Functions Focusing on Feature of Object Parts
    Ryunosuke Azuma, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2017, International Workshop on Frontiers of Computer Vision, 1 - 4, English
    [Refereed]
    International conference proceedings

  • Jinhui Chen, Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    2017, COMPUTER VISION - ACCV 2016 WORKSHOPS, PT III, 10118, 517 - 530, English
    [Refereed]
    International conference proceedings

  • 脳磁界データによる想起音声の識別 -次元数削減による精度向上の検討-
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 2017, 337 - 340, Japanese
    Research society

  • 人の理解や習熟をサポートする音声質問応答システム
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 人工知能学会 言語・音声理解と対話処理研究会, 90 - 91, Japanese
    Symposium

  • 深層学習による位相情報を考慮した音声合成の検討
    I Konjun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 281 - 284, Japanese
    Research society

  • 重度難聴者音声認識のためのDeep Canonical Correration Analysisを用いた音響特徴量抽出の検討
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 119 - 122, Japanese
    Research society

  • 音声明瞭度に関連した脳磁界計測 -聴覚野および運動野における活動源解析-
    SAGA Naoki, YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, SOETA Yoshiharu, NAKAGAWA Seiji
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 2017, 683 - 686, Japanese
    Research society

  • ユーザー支援を目的とした音声質問応答システム
    MATSUYOSHI Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 141 - 144, Japanese
    Research society

  • ニュース情報検索システム「NetTv」における議論対話システム実現のためのユーザ主張・根拠の推定
    MARUMOTO Rikito, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 人工知能学会 言語・音声理解と対話処理研究会, 92 - 93, Japanese
    Symposium

  • エアコン音の聴感印象推定のための比較判断を考慮した脳活動特徴量抽出
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 2017, 573 - 576, Japanese
    Research society

  • Visual-to-Speech Conversion Based on Maximum Likelihood Estimation
    Rina Ra, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2017, IAPR International Conference on Machine Vision Applications, 488 - 491, English
    [Refereed]
    International conference proceedings

  • Semantic Web and Zero-Shot Learning of Large Scale Visual Classes
    Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    2017, First International Workshop on Symbolic-Neural Learning, 1 - 6, English
    [Refereed]
    International conference proceedings

  • Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    International Speech Communication Association, 2017, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017-, 3374 - 3378, English
    [Refereed]
    International conference proceedings

  • Individuality-Preserving Speech Synthesis System for Hearing Loss Using Deep Neural Networks
    Tsuyoshi Kitamura, Tetsuya Takiguchi, Yasuo Ariki, Kiyohiro Omori
    2017, 1st International Workshop on Challenges in Hearing Assistive Technology, 95 - 99, English
    [Refereed]
    International conference proceedings

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2017, The Second Workshop on Human Identification in Multimedia, 657 - 662, English
    [Refereed]
    International conference proceedings

  • YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji

    The evaluation of subjective impressions induced by environmental sounds using neurophysiological indices has been proposed in recent years. In this paper, we focus on the evaluation of HVAC (heating, ventilation and air conditioning) sounds, and models that predict subjective coolness/preference induced by time-varying HVAC sound from brain activities were constructed. First, magnetoencephalographic (MEG) measurements were carried out to measure brain activities while hearing HVAC sound with paired comparison task. Second, feature vectors representing time-frequency components of brain activities on the whole head were extracted from MEG data using the time-frequency analysis and nonnegative tensor factorization (NTF). And third, two kinds of predictive model were constructed from the brain feature vectors and comparative judgments to pairs of stimuli using a regression model or an SVM-based method. Evaluation experiments show that the SVM-based method is more effective than the regression model.

    Japanese Society for Medical and Biological Engineering, 2017, Transactions of Japanese Society for Medical and Biological Engineering, 55(0) (0), 522 - 523, English
    Research society

  • Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data.
    Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2017, Interspeech, 3399 - 3403, English
    [Refereed]
    International conference proceedings

  • Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data.
    Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    ISCA, 2017, 日本音響学会2017年秋季研究発表会講演論文集, 3399 - 3403, English
    International conference proceedings

  • CNN-LSTMを用いた唇画像から音声への変換
    ITOH Daiki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2017, 日本音響学会2017年秋季研究発表会講演論文集, 305 - 308, Japanese
    Research society

  • Audio-Visual Speech Recognition for a Person with Severe Hearing Loss Using Deep Canonical Correlation Analysis
    Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki, Kiyohiro Omori
    2017, 1st International Workshop on Challenges in Hearing Assistive Technology, 71 - 81, English
    [Refereed]
    International conference proceedings

  • Expression Recognition with Ri-HOG Cascade
    Jinhui Chen, Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    Nov. 2016, Workshop on Computer Vision for Affective Computing, 1 - 14, English
    [Refereed]
    International conference proceedings

  • 話速補正に基づく話者性を維持した構音障害者のための音声合成システム
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 229 - 232, Japanese
    Research society

  • 複素NMFを用いた声質変換の検討
    I Konjun, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 277 - 280, Japanese
    Research society

  • 非負値行列因子分解に基づく声質変換のためのGraph Embeddingを用いたパラレル辞書学習
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 155 - 158, Japanese
    Research society

  • 非負値行列因子を用いたマルチモーダル声質変換における画像特徴量の検討
    RA Rina, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 273 - 276, Japanese
    Research society

  • 脳磁界計測を用いたエアコン音の聴感印象推定の試み -非負値テンソル分解による関連脳活動の抽出-
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 2016, 673 - 676, Japanese
    Research society

  • 脳磁界データからの想起音声の判別に係る特徴量の推定 -ウェーブレット変換とSVMによる解析-
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 2016, 621 - 624, Japanese
    Research society

  • Factored 3-Way Restricted Boltzmann Machine を用いたマルチモーダル音声認識の検討
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2016, 日本音響学会2016年秋季研究発表会講演論文集, 109 - 112, Japanese
    Research society

  • Dysarthric Speech Modification Using Parallel Utterance Based on Non-negative Temporal Decomposition
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Sep. 2016, Workshop on Speech and Language Processing for Assistive Technologies, 75 - 79, English
    [Refereed]
    International conference proceedings

  • Extraction of brain activity related to auditory impressions induced by HVAC sound using non-negative tensor decomposition
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, HOTEHAMA Takuya, KAMIYA Masaru, NAKAGAWA Seiji
    電子情報通信学会, Aug. 2016, 電子情報通信学会技術研究報告, 116(189) (189), 37 - 40, Japanese
    Symposium

  • SIFT Boosting for Handwriting Recognition
    CHEN Jinhui, KAMIHIGASHI Takashi, ITOH Munehiko, TAKATSUKI Yasuo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2016, MIRU 2016, PS2-48, English
    International conference proceedings

  • Discriminative Graph-embedded Non-negative Matrix Factorizationを用いた声質変換のためのパラレル辞書学習
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会, Aug. 2016, 電子情報通信学会技術研究報告, 116(189) (189), 59 - 64, Japanese
    Symposium

  • Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Jul. 2016, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 24(7) (7), 1175 - 1184, English
    [Refereed]
    Scientific journal

  • 音声想起に伴う誘発脳磁界の時空間的特性
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    日本生体磁気学会, Jun. 2016, 第31回日本生体磁気学会大会論文集, 29(1) (1), 104 - 105, Japanese
    Research society

  • エアコン音の聴感印象と自発脳磁界のERS/ERDの関係
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, HOTEHAMA Takuya, KAMIYA Masaru, NAKAGAWA Seiji
    Jun. 2016, 第31回日本生体磁気学会大会論文集, 29(1) (1), 74 - 75, Japanese
    Research society

  • Katsuyuki Tanaka, Tetsuya Takiguchi, Yasuo Ariki
    May 2016, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E99D(5) (5), 1375 - 1383, English
    [Refereed]
    Scientific journal

  • 音素選択型スペクトル補正に基づく話者性を維持した構音障害者のための音声合成システム
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 299 - 302, Japanese
    Research society

  • 音声想起による誘発脳磁界の計測
    UZAWA Shihomi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAGAWA Seiji
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 531 - 532, Japanese
    Research society

  • ハイスピード映像中の物体振動を利用したvisual microphoneの検討
    YASUMI Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 1309 - 1310, Japanese
    Research society

  • タスク指向型対話システムにおける強化学習とニューラルネットワークの比較
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 99 - 102, Japanese
    Research society

  • スパースパラレル学習を用いたマルチモーダル声質変換
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 321 - 324, Japanese
    Research society

  • エアコン音の聴感印象関連領域の探索 -脳磁界の時間周波数解析に基づく推定-
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, HOTEYAMA Takuya, KAMIYA Masaru, NAKAGAWA Seiji
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 511 - 514, Japanese
    Research society

  • Restricted Boltzmann Machine を用いた話者性・雑音を考慮したモデリングの検討
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 299 - 302, Japanese
    Research society

  • Emotional Speech Conversion Using Deep Neural Networks
    LUO Zhaojie, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 351 - 354, English
    Research society

  • Alternating Direction Method of MultipliersによるNMF声質変換のためのパラレル辞書学習
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 325 - 328, Japanese
    Research society

  • ADMMを用いたNMFによる雑音環境下での少量パラレルデータ声質変換
    I Konjun, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2016, 日本音響学会2016年春季研究発表会講演論文集, 333 - 336, Japanese
    Research society

  • Estimation of Object Functions Using Convolutional Neural Network
    KITANO Yosuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2016, Korea-Japan joint Workshop on Frontiers of Computer Vision, English
    [Refereed]
    International conference proceedings

  • Ryo Aihara, Kenta Masaka, Tetsuya Takiguchi, Yasuo Ariki
    2016, COMPUTER AND INFORMATION SCIENCE, 656, 27 - 40, English
    [Refereed]
    International conference proceedings

  • Ozasa Yuko, Ariki Yasuo
    This paper deals with a problem where a robot identifies an object that a human asks it to bring by voice when there is a set of objects that the human and the robot can see. In this case, a human uses an expression which consistes of one or some attributes, such as color and name etc.. In this paper, we propose the method for the identication using color and object names. The multimodal information of speech and images are used for the identification.
    The Institute of Image Electronics Engineers of Japan, 2016, The Journal of the Institute of Image Electronics Engineers of Japan, 45(1) (1), 105 - 111, Japanese
    [Refereed]
    Scientific journal

  • Phone Labeling Based on the Probabilistic Representation for Dysarthric Speech Recognition
    Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2016, American Journal of Signal Processing, 6(1) (1), 19 - 23, English
    [Refereed]
    Scientific journal

  • SEMI-NON-NEGATIVE MATRIX FACTORIZATION USING ALTERNATING DIRECTION METHOD OF MULTIPLIERS FOR VOICE CONVERSION
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2016, 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 5170 - 5174, English
    [Refereed]
    International conference proceedings

  • MODELING DEEP BIDIRECTIONAL RELATIONSHIPS FOR IMAGE CLASSIFICATION AND GENERATION
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2016, 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 1327 - 1331, English
    [Refereed]
    International conference proceedings

  • Selection of an Optimum Random Matrix Using a Genetic Algorithm for Acoustic Feature Extraction
    Yuichiro Kataoka, Toru Nakashika, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2016, 2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 983 - 988, English
    [Refereed]
    International conference proceedings

  • Lip Reading Using a Dynamic Feature of Lip Images and Convolutional Neural Networks
    Yiting Li, Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2016, 2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 971 - 976, English
    [Refereed]
    International conference proceedings

  • Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    IEEE Computer Society, 2016, International Conference on Computer and Information Science, 1 - 5, English
    [Refereed]
    International conference proceedings

  • Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2016, 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5, 292 - 296, English
    [Refereed]
    International conference proceedings

  • Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    ISCA, 2016, ISCA Speech Synthesis Workshop, 140 - 145, English
    [Refereed]
    International conference proceedings

  • Yuki Takashima, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
    2016, 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5, 277 - 281, English
    [Refereed]
    International conference proceedings

  • Jinhui Chen, Zhaojie Luo, Tetsuya Takiguchi, Yasuo Ariki
    2016, EURASIP J. Image and Video Processing, 2016, 37 - 37, English
    [Refereed]
    Scientific journal

  • Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Nov. 2015, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, English
    [Refereed]
    Scientific journal

  • Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Sep. 2015, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, English
    [Refereed]
    Scientific journal

  • Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Sep. 2015, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2749 - 2753, English
    [Refereed]
    Scientific journal

  • Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Association for Computing Machinery, May 2015, ACM Transactions on Accessible Computing, 6(4) (4), 1 - 17, English
    [Refereed]
    Scientific journal

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Institute of Electrical and Electronics Engineers Inc., Mar. 2015, IEEE Transactions on Audio, Speech and Language Processing, 23(3) (3), 580 - 587, English
    [Refereed]
    Scientific journal

  • Yuko Ozasa, Mikio Nakano, Yasuo Ariki, Naoto Iwahashi
    Mar. 2015, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E98D(3) (3), 704 - 711, English
    [Refereed]
    Scientific journal

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2015, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 23(3) (3), 580 - 587, English
    [Refereed]
    Scientific journal

  • Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2015, EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 1 - 12, English
    [Refereed]
    Scientific journal

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2015, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 1 - 12, English
    [Refereed]
    Scientific journal

  • Jinhui Chen, Yosuke Kitano, Yiting Li, Tetsuya Takiguchi, Yasuo Ariki
    2015, COMPUTER VISION - ACCV 2014 WORKSHOPS, PT II, 9009, 629 - 643, English
    [Refereed]
    International conference proceedings

  • Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization
    Ryo Aihara, Testuya Takiguchi, Yasuo Ariki
    2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2749 - 2753, English
    [Refereed]
    International conference proceedings

  • SPARSE NONLINEAR REPRESENTATION FOR VOICE CONVERSION
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), English
    [Refereed]
    International conference proceedings

  • FEATURE EXTRACTION USING PRE-TRAINED CONVOLUTIVE BOTTLENECK NETS FOR DYSARTHRIC SPEECH RECOGNITION
    Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 1411 - 1415, English
    [Refereed]
    International conference proceedings

  • Shunsuke Numano, Naoko Enami, Yasuo Ariki
    2015, COMPUTER VISION - ACCV 2014 WORKSHOPS, PT II, 9009, 658 - 671, English
    [Refereed]
    International conference proceedings

  • 話者適応に基づく日本人英語発話の認識、合成
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 381 - 382, Japanese
    Research society

  • 非負値行列因子分解に基づく唇動画像からの音声生成
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 389 - 392, Japanese
    Research society

  • 脳磁界計測を用いたエアコン音の聴感印象推定の試み -線形回帰による関連脳活動の抽出-
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, HOTEYAMA Takuya, KAMIYA Masaru, NAKAGAWA Seiji
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 485 - 488, Japanese
    Research society

  • 任意話者を対象としたExemplar-based声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, 115(253) (253), 1 - 6, Japanese
    Symposium

  • 適応型 Restricted Boltzmann Machine を用いたパラレルデータフリーな任意話者声質変換
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 279 - 282, Japanese
    Research society

  • 状態空間の分割と状態遷移の学習に基づく Parallel POMDPの評価
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, 115(253) (253), 39 - 43, Japanese
    Symposium

  • 状態空間の分割と状態遷移の学習に基づくParallel POMDP
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 185 - 188, Japanese
    Research society

  • 少量のパラレルデータを用いたNon-negative Matrix Factorizationによる雑音環境下の声質変換
    FUJII Takao, AIHARA Ryo, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 393 - 396, Japanese
    Research society

  • 視覚障碍者のための一人称ビジョンを用いた交差点上の自己位置・進行方向推定
    KAWAGUCHI Satoshi, ENAMI Naoko, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, Japanese
    Symposium

  • 構音障害者音声認識のための混合正規分布に基づく音素ラベリングの検討
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, 115(99) (99), 71 - 76, Japanese
    Symposium

  • 構音障害者音声認識のための確率表現に基づく音素ラベリングの検討
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 1243 - 1246, Japanese
    Research society

  • 階層的POMDPを用いた商品検索型音声対話システムの検討
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 193 - 196, Japanese
    Research society

  • 音楽経験の分析に基づく演奏映像における視覚的顕著性マップモデル
    NUMANO Syunsuke, ENAMI Naoko, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, Japanese
    Symposium

  • 一般物体認識に基づく音声で指示された物体の選択法
    NISHIMURA Hitoshi, OZASA Yuko, ARIKI Yasuo, NAKANO Mikio
    2015, 電子情報通信学会論文誌, J98-D(9) (9), 1265 - 1276, Japanese
    [Refereed]
    Scientific journal

  • 一人称ビジョンを用いた視覚障碍者道路横断支援システムの検討
    KAWAGUCHI Satoshi, ENAMI Naoko, ARIKI Yasuo
    2015, 情報処理学会技術研究報告, Japanese
    Symposium

  • π-CAVEを用いた歩行時の下視野測定システムの開発
    NIWA Yudai, ENAMI Naoko, YASUOKA Akiko, WADA Honoka, KITA Shinichi, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, Japanese
    Symposium

  • β-NMFを用いた唇動画像からの音声生成
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 285 - 288, Japanese
    Research society

  • スペクトル補正に基づく話者性を維持した構音障害者のための音声合成システム
    UEDA Reina, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 267 - 270, Japanese
    Research society

  • エアコン音の時間変動が主観印象および大脳皮質活動に及ぼす影響
    YANO Hajime, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, HOTEHAMA Yuya, NAKAGAWA Seiji
    2015, 日本音響学会2015年春季研究発表会講演論文集, 503 - 504, Japanese
    Research society

  • Word-Error Correction of Continuous Speech Recognition based on Normalized Relevance Distance
    FUSAYASU YOUHEI, TANAKA KATSUYUKI, TAKIGUCHI TETSUYA, ARIKI YASUO
    2015, International Joint Conference on Artificial Intelligence, English
    [Refereed]
    International conference proceedings

  • Top-Down Feature Extraction from Musical Score for Visual Attention in Music Videos
    NUMANO Syunsuke, ENAMI Naoko, ARIKI Yasuo
    2015, Korea-Japan joint Workshop on Frontiers of Computer Vision, English
    [Refereed]
    International conference proceedings

  • SPOKEN DIALOGUE SYSTEM FOR PRODUCT RECOMMENDATION USING HIERARCHICAL POMDP
    YAMADA Yoji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, MLSLP, English
    [Refereed]
    International conference proceedings

  • Sparse Nonlinear Representation for Voice Conversion
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, IEEE ICME, English
    [Refereed]
    International conference proceedings

  • Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2015, APSIPA, 196 - 199, English
    [Refereed]
    International conference proceedings

  • Relationships between Subjective Auditory Impression and Brain Cortical Activities for Time-varying HVAC Sound
    YANO Hajime, HOTEHAMA Takuya, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masaru, NAKAGAWA Seiji
    2015, IEEE EMBC, 37-6LB2, 1 - 4, English
    [Refereed]
    International conference proceedings

  • Parallel-Data-Free, Many-To-Many Voice Conversion Using an Adaptive Restricted Boltzmann Machine
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, MLSLP, English
    [Refereed]
    International conference proceedings

  • Normalized Similarity Distance を用いた音声認識の謝り訂正
    FUSAYASU Hohei, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 31 - 34, Japanese
    Research society

  • Normalized Relevance Distance を用いた音声認識の誤り訂正
    FUSAYASU Yohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 163 - 166, Japanese
    Research society

  • NOISE-ROBUST VOICE CONVERSION USING A SMALL PARALLE DATA BASED ON NON-NEGATIVE MATRIX FACTORIZATION
    Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 315 - 319, English
    [Refereed]
    International conference proceedings

  • MULTITHREADING ADABOOST FRAMEWORK FOR OBJECT RECOGNITION
    Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 1235 - 1239, English
    [Refereed]
    International conference proceedings

  • Multiple Non-negative Matrix Factorizationに基づく多対多声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年秋季研究発表会講演論文集, 227 - 230, Japanese
    Research society

  • Multiple Non-negative Matrix Factorizationに基づく多対一声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 275 - 278, Japanese
    Research society

  • MANY-TO-ONE VOICE CONVERSION USING EXEMPLAR-BASED SPARSE REPRESENTATION
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), English
    [Refereed]
    International conference proceedings

  • LIP-TO-SPEECH SYNTHESIS USING LOCALITY-CONSTRAINT NON-NEGATIVE MATRIX FACTORIZATION
    AIHARA Ryo, MASAKA Kenta, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, MLSLP, English
    [Refereed]
    International conference proceedings

  • Integrated GIS, Remote Sensing and Survey Data for Damage Assessment of Buildings in Tsunami Event, Ishinomaki City, Japan
    POURSABER Mohammad, ARIKI Yasuo
    2015, Journal of Geographic Information System, English
    [Refereed]
    Scientific journal

  • Individuality-Preserving Voice Reconstruction for Articulation Disorders Using Text-to-Speech Synthesis
    Reina Ueda, Tetsuya Takiguchi, Yasuo Ariki
    2015, ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 343 - 346, English
    [Refereed]
    International conference proceedings

  • Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis
    UEDA Reina, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, SLPAT, English
    [Refereed]
    International conference proceedings

  • Home Appliance Control Using Speech Recognition for a Person with an Articulation Disorder
    AIHARA Ryo, TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, ISEM, English
    [Refereed]
    International conference proceedings

  • FEATURE EXTRACTION USING PRE-TRAINED CONVOLUTIVE BOTTLENECK NETS FOR DYSARTHRIC SPEECH RECOGNITION
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, EUSIPCO, 1426 - 1430, English
    [Refereed]
    International conference proceedings

  • Facial Expression Recognition with Multithreaded Cascade of Rotation-invariant HOG
    Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 636 - 642, English
    [Refereed]
    International conference proceedings

  • Estimation of Tsunami Hazard Vulnerability Factors by Integrating Remote Sensing, GIS and AHP based Assessment
    POURSABER Mohammad, ARIKI Yasuo
    2015, Open Access Library Journal, English
    [Refereed]
    Scientific journal

  • Detection of Facial Parts via Deformable Part Model Using Part Annotation
    Kazuhiro Nishida, Naoko Enami, Yasuo Ariki
    2015, 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 192 - 195, English
    [Refereed]
    International conference proceedings

  • Deformable Part Modelを用いた顔部品検出
    NISHIDA Kazuhiro, ENAMI Naoko, ARIKI Yasuo
    2015, 電子情報通信学会技術研究報告, Japanese
    Symposium

  • Deep Boltzmann Machine を用いた音素ラベル情報推定
    TAKASHIMA Yuki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2015, 日本音響学会2015年春季研究発表会講演論文集, 3 - 6, Japanese
    Research society

  • Convolutional Neural Networkを用いた重度難聴者のマルチモーダル音声認識
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo, MITANI Nobuyuki, Omori Kiyohiro, NAKAZONO Kaoru
    2015, 日本音響学会2015年春季研究発表会講演論文集, 197 - 200, Japanese
    Research society

  • Jinhui Chen, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2015, ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 443 - 446, English
    [Refereed]
    International conference proceedings

  • Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki, Nobuyuki Mitani, Kiyohiro Omori, Kaoru Nakazono
    Information Processing Society of Japan, 2015, IPSJ Transactions on Computer Vision and Applications, 7, 64 - 68, English
    [Refereed]
    Scientific journal

  • Lu Li, Xingyu Wang, Guoqiang Wang
    2015, MATHEMATICAL PROBLEMS IN ENGINEERING, 115(346) (346), 13 - 18, English
    Scientific journal

  • ACTIVITY-MAPPING NON-NEGATIVE MATRIX FACTORIZATION FOR EXEMPLAR-BASED VOICE CONVERSION
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 4899 - 4903, English
    [Refereed]
    International conference proceedings

  • Investigation of Classification Using Pitch Features for Children with Autism Spectrum Disorders and Typically Developing Children
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jan. 2015, American Journal of Signal Processing, 5(1) (1), 1 - 5, English
    [Refereed]
    Scientific journal

  • Estimation of Object Functions Using Deformable Part Model
    Yosuke Kitano, Tetsuya Takiguchi, Yasuo Ariki
    2015, 2015 21ST KOREA-JAPAN JOINT WORKSHOP ON FRONTIERS OF COMPUTER VISION, English
    [Refereed]
    International conference proceedings

  • Color Saliency for Object Identification
    Yuko Ozasa, Naoko Enami, Yasuo Ariki
    2015, 2015 21ST KOREA-JAPAN JOINT WORKSHOP ON FRONTIERS OF COMPUTER VISION, English
    [Refereed]
    International conference proceedings

  • Error Correction of Automatic Speech Recognition Based on Normalized Web Distance
    BYAMBAKHISHIG Enkhbolor, TANAKA Katsuyuki, AIHARA Ryo, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2014, Proceedings of the 15th Conference of the International Speech Communication Association (Interspeech 2014), English
    [Refereed]
    International conference proceedings

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Jun. 2014, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E97D(6) (6), 1403 - 1410, English
    [Refereed]
    Scientific journal

  • Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Jun. 2014, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E97D(6) (6), 1411 - 1418, English
    [Refereed]
    Scientific journal

  • Parallel Dictionary Learning Using a Joint Density Restricted Boltzmann Machine for Sparse-Representation-Based Voice Conversion
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jun. 2014, Advances in Computer Science and Engineering, 12(2) (2), 101 - 117, English
    [Refereed]
    Scientific journal

  • 話者適応を用いたNMFによる声質変換
    FUJII Takao, AIHARA Ryo, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,話者適応を用いたNMFによる声質変換手法を提案する.我々が提案してきた従来のNMFによる声質変換手法では,入力話者と出力話者の同一発話内容のパラレルデータを用いることが前提となっていた.つまり,対応する任意の話者の大量のデータをあらかじめ用意しておかなければならないという問題点があった.そこで,出力話者の少量の音声データのみを辞書適応に用いることで,入力話者辞書から出力話者辞書を生成する手法を提案する.評価実験では,話者適応を用いた本手法の有効性を示す.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 421 - 424, Japanese
    Research society

  • 様々なRandom行列を用いた構音障害者の音声特徴量抽出
    KATAOKA Yuichiro, YOSHIOKA Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    提案手法では,様々な分布から作成したランダム写像行列を用いて音声特徴量を変換することで,認識結果がどのように変化するのかを見る.各々の特徴量を用いて音声認識を行い,各認識結果を投票により統合することで最適な認識結果を得る.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 241 - 242, Japanese
    Research society

  • 声質変換のための Restricted Boltzmann Machine を用いた パラレル辞書の学習法
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,スパース表現に基づく声質変換において,パラレル辞書の作成・選択を統一的な枠組みで行うために,結合型RBM(restricted Boltzmann machine)を用いた声質変換法を提案する.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 415 - 416, Japanese
    Research society

  • 辞書選択型NMFを用いた構音障害者の話者性を維持した声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本論文ではアテトーゼ型構音障害者を対象として,辞書選択を用いたNMF声質変換による話者性を維持した声質変換を提案する.出力話者のカテゴリ辞書のうち,子音に関するカテゴリ辞書のみに健常者のスペクトルを用い,母音に関するカテゴリ辞書に障害者のスペクトルを用いることで,障害者の話者性を維持した声質変換を行う.以下,第2章で従来のNMF声質変換手法を説明する.第3章で本稿の提案手法を述べた後,第4章で従来のGMM・NMFによる声質変換手法と比較し,第5章で本稿をまとめる.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 459 - 462, Japanese
    Research society

  • ピッチ特徴量を用いた自閉症スペクトラム障害児と定型発達児の識別
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAI Yasushi, TAKADA Satoshi
    本研究では,ピッチ特徴量を入力とし,SVMを用いて自閉症スペクトラム障害児と定型発達児の識別を行う.本稿では,ピッチ特徴量として,音声データから得られたピッチ系列とそのデルタ系列のそれぞれに対して,12種類の統計量を計算したものを用いて,区間分割による識別実験と単語毎の識別実験を行った.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 467 - 470, Japanese
    Research society

  • Normalized web distanceを用いた音声認識誤り訂正法
    BYAMBAKHISHIG Enkhbolor, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,従来のConfusion Networkに基づく音声認識誤り訂正で,ヌル遷移による短距離訂正の劣化と,文脈スコアを計算するためのコーパスの必要性という問題点を指摘し,これらの問題点を解決するために以下の2つのアプローチで認識誤りの削減をねらう.1つ目は,離れた単語も視野に入れ訂正する長距離文脈スコアとしてNormalized Web Distanceを用いる.Normalized Web Distanceは学習コーパスとして, World Wide Web,検索エンジンなど様々なデータベースを利用することができるため,コーパスを用意する必要がなく,計算も簡単にできるというメリットがある.2つ目は,短距離訂正で有効であるN-gram学習において,悪影響を及ぼすヌル遷移をテストデータから効率的に削除することにより,その効果を改善することで音声認
    Mar. 2014, 第8回音声ドキュメント処理ワークショップ, 1 - 7, Japanese
    Symposium

  • NMFに基づく音声と画像情報を用いた雑音下声質変換
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,雑音環境下に強いNMF基づく声質変換に唇画像特徴を組み込んだ手法を提案する.ここでは入力音声の発話前後の非音声区間から雑音辞書を構築し,入力として与えられる雑音重畳音声を入力音声辞書と雑音辞書のスパースな表現にする.この入力音声と辞書から推定される重み行列のうち,音声辞書に関する重みのみを取り出し,出力話者の音声サンプルから構築した出力音声辞書との線形結合をとる.更に本手法では,入力話者の画像特徴から得られた唇画像辞書を導入することで変換精度をより向上させる.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 417 - 420, Japanese
    Research society

  • Convolutive Bottleneck Network 特徴量を用いた構音障害者の音声認識
    YOSHIOKA Toshiya, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本論文では,構音障害者を対象とした音声認識の実現に向けて,障害者音響モデルを用いた認識実験を行う.さらに,筋肉の緊張により発話が変動しやすいという障害者特有の問題に対して,ボトルネックの構成を持つCNN(CBN)を用いた特徴量抽出法を提案する.
    日本音響学会, Mar. 2014, 日本音響学会2014年春季研究発表会講演論文集, 237 - 240, Japanese
    Research society

  • 演奏視聴時における演奏熟練者と非熟練者の視線情報の分析
    沼野 俊亮, ENAMI NAOKO, ARIKI YASUO
    業支援のためには作業者の熟練度の正確な推定が重要である.この際,熟練度に対する客観的指標が必要となるが,楽器演奏などの技術だけではなく感性が関わる作業の客観的指標の設定は困難である.
    本稿では,人の思考過程や心理状態と相関がある視線情報から,ピアノ演奏の熟練度を推定することを目的とし,熟練者と非熟練者の演奏視聴時の視線情報を分析することで,熟練度の客観的指標の検討と識別手法を提案する.
    Feb. 2014, 電子情報通信学会,信学技報,, 113(431) (431), 93 - 94, Japanese
    Symposium

  • 一人称カメラと街並画像データベースの対応付けによる交差点上の歩行者位置・進行方向推定
    川口 智士, ENAMI NAOKO, ARIKI YASUO
    本稿では歩行者支援のための交差点上における歩行者の位置,進行方向の推定手法を提案する.GPSの測位誤差のため,歩行者が交差点上の歩行者の交差点上位置・進行方向の推定は困難である.そこで本稿では,Google Street View から生成された建物の壁面パノラマ画像と,歩行時に一人称カメラ画像から生成される建物
    壁面パノラマ画像をマッチングすることで,交差点上における歩行者位置を推定する.進行方向および,画像補正にStructure-from-Motionを用いて取得したカメラ姿勢情報を用いる.実環境下で撮影された一人称カメラ画像を用いた実験により,提案手法の有効性を示す.
    Feb. 2014, 電子情報通信学会,信学技報,, 113(431) (431), 91 - 92, Japanese
    Symposium

  • コンテクストに基づくChannel特徴を用いた歩行者検出
    髙柳 陽平, ENAMI NAOKO, ARIKI YASUO
    本稿では,歩行者と背景とのコンテクストモデルに基づく識別器学習による歩行者検出手法を提案する.
    既存の歩行者検出手法では,複数特徴量から識別器学習を行うことで精度向上を実現するが,検出コストが増加するという問題がある.
    本稿では,歩行者と背景とのコンテクストモデルを顕著性マップにより構築し,カスケード型Adaboostによる識別器構築時の重み推定に用いる.これにより,背景や歩行者の姿勢に依存しない弱識別器候補の歩行者尤度を推定可能とする.
    INRIA Pedestrian Datasetを用いた実験により,従来法と比較して検出速度を低下させることなく,検出率を向上できることを示す.
    Feb. 2014, 電子情報通信学会,信学技報,, 113(431) (431), 103 - 104, Japanese
    Symposium

  • Hierarchical Sparse Representation for Object Recognition
    NAKASHIKA Toru, OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2014, Transactions on Machine Learning and Artificial Intelligence, 2(1) (1), 46 - 60, English
    [Refereed]
    Scientific journal

  • Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2014, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014(5) (5), 1 - 10, English
    [Refereed]
    Scientific journal

  • YURIMOTO Mizuki, ENAMI Naoko, ARIKI Yasuo
    In this paper, we propose a method to estimate a flow of each lane and vehicle position in the lane from images by car mounted stereo camera for active navigation. The effectiveness is evaluated with the Karlsruhe dataset of images captured by the car-mounted stereo cameras in real environments.
    The Institute of Image Information and Television Engineers, 2014, PROCEEDINGS OF THE ITE ANNUAL CONVENTION, 2014(0) (0), 3 - 2-1_-_3-2-2_, Japanese
    [Refereed]

  • 話者適応型 Restricted Boltzmann Machine を用いた声質変換の検討
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(365) (365), 165 - 170, Japanese
    Symposium

  • 話者適応を用いたNMFによる雑音環境下の声質変換
    FUJII Takao, AIHARA Ryo, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会, 2014, 日本音響学会2014年秋季研究発表会講演論文集, 345 - 348, Japanese
    Research society

  • 話者依存型 Recurrent Temporal Restricted Boltzmann Machine を用いた声質変換
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会, 2014, 日本音響学会2014年秋季研究発表会講演論文集, 219 - 222, Japanese
    Research society

  • 物体特定のための顕著性
    OZASA Yuko, ENAMI Naoko, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(230) (230), 19 - 24, Japanese
    Symposium

  • 発話に不自由のある聴覚障害者の発話音声認識の検討
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo, MITANI Nobuyuki, Omori Kiyohiro
    2014, 日本音響学会2014年秋季研究発表会講演論文集, 109 - 110, Japanese
    Research society

  • 色属性による物体特定のための顕著性
    OZASA Yuko, ENAMI Naoko, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(356) (356), 79 - 83, Japanese
    Symposium

  • 雑音環境下における特徴重み付マルチモーダル性質変換
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(365) (365), 87 - 92, Japanese
    Symposium

  • 遺伝的アルゴリズムを用いた 構音障害者の音声特徴量抽出に最適なランダム行列の生成
    KATAOKA Yuichiro, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会, 2014, 日本音響学会2014年秋季研究発表会講演論文集, 83 - 86, Japanese
    Research society

  • ハイスピードカメラ画像を用いたマルチモーダルNMF声質変換
    MASAKA Kenta, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会, 2014, 日本音響学会2014年秋季研究発表会講演論文集, 349 - 352, Japanese
    Research society

  • スパース表現に基づく声質変換のための結合型 restricted Boltzmann machine
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(52) (52), 343 - 348, Japanese
    Symposium

  • スパース辞書学習による構音障害者の話者性を維持した声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(91) (91), 39 - 44, Japanese
    Symposium

  • アクティビティマッピングによる非負値行列因子分解を用いた声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会, 2014, 日本音響学会2014年秋季研究発表会講演論文集, 223 - 226, Japanese
    Research society

  • VOICE CONVERSION IN TIME-INVARIANT SPEAKER-INDEPENDENT SPACE
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 7939 - 7943, English
    [Refereed]
    International conference proceedings

  • VOICE CONVERSION BASED ON NON-NEGATIVE MATRIX FACTORIZATION USING PHONEME-CATEGORIZED DICTIONARY
    Ryo Aihara, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 7944 - 7948, English
    [Refereed]
    International conference proceedings

  • Mohammad Reza Poursaber, Yasuo Ariki, Nemat Hassani, Mohammad Safi
    2014, LAND SURFACE REMOTE SENSING II, 9260, English
    [Refereed]
    International conference proceedings

  • Hitoshi Nishimura, Yuko Ozasa, Yasuo Ariki, Mikio Nakano
    2014, 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 477 - 482, English
    [Refereed]
    International conference proceedings

  • Hitoshi Nishimura, Yuko Ozasa, Yasuo Ariki, Mikio Nakano
    ACM, 2014, Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction, 23 - 24, English
    [Refereed]
    International conference proceedings

  • Mohammad Reza Poursaber, Yasuo Ariki, Mohammad Safi
    2014, EARTH RESOURCES AND ENVIRONMENTAL REMOTE SENSING/GIS APPLICATIONS V, 9245, English
    [Refereed]
    International conference proceedings

  • Parallel Dictionary Learning Using a Joint Density Restricted Boltzmann Machine for Sparse-Representation-Based Voice Conversion
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, Advances in Computer Science and Engineering, 12(2) (2), 101 - 117, English
    [Refereed]
    Scientific journal

  • Novel Continuous-multi-class Cascade for Real-Time Emotional Recognition
    CHEN Jinhui, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, Workshops CV4AC, 1 - 15, English
    [Refereed]
    International conference proceedings

  • Normalized Web Distanceを用いた音声認識誤りの訂正法
    BYAMBAKHISHING E, TANAKA Katsuyuki, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 第28回人工知能学会全国大会論文集, 1 - 4, Japanese
    Research society

  • Multiple Non-negative Matrix Factorization を用いた多対一声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, 電子情報通信学会技術研究報告, 114(365) (365), 75 - 80, Japanese
    Symposium

  • MULTIMODAL VOICE CONVERSION USING NON-NEGATIVE MATRIX FACTORIZATION IN NOISY ENVIRONMENTS
    Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 1561 - 1565, English
    [Refereed]
    International conference proceedings

  • Multimodal Exemplar-based Voice Conversion using Lip Features in Noisy Environments
    Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    2014, 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 1159 - 1163, English
    [Refereed]
    International conference proceedings

  • Individuality-preserving Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, SLPAT, 29 - 37, English
    [Refereed]
    International conference proceedings

  • High-Order Sequence Modeling Using Speaker-Dependent Recurrent Temporal Restricted Boltzmann Machines for Voice Conversion
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, Interspeech, 2278 - 2282, English
    [Refereed]
    International conference proceedings

  • Exemplar-based Emotional Voice Conversion Using Non-negative Matrix Factorization
    Ryo Aihara, Reina Ueda, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 1 - 4, English
    [Refereed]
    International conference proceedings

  • Dysarthric Speech Recognition Using a Convolutive Bottleneck Network
    Toru Nakashika, Toshiya Yoshioka, Tetsuya Takiguchi, Yasuo Ariki, Stefan Duffner, Christophe Garcia
    2014, 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 505 - 509, English
    [Refereed]
    International conference proceedings

  • Depth Spatial Pyramid: a Pooling Method for 3D-Object Recognition
    NAKASHIKA Toru, HORI Takafumi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2014, Advances in Computer Science and Engineering, 12(1) (1), 15 - 30, English
    [Refereed]
    Scientific journal

  • Convolutive Bottleneck Network with Dropout for Dysarthric Speech Recognition
    NAKASHIKA Toru, YOSHIOKA Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo, DUFFNER Stefan, GARCIA Christophe
    2014, Transactions on Machine Learning and Artificial Intelligence, 2(2) (2), 46 - 60, English
    [Refereed]
    Scientific journal

  • A Robust Learning Algorithm Based on SURF and PSM for Facial Expression Recognition
    Jinhui Chen, Xiaoyan Lin, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 1352 - 1357, English
    [Refereed]
    International conference proceedings

  • Toru Nakashika, Takafumi Hori, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 4224 - 4228, English
    [Refereed]
    International conference proceedings

  • VOICE CONVERSION BASED ON NON-NEGATIVE MATRIX FACTORIZATION USING PHONEME-CATEGORIZED DICTIONARY
    Ryo Aihara, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2014, 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 7894 - 7898, English
    [Refereed]
    International conference proceedings

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Acoustical Society of Japan, 2014, Acoustical Science and Technology, 35(4) (4), 181 - 191, English
    [Refereed]
    Scientific journal

  • 話者依存型 Conditional Restricted Boltzmann Machine による声質変換
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,元の音響特徴量空間よりも音韻性や時間変化性を抑え,話者性を強調させることによって,より入力話者音声の声質を出力話者のものへと変換しやすい話者依存空間を形成することを目的として,話者ごとにconditional restricted Boltzmann machine (CRBM)を用いた声質変換法を提案する.提案手法ではまず初めに,話者ごとに用意した学習データ(パラレルデータである必要は無い)を用いて,入力話者,出力話者のCRBMを独立に学習させる.次に,少量のパラレルデータの音響特徴量を,それぞれのCRBMを通して話者依存高次元空間へ写像(CRBMの前方推論)し,その高次特徴量同士をNeural Network (NN)を用いて変換させる.NNの変換で得られた特徴量は,CRBMの後方推論によって元の音響特徴量へ逆変換することが可能である
    電子情報通信学会, Dec. 2013, 電子情報通信学会技術研究報告, 113(366) (366), 83 - 88, Japanese
    Symposium

  • 辞書選択型非負値行列因子分解による構音障害者の声質変換
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,アテトーゼ型脳性麻痺による構音障害者を対象とし,筋肉の不随意運動を原因とする障害者の不安定な発話を聞き取りやすく変換することを目指す.従来の声質変換手法で最も一般的なのは,混合正規分布モデル(GMM)を用いた統計的手法であった.この手法は主に話者変換を目的として研究されてきたため,GMM声質変換を構音障害者の発話音声に適用し健常者の音声に変換した場合,障害者の話者性は別人のものに置き換わってしまう.「自分らしい声で話したい」という障害者のニーズに答えるため,本研究では従来の統計的モデルによる声質変換とは異なる,非負値行列因子分解(NMF)を用いたExemlpar-based声質変換を用いて,話者性を維持しつつ聞き取りやすい音声に変換する.これまでNMF声質変換では,入力音声フレームと,辞書から選ばれる基底の音素が必ずしも一致しないという問
    電子情報通信学会, Dec. 2013, 電子情報通信学会技術研究報告, 113(366) (366), 71 - 76, Japanese
    Symposium

  • 雑音環境下におけるセグメント特徴を考慮したNMFによる声質変換
    FUJII Takao, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本報告では,雑音環境下に強いNMFによる声質変換の手法を提案する.ここでは入力話者と出力話者それぞれの同一発話内容の音声特徴量をサンプルとするパラレル辞書を構築する.更に,入力音声の発話前後の非音声区間から雑音辞書を構築し,入力として与えられる雑音重畳音声を入力音声辞書と雑音辞書の線形結合で表現する.この入力音声と辞書から推定される重み行列のうち,音声辞書に関する重みのみを取り出し,出力話者の音声サンプルから構築した出力音声辞書との線形結合をとる.本手法では,NMFにセグメント特徴を導入することで重み行列の推定の精度をより向上させる.実験結果より,雑音重畳音声に対して提案手法の有効性が示された.
    電子情報通信学会, Dec. 2013, 電子情報通信学会技術研究報告, 113(366) (366), 77 - 82, Japanese
    Symposium

  • ピッチ特徴量を用いた自閉症スペクトラム障害児と定型発達児の識別
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAI Yasushi, TAKADA Satoshi
    近年,自閉症スペクトラム障害の発生頻度の増加が注目されている.自閉症スペクトラム障害とは,自閉性障害,アスペルガー障害,特定不能の広汎性発達障害の総体である.これらの障害は多様な原因に基づいて発症するため根本的な治療は困難とされているが,この障害に特化した支援による早期療育の効果が報告されている.本研究では,自閉症スペクトラム障害の早期発見を音響的な側面から目指し,ピッチ特徴量をSVMの入力として識別実験を行った.ピッチ特徴量とは,音声データから得られたピッチ系列とそのデルタ系列のそれぞれに対して,25,50,75パーセンタイル,25-50と50-75パーセンタイルの差,平均,標準偏差,尖度,歪度,最大値,最小値,レンジという12の統計量を計算したものである.実験として,単語毎の識別,区間分割による識別,特徴分割による識別の3つの識別実験を行った.区
    電子情報通信学会, Dec. 2013, 電子情報通信学会技術研究報告, 113(366) (366), 35 - 40, Japanese
    Symposium

  • Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Oct. 2013, IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, E96A(10) (10), 1946 - 1953, English
    [Refereed]
    Scientific journal

  • 辞書選択に基づく非負値行列因子分解による声質変換
    AIHARA Ryo, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,声質変換においてもっとも一般的な,音声スペクトルを特徴量とした話者変換をタスクとし,NMFを用いた声質変換手法の精度を向上させるため,辞書選択手法の導入を提案する.これまではパラレルデータの全フレームをそのまま辞書の基底として用いており,辞書のサイズが膨大となっていた.そのため,入力音声のフレームと,入力話者辞書から選ばれる基底の音素が必ずしも一致しないといった問題があった.そこで本稿では,入力・出力話者辞書を音素カテゴリに分けた副辞書を作成する.NMFを用いて音素カテゴリ認識を行い,選択した副辞書上でマッピングを行うことで声質変換を行う.
    日本音響学会, Sep. 2013, 日本音響学会2013年秋季研究発表会講演論文集, 1473 - 1476, Japanese
    Research society

  • 時間変化を考慮した Deep Learning を用いた声質変換
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,Conditional Restricted Boltzmann Machine を用いて音声の時間的変化を捉え,Deep Learningの枠組みで声質変換を行う手法を提案する.
    日本音響学会, Sep. 2013, 日本音響学会2013年秋季研究発表会講演論文集, 1471 - 1472, Japanese
    Research society

  • セグメント特徴を考慮したNMFを用いた雑音環境下の声質変換
    FUJII Takao, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,雑音環境下に強いNMFによる声質変換の手法を提案する.ここでは入力話者と出力話者それぞれの同一発話内容の音声の特徴量をサンプルとするパラレル辞書を構築する.更に,入力音声の発話前後の非音声区間から雑音辞書を構築し,入力として与えられる雑音重畳音声を入力音声辞書と雑音辞書のスパースな表現にする.この入力音声と辞書から推定される重み行列のうち,音声辞書に関する重みのみを取り出し,出力話者の音声サンプルから構築した出力音声辞書との線形結合をとる.更に本手法では,NMFにセグメント特徴を導入することで重み行列の推定の精度をより向上させる.実験では雑音重畳音声に対して,提案手法の有効性を示す.
    日本音響学会, Sep. 2013, 日本音響学会2013年秋季研究発表会講演論文集, 337 - 340, Japanese
    Research society

  • MKL-SVMを用いた自閉症スペクトラム障害児と定型発達児の音響識別
    KAKIHARA Yasuhiro, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAI Yasushi, TAKADA Satoshi
    本稿では,自閉症スペクトラム障害の早期発見を音響的な側面から目指し,MKL-SVMを用いて自閉症スペクトラム障害児と定型発達児の音響識別を行う.
    日本音響学会, Sep. 2013, 日本音響学会2013年秋季研究発表会講演論文集, 397 - 400, Japanese
    Research society

  • Convolutional Neural Networksを用いた構音障害者のための音声認識
    YOSHIOKA Toshiya, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    提案手法では,音声のスペクトログラムから得られた2次元特徴を入力層,入力層の音素情報を要素として持つベクトルを出力層とするConvolutional Neural Networks (CNN) を構築し,特徴量抽出に用いる.
    日本音響学会, Sep. 2013, 日本音響学会2013年秋季研究発表会講演論文集, 167 - 168, Japanese
    Research society

  • 単眼サッカー映像における時間状況グラフを用いた選手追跡
    ITOH Hiroki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,サッカー映像においてオクルージョンにロバストな選手追跡を行うために,時間状況グラフにガイドされたパーティクルフィルタによる新しい選手追跡手法を提案する.従来のパーティクルフィルタによる選手追跡では,映像のフレーム間で複数選手の位置情報を用いていないため,一度対象を見失うと再度発見するのが困難であるという欠点があった.そこで,複数選手の位置情報を時間状況グラフとして表現しておき,これにガイドされる形でパーティクルフィルタを実行すれば,オクルージョンが生じても選手の誤検出を大幅に減らすことが期待できる.評価実験では,実際の視点固定単眼サッカー映像に対して追跡を行い,時間状況グラフを用いないパーティクルフィルタによる選手追跡(従来手法)と,提案手法の時間状況グラフを用いたパーティクルフィルタによる選手追跡を比較した.その結果,従来手法に比べて提
    電子情報通信学会, Aug. 2013, 電子情報通信学会論文誌, J96-D(8) (8), 1854 - 1864, Japanese
    [Refereed]
    Scientific journal

  • Robust Feature Extraction to Utterance Fluctuation of Articulation Disorders Based on Random Projection
    YOSHIOKA Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    We investigated the speech recognition of a person with an articulation disorder resulting from the athetoid type of cerebral palsy. The articulation of the first speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. In this paper, we introduce a robust feature extraction method based on PCA (Principal Compon
    Aug. 2013, 4th Workshop on Speech and Language Processing for Assistive Technologies, 129 - 133, English
    [Refereed]
    International conference proceedings

  • Noise-Robust Voice Conversion Based on Spectral Mapping on Sparse Space
    TAKASHIMA Ryoichi, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. In our previous work, we discussed an exemplar-based VC technique for noisy environments. In that report, source exemplars and target exemplars are extracted from the parallel training data, having the same texts uttered by the source and target speakers. The
    International Speech Communication Association, Aug. 2013, 8th Speech Synthesis Workshop, 71 - 75, English
    [Refereed]
    International conference proceedings

  • Individuality-Preserving Voice Conversion for Articulation Disorders Using Locality-Constrained NMF
    AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movements of such speakers are limited by their athetoid symptoms, and their consonants are often unstable or unclear, which makes it difficult for them to communicate. In this paper, exemplar-based spectral conversion using Non-negative
    Aug. 2013, 4th Workshop on Speech and Language Processing for Assistive Technologies, 3 - 8, English
    [Refereed]
    International conference proceedings

  • 雑音環境下における非負値行列因子分解を用いた声質変換
    FUJII Takao, AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,雑音環境下に強いSparse Codingによる声質変換の手法を提案する.ここでは入力話者と出力話者それぞれの同一発話内容の音声の特徴量をサンプルとするパラレル辞書を構築する.更に,入力音声の発話前後の非音声区間から雑音辞書を構築し,入力として与えられる雑音重畳音声を入力音声辞書と雑音辞書のスパースな表現にする.この入力音声と辞書から推定される重み行列のうち,音声辞書に関する重みのみを取り出し,出力話者の音声サンプルから構築した出力音声辞書との線形結合をとる.更に本手法では,より出力話者への音声へと近似させるため,ここで得られる特徴量に対してGMM変換を適用することで出力話者の変換音声とする.実験では雑音重畳音声に対して,提案手法の有効性を示す.
    システム制御情報学会, May 2013, システム制御情報学会研究発表講演会講演論文集, (114-5) (114-5), 1 - 6, Japanese
    Research society

  • Unknown Object Identification Using Category Visual Words with Rejection Function
    TANAKA Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In this paper, we introduce an identification method for unknown category objects. Most popular conventional methods in object recognition use Bag of Features (BoF) that represents the image as an appearance frequency histogram of common visual words by quantizing SIFT features. However, this method is unable to identify unknown objects because the common visual words cannot re
    IAPR, May 2013, International Conference on Machine Vision Applications, 375 - 378, English
    [Refereed]
    International conference proceedings

  • 石井 良, 高島 遼一, 滝口 哲也, 有木 康雄, 中井 靖, 高田 哲
    神戸大学都市安全研究センター, Mar. 2013, 神戸大学都市安全研究センター研究報告, (17) (17), 97 - 104, Japanese
    [Refereed]

  • 非負値行列因子分解による構音障害者の話者性を維持した声質変換
    AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,脳性麻痺の一種であるアテトーゼ型構音障害者を対象とした話者性を維持した声質変換を提案する.アテトーゼ現象は意図的な動作に緊張状態を発生させるために,障害者の発話,特に子音が不安定になる.本稿では,非負値行列因子分解(Non-negative Matrix Factorization: NMF) を用いたExemplar-basedな声質変換を構音障害者の発話に適用し,不安定な発話音声をより聞き取りやすく変換することを目指す.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 333 - 336, Japanese
    Research society

  • 自閉症スペクトラム障害児と定型発達児の識別に関する音響特徴量選択の検討
    ISHII Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAI Yasushi, TAKADA Satoshi
    本論文では,幼稚園児から小学校4年生までの自閉症スペクトラム障害児を対象に,早期発見と早期療育を目指した音響的な側面による識別実験の結果に関して報告する.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 141 - 142, Japanese
    Research society

  • 雑音環境下におけるSparse Coding 声質変換
    FUJII Takao, AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,雑音環境下に強いSparse Codingによる声質変換の手法を提案する.ここでは入力話者と出力話者それぞれの同一発話内容の音声の特徴量をサンプルとするパラレル辞書を構築する.更に,入力音声の発話前後の非音声区間から雑音辞書を構築し,入力として与えられる雑音重畳音声を入力音声辞書と雑音辞書のスパースな表現にする.この入力音声と辞書から推定される重み行列のうち,音声辞書に関する重みのみを取り出し,出力話者の音声サンプルから構築した出力音声辞書との線形結合をとる.更に本手法では,より出力話者への音声へと近似させるため,ここで得られる特徴量に対してGMM変換を適用することで出力話者の変換音声とする.実験では雑音重畳音声に対して,提案手法の有効性を示す.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 529 - 532, Japanese
    Research society

  • Single-Channel Two-Talker Localization Using Model Composition
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本論文では単一マイクを用いた2話者の音源位置推定手法について提案する.我々はこれまで音響伝達特性の識別に基づく,単一マイクによる音源位置推定法を提案してきたが,それは話者が一人のみであることを前提とした手法であった.そこで本論文ではこれまで提案した枠組みを元に,新たに音響モデル合成を用いることで,単一マイクで2話者の音源位置推定を行う手法を提案する.提案手法では位置ごとの観測信号の音響伝達特性をあらかじめ推定し,そのモデルを学習しておく.そして,学習された音響伝達特性モデルと各話者の音響モデルを合成させることで,複数話者のそれぞれの位置における混合音声信号のモデルを作成する.その後,二人の話者が同時に発話した評価音声について,位置の組合せごとに合成された混合信号モデルとのゆう度を比較することでそれぞれの話者の位置を推定する.2話者位置推定の実験により
    The Institute of Electronics, Information and Communication Engineers, Mar. 2013, The IEICE transactions on information and systems (Japanese edetion), 96(3) (3), 675 - 685, Japanese
    [Refereed]
    Scientific journal

  • ランダムプロジェクションを用いた構音障害音声の認識および誤り単語検出
    YOSHIOKA Toshiya, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,アテトーゼ型の脳性麻痺による構音障害者を対象とした音声認識の実現を目指している.彼らは意図的な動作時や緊張状態にある場合に筋肉の制御が難しくなり,アテトーゼと呼ばれる不随意運動を伴う.アテトーゼ型の構音障害者の発話スタイルは健常者と大きく異なり,認識精度が著しく低下する.ランダムプロジェクションとは,空間写像の一手法で,その変換写像行列の各要素がある確率分布に従うランダムな値として定義される点に特徴を持つ.提案手法では,複数のランダム写像行列を用いて音声特徴量を変換する.各々の特徴量を用いて音声認識を行い,各認識結果を投票により統合することで最適な認識結果を得る.さらに,その投票結果に基づく正誤判定手法を紹介する.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 139 - 140, Japanese
    Research society

  • スパース基底空間上のマッピングに基づく声質変換
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,これまでに提案してきた音声のスパース表現に基づく声質変換法において,入力音声と出力音声を同一のアクティビティで表現できるような部分空間を学習するNMFの枠組みを提案し,この空間上でマッピングを行うことで声質変換を行う手法を提案する.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 533 - 536, Japanese
    Research society

  • Web画像を用いたマルチモーダル情報による物体認識
    NISHIMURA Hitoshi, OZASA Yuko, ARIKI Yasuo, NAKANO Mikio
    ロボットが生活環境下で作業を行う際,ユーザに指示された物体を把持する物体把持タスクを達成することは最低限必要である.小篠らにより,物体把持タスクのための物体認識手法として音声情報と画像情報を統合した手法が提案されている.小篠らの手法では物体認識を行う際,画像モデルと音声モデルの両モデルが必要であるという問題があった.この問題解決のため,Web画像を用いたマルチモーダル情報による物体認識手法を提案する.本手法では大語彙辞書の発達により音声モデルは既に保持していると考え,認識に必要な画像モデルをWebにより補完する.
    電子情報通信学会, Mar. 2013, 電子情報通信学会総合大会, Japanese
    Research society

  • Syntax情報とContext情報を用いた音声認識誤りの2段階訂正
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,単語ごとに付与した長距離文脈スコアを素性とし,Confusion Network上で音声認識自動誤り訂正を行う手法を提案する.従来,単語ごとに付与された長距離文脈情報を素性として音声認識誤り訂正を行う手法は提案されているが,単語ごとにそれを付与する場合,周辺の認識精度に大きく依存してしまうという問題があった.そのため,認識誤りを多く含む認識結果に対して長距離文脈情報を付与することは,あまり好ましくない.したがって本研究では,長距離文脈情報を誤り訂正の素性として用いるために,始めにN-gram情報を用いた誤り訂正を行い,誤認識を軽減する.その後,長距離文脈スコアを付与し,2段階目の訂正を行うことで,音声認識精度を向上させる手法を提案する.実験により,提案する2段階訂正を行うことで,より効果的に長距離文脈情報を誤り訂正の素性として利用できること
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 221 - 224, Japanese
    Research society

  • Specmurtを利用した調波構造行列による混合楽音解析の検討
    NISHIMURA Daiki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    我々が耳にする楽曲の多くは様々な楽器が同時刻に存在する混合楽音である.しかし,Specmurt法は単一楽器の多重音の解析のみしか行うことができない.そこで我々は従来のSpecmurtを拡張し,複数の楽器の混合楽音から,各楽器に分離された音高を解析する新たな手法を提案する.各楽器に分離された音高を解析する新たな手法を提案する.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 843 - 844, Japanese
    Research society

  • Sparseness Criteria of F0-Frequencies Selection for Specmurt-Based Multi-Pitch Analysis without Modeling Harmonic Structure
    NISHIMURA Daiki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper introduces a multi-pitch analysis method using specmurt analysis without modeling the common harmonic structure pattern. Specmurt analysis is based on the idea that the fundamental frequency distribution is expressed as a deconvolution of the observed spectrum by the common harmonic structure pattern. To analyze the fundamental frequency distribution, the common harm
    Research Institute of Signal Processing, Mar. 2013, Journal of Signal Processing, 17(2) (2), 29 - 38, English
    [Refereed]
    Scientific journal

  • Deep Belief Nets による低次元空間表現を用いた声質変換の検討
    NAKASHIKA Toru, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,DBNとNNを組み合わせて,話者性の取り除いた低次元空間で非線形変換を行う声質変換法を提案した.主観的・客観的に評価実験を行い,いずれの実験においても高い精度を示した.
    日本音響学会, Mar. 2013, 日本音響学会2013年春季研究発表会, 517 - 520, Japanese
    Research society

  • 距離空間ピラミッドを用いたLLCによる3次元物体認識
    HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    近年,高精度なRGB-Dカメラの登場により,高品質な3次元情報(色,奥行き情報)が容易に測定可能となった.これを用いた従来の物体認識手法は,奥行き情報を局所的特徴の抽出にしか使用していない.つまり,奥行き情報を取得することで物体の全体的な形状を把握することが可能であるにも関わらず,部分的な利用に留まっている.そこで,提案手法では,奥行き情報に基づく距離空間ピラミッドによって,全体的な物体形状を表現する手法を提案する.具体的には,距離空間ピラミッドでの特徴点の座標位置によって,奥行きの位相情報を含んだ特徴表現を実現する.また,距離画像から抽出する3次元局所特徴量として,HONV (Histogram of Oriented Normal Vectors)を用い,特徴量のコード化には,特徴空間座標系での近傍制限を利用したLLC (Locality-con
    電子情報通信学会, Feb. 2013, 電子情報通信学会技術研究報告, 43 - 48, Japanese
    Symposium

  • Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2013, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 133(2) (2), 891 - 901, English
    [Refereed]
    Scientific journal

  • 人検出のための動的顕著性マップモデルの構築
    TAKAYANAGI YOUHEI, OZASA YUKO, ENAMI NAOKO, ARIKI YASUO
    未学習背景下での動画像からの人検出精度向上のため,
    動的顕著性マップモデルの構築手法を提案する.
    顕著性マップは画像中における人の視覚的注意を引く領域を抽出するが,対象のシーンにより有効な顕著性モデルは異なる.
    本研究では,静的特徴マップに加えて形状変化量を動的特徴量として抽出し,動的特徴マップから人検出に適した動的顕著性マップモデルを表現する.
    次に,Adaboostによってアピアランス特徴であるHOG特徴と動的顕著性マップからそれぞれ識別器を構築し,顕著性の高い特徴量を選択を可能とする.
    提案手法の有効性を確認するため,未学習背景下の動画像を用いて,従来の顕著性モデルとの比較を行った.
    電子情報通信学会, Jan. 2013, 電子情報通信学会技術研究報告, Japanese
    Symposium

  • SPARSE REPRESENTATION FOR OUTLIERS SUPPRESSION IN SEMI-SUPERVISED IMAGE ANNOTATION
    Toru Nakashika, Takeshi Okumura, Tetsuya Takiguchi, Yasuo Ariki
    2013, 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2080 - 2083, English
    [Refereed]
    International conference proceedings

  • Ryoichi Takashima, Tetsuya Takiguchiy, Yasuo Arikiz
    Acoustical Society of Japan, 2013, Acoustical Science and Technology, 34(3) (3), 176 - 186, English
    [Refereed]
    Scientific journal

  • PREDICTION OF UNLEARNED POSITION BASED ON LOCAL REGRESSION FOR SINGLE-CHANNEL TALKER LOCALIZATION USING ACOUSTIC TRANSFER FUNCTION
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2013, 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 4295 - 4299, English
    [Refereed]
    International conference proceedings

  • INDIVIDUALITY-PRESERVING VOICE CONVERSION FOR ARTICULATION DISORDERS BASED ON NON-NEGATIVE MATRIX FACTORIZATION
    Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2013, 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 8037 - 8040, English
    [Refereed]
    International conference proceedings

  • Voice Conversion in High-order Eigen Space Using Deep Belief Nets
    Toru Nakashika, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2013, 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 369 - 372, English
    [Refereed]
    International conference proceedings

  • Two-step Correction of Speech Recognition Errors Based on N-gram and Long Contextual Information
    Ryohei Nakatani, Tetsuya Takiguchi, Yasuo Ariki
    2013, 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 3714 - 3717, English
    [Refereed]
    International conference proceedings

  • Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments
    Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2013, 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 3604 - 3608, English
    [Refereed]
    International conference proceedings

  • Jinhui Chen, Yasuo Ariki, Tetsuya Takiguchi
    ACM, 2013, MM 2013 - Proceedings of the 2013 ACM Multimedia Conference, 661 - 664, English
    [Refereed]
    International conference proceedings

  • Hitoshi Nishimura, Yuko Ozasa, Yasuo Ariki, Mikio Nakano
    It is an important task for a robot to bring objects requested by human via voice. In order to achieve the task, object recognition using speech and images is needed. Ozasa et al. has proposed the method for the object recognition by integrating speech and image information. Although this method requires both speech (word) and image models, the speech models are automatically c
    IAPR, 2013, Asian Conference on Pattern Recognition, 657 - 661, English
    [Refereed]
    International conference proceedings

  • Voice Conversion based on Non-negative Matrix Factorization in Noisy Environments
    Takao Fujii, Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2013, 2013 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 495 - 498, English
    [Refereed]
    International conference proceedings

  • Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2013, Proceedings - 2013 International Conference on Signal-Image Technology and Internet-Based Systems, SITIS 2013, 38 - 42, English
    [Refereed]
    International conference proceedings

  • Hiroki Itoh, Tetsuya Takiguchi, Yasuo Ariki
    2013, Proceedings - 2013 International Conference on Signal-Image Technology and Internet-Based Systems, SITIS 2013, 14 - 21, English
    [Refereed]
    International conference proceedings

  • Acoustic Feature Selection Utilizing Multiple Kernel Learning for Classification of Children with Autism Spectrum and Typically Developing Children
    Yasuhiro Kakihara, Tetsuya Takiguchi, Yasuo Ariki, Yasushi Nakai, Satoshi Takada
    2013, 2013 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 490 - 494, English
    [Refereed]
    International conference proceedings

  • 音声・画像情報の融合処理を目指して
    ARIKI Yasuo
    文書、画像、映像、音声を対象としたマルチメディアの処理研究と、視覚、聴覚といったモダリティを統合するマルチモーダル処理研究に関して、筆者の研究内容の概要を述べる。次に、複数のメディアやモダリティから得られるデータ・情報を基に、場の状況認識や人の意図認識を行う研究について述べる。最後に、音声と画像間で、同じ手法を用いることから得られる新たな処理内容についても述べる。
    電子情報通信学会, Dec. 2012, 電子情報通信学会技術研究報告, 112(369) (369), 27 - 32, Japanese
    [Invited]
    Symposium

  • 音響伝達特性を用いたシングルチャネル音源位置推定における局所的回帰に基づく未学習位置の補間
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    我々はこれまで,観測音声の音響伝達特性が話者の位置に依存するという点に着目し,音響伝達特性を識別することにより,単一マイクロホンで音源位置推定を行う手法を提案してきた.しかしこの手法は,事前に想定される音源位置毎に音響伝達特性を学習させる必要があり,学習していない位置の推定が困難であった.そこで本稿では,限られた位置の音響伝達特性を用いて,音響伝達特性から位置への回帰モデルを学習し,その回帰モデルにより未学習位置の推定を行う手法について検討する.回帰モデルとして,線形回帰である重回帰分析,非線形回帰であるGPR (Gaussian Process Regression),SVR (Support Vector Regression)を用い,さらにその学習方法として,評価データに類似した学習サンプルのみから回帰モデルを学習する局所的回帰を検討し,その性
    電子情報通信学会, Dec. 2012, 電子情報通信学会技術研究報告, 112(369) (369), 75 - 80, Japanese
    Symposium

  • シンタックスとセマンティックスに基づく音声認識結果の2段階訂正
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,単語ごとに長距離文脈スコアを付与することで素性とし,Confusion Network上での音声認識自動誤り訂正手法を提案する.従来,単語ごとの長距離文脈情報を素性に音声認識誤り訂正を行う手法は提案されているが,単語ごとにそれを付与する場合,周辺の認識精度に大きく依存してしまうという問題がある.そのため,認識誤りを多く含む認識結果に対して長距離文脈情報を付与するのは,あまり好ましくない.したがって本稿では,文脈情報を誤り訂正の素性として用いるために,まずはシンタックスを用いた誤り訂正を行い,誤認識を軽減する.その後,長距離文脈スコアを付与し,2段階目の訂正を行うことで,より音声認識精度を向上させることを目的とする.
    電子情報通信学会, Dec. 2012, 電子情報通信学会技術研究報告, 112(369) (369), 149 - 154, Japanese
    Symposium

  • Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Increased attention has been focused on question answering (QA) technology as next generation search since it improves the usability of information acquisition from web. However, not much research has been conducted on “non-factoid-QA”, especially on Why Question Answering (Why-QA). In this paper, we introduce a machine learning approach to automatically construct a classifier
    Dec. 2012, Australasian Joint Conference on Artificial Intelligence, English
    [Refereed]
    International conference proceedings

  • Sparse Coding を用いた唇情報からの音声変換
    AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    唇の動きから発話内容を読み取る技術はリップリーディング(読唇)と呼ばれ,聴覚・言語障害者のコミュニケーション手段の一つとして用いられている.本研究では,Sparse Codingを用いて,唇動画像から対応する発話音声へテキスト情報なしで変換を行う.事前に音声を含んだ発話映像から唇情報と音声情報を抽出し,それぞれを基底の集合である辞書として学習する.このとき,二つの辞書行列は同一時系列であり,パラレルなデータである.入力された無音声の映像から抽出された唇情報は,Sparse Codingにより少数の基底の線形和で表される.唇辞書行列から選ばれた基底を対応する音声辞書の基底と取り換えることで,音声の基底の線形和として音声が出力される.本稿では,唇情報から識別可能と考えられる母音について変換を行った.
    電子情報通信学会, Dec. 2012, 電子情報通信学会技術研究報告, 112(369) (369), 119 - 124, Japanese
    Symposium

  • GMM-Based Emotional Voice Conversion Using Spectrum and Prosody Features
    AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    We propose Gaussian Mixture Model (GMM)-based emotional voice conversion using spectrum and prosody features. In recent years, speech recognition and synthesis techniques have been developed, and an emotional voice conversion technique is required for synthesizing more expressive voices. The common emotional conversion was based on transformation of neutral prosody to emotional
    Scientific & Academic Publishing, Oct. 2012, American Journal of Signal Processing, 2(5) (5), 134 - 138, English
    [Refereed]
    Scientific journal

  • 非負値行列因子分解による構音障害者の声質変換
    AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    近年,情報技術の福祉分野への応用が進んでいる.例えば,画像認識技術の応用による手話認識,文章読み上げシステム,無喉頭音声変換など,その応用領域は幅広い.本研究では,脳性麻痺による構音障害者に焦点をあて,構音障害者の音声を健常者のものに変換することで,より聞き取りやすくすることを目指す.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 331 - 334, Japanese
    Research society

  • 重みつきノルム基準によるF0周波数選択を用いたSpecmurtによる多重音解析
    NISHIMURA Daiki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では共通調波構造をモデル化しないで,重みつきノルムによるスパース性を考慮したSpecmurtによる多重音解析の有効性を示した.この手法は音色の学習を必要とせず,また和音数などといった知識も用いないで多重音の解析ができる.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 781 - 784, Japanese
    Research society

  • 構音障害者の音素認識誤りの傾向
    YOSHIOKA Toshiya, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,構音障害者の音素体系に注目し,音素認識実験を行いその誤り傾向について検討を行った.構音障害者3名を対象とした音素認識実験により,正解率が低下している音素が,母音,子音ともに類似していることが確認できた.また,正解率が低下している音素において,いくつかの誤り傾向が見られた.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 140 - 141, Japanese
    Research society

  • 音響特徴量を用いた自閉症児と定型発達児の識別
    ISHII Ryo, TaKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKAI Yasushi, TAKADA Satoshi
    本論文では,幼稚園児から小学校4年生までの自閉症児を対象に,その早期発見を目指した音響的な側面による識別実験の結果に関して報告する.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 117 - 118, Japanese
    Research society

  • スパース表現を用いた雑音環境下の声質変換
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本稿では,入力話者のパラレルデータから構築したパラレル辞書と入力音声から構築した雑音辞書を用いて,雑音が重畳した入力音声を入力話者辞書と雑音辞書のスパース表現にし,入力話者辞書のアクティビティ行列に基づいて出力話者辞書内のサンプルを線形結合することで,出力話者の音声へ変換する手法を提案した.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 213 - 216, Japanese
    Research society

  • CRFを用いた音声認識誤り訂正における素性の検討
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    我々は,大語彙連続音声認識において,Conditional Random Fields (CRF) を用いて認識結果中の誤りを訂正する手法を提案してきた.素性として,長距離言語情報などを用いたが,あまり大きな効果が得られなかった.そのため,本稿では,長距離言語情報を他の情報と組み合わせ,新たな素性として誤り訂正に用いる.その結果,長距離言語情報を単独で用いた場合と比較して,単語誤り率の改善が見られたので報告する.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 141 - 142, Japanese
    Research society

  • Convolutional Neural Networks を用いた局所特徴統合による 自動音楽ジャンル分類
    NAKASHIKA Toru, Garcia Christophe, TAKIGUCHI Tetsuya, ARIKI Yasuo
    近年のコンピュータの発展とともに音楽のデジタルコンテンツが爆発的に増大し,web上や個人の情報端末上で音楽データを整理・検索することが困難になってきている.このような背景の中で,類似した音楽を自動的にクラスタリングする自動音楽ジャンル分類の研究が盛んに行われている.本稿では後者のアプローチに基づき,各マップから計算される画像特徴であるGLCM (Gray Level Co-occurrence Matrix)を特徴量とし,Convolutional Neural Networks (ConvNets)を用いて複数のGLCMを統合しつつ音楽ジャンルを識別する手法を提案する.
    日本音響学会, Sep. 2012, 日本音響学会2012年秋季研究発表会, 789 - 790, Japanese
    Research society

  • 3次元Active Appearance Modelsを用いた手形状認識
    YAMASHITA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,高機能TVなどに対するジェスチャー入力として3次元モデルを使用した,複雑な手の形状認識の手法を提案する.従来のジェスチャー認識では,カメラに対して正面に手を向ける必要があり,任意の手の傾きに対応できない問題点がある.そこで,3D Active Appearance Modelsを使用することで,あらゆる方位にも対応できる手の形状追跡を実現する.高精度な距離画像センサーKinectを用いて,対象のRGB画像と深度情報を取得し,モデルの学習及びテストを行った.複数の3D-AAMを使用することにより,複雑な指の形状を,方向の変化に対して頑健に認識することができた.
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    Symposium

  • Detection of the greeting motion based on Cooccurrence of the motion using Continuous DP Matching
    ENAMI NAOKO, YASUO ARIKI
    本論文では,動作間の共起性を用いた挨拶動作の検出手法を提案する.既存の動作認識では決められた動作を学習し動作モデルを作成することで動作認識を実現してきたが,発話を伴うコミュニケーション動作は個人差が大きく,コンテキストにより意味や動作が異なるため,動作モデルを同定することは困難である.そこでコミュニ
    ケーション時には動作間に共起性が存在することに注目し,挨拶動作が生じるタイミングにも共起性が存在すると仮定する.さらに動作の挨拶動作との相関性を示す尺度として挨拶動作強度を定義する.この挨拶動作強度は距離画像からHOG特徴量により算出した人の形状の変化量から算出される.そして,挨拶動作強度時系列データ間を連続マッチングによりマッチングすることで,挨拶動作強度が高く共起性のある挨拶動作が生じている地点を検出する.
    本論文では,複数の挨拶動作と挨拶動作以外の
    Aug. 2012, 15th Meeting on Image Recognition and Understanding, Japanese
    Symposium

  • 単眼サッカー映像における時間状況グラフを用いた選手追跡
    ITOH Hiroki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本研究では,サッカー映像においてオクルージョンにロバストな選手追跡を行うために,時間状況グラフにガイドされたパーティクルフィルタによる新しい選手追跡手法を提案する.従来のパーティクルフィルタによる選手追跡では,映像のフレーム間で複数選手の位置情報を用いていないため,一度対象を見失うと再度発見するのが困難であるという欠点があった.そこで,複数選手の位置情報を時間状況グラフとして表現しておき,これにガイドされる形でパーティクルフィルタを実行すれば,オクルージョンが起こっても選手の誤検出を大幅に減らすことが期待できる.評価実験では,実際の視点固定単眼サッカー映像に対して追跡実験を行い,時間状況グラフを用いないパーティクルフィルタによる選手追跡(従来手法)と,提案手法の時間状況グラフを用いたパーティクルフィルタによる選手追跡を比較した.その結果,従来手法に比
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    [Refereed]
    Symposium

  • 自己縮小画像と混合ガウス分布モデルを用いた超解像
    OGAWA Yuki, HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    近年,超解像技術はコンピュータビジョンの分野において活発に研究されている.本稿では,混合正規分布(GMM)を用いた変換関数による超解像を提案する.低解像度画像を高解像度画像に変換する変換関数を,入力画像と入力画像の自己縮小画像を用いた混合正規分布から作成する.入力画像をその変換関数に適用することによって,高解像度画像を得ることができる.さらに,混合正規分布だけでなく,PLS (Partial Least Squares)も用いた変換関数による超解像も提案する.また,入力画像だけを用いているので,従来手法のように大量の学習画像を必要としない.従来手法との比較を行った結果,提案手法(GMMのみ,GMM+PLS)共に,従来手法より評価値が優れ,より鮮明な画像を作成することができ,提案手法の有効性を確認した.
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    Symposium

  • 学習画像の選択に基づくAAMの繰り返し適応
    TAKAYANAGI Yohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    顔特徴点の取得法として,顔を追跡する方法として適しているActive Appearance Model (AAM)がある.しかし、AAMによって,未知人物を追跡しようとする時,学習データを過剰に用いると,個人の特徴が失われ,多くの局所解が生まれてしまい,追跡精度が低下してしまうので,現状では学習済みの人物でないと顔特徴点を正確に取得できないといった問題がある.そこで本研究では,この問題を解決するため,学習データを人物ごとに分けておき,未知人物に対して学習人物との類似度を,Gaussian Mixture Models(GMM)によって求める.この類似度に応じて,学習人物毎に学習データの枚数を決定し,こうして集められた学習データを基にAAMを構築して特徴点を得る.更に得られた特徴点に対して学習データとの類似度によって,繰り返しAAMを構築することで,未
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    Symposium

  • マルチモーダル情報を用いた未知物体学習のための 未知物体判別手法
    OZASA Yuko, NAKANO Mikio, HASEGAWA Yuji, NAKAMURA Tomoaki, NAGAI Takayuki, IWAHASHI Naoto, ARIKI Yasuo
    本論文では未知物体の学習のためのマルチモーダル情報を用いた未知物体判別手法を提案する.提案手法により,ロボットがユーザに指示されたと同時に未知もしくは既知であるかを判別し,未知物体を学習できる可能性を示す.
    Aug. 2012, ロボット学会学術講演会, Japanese
    Symposium

  • ウェブ画像を用いたカテゴリ別Visual Wordsによる未知物体判別
    TANAKA Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本論文では,既知の物体と同様に,未知の物体も判別できるカテゴリ別Visual Wordsを提案する.最も広く用いられている物体認識の手法は,Bag of Features (BoF)手法である.これは,SIFT (Scale-Invariant Feature Transform)などの局所特徴を量子化することによって,Visual Wordsと呼ばれるコードブックを作成し,その出現頻度ヒストグラムとして画像を表現する手法である.しかし,この手法には既知の物体にしか適用できないという問題点がある.従って,BoF手法は未知の物体を含む物体認識に適している手法とはいえない.この観点から,本論文は未知のカテゴリの物体も表現することができるカテゴリ別Visual Wordsと,それによる物体認識手法を提案する.10クラスの物体認識において,提案手法は従来のB
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    Symposium

  • Unknown Object Detection Using Multimodal Information Integrated by Kernel Logistic Regression
    OZASA Yuko, ARIKI Yasuo, IWAHASHI Naoto, NAKANO Mikio
    This paper presents a new method to detect unknown objects and their unknown names in object manipulation through man-robot dialog. In the method, the detection is carried out by using the information of object images and user’s speech in an integrated way. Originality of the method is to use kernel logistic regression and multiclass logistic regression for the discrimination b
    Aug. 2012, 画像の認識・理解シンポジウム, English
    Symposium

  • Facial Age Estimation Based on KNN-SVR Regression and AAM Parameters
    Songzhu Gao, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Age estimation is the determination of a person’s age based on biometric features. It is an important technique to estimate age from facial pictures automatically in Computer Vision. The application using age estimation for interface, robot, and human interaction is expected. In recent years, many approaches for age estimation were proposed while the results were not ideal. To
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, English
    Symposium

  • AAMを用いた顔方位に依存しない発話認識
    KOMAI Yuto, YANG Nan, ARIKI Yasuo, TAKIGUCHI Tetsuya
    音声情報に唇動画像情報を併用して認識を行うマルチモーダル発話認識は,雑音環境下での認識が可能である.しかし,唇情報は,顔が横を向いてしまうと,認識精度が大きく劣化してしまうため,従来のリップリーディングでは正面顔での発話に限定されていることが多かった.本研究ではActive Appearance Modelを用いて,様々な角度の顔方位を正面に変換してリップリーディングを行う手法を提案する.提案手法では,顔方位に関する回帰モデル式を選択的に扱うことで,正面顔と横顔との変動のミスマッチを抑えつつ,任意の角度で横顔を正面顔に変換することができる.実験では,正面方向の発話のみを学習し,正面,横15度,横30度の3方向の角度において認識を行った結果,3方向全てにおいて,従来手法と比べ認識精度を改善することができた.
    情報処理学会, Aug. 2012, 画像の認識・理解シンポジウム, Japanese
    Symposium

  • Generic Object Recognition Based on CRF Incorporating BoF as Global Features
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Generic object recognition using a computer has become a necessity in various fields, such as robot vision and image retrieval in recent years. Conventional methods use conditional random field (CRF) that recognizes the class of each region using the features extracted from the local regions and the class co-occurrence between the adjoining regions. However, there is a problem
    Jun. 2012, Far East Journal of Electronics and Communications, 8(2) (2), 85 - 96, English
    [Refereed]
    Scientific journal

  • Audio-Visual Speech Recognition Using AAM-Based Visual Features
    KOMAI Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    As one of the techniques for robust speech recognition under noisy environments, audio-visual speech recognition (AVSR) using lip dynamic scene information together with audio information is attracting attention, and the research has made strides in recent years. However, in visual speech recognition (VSR), when a face turns sideways, the shape of the lip as viewed by the camer
    May 2012, Advances in Computer Science and Engineering, 8(2) (2), 123 - 137, English
    [Refereed]
    Scientific journal

  • 高塚 智敬, 高島 遼一, 滝口 哲也, 有木 康雄
    神戸大学都市安全研究センター, Mar. 2012, 神戸大学都市安全研究センター研究報告, (16) (16), 123 - 128, Japanese
    [Refereed]

  • Integrated Multimodal Information for Detection of Unknown Objects and Unknown Names
    OZASA Yuko, IWAHASHI Naoto, TAKIGUCHI Tetsuya, ARIKI Yasuo, NAKANO Mikio
    Mar. 2012, NCSP, pp. 631-634, English
    [Refereed]
    International conference proceedings

  • Gaze Estimation Using 3D Active Appearance Models
    NAKAMATSU Yukari, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2012, NCSP, pp. 112-115, English
    [Refereed]
    International conference proceedings

  • Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words
    TANAKA KATSUYUKI, TAKIGUCHI TETSUYA, ARIKI YASUO
    2012, The Australasian Joint Conference on Artificial Intelligence, 469 - 480, English
    [Refereed]
    International conference proceedings

  • SUPER-RESOLUTION BY GMM BASED CONVERSION USING SELF-REDUCTION IMAGE
    Yuki Ogawa, Yasuo Ariki, Tetsuya Takiguchi
    2012, 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), pp. 1285--1288, 1285 - 1288, English
    [Refereed]
    International conference proceedings

  • GENERIC OBJECT RECOGNITION BY GRAPH STRUCTURAL EXPRESSION
    Takahiro Hori, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), pp. 1021--1024, 1021 - 1024, English
    [Refereed]
    International conference proceedings

  • A NEW MULTIPLE-KERNEL-LEARNING WEIGHTING METHOD FOR LOCALIZING HUMAN BRAIN MAGNETIC ACTIVITY
    T. Takiguchi, T. Imada, R. Takashima, Y. Ariki, J. -F. L. Lin, P. K. Kuhl, M. Kawakatsu, M. Kotani
    2012, 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), pp. 761--764, 761 - 764, English
    [Refereed]
    International conference proceedings

  • ACOUSTIC MODEL TRANSFORMATIONS BASED ON RANDOM PROJECTIONS
    Tetsuya Takiguchi, Mariko Yoshii, Yasuo Ariki, Jeff Bilmes
    2012, 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), pp. 1933--1936, 1933 - 1936, English
    [Refereed]
    International conference proceedings

  • Mohammadreza Poursaber, Yasuo Ariki, Mohammad Safi
    2012, EARTH RESOURCES AND ENVIRONMENTAL REMOTE SENSING/GIS APPLICATIONS III, 8538, English
    [Refereed]
    International conference proceedings

  • Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 1842 - 1845, English
    [Refereed]
    International conference proceedings

  • Yuto Komai, Nan Yang, Tetsuya Takiguchi, Yasuo Ariki
    ACM, 2012, MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, 1161 - 1164, English
    [Refereed]
    International conference proceedings

  • Yuko Ozasa, Yasuo Ariki, Mikio Nakano, Naoto Iwahashi
    This paper presents a new method to detect unknown objects and their unknown names in object manipulation through man-robot dialog. In the method, the detection is carried out by using the information of object images and user's speech in an integrated way. Originality of the method is to use logistic regression for the discrimination between unknown and known objects. The accu
    Springer, 2012, ACCV, 85 - 96, English
    [Refereed]
    International conference proceedings

  • 3D Tracking of Soccer Players Using Time-Situation Graph in Monocular Image Sequence
    Hiroki Itoh, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2532 - 2536, English
    [Refereed]
    International conference proceedings

  • Yuki Ogawa, Takahiro Hori, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 298 - 301, English
    [Refereed]
    International conference proceedings

  • Robust Feature Extraction to Utterance Fluctuations Due to Articulation Disorders Based on Sparse Expression
    Toshiya Yoshioka, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 4 pages, English
    [Refereed]
    International conference proceedings

  • EXEMPLAR-BASED VOICE CONVERSION IN NOISY ENVIRONMENT
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 313 - 317, English
    [Refereed]
    International conference proceedings

  • Consonant Enhancement for Articulation Disorders Based on Non-negative Matrix Factorization
    Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2012, 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 4 pages, English
    [Refereed]
    International conference proceedings

  • An AdaBoost-Based Weighting Method for Localizing Human Brain Magnetic Activity
    T. Takiguchi, R. Takashima, Y. Ariki, T. Imada, J. -F. L. Lin, P. K. Kuhl, M. Kawakatsu, M. Kotani
    2012, 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 4 pages, English
    [Refereed]
    International conference proceedings

  • Towards Domein Independent Why Text Segment Classification by Bag of Grammar
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    本論文では,non-factoid型質問応答技術の一つであるWhy型質問応答を可能とするための技術として,Whyテキストセグメントを識別する識別器の構築方法を提案する.具体的には,テキストセグメント中の文法情報に着目し,機械学習の一つであるSupport Vector Machineにより,それらの特徴パターンを学習することによって,Whyテキストセグメント識別器を構築する.これにより,どのようなドメインのテキストセグメントに対しても,有効に機能するWhyテキストセグメント識別器が構築でき,WebのようなオープンドメインにおいてWhy型質問応答が可能となる.提案手法によるWhyテキストセグメント識別能力の評価のために,Yahoo!知恵袋の回答集合からなる学習データセットをもとに,Whyテキストセグメント識別器を構築して実験を行った結果,F値=0.661,正解率=63.25%の識別性能を有する識別器を構築することができた.これより,従来のWhy型質問応答の問題点であったルール作成に手間が掛かる,識別器がドメインに依存する,ラベル付けされた学習データの入手が困難である,といった問題が改善され,より識別能力の高いWhyテキストセグメント識別が可能となった.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2011, Transactions of the Institute of Electronics, Information and Communication Engineers, Vol. J94-D, No. 12, pp. 2047-2(12) (12), 2047 - 2057, Japanese
    [Refereed]
    Scientific journal

  • Constrained Spectrum Generation Using A Probabilistic Spectrum Envelope for Mixed Music Analysis
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Oct. 2011, ISMIR, pp. 181-184, English
    [Refereed]
    International conference proceedings

  • Kosuke Mizuno, Hiroki Noguchi, Guangji He, Yosuke Terachi, Tetsuya Kamino, Tsuyoshi Fujinaga, Shintaro Izumi, Yasuo Ariki, Hiroshi Kawaguchi, Masahiko Yoshimoto
    Apr. 2011, IEICE TRANSACTIONS ON ELECTRONICS, E94C(4) (4), 448 - 457, English
    [Refereed]
    Scientific journal

  • Tracking of Multiple Soccer Players Using a 3D Particle Filter Based on Detector Confidence
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2011, Advances in Computer Science and Engineering, Volume 6, Issue 1, pp. 93 - 10, English
    [Refereed]
    Scientific journal

  • Topic tracking language model for speech recognition
    WATANABE Shinji, IWATA Tomoharu, HORI Takaaki, SAKO Atsushi, ARIKI Yasuo
    Feb. 2011, Computer Speech and Language, Vol. 25, Issue2, pp. 440–461, English
    [Refereed]
    Scientific journal

  • Gaze Estimation Using Regression Analysis and AAMs Parameters Selected Based on Information Criterion
    Manabu Takatani, Yasuo Ariki, Tetsuya Takiguchi
    2011, COMPUTER VISION - ACCV 2010 WORKSHOPS, PT I, 6468, 400 - 409, English
    [Refereed]
    International conference proceedings

  • Bag of Grammarを用いたドメイン依存性の少ないWhyテキストセグメント識別器の自動構築法
    TANAKA KATSUYUKI, TAKIGUCHI TETSUYA, ARIKI YASUO
    2011, 電子情報通信学会論文誌, J94-D(No.12) (No.12), 2047 - 2057, Japanese
    [Refereed]
    Scientific journal

  • Image Annotation with Concept Level Feature Using PLSA plus CCA
    Yu Zheng, Tetsuya Takiguchi, Yasuo Ariki
    2011, ADVANCES IN MULTIMEDIA MODELING, PT II, 6524, 454 - 464, English
    [Refereed]
    International conference proceedings

  • GENERIC OBJECT RECOGNITION USING AUTOMATIC REGION EXTRACTION AND DIMENSIONAL FEATURE INTEGRATION UTILIZING MULTIPLE KERNEL LEARNING
    Toru Nakashika, Akira Suga, Tetsuya Takiguchi, Yasuo Ariki
    2011, 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, pp. 1229-1232, 1229 - 1232, English
    [Refereed]
    International conference proceedings

  • FEATURE SELECTION BASED ON MULTIPLE KERNEL LEARNING FOR SINGLE-CHANNEL SOUND SOURCE LOCALIZATION USING THE ACOUSTIC TRANSFER FUNCTION
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2011, 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, pp. 2696-2699, 2696 - 2699, English
    [Refereed]
    International conference proceedings

  • Single-channel Head Orientation Estimation Based on Discrimination of Acoustic Transfer Function
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, pp. 2721-2724, 2732 - 2735, English
    [Refereed]
    International conference proceedings

  • Probabilistic Spectrum Envelope: Categorized Audio-features Representation for NMF-based Sound Decomposition
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, pp. 1765-1768, 1776 - 1779, English
    [Refereed]
    International conference proceedings

  • Audio-Visual Speech Recognition Based on AAM Parameter and Phoneme Analysis of Visual Feature
    Yuto Komai, Yasuo Ariki, Tetsuya Takiguchi
    2011, ADVANCES IN IMAGE AND VIDEO TECHNOLOGY, PT I, 7087, 97 - 108, English
    [Refereed]
    International conference proceedings

  • 3D Human Pose Estimation from a Monocular Image Using Model Fitting in Eigenspaces
    BO Geli, ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Nov. 2010, Journal of Software Engineering and Applications, Volume 3, Number 11, pp. 1060-, English
    [Refereed]
    Scientific journal

  • Speech Synthesis by Modeling Harmonics Structure with Multiple Function
    NAKASHIKA Toru, TACHIBANA Ryuki, NISHIMURA Masafumi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Sep. 2010, Interspeech2010, pp. 945-948, English
    [Refereed]
    International conference proceedings

  • Sudden Noise Reduction Based on GMM with Noise Power Estimation
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Apr. 2010, Journal of Software Engineering and Applications, Volume 3, Number 4, pp. 341-34, English
    [Refereed]
    Scientific journal

  • Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    Feb. 2010, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 127(2) (2), 902 - 908, English
    [Refereed]
    Scientific journal

  • Jerome Revaud, Guillaume Lavoué, Yasuo Ariki, Atilla Baskurt
    Institute of Electrical and Electronics Engineers Inc., 2010, Proceedings - International Conference on Pattern Recognition, 754 - 757, English
    International conference proceedings

  • Human Action Recognition Using HDP by Integrating Motion and Location Information
    Yasuo Ariki, Takuya Tonaru, Tetsuya Takiguchi
    2010, COMPUTER VISION - ACCV 2009, PT II, 5995, 291 - +, English
    [Refereed]
    International conference proceedings

  • STRUCTURING A GENE NETWORK USING A MULTIRESOLUTION INDEPENDENCE TEST
    Takayuki Yamamoto, Tetsuya Takiguchi, Yasuo Ariki
    2010, 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, pp. 538-541, 538 - 541, English
    [Refereed]
    International conference proceedings

  • HMM-BASED SEPARATION OF ACOUSTIC TRANSFER FUNCTION FOR SINGLE-CHANNEL SOUND SOURCE LOCALIZATION
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2010, 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, pp. 2830-2833, 2830 - 2833, English
    [Refereed]
    International conference proceedings

  • EVALUATION OF RANDOM-PROJECTION-BASED FEATURE COMBINATION ON SPEECH RECOGNITION
    Tetsuya Takiguchi, Jeff Bilmes, Mariko Yoshii, Yasuo Ariki
    2010, 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, pp. 2150-2153, 2150 - 2153, English
    [Refereed]
    International conference proceedings

  • Takeshi Okumura, Tetsuya Takiguchi, Yasuo Ariki
    2010, Proceedings - International Conference on Pattern Recognition, pp. 3025-3028, 3025 - 3028, English
    [Refereed]
    International conference proceedings

  • Why Text Segment Classification Based on Part of Speech Feature Selection
    Iulia Nagy, Katsuyuki Tanaka, Yasuo Ariki
    2010, DISCOVERY SCIENCE, DS 2010, 6332, 87 - 101, English
    [Refereed]
    International conference proceedings

  • Chikoto Miyamoto, Yuto Komai, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li
    2010, 2010 IEEE International Workshop on Multimedia Signal Processing, MMSP2010, pp. 517-520, 517 - 520, English
    [Refereed]
    International conference proceedings

  • 3D Human Posture Estimation Based on Linear Regression of HOG Features from Monocular Images
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Nov. 2009, Advances in Computer Science and Engineering, Volume 3, Issue 3, pp. 175-186, English
    [Refereed]
    Scientific journal

  • Echo Canceller for Multi-Loudspeakers Based on Maximum Likelihood Using an Acoustic Model
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Oct. 2009, Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference, pp. 246-249, English
    [Refereed]
    International conference proceedings

  • SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION
    ARIKI Yasuo, TAKIGUCHI Tetsuya, MUROI Takashi, TAKASHIMA Ryoichi
    Aug. 2009, Far East Journal of Electronics and Communications, Volume 3, Issue 2, pp. 125 - 1, English
    [Refereed]
    Scientific journal

  • Situation Recognition Using 3D Positional Information of Ball from Monocular Soccer Image Sequence
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2009, The 2009 International Conference on Multimedia, Information Technology and its Applications, pp. 109-112, English
    [Refereed]
    International conference proceedings

  • Generic Object Recognition using CRF by Incorporating BoF as Global Features
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2009, The 2009 International Conference on Multimedia, Information Technology and its Applications, pp. 49-52, English
    [Refereed]
    International conference proceedings

  • Estimation of Ground Surface Displacement from Microwave Radar Images by Using Phase-only Correlation
    MIZUNO Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2009, The 2009 International Conference on Multimedia, Information Technology and its Applications, pp. 205-206, English
    [Refereed]
    International conference proceedings

  • Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    Jul. 2009, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E92D(7) (7), 1453 - 1461, English
    [Refereed]
    Scientific journal

  • Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    2009, Digest of Technical Papers - IEEE International Conference on Consumer Electronics, 36 - 37, English
    [Refereed]
    International conference proceedings

  • Tomoko Okada, Tetsuya Takiguchi, Yasuo Ariki
    2009, Digest of Technical Papers - IEEE International Conference on Consumer Electronics, 637 - 638, English
    [Refereed]
    International conference proceedings

  • Tetsuya Takiguchi, Yuji Sumida, Ryoichi Takashima, Yasuo Ariki
    2009, EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, Volume 2009 (2009), Article ID, English
    [Refereed]
    Scientific journal

  • Hyunsin Park, Tetsuya Takiguchi, Yasuo Ariki
    2009, EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, Volume 2009 (2009), Article ID, English
    [Refereed]
    Scientific journal

  • Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, I-Chao Li, Toshitaka Nakabayashi
    Academy Publisher, 2009, Journal of Multimedia, 4(4) (4), 254 - 261, English
    [Refereed]
    Scientific journal

  • Pose Robust and Person Independent Facial Expressions Recognition Using AAM Selection
    Tomoko Okada, Tetsuya Takiguchi, Yasuo Ariki
    2009, ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, pp. 637-638, 668 - +, English
    [Refereed]
    International conference proceedings

  • Automatic Segmentation of Object Region Using Graph Cuts Based on Saliency Maps and AdaBoost
    Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    2009, ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, pp. 36-37, 412 - +, English
    [Refereed]
    International conference proceedings

  • Monaural Sound-Source-Direction Estimation Using the Acoustic Transfer Function of an Active Microphone
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2009, FUSION: 2009 12TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOLS 1-4, pp. 48-53, 48 - 53, English
    [Refereed]
    International conference proceedings

  • SINGLE-CHANNEL MULTI-TALKER-LOCALIZATION BASED ON MAXIMUM LIKELIHOOD
    Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2009, 2009 IEEE/SP 15TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, pp. 461-464, 461 - 464, English
    [Refereed]
    International conference proceedings

  • MATHEMATICAL MODELING OF HARMONIC-TIMBRE STRUCTURE WITH MULTI-BETA-DISTRIBUTION
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    2009, 2009 IEEE/SP 15TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, pp. 769-772, 768 - 771, English
    [Refereed]
    International conference proceedings

  • System Request Detection in Human Conversation Based on Multi-Resolution Gabor Wavelet Features
    Tomoyuki Yamagata, Tetsuya Takiguchi, Yasuo Ariki
    2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, pp. 256-259, 284 - 287, English
    [Refereed]
    International conference proceedings

  • Gradient-Based Acoustic Features for Speech Recognition
    Takashi Muroi, Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki
    2009, 2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2009), pp. 445-448, 445 - 448, English
    [Refereed]
    International conference proceedings

  • Improvement of In-Car Speech Recognition by Acoustic Echo Canceller with Maximum Likelihood
    Koga Kentaro, Fukuda Shinji, Takiguchi Tetsuya, Ariki Yasuo
    Nov. 2008, 15th World Congress on ITS, CD-ROM, English
    [Refereed]
    International conference proceedings

  • Tagging Video Contents Based on Interest Estimation from Facial Expression
    MIYAHARA Masanori, AOKI Masaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    近年,ユーザが視聴可能な映像コンテンツは莫大な量となってきているため,ユーザが自分の好きな映像コンテンツを探し出すことが困難になりつつある.そこで我々は,映像コンテンツを視聴するユーザを撮影し,その表情から関心度を推定することで映像コンテンツにタギングを行い,番組推薦に役立てるためのシステムを提案する.撮影された顔は,Elastic Bunch Graph Matchingによって,顔特徴点抽出と個人認識が行われ,特定された個人に対して,Support Vector Machinesによって関心のクラスが推定される.関心のクラスは,Neutral,Positive,Negative,Rejectiveの4種類であり,映像コンテンツと同期してフレームごとにタギングが行われる.評価実験の結果,関心クラス推定の平均再現率は86.73% ,平均適合率は86.67%となった.Recently, there are so many videos available for people to choose to watch. To solve this problem, we propose a tagging system for video content based on facial expression that can be used for video content recommendations. Viewer's face captured by a camera is extracted by Elastic Bunch Graph Matching, and Interest class is estimated by Support Vector Machines. The interest classes are Neutral, Positive, Negative and Rejective. They are recorded as “interest tags” in synchronization with video content. Experimental results achieved an averaged recall rate of 86.73%, and averaged precision rate of 86.67%.
    情報処理学会, Oct. 2008, Journal of Information Processing Society of Japan, Vol.49,No.10,pp.3694-3702(10) (10), 3694 - 3702, Japanese
    [Refereed]
    Scientific journal

  • 井上 淳一, 滝口 哲也, 有木 康雄
    神戸大学都市安全研究センター, Mar. 2008, 神戸大学都市安全研究センター研究報告, 12, 91 - 102, Japanese

  • Multiple Classifier Based on Fuzzy C-Means for a Flower Image Retrieval
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2008, NCSP, pp. 76-79, English
    [Refereed]
    International conference proceedings

  • Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2008, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E91D(3) (3), 522 - 528, English
    [Refereed]
    Scientific journal

  • TAKIGUCHI Tetsuya, TAKASHIMA Ryoichi, ARIKI Yasuo
    神戸大学都市安全研究センター, Mar. 2008, NCSP, pp. 9-12(12) (12), 103 - 108, Japanese
    [Refereed]
    International conference proceedings

  • Research center for urban safety and security kobe university
    Kentaro Koga, Shinji Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    2008, 15th World Congress on Intelligent Transport Systems and ITS America Annual Meeting 2008, 5, 3316 - 3319, English
    International conference proceedings

  • AdaBoost/LogitBoostによるWhyテキストセグメント判定と回答抽出の自動化
    TANAKA KATSUYUKI, TAKIGUCHI TETSUYA, ARIKI YASUO
    従来の質問応答システムは,What,Where,Who を扱った質問に対して,事実に関係する回答を行う研究,つまりFactoid 型質問応答システムが主流である."~はなぜ?" のように原因を求めるWhy 型や,"どのように~できる?" のような方法を探究するHow 型の質問に対応した研究例は多いとはいえない.そこで,本研究では,インターネット上にあるテキスト文書中のテキストセグメントのWhy 判定と,セグメント内の事実文と理由文の位置関係によりCase に分けた回答文の特定を,機械学習によって自動的に行う方法を提案する.Why 判定ではF 値約80%で判別可能となった.回答部分の抽出でも各クラスのF 値を向上させることができた.Typical question-answering systems deal with factoid types, such as ‘what’, ‘where’, and ‘who’. These types of QA systems are concerned mainly with finding facts from corpus, and are thus unable to answer questions asking for reasons for some events or things. This paper presents the algorithm to find ‘Why-based’ answers from the internet. The main focus of this paper is to classify Why-based text segments and extractWhy-based answers from the segment with Cases, which are differentiated automatically by the position of the fact and reason sentence within a segment, using machine learning. The experiment showed improvement on differentiating Why-based segments from text. Also, this method enabled enhancement of F-measurement of answer extraction.
    情報処理学会, 2008, 情報処理学会論文誌, 49(6) (6), 2234 - 2242, Japanese
    [Refereed]
    Scientific journal

  • Tagging video contents with positive/negative interest based on user's facial expression
    Masanori Miyahara, Masaki Aoki, Tetsuya Takiguchi, Yasuo Ariki
    2008, ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 4903, 210 - +, English
    [Refereed]
    International conference proceedings

  • Speaker independent phoneme recognition based on Fisher weight map
    Takashi Muroi, Tetsuya Takiguchi, Yasuo Ariki
    2008, MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, pp. 253-257, 253 - 257, English
    [Refereed]
    International conference proceedings

  • Tetsuya Takiguchi, Tomoyuki Yamagata, Atsushi Sako, Nobuyuki Miyake, Jerome Revaud, Yasuo Ariki
    2008, MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, pp. 304-309, 304 - +, English
    [Refereed]
    International conference proceedings

  • Tetsuya Takiguchi, Jun Adachi, Yasuo Ariki
    2008, MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, pp. 282-287, 282 - +, English
    [Refereed]
    International conference proceedings

  • Integration of phoneme-subspaces using ICA for speech feature extraction and recognition
    Hyunsin Park, Tetsuya Takiguchi, Yasuo Ariki
    2008, 2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, pp. 148-151, 149 - 152, English
    [Refereed]
    International conference proceedings

  • Active microphone with parabolic reflection board for estimation of sound source direction
    Tetsuya Takiguchi, Ryoichi Takashima, Yasuo Ariki
    2008, 2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, pp. 65-68, 66 - 69, English
    [Refereed]
    International conference proceedings

  • GRAPH CUTS BY USING LOCAL TEXTURE FEATURES OF WAVELET COEFFICIENT FOR IMAGE SEGMENTATION
    Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    2008, 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, pp. 881-884, 881 - +, English
    [Refereed]
    International conference proceedings

  • DIGITAL CAMERA WORK FOR SOCCER VIDEO PRODUCTION WITH EVENT RECOGNITION AND ACCURATE BALL TRACKING BY SWITCHING SEARCH METHOD
    Yasuo Ariki, Tetsuya Takiguchi, Kazuki Yano
    2008, 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, pp. 889-892, 889 - +, English
    [Refereed]
    International conference proceedings

  • Takashi Muroi, Tetsuya Takiguchi, Yasuo Ariki
    2008, Proceedings - 2008 International Conference on Multimedia and Ubiquitous Engineering, MUE 2008, Vol. 1, No. 3, pp. 61-70, 253 - 257, English
    [Refereed]
    International conference proceedings

  • Tetsuya Takiguchi, Tomoyuki Yamagata, Atsushi Sako, Nobuyuki Miyake, Jerome Revaud, Yasuo Ariki
    2008, MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, Vol. 1, No. 3, pp. 81-90, 304 - +, English
    [Refereed]
    International conference proceedings

  • Tetsuya Takiguchi, Jun Adachi, Yasuo Ariki
    2008, MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, Vol. 1, No. 3, pp. 71-80, 282 - +, English
    [Refereed]
    International conference proceedings

  • Sudden Noise Reduction Based on GMM with Noise Power Estimation
    Nobuyuki Miyake, Tetsuya Takiguchi, Yasuo Ariki
    2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, pp. 403-406, 403 - 406, English
    [Refereed]
    International conference proceedings

  • Integration of Metamodel and Acoustic Model for Speech Recognition
    Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi
    2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, pp. 2234-2237, 2234 - +, English
    [Refereed]
    International conference proceedings

  • Object Recognition and Segmentation Using SIFT and Graph Cuts
    Akira Suga, Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki
    2008, 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, CD-ROM, 1179 - +, English
    [Refereed]
    International conference proceedings

  • 3D Human Posture Estimation Using the HOG Features from Monocular Image
    Katsunori Onishi, Tetsuya Takiguchi, Yasuo Ariki
    2008, 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, CD-ROM, 1466 - +, English
    [Refereed]
    International conference proceedings

  • Improved Camera Work Analysis Method for Online System Using Luminance-Projection Correlation and Split Tensor Histogram
    ARIKI Yasuo, KUMANO Masahito, UEHARA Kuniaki
    Aug. 2007, The Institute of Image Information and Television Engineers, Vol.61,No.8,pp.1159-1167, Japanese
    [Refereed]
    Scientific journal

  • Masakiyo Fujimoto, Yasuo Ariki
    Mar. 2007, Systems and Computers in Japan, 38(3) (3), 23 - 38, English
    Scientific journal

  • 有木 康雄, 滝口 哲也, 松政 宏典
    神戸大学都市安全研究センター, Mar. 2007, 神戸大学都市安全研究センター研究報告, 11, 191 - 196, Japanese

  • Toshiya Ohkubo, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2007, Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 46(2-3) (2-3), 123 - 131, English
    [Refereed]
    Scientific journal

  • Toshiya Ohkubo, Tetsuya Takiguchi, Yasuo Ariki
    Mar. 2007, JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 46(2-3) (2-3), 123 - 131, English
    [Refereed]
    Scientific journal

  • 実時間カメラワーク評価に基づく単一ショット訓練指向型オンライン映像処理ナビゲーションシステム ~映像文法を背景とした映像撮影学習システムに向けて~
    Kumano Masahiro, Ariki Yasuo, Uehara Kuniaki
    2007, 映像情報メディア学会誌, Vol.61, No.8, pp.1150-1158(8) (8), 1150 - 1158, Japanese
    [Refereed]
    Scientific journal

  • 輝度投影相関と二分化テンソルヒストグラムを併用したオンライン処理向けカメラワーク解析法の精度向上 ~訓練指向型オンライン映像撮影ナビゲーションシステム~
    Kumano Masahiro, Ariki Yasuo, Uehara Kuniaki
    2007, 映像情報メディア学会誌, Vol.61, No.8, pp.1159-1167, Japanese
    [Refereed]
    Scientific journal

  • Tetsuya Takiguchi, Yasuo Ariki
    Academy Publisher, 2007, Journal of Multimedia, 2(5) (5), 13 - 18, English
    [Refereed]
    Scientific journal

  • Masahito Kumano, Yasuo Ariki, Kuniaki Uehara
    Inst. of Image Information and Television Engineers, 2007, Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 61(8) (8), 1159 - 1167, Japanese
    [Refereed]
    Scientific journal

  • Noise detection and classification in speech signals with boosting
    Nobuyuki Miyake, Tetsuya Takiguchi, Yasuo Ariki
    2007, 2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, pp. 778-782, 778 - 782, English
    [Refereed]
    International conference proceedings

  • Estimation of room acoustic transfer function using speech model
    Tetsuya Takiguchi, Yuji Sumida, Yasuo Ariki
    2007, 2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, pp. 336-340, 336 - 340, English
    [Refereed]
    International conference proceedings

  • Masaki Aoki, Ken Masuda, Hiroyoshi Matsuda, Tetsuya Takiguchi, Yasuo Ariki
    2007, Proceedings of the ACM International Multimedia Conference and Exhibition, pp. 561-564, 561 - 564, English
    [Refereed]
    International conference proceedings

  • System Request Detection in Conversation Based on Acoustic and Speaker Alternation Features
    Tomoyuki Yamagata, Atsushi Sako, Tetsuya Takiguch, Yasuo Ariki
    2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, pp. 2789-2792, 2776 - +, English
    [Refereed]
    International conference proceedings

  • PCA-Based Feature Extraction for Fluctuation in Speaking Style of Articulation Disorders
    Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi
    2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, pp. 1150-1153, 1425 - +, English
    [Refereed]
    International conference proceedings

  • Language Modeling using PLSA-Based Topic HMM
    Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki
    2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, pp. 606-609, 2924 - +, English
    [Refereed]
    International conference proceedings

  • A study on robust feature extraction using kernel PCA in reverberant environments
    TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jun. 2006, IPSJ Transactions, Vol. 47, No. 6, pp. 1767-1773, Japanese
    [Refereed]
    Scientific journal

  • 有木 康雄, 滝口 哲也, 住田 雄司
    神戸大学都市安全研究センター, Mar. 2006, 神戸大学都市安全研究センター研究報告, 10, 117 - 124, Japanese

  • T Takiguchi, M Nishimura, Y Ariki
    Mar. 2006, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, E89D(3) (3), 908 - 914, English
    [Refereed]
    Scientific journal

  • Segmentation of Goods Catalog Video into Individual Goods Section by Combination of Speech and Image Information
    FUJIMOTO Masakiyo, ARIKI Yasuo, MATSUMOTO Hiroshi
    2006, 電子情報通信学会論文誌, Vol. J89-DII, No. 2, pp. 292-3, Japanese
    [Refereed]
    Scientific journal

  • Person Recognition for News Videos through Multi-modal Interaction
    FUJIMOTO Masakiyo, ARIKI Yasuo, DOSHITA Shuji
    2006, 日本音響学会論文誌, Vol. 62, No. 3, pp. 182-192, Japanese
    [Refereed]
    Scientific journal

  • Robust feature extraction using kernel PCA
    Tetsuya Takiguchi, Yasuo Ariki
    2006, 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, pp. 509-512, 509 - 512, English
    [Refereed]
    International conference proceedings

  • Online training-oriented video shooting navigation system based on real-time camerawork evaluation
    M. Kumano, K. Uehara, Y. Ariki
    2006, 2006 IEEE International Conference on Multimedia and Expo - ICME 2006, Vols 1-5, Proceedings, CD-ROM, pp.1281-1284, 1281 - 1284, English
    [Refereed]
    International conference proceedings

  • Phoneme Recognition Based on Fisher Weight Map to Higher-Order Local Auto-Correlation
    Yasuo Ariki, Shunsuke Kato, Tetsuya Takiguchi
    2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, pp. 377-380, 377 - 380, English
    [Refereed]
    International conference proceedings

  • Automatic production system of soccer sports video by digital camera work based on situation recognition
    Yasuo Ariki, Shintaro Kubota, Masahito Kumano
    2006, ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, pp.851-858, 851 - 858, English
    [Refereed]
    International conference proceedings

  • Yasuo Ariki, Jun Ogata, Masakiyo Fujimoto, Kiyoshi Tsukada
    Jul. 2005, Systems and Computers in Japan, 36(8) (8), 40 - 48, English
    Scientific journal

  • 有木 康雄, 滝口 哲也, 大久保 俊也
    神戸大学都市安全研究センター, Mar. 2005, 神戸大学都市安全研究センター研究報告, 9, 179 - 185, Japanese

  • Masahito Kumano, Yasuo Ariki, Kiyoshi Tsukada
    Inst. of Image Information and Television Engineers, 2005, Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 59(2) (2), 271 - 278, Japanese
    Scientific journal

  • 熊野 雅仁, 有木 康雄, 塚田 清志
    To replay baseball highlight scenes in live broadcasts to baseball fans outside, image processing, such as analysis, meta information extraction, and automatic editing, has to be performed in real time. This paper proposes high-speed image processing that automatically extracts PC (Pitcher and Catcher) scenes from live broadcasts of a baseball game in real time using a feature mining technique as a part of baseball highlight scene delivery. This method achieves an F-measure of 97.2% and a processing speed 30 times faster than actual time.
    The Institute of Image Information and Television Engineers, 2005, 映像情報メディア学会誌, 59, 1, 77-84(1) (1), 77 - 84, Japanese
    [Refereed]
    Scientific journal

  • ボールと選手に着目したディジタルカメラワークの実現法 -ディジタルシューティングによるサッカー解説映像生成システムに向けて-
    熊野 雅仁, 有木 康雄, 塚田 清志
    2005, 映像情報メディア学会誌, 59, 2, 271-278, Japanese
    [Refereed]
    Scientific journal

  • Structuring baseball live games based on speech recognition using task dependent knowledge and emotion state recognition
    A Sako, Y Ariki
    2005, 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5, 1049-1052, 1049 - 1052, English
    [Refereed]
    International conference proceedings

  • Situation Based Speech Recognition for Structuring Baseball Live Games
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    2005, Interspeech, pp. 3453-3456, English
    [Refereed]
    International conference proceedings

  • Yasuo Ariki, Tetsuya Takiguchi, Atsushi Sako
    2005, Proceedings of the 13th ACM International Conference on Multimedia, MM 2005, pp.355-358, 355 - 358, English
    [Refereed]
    International conference proceedings

  • GMMに基づく音声信号推定法と時間領域SVDに基づく音声強調法の併用による雑音下音声認識
    藤本 雅清, 有木 康雄
    2005, 電子情報通信学会論文誌, J88-D-11, 2, 250-265, Japanese
    [Refereed]
    Scientific journal

  • Additive and Convolutive Noise Suppression Method Based on GMM and EM Algorithm
    FUJIMOTO Masakiyo, ARIKI Yasuo
    2005, 電子情報通信学会論文誌, Vol. J88-DII, No. 7, pp. 1093-, Japanese
    [Refereed]
    Scientific journal

  • Two-channel-based noise reduction in a complex spectrum plane for hands-free communication system
    T Ohkubo, T Takiguchi, Y Ariki
    2005, ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2005, PT 2, 3768, 923 - 934, English
    [Refereed]
    Scientific journal

  • Masakiyo Fujimoto, Yasuo Ariki
    John Wiley and Sons Inc., 2004, Systems and Computers in Japan, 35(3) (3), 46 - 57, English
    Scientific journal

  • 音響・言語モデルの適応処理によるスポーツ実況中継の音声認識
    有木 康雄, 緒方 淳, 藤本 雅清, 塚田 清志
    2004, 電子情報通信学会論文誌, J87-D-11, 6, 1208-1215, Japanese
    [Refereed]
    Scientific journal

  • Video shooting navigation system by real-time useful shot discrimination based on video grammar
    K Uehara, M Amano, Y Ariki, M Kumano
    2004, 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, CD-ROM, 583 - 586, English
    [Refereed]
    International conference proceedings

  • Structuring of Baseball Live Games Based on Speech Recognition Using Task Dependent Knowledge
    SAKO Atsushi, ARIKI Yasuo
    2004, Interspeech 2004 ICSLP, 446-449, English
    [Refereed]
    Scientific journal

  • Robust speech recognition in additive and channel noise environments using GMM and EM algorithm
    M Fujimoto, Y Ariki
    2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, I-941-I-944, 941 - 944, English
    [Refereed]
    International conference proceedings

  • Automatic extraction of PC scenes based on feature mining for a real time delivery system of baseball highlight scenes
    M Kumano, Y Ariki, K Tsukada, S Hamaguchi, H Kiyose
    2004, 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, CD-ROM, 277 - 280, English
    [Refereed]
    International conference proceedings

  • A method of digital camera work focused on players and a ball - Toward automatic contents production system of commentar soccer video by digital shooting
    M Kumano, Y Ariki, K Tsukada
    2004, ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 3, PROCEEDINGS, 3333, 466 - 473, English
    [Refereed]
    Scientific journal

  • Yasuo Ariki, Masahito Kumano, Kiyoshi Tsukada
    Association for Computing Machinery, Inc, Nov. 2003, Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, MIR 2003, 未記入, 209 - 214, English
    [Refereed]
    International conference proceedings

  • Masakiyo Fujimoto, Yasuo Ariki, Shuji Doshita
    Nov. 2003, Acoustical Science and Technology, 24(6) (6), 379 - 381, English
    [Refereed]
    Scientific journal

  • Masahito Kumano, Yasuo Ariki, Kenji Shunto, Kiyoshi Tsukada
    Inst. of Image Information and Television Engineers, 2003, Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 57(7) (7), 829 - 839, Japanese
    Scientific journal

  • 日本語話し言葉音声認識のための音節に基づく音響モデリング
    緒方 淳, 有木 康雄
    2003, 電子情報通信学会論文誌, Vol.J86-D-II No.11 1523-1530(11) (11), 1523 - 1530, Japanese
    [Refereed]
    Scientific journal

  • 映像文法に基づく映像編集支援システム
    天野 美紀, 上原 邦昭, 熊野 雅仁, 有木 康雄, 下条 真司, 春藤 憲司, 塚田 清志
    2003, 情報処理学会論文誌, 44(03) 915-924(3) (3), 915 - 924, Japanese
    [Refereed]
    Scientific journal

  • 映像文法に基づいた映像編集支援システムのための使用可能なショット区間の自動抽出
    熊野 雅仁, 有木 康雄, 春藤 憲司, 塚田 清
    2003, 映像メディア学会誌, Vol.57 No.7 829-839, Japanese
    [Refereed]
    Scientific journal

  • マルチメディア情報の高次処理
    馬場 口登, 上原 邦昭, 有木 康雄
    2003, 人工知能学会誌, Vol.18,No.3,pp.307-316, Japanese
    [Refereed]
    Scientific journal

  • Topic Segmentation and Retrieval System for Lecture Videos Based on Spontaneous Speech Recognition
    YAMAMOTO Natsuo, OGATA Jun, ARIKI Yasuo
    2003, EuroSpeech2003, 961-964, English
    [Refereed]
    Scientific journal

  • Syllable-Based Acoustic Modeling for Japanese Spontaneous Speech Recognition
    OGATA Jun, ARIKI Yasuo
    2003, EuroSpeech2003, 2513-2516, 2513 - 2516, English
    [Refereed]
    Scientific journal

  • Speaker Naming System by Associating Speech and Speaker Recognition Results
    ARIKI Yasuo, NISHIDA M
    2003, 2003 ISCA Workshop on Multilingual Spoken Document Retrieval(MSDR2003), 61-66, English
    [Refereed]
    Scientific journal

  • Live Speech Recognition in Sports Games by Adaptation of Acoustic Model and Language Model
    ARIKI Yasuo, SHIGEMORI Takeru, KANEKO Tsuyoshi, OGATA Jun, FUJIMOTO Masakiyo
    2003, EuroSpeech2003, 1453-1456, English
    [Refereed]
    Scientific journal

  • Human Information Retrieval Based on Face Recognition in Video Image through Multi-modal Interaction Using Speech and Hand Pointing Action
    ARIKI Yasuo, FUJIMOTO Masakiyo, YAMAMOTO Natsuo, KUMANO Masahito
    2003, HCI International 2003, Vol.II 586-590, English
    [Refereed]
    Scientific journal

  • Full Automatic Segmentation of Goods Catalog Video into Individual Goods Section by Integrating Speech and Image Information
    FUJIMOTO Masakiyo, ARIKI Yasuo, MATSUMOTO Hiroshi
    2003, 3th-International Workshop on Content-Based Multimedia Indexing(CBMI'03), 35-40, English
    [Refereed]
    Scientific journal

  • Combination of Temporal Domain SVD Based Speech Enhancement and GMM Based Speech Estimation for ASR in Noise - Evaluation on the AURORA2 Task -
    FUJIMOTO Masakiyo, ARIKI Yasuo
    2003, EuroSpeech2003, 1781-1784, English
    [Refereed]
    Scientific journal

  • AUTOMATIC SHOT SIZE INDEXING FOR A VIDEO EDITING SUPPORT SYSTEM
    KUMANO Masahiko, ARIKI Yasuo, TSUKADA Kiyoshi, SHUNTO Kenji
    2003, 3th-International Workshop on Content-Based Multimedia Indexing(CBMI'03), 57-62, English
    [Refereed]
    Scientific journal

  • M Kumano, Y Ariki, K Uehara, S Shimojo, K Sunto, K Tsukada
    2003, ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 86(11) (11), 61 - 71, English
    [Refereed]
    Scientific journal

  • 映像文法のためのカット先読み機構を備えた自動ダイジェスト生成システム
    西澤 尚宏, 鎌原 淳三, 春藤 憲司, 塚田 清志, 有木 康雄, 上原 邦昭, 下條真司, 宮原 秀夫
    Mar. 2002, Japanese
    Research institution

  • Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the AURORA2 tasks
    M. Fujimoto, Y. Ariki
    International Speech Communication Association, 2002, 7th International Conference on Spoken Language Processing, ICSLP 2002, 465 - 468, English
    International conference proceedings

  • M. Fujimoto, Y. Ariki
    Institute of Electrical and Electronics Engineers Inc., 2002, Proceedings of 2002 IEEE Workshop on Multimedia Signal Processing, MMSP 2002, 268 - 271, English
    International conference proceedings

  • Video editing support system based on video grammar and content analysis
    Masahito Kumano, Yasuo Ariki, Miki Amano, Kuniaki Uehara, Kenji Shunto, Kiyoshi Tsukada
    2002, Proceedings - International Conference on Pattern Recognition, 16(2) (2), 1031 - 1036, English
    International conference proceedings

  • An Advanced Multimedia Content Processing for the Broadband Internet Services
    Shimojo, S, Nishio, S, Tanaka, K, Ariki, Y, Uehara, K, Tsukamoto, M, Arikawa, M, Tajima, K, Harumoto, K
    Jul. 2001, English
    [Invited]
    Research society

  • Speech recognition under musical environments using kalman filter and iterative MLLR adaptation
    M. Fujimoto, Y. Ariki
    International Speech Communication Association, 2001, EUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology, 1879 - 1882, English
    International conference proceedings

  • Continuous speech recognition under non-stationary musical environments based on speech state transition model
    M. Fujimoto, Y. Ariki
    2001, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1, 297 - 300, English
    International conference proceedings

  • Shojiro Nishio, Katsumi Tanaka, Yasuo Ariki, Shinji Shimojo, Masahiko Tsukamoto, Masatoshi Arikawa, Keishi Tajima, Kaname Harumoto
    【工学部論文データから移行】
    Sep. 2000, Proceedings of ADBIS-DASFAA Symposium on Advances in Databases and Information Systems, J. Stuller, et al., Eds.: Lecture Notes in Computer Science 1884, Springer, English
    [Invited]
    International conference proceedings

  • Multimedia technologies for structuring and retrieval of TV news
    Y Ariki
    2000, NEW GENERATION COMPUTING, 18(4) (4), 341 - 357, English
    [Refereed]
    Scientific journal

  • Noisy speech recognition using noise reduction method based on Kalman filter
    M Fujimoto, Y Ariki
    2000, 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 1727 - 1730, English
    [Refereed]
    International conference proceedings

  • ARIKI Yasuo
    The Institute of Image Information and Television Engineers, Jan. 1999, The Journal of the Institute of Television Engineers of Japan, 53(1) (1), 34 - 40, Japanese

■ MISC
  • Dysarthric Speech Recognition Based on Deep Metric Learning
    高島 悠樹, 高島 遼一, 滝口 哲也, 有木 康雄
    電子情報通信学会, 02 Mar. 2020, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 119(439) (439), 181 - 186, Japanese

  • Dysarthric Speech Recognition Based on Deep Metric Learning
    高島 悠樹, 高島 遼一, 滝口 哲也, 有木 康雄
    電子情報通信学会, 02 Mar. 2020, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 119(440) (440), 181 - 186, Japanese

  • Differentiable Programmingを用いた強化学習の最適化
    黄 伊莎, トリスタン ハスクウェト, 高島 遼一, 滝口 哲也, 有木 康雄
    機械学習と関数型プログラミングには多くの類似性を持ち、近年それらを結びつけるDifferentiable Programmingというアイデアが新しく出てきた。この方法はパラメータを直接調節して勾配を求められることが今までと大きく違う。これにより物理シミュレーションなど様々な分野に適用できることが期待されている。私たちは強化学習のベンチマークを用いてDQNの手法とDifferentiable Programmingの手法を比較し、Differentiable programmingの効果及び学習ダイナミクスを調査し説明する。
    20 Feb. 2020, 第82回全国大会講演論文集, 2020(1) (1), 267 - 268, Japanese

  • ニューロンセグメンテーションにおけるマルチドメイン学習による汎化性能の改善
    長谷川 貴大, Tristan Hascoet, 高島 遼一, 滝口 哲也, 有木 康雄
    脳全体における神経回路のマッピングの研究であるコネクトミクスにおいて、脳の電子顕微鏡画像から各ニューロンを識別することが重要である。深層学習によるニューロンの自動セグメンテーションに際して、データの取得にもアノテーションにも多大なコストがかかるため、転移学習をさせることが有力な選択肢の1つとなる。本稿では、U-Netと呼ばれる深層学習モデルを用いて、複数のドメインの公開データセットで学習させたモデルの汎化性能を検討した。また、それによって、目標となるドメインのデータセットでの転移学習のコストを低減させつつ、精度を向上させることを試みた。
    20 Feb. 2020, 第82回全国大会講演論文集, 2020(1) (1), 169 - 170, Japanese

  • Discriminative features in brain magnetic fields during auditory speech sound imagery using convolutional neural networks
    矢野彩緒里, 高島遼一, 滝口哲也, 有木康雄, 添田喜治, 中川誠司, 中川誠司
    2020, 日本音響学会研究発表会講演論文集(CD-ROM), 2020

  • Transfer Learning Using the Speech Data of Persons with Dysarthria Speaking Different Languages for Dysarthric Speech Recognition
    高島 悠樹, 高島 遼一, 滝口 哲也, 有木 康雄
    電子情報通信学会, 26 Oct. 2019, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 119(251) (251), 45 - 50, Japanese

  • 脳磁界データによる音声の識別―想起時と聴取時の比較―
    矢野彩緒里, 高島遼一, 滝口哲也, 有木康雄, 添田喜治, 中川誠司, 中川誠司
    21 Aug. 2019, 日本音響学会研究発表会講演論文集(CD-ROM), 2019, ROMBUNNO.3‐P‐13, Japanese

  • 音声明瞭度に関連した脳磁界反応:聴覚野および知覚性言語野における解析
    嵯峨直樹, 矢野肇, 滝口哲也, 有木康雄, 添田喜治, 中川誠司
    脳磁界計測による知覚する音声の明瞭性に関わる反応を聴覚野と知覚性言語野に注目して調べた.
    背景雑音として50 dBの白色雑音を用い,雑音の音圧レベルに対し,SN比が0, 6, 12, 18, 24 dBになるように日本語の単音節の音圧レベルを調整した.これらの刺激音を用い,明瞭度試験と脳磁界計測を行った.誘発脳磁界データからの脳内活動源推定を行い,聴覚野や知覚性言語野における活動強度の変化を調べた.右半球の上側頭回後方において潜時250〜500 msにおける言語性情報の処理の可能性が示唆された.
    Sep. 2018, 日本音響学会2018年秋季研究発表会講演論文集, 485 - 488, Japanese
    Summary national conference

  • Alternating Direction Method of Multipliersを用いた声質変換のためのパラレル辞書学習 (音声) -- (第17回音声言語シンポジウム)
    相原 龍, 滝口 哲也, 有木 康雄
    電子情報通信学会, 02 Dec. 2015, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 115(346) (346), 13 - 18, Japanese

  • Detection of facial parts using deformable part model
    西田 和博, 榎並 直子, 有木 康雄
    日本音響学会, 18 Jun. 2015, 聴覚研究会資料 = Proceedings of the auditory research meeting, 45(4) (4), 211 - 216, Japanese

  • Detection of facial parts using deformable part model
    西田 和博, 榎並 直子, 有木 康雄
    電子情報通信学会, 18 Jun. 2015, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 115(100) (100), 7 - 12, Japanese

  • Detection of facial parts using deformable part model
    西田 和博, 榎並 直子, 有木 康雄
    電子情報通信学会, 18 Jun. 2015, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 115(99) (99), 7 - 12, Japanese

  • Detection of facial parts using deformable part model
    西田 和博, 榎並 直子, 有木 康雄
    電子情報通信学会, 18 Jun. 2015, 電子情報通信学会技術研究報告 = IEICE technical report : 信学技報, 115(98) (98), 7 - 12, Japanese

  • 脳磁界計測によるエアコン音の"涼しさ"の印象評価の試み
    矢野 肇, 保手浜 拓也, 滝口 哲也, 有木 康雄, 神谷 勝, 中川 誠司
    日本生体磁気学会, Jun. 2015, 日本生体磁気学会誌, 28(1) (1), 106 - 107, Japanese

  • Voice Conversion Using Speaker Adaptive Restricted Boltzmann Machine
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    Voice conversion (VC) is a technique where only speaker-specific information in source speech is converted while keeping phonological information. The technique can be applied to various tasks such as speaker-identity conversion, emotion conversion and aid to speaking for people with articulation disorders. Most of the existing VC methods rely on parallel data?pairs of speech data from source and target speakers uttering the same articles. However, this approach involves several problems; firstly, the data used for the training is limited to the pre-defined articles. Secondly, the use of the trained model is limited only to the speaker pair used in the training. In this paper, we propose a novel probabilistic model called an adaptive restricted Boltzmann machine (ARBM) for VC between arbitrary speakers without use of parallel data. This model consists of a visible-unit and a hidden-unit layer with the speaker-dependent connection. In this paper, we report our experimental results of arbitrary-speaker VC using our model, an ARBM.
    Information Processing Society of Japan (IPSJ), 08 Dec. 2014, IPSJ SIG Notes, 2014(30) (30), 1 - 6, Japanese

  • Multimodal Voice Conversion using Weighted Features in Noisy Environments
    Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Voice conversion is a technique for converting specific information in speech while maintaining the other information, such as linguistic information. This technique has been applied to various tasks, for example, there are speaker conversion, emotion conversion and speaking assistance, etc. The GMM-based method is conventional VC method and widely used. In noisy environments, the GMM-based method cannot convert the speech well, because this method cannot model the noisy signal well. Therefore, we have been researched about a noise-robust VC method using Non Negative Matrix Factorization (NMF). In this paper, we propose a multimodal VC method that improves the noise robustness of our previous exemplar-based VC method. Furthermore, we introduce the combination weight between audio and visual features and formulate a new cost function in order to estimate the audio-visual exemplars. By using the joint audio-visual features as source features, the VC performance is improved compared to a previous audio-input exemplar-based VC method. The effectiveness of this method was confirmed by comparing it with that of the conventional audio input NMF-based method and the conventional GMM-based method.
    Information Processing Society of Japan (IPSJ), 08 Dec. 2014, IPSJ SIG Notes, 2014(17) (17), 1 - 6, Japanese

  • Many-to-one Voice Conversion using Multiple Non-negative Matrix Factorization
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    Voice conversion (VC) is being widely researched in the field of speech processing because of increased interest in using such processing in applications such as personalized Text-To-Speech systems. Statistical approach using Gaussian Mixture Model (GMM) is widely researched in VC and eigen-voice GMM enables one-to-many and many-to-one VC from multiple training data sets. We present in this paper an exemplar-based VC method using Non-negative Matrix Factorization (NMF), which is different from conventional statistical VC. NMF-based VC has advantages of noise robustness and naturalness of converted voice compared to GMM-based VC. However, because NMF-based VC is based on parallel training data of source and target speaker, we cannot covert voice of arbitrary speakers in this framework. In this paper, we propose a many-to-one VC using Multiple Non-negative Matrix Factorization (Multi-NMF). By using Multi-NMF, arbitrary speaker's voice is converted to target speaker's voice without any training data of input speaker's. We assume that this method is flexible because we can adopt it to many-to-many VC or voice quality control.
    Information Processing Society of Japan (IPSJ), 08 Dec. 2014, IPSJ SIG Notes, 2014(15) (15), 1 - 6, Japanese

  • Individuality-preserving Voice Conversion for Articulation Disorders Using Sparse Dictionary Learning
    相原 龍, 滝口 哲也, 有木 康雄
    日本音響学会, 19 Jun. 2014, 聴覚研究会資料 = Proceedings of the auditory research meeting, 44(5) (5), 283 - 288, Japanese

  • A joint restricted Boltzmann machine for dictionary learning in sparse-representation-based voice conversion
    中鹿 亘, 滝口 哲也, 有木 康雄
    近年,声質変換の研究分野において,over-fitting や over-smoothing の生じにくいスパース表現に基づく手法が注目を浴びている.スパース表現に基づく声質変換法では,予め入力話者・出力話者のパラレル辞書を求めておき,スパースな辞書選択重みを用いて適切な辞書を選択することで声質変換を実現するとの手法は主に 2 つのアプローチに分けることができる.1 つ目はパラレル辞書として,学習データの音響特徴量をそのまま辞書として用いるアプローチであり,もう 1 つは,パラレル辞書そのものを何らかの手法で学習させるアプローチである.本研究では,後者のアプローチに基づき,近年注目を浴びている Deep Learning の基礎技術となる restricted Bolzmann machine(RBM) を用いて,入力話者・出力話者のパラレル辞書を体系的に求める手法を提案する.評価実験では,代表的な手法である Gaussian mixture model(GMM) だけでなく,従来のスパース表現に基づく手法である、non-negative matrix factorization (NMF) による声質変換法に比べて高い精度が得られたことを確認した.In voice conversion, sparse-representation-based methods have recently been garnering attention because they are, relatively speaking, not affected by over-fitting or over-smoothing problems. In these approaches, voice conversion is achieved by estimating a sparse vector that determines which dictionaries of the target speaker should be used, calculated from the matching of the input vector and dictionaries of the source speaker. The sparse-repre sentation-based voice conversion methods can be broadly divided into two approaches: 1) an approach that uses raw acoustic features in the training data as parallel dictionaries, and 2) an approach that trains parallel dictionaries from the training data. Our approach belongs to the latter; we systematically estimate the parallel dictionaries using a restricted Boltzmann machine, a fundamental technology commonly used in deep learning. Through voice-conver sion experiments, we confirmed the high-performance of our method, comparing it with the conventional Gaussian mixture model (GMM)-based approach, and a non-negative matrix factorization (NMF)-based approach, which is based on sparse-representation.
    17 May 2014, 研究報告音楽情報科学(MUS), 2014(66) (66), 1 - 6, Japanese

  • Speaker-dependent conditional restricted Boltzmann machine for voice conversion
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In this paper, we present a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain time-invariant speaker-independent spaces where voice features are converted more easily than those in an original acoustic feature space. First, we train two CRBMs for a source and target speaker independently using speaker-dependent training data (without the need to parallelize the training data). Then, a small number of parallel data are fed into each CRBM and the high-order features produced by the CRBMs are used to train a concatenating neural network (NN) between the two CRBMs. Finally, the entire network (the two CRBMs and the NN) is fine-tuned using the acoustic parallel data. Through voice-conversion experiments, we confirmed the high performance of our method in terms of objective and subjective evaluations, comparing it with conventional GMM, NN, and speaker-dependent DBN approaches.
    The Institute of Electronics, Information and Communication Engineers, 19 Dec. 2013, IEICE technical report. Speech, 113(366) (366), 83 - 88, Japanese

  • Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization
    Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movement of such speakers is limited by their athetoid symptoms, and their consonants are often unstable or unclear, which makes it difficult for them to communicate. In our previous method, exemplar-based spectral conversion using Non-negative Matrix Factorization (NMF) was applied to a voice with an articulation disorder. To preserve the speaker's individuality, we used a combined dictionary that is con structed from the source speaker's vowels and target speaker's consonants. However, this exemplar-based approach needs to hold all the training exemplars (frames), and it may cause mismatching of phonemes between input signals and selected exemplars. In this paper, in order to reduce the mismatching of phoneme alignment, we propose a phoneme-categorized sub-dictionary and a dictionary selection method using NMF. By using the sub-dictionary, the performance of VC is improved compared to a conventional NMF-based VC. The effectiveness of this method was confirmed by comparing its effectiveness with that of a conventional Gaussian Mixture Model (GMM)-based method and a conventional NMF-based method.
    Information Processing Society of Japan (IPSJ), 12 Dec. 2013, IPSJ SIG Notes, 2013(12) (12), 1 - 6, Japanese

  • Voice Conversion based on Non-negative Matrix Factorization with Segment Features in Noisy Environments
    Takao Fujii, Ryo Aihara, Tetsuya Takiguchi, Yasuo Ariki
    This paper presents a voice conversion based on NMF for noisy environments. We prepared parallel exemplars that consist of the source and target exemplars, which have the same texts uttered by the source and target speakers. The input source signal is decomposed into the source exemplars, noise exemplars obtained from the input signal, and their weights. Then, the converted signal is obtained by calculating the linear combination of the target exemplars and the weights which are calculated using the source exemplars. In the proposed method, segment features are used for the voice conversion technique based on NMF in order to improve the accuracy of the weight estimation. The effectiveness of this method was confirmed by comparing its effectiveness with that of a conventional method.
    Information Processing Society of Japan (IPSJ), 12 Dec. 2013, IPSJ SIG Notes, 2013(13) (13), 1 - 6, Japanese

  • Classification of Children with Autism Spectrum and Typically Developing Children Using Pitch Features
    Yasuhiro Kakihara, Tetsuya Takiguchi, Yasuo Ariki, Yasushi Nakai, Satoshi Takada
    Recent investigations have demonstrated that the early support which specialized in autistic spectrum obstacle, such as Picture Exchange Communication System (PECS) Applied Behavier Analysis (ABA) Social Skills Training (SST), is effective. This paper reports the result of a classification experiment carried out using pitch features for children with autism spectrum. Pitch features consist of 24 dimensions, such as 25th, 50th, 75th percentiles, 25-50 percentile difference, 50-75 percentile difference, mean, standard deviation, kurtosis, skewness, maximum, minimum, and range.
    Information Processing Society of Japan (IPSJ), 12 Dec. 2013, IPSJ SIG Notes, 2013(6) (6), 1 - 6, Japanese

  • Speaker-dependent conditionl restricted Boltzmann machine for voice conversion
    Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
    In this paper, we present a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain time-invariant speaker-independent spaces where voice features are converted more easily than those in an original acoustic feature space. First, we train two CRBMs for a source and target speaker independently using speaker-dependent training data (without the need to parallelize the training data). Then, a small number of parallel data are fed into each CRBM and the high-order features produced by the CRBMs are used to train a concatenating neural network (NN) between the two CRBMs. Finally, the entire network (the two CRBMs and the NN) is fine-tuned using the acoustic parallel data. Through voice-conversion experiments, we confirmed the high performance of our method in terms of objective and subjective evaluations, comparing it with conventional GMM, NN, and speaker-dependent DBN approaches.
    Information Processing Society of Japan (IPSJ), 12 Dec. 2013, IPSJ SIG Notes, 2013(14) (14), 1 - 6, Japanese

  • Proposal and evaluation of detour multipath data gathering protocol for wireless sensor networks
    藤田 圭佑, 高木 由美, 太田 能, 玉置 久, 有木 康雄
    神戸大学都市安全研究センター, Mar. 2013, 神戸大学都市安全研究センター研究報告, (17) (17), 269 - 278, Japanese

  • Two-step Correction of the Speech Recognition Result based on Syntax and Semantics
    中谷 良平, 滝口 哲也, 有木 康雄
    本稿では,単語ごとに長距離文脈スコアを付与することで素性とし, ConfUsion Network 上での音声認識自動誤り訂正手法を提案する.従来,単語ごとの長距離文脈情報を素性に音声認識誤り訂正を行う手法は提案されているが,単語ごとにそれを付与する場合,周辺の認識精度に大きく依存してしまうという問題がある.そのため,認識誤りを多く含む認識結果に対して長距離文脈情報を付与するのは,あまり好ましくない.したがって本稿では,文脈情報を誤り訂正の素性として用いるために,まずはシンタックスを用いた誤り訂正を行い,誤認識を軽減する.その後,長距離文脈スコアを付与し, 2 段階目の訂正を行うことで,より音声認識精度を向上させることを目的とする.This paper presents the new method correcting speech recognition errors base on long-distance context. As in the past, the method which corrects recognition errors using long-distance context information given every word has been already proposed However, this method has the problem that a context score every word depends on peripheral recognition errors considerably. So, it is not desirable that long-distance context information is given the recognition result containing a lot of recognition errors. Therefore, in this paper, recognition errors are reduced by error correction adopting features of syntax to use context information as one of the feature. And then after correcting results are given long-distance context score, residual recognition errors are corrected by using that score as the feature.
    13 Dec. 2012, 研究報告音声言語情報処理(SLP), 2012(26) (26), 1 - 6, Japanese

  • Interpolation of unlearned position based on local regression for single-channel talker localization using acoustic transfer function
    高島 遼一, 滝口 哲也, 有木 康雄
    我々はこれまで,観測音声の音響伝達特性が話者の位置に依存するという点に着目し,音響伝達特性を識別することにより,単一マイクロホンで音源位置推定を行う手法を提案してきた.しかしこの手法は,事前に想定される音源位置毎に音響伝達特性を学習させる必要があり,学習していない位置の推定が困難であった.そこで本稿では,限られた位置の音響伝達特性を用いて,音響伝達特性から位置への回帰モデルを学習し,その回帰モデルにより未学習位置の推定を行う手法について検討する.回帰モデルとして,線形回帰である重回帰分析,非線形回帰である GPR (Gaussian Process Regression), SVR (Support Vector Regression) を用い,さらにその学習方法として,評価データに類似した学習サンプルのみから回帰モデルを学習する局所的回帰を検討し,その性能を評価した.This paper presents a sound source (talker) localization method using only a single micro phone. In our previous work, we discussed the single-channel sound source localization method based on the discrimination of the acoustic transfer function. However, that method requires to train the acoustic transfer function for each possible position in advance, and it is difficult to estimate the position that have not been pre-trained. In this paper, we discuss a single-channel talker localization method based on a regression model, which predicts the position from the acoustic transfer function. For training the regression models, we use the local regression method that trains the regression model from only training samples being similar to the evaluation data. Considering both of linear and non-linear regression mod els, the effectiveness of this method has been confirmed by talker localization experiments performed in different room environments.
    13 Dec. 2012, 研究報告音声言語情報処理(SLP), 2012(14) (14), 1 - 6, Japanese

  • Sparse Coding-Based Voice Conversion from Lip Information
    相原 龍, 高島 遼一, 滝口 哲也, 有木 康雄
    唇の動きから発話内容を読み取る技術はリップリーディング (読唇) と呼ばれ,聴覚・言語障害者のコミュニケーション手段の一つとして用いられている.本研究では, Sparse Coding を用いて,唇動画像から対応する発話音声へテキスト情報なしで変換を行う.事前に音声を含んだ発話映像から唇情報と音声情報を抽出し,それぞれを基底の集合である辞書として学習する.このとき,二つの辞書行列は同一時系列であり,パラレルなデータである.入力された無音声の映像から抽出された唇情報は, Sparse Coding により少数の基底の線形和で表される.唇辞書行列から選ばれた基底を対応する音声辞書の基底と取り換えることで,音声の基底の線形和として音声が出力される.本稿では,唇情報から識別可能と考えられる母音について変換を行った.A technology to recognize speech content from lip motion is called visual speech recognition (VSR). VSR is an important communication method for people who have a handicap with hearing or speaking. In this paper, we propose a sparse-coding-based voice conversion method using lip motion without text information. Lip information and voices are extracted from videos, where they are used to construct lip dictionary and voice dictionary. Input lip information is represented by a linear combination of a small number of bases in the lip dictionary. The bases are replaced to coordinate bases in the voice dictionary, and they are recomposed to voice information. In this paper, we conducted vowel conversion because vowels are able to recognize from lip information.
    13 Dec. 2012, 研究報告音声言語情報処理(SLP), 2012(21) (21), 1 - 6, Japanese

  • Discrimination of Unknown Objects from Known Objects Using Multimodal Information
    OZASA Yuko, ARIKI Yasuo, IWAHASHI Naoto, NAKANO Mikio
    Mar. 2012, PRMU, pp. 247-252, English
    Report scientific journal

  • 未知語とその周辺単語の音声認識誤りを考慮したCRFによる音声認識誤り訂正
    NAKATANI Ryohei, IWAHASHI Naoto, NAKANO Mikio, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a fully automatic word-error correction on a confusion network by employing out-of-vocabulary word modeling. In usual speech recognition, there is a problem that speech recognition systems incorrectly recognize OOV words and their neighboring words. In this paper, we add hybrid word/syllable recognition to the speech recognizer in order to make it recognize OOV words and to reduce the recognition error around OOV words. Then, we propose a CRF-based word-error correction method using acoustic and linguistic features. The proposed method can not only recognize OOV words but also correct the words neighboring OOV words.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2011, IEICE Speech Committee, SP2011-94,No.24,pp.139-144(365) (365), 139 - 144, Japanese
    Report scientific journal

  • グラフ構造表現による一般物体認識
    HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2011, PRMU, PRMU2011-127,pp.19-24, Japanese
    Report scientific journal

  • Detecting Unknown Objects and Unknown Names Using Multimodal Information
    OZASA Yuko, IWAHASHI Naoto, HORI Takahir, NAKATANI Ryohei, ARIKI Yasuo, NAKANO Mikio
    Dec. 2011, SI2011, pp. 1629-1639, English
    Report scientific journal

  • H-013 Generic Object Recognition by 3D Feature-Based Structural Expression
    Hori Takahiro, Iwahashi Naoto, Nakano Mikio, Ariki Yasuo
    Forum on Information Technology, 07 Sep. 2011, 情報科学技術フォーラム講演論文集, 10(3) (3), 131 - 132, Japanese

  • 確率スペクトル包絡を用いた混合音解析における制約付きスペクトル生成法の検討
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    NMF (Non-negative matrix factorization) has been one of the most widely-used techniques for signal analysis in recent years. In particular, the supervised type of NMF is garnering much attention in source separation or signal analysis with respect to the analysis accuracy and speed. Because such methods require all the possible samples for the analysis, it is hard to build a practical analysis system. To analyze signals properly even when short of samples, we proposed a probabilistic approach called PSE (probabilistic spectrum envelope) so far, in which spectrum envelopes belonging to an auditory category are randomly generated, and the spectrum is used as a part of supervised basis matrix of NMF. However, this method has a difficulty in obtaining the optimum solution due to a lot of flexibility. In this paper, we propose a new PSE method with sparseness and density constraints which efficiently lead to the more appropriate solution.
    The Institute of Electronics, Information and Communication Engineers, Jul. 2011, IEICE Speech Committee, SP2011-50,pp. 51-56(153) (153), 51 - 56, Japanese
    Report scientific journal

  • グラフ-ベクトル変換を用いたグラフ構造表現による一般物体認識
    HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2011, MIRU, pp.387-394, Japanese
    Report scientific journal

  • CSP係数の識別に基づく話者の頭部方向推定の検討
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2011, IEICE Speech Committee, SP2011-51,pp. 57-62, Japanese
    Report scientific journal

  • ARCOによる顔検出を併用した人誤検出の棄却について
    YAMASHITA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2011, MIRU, pp.608-614, Japanese
    Report scientific journal

  • AAMによる顔方位を考慮した発話認識
    KOMAI Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2011, MIRU, pp.534-539, Japanese
    Report scientific journal

  • 3次元ActiveAppearanceModel を利用した視線方向推定
    NAKAMATSU Yukari, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2011, MIRU, pp.572-579, Japanese
    Report scientific journal

  • Estimation of Head Orientation Based on Discrimination of Acoustic Transfer Functions
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a talker's head orientation estimation method using only a single microphone, where phoneme HMMs (Hidden Markov Models) of clean speech are introduced to separate the acoustic transfer function at the user's position and head orientation. The frame sequence of the acoustic transfer function is estimated by maximizing the likelihood of training data uttered from a given position with a given head orientation. Using the separated frame sequence data, the user's position and the head orientation are trained by Support Vector Machine (SVM) in advance. Then, for each test utterance, the frame sequence of the acoustic transfer function is separated based on the maximum likelihood estimation using the label sequence obtained from the phoneme recognition, and the user's position and head orientation are estimated by discriminating the separated acoustic transfer function using SVM. The effectiveness of this method has been confirmed by talker localization and head orientation estimation experiments performed in a real environment.
    The Institute of Electronics, Information and Communication Engineers, 05 May 2011, IEICE technical report, 111(27) (27), 167 - 172, Japanese

  • Estimation of Head Orientation Based on Discrimination of Acoustic Transfer Functions
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a talker's head orientation estimation method using only a single microphone, where phoneme HMMs (Hidden Markov Models) of clean speech are introduced to separate the acoustic transfer function at the user's position and head orientation. The frame sequence of the acoustic transfer function is estimated by maximizing the likelihood of training data uttered from a given position with a given head orientation. Using the separated frame sequence data, the user's position and the head orientation are trained by Support Vector Machine (SVM) in advance. Then, for each test utterance, the frame sequence of the acoustic transfer function is separated based on the maximum likelihood estimation using the label sequence obtained from the phoneme recognition, and the user's position and head orientation are estimated by discriminating the separated acoustic transfer function using SVM. The effectiveness of this method has been confirmed by talker localization and head orientation estimation experiments performed in a real environment.
    The Institute of Electronics, Information and Communication Engineers, 05 May 2011, IEICE technical report, 111(26) (26), 167 - 172, Japanese

  • Estimation of Head Orientation Based on Discrimination of Acoustic Transfer Functions
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a talker's head orientation estimation method using only a single microphone, where phoneme HMMs (Hidden Markov Models) of clean speech are introduced to separate the acoustic transfer function at the user's position and head orientation. The frame sequence of the acoustic transfer function is estimated by maximizing the likelihood of training data uttered from a given position with a given head orientation. Using the separated frame sequence data, the user's position and the head orientation are trained by Support Vector Machine (SVM) in advance. Then, for each test utterance, the frame sequence of the acoustic transfer function is separated based on the maximum likelihood estimation using the label sequence obtained from the phoneme recognition, and the user's position and head orientation are estimated by discriminating the separated acoustic transfer function using SVM. The effectiveness of this method has been confirmed by talker localization and head orientation estimation experiments performed in a real environment.
    The Institute of Electronics, Information and Communication Engineers, 05 May 2011, IEICE technical report, 111(28) (28), 167 - 172, Japanese

  • Confusion Networkを用いたCRFによる音声認識誤り訂正
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2011, SDPW2011, 6 pages, Japanese
    Report scientific journal

  • 確率スペクトル包絡に基づくNMF 基底生成モデルを用いた混合楽音解析
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    従来の代表的な楽音解析手法として,NMF (非負値行列因子分解) をベースとしたアプローチが注目を浴びている.これは,予め大量の音源サンプルを用意しておくことで解析を行う教師あり NMF と,学習を用いず何らかの制約条件に基づいて解析を行う教師なし NMF に,大別することができる.しかしながら,前者では,可能性のある全ての基底サンプルを用意する必要があるので,一般にシステムの実用化は困難である.一方後者のアプローチでは,機械的に分解しているに過ぎないので意図しない結果が表れる傾向にある.本研究では,楽器カテゴリごとに共通なスペクトル包絡 (確率スペクトル包絡) を統計的に学習し,確率スペクトル包絡が作り出す基底の組み合わせによって観測信号のスペクトルを表現する手法を提案する.提案手法ではまず,ガウシアンプロセスをベースとした手法により,楽器カテゴリごとの確率スペクトル包絡を学習させる.その後教師あり NMF と遺伝アルゴリズムを組み合わせて,包絡に沿って確率的に生成されるランダム基底集合から,最適な基底解を探索する.最後に,得られたアクティビティ行列から楽音を解析する.実験結果から,提案手法が学習データには含まれない未知の音源に対しても頑健であると同時に,複数の音源が混ざっていても解析が可能であることを確かめた.NMF (Non-negative Matrix Factorization) based approaches are garnering much attention in musical signal analysis in recent years. These are roughly classified into two approaches: exemplar-based NMF, in which a large number of samples are used for analyzing a signal, and unsupervised NMF, in which signals are analyzed in some constrains without learning any samples beforehand. However, because the former methods require all the possible samples for the analysis, it is hard to build the practical system of the method. The latter approach should cause unintended results because the method is based on mathematical analysis not perceptual coding. In this paper, we propose a novel method of signal analysis by combining NMF and a probabilistic approach. At the beginning, a common spectram envelope to an instrument, called a probabilistic spectrum envelope (PSE), is learned for each categories using a Gaussian-Process-based approach. On the analyzing stage, basis vectors of NMF are randomly generated from the PSE, and the most befitting vectors can be found by combination of unsupervised NMF and Genetic Algorithm. The experimental results indicated that the method is robust against unknown sound sources, and can properly analyze the signals including multiple sources.
    情報処理学会, Feb. 2011, IPSJ-SIGMUS, Vol.2011-MUS-89,No.18, pp. 1-6(18) (18), 1 - 6, Japanese
    Report scientific journal

  • Feature selection for single-channel sound source localization using the acoustic transfer function
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a sound source (talker) localization method using only a single microphone. In our previous work, we discussed the single-channel sound source localization method, where the acoustic transfer function from a user's position is estimated by using a Hidden Markov Model (HMM) of clean speech in the cepstral domain. In this paper, each cepstral dimension of the acoustic transfer function is newly selected in order to select the cepstral dimensions having information that is useful for classifying the user's position. Then, we propose a feature selection method for the cepstral parameter using Multiple Kernel Learning (MKL) to define the base kernels for each cepstral dimension (scalar) of the acoustic transfer function.
    The Institute of Electronics, Information and Communication Engineers, 20 Jan. 2011, IEICE technical report, 110(401) (401), 49 - 54, Japanese
    Report scientific journal

  • Development for Collaborative Integration of Speech and Image Recognition : 3. Audio-Visual Speech Recognition
    有木 康雄, 駒井 祐人
    情報処理学会, 15 Jan. 2011, 情報処理, 52(1) (1), 87 - 94, Japanese

  • 基底の反復生成と教師ありNMFを用いた信号解析
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2010, IEICE Speech Committee, SP2010-102,pp. 195-200, Japanese
    Report scientific journal

  • 階層的強化学習を適用したPOMDPによる音声対話制御
    KISHIMOTO Yasuhide, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2010, IEICE Speech Committee, SP2010-98,pp. 121-126, Japanese
    Report scientific journal

  • Bag of Grammarとルールベース手法を用いたドメイン依存性の少ないハイブリッド型Whyテキストセグメント判定
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    The main focus of this research is to improve Why Text Segments Classification accuracy for Bag of Grammar method by making use of why-type keywords rules, which extracted manual to construct rule based dictionary, in Rule Based method. The idea behind the usage of Rule Based method is so that it can be supplement the why-type rules that are not covered by Bag of Grammar. We examined two different methods of exploiting Rule Based method to build better Why Text Segments Classification. These methods are differed by the way of combing the Rule Based method into Bag of Grammar method. First model is simply combining features used in Bag of Grammar and Rule Based method to build one feature space to build classifier based on this feature space. Second model explored the schemes of combing different classifiers built by each method to boost the classification accuracy. The experiments showed that some of combing methods provide effective way of constructing more accurate Why Text Segments Classification classifier.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2010, IEICE Speech Committee, SP2010-97,pp. 103-108(356) (356), 103 - 108, Japanese
    Report scientific journal

  • Buried Markov Modelを用いた構音障害者の音声認識の検討
    MIYAMOTO Chikoto, KOMAI Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao
    Recently, the accuracy of speaker-independent speech recognition has been remarkably improved by use of stochastic modeling of speech. However, there has been very little research on orally-challenged people, such as those with speech impediments. Therefore we have tried to build the acoustic model for a person with articulation disorders. The articulation of speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we consider temporal dependence to solve this problem. Though HMM makes it possible to recognize clear utterance with high accuracy, the speech including the noise or the continuous utterance causes degradation of speech recognition. To solve this problem, J. Bilmes proposed buried Markov model which contains the conditional independence between the observation nodes. In this paper, we perform phone recognition experiments using buried Markov model.
    The Institute of Electronics, Information and Communication Engineers, Oct. 2010, IEICE Speech Committee, SP2010-57, pp. 69-74(220) (220), 69 - 74, Japanese
    Report scientific journal

  • 物体領域特徴の自動選定とマルチカーネル学習を用いた特徴統合による一般物体認識
    NAKASHIKA Toru, SUGA Akira, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, OS8-2, pp. 1404-1411, Japanese
    Report scientific journal

  • 複数尤度を用いた3次元パーティクルフィルタによる選手の追跡
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS1-39, pp. 307-312, Japanese
    Report scientific journal

  • 地面位置の推定に基づく2次元画像からの擬似3次元復元
    ISHIBASHI Kaoru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS2-36, pp. 1011-1016, Japanese
    Report scientific journal

  • 唇領域のAAMを用いた発話認識における画像特徴量の音素解析
    KOMAI Yuto, MIYAMOTO Chikoto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS3-31,pp. 1771-1778, Japanese
    Report scientific journal

  • 視点移動カメラにおけるカメラキャリブレーション
    SOWA Tomoya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS2-37,pp.1017-1022, Japanese
    Report scientific journal

  • 高周波強調処理と入力画像の利用による学習型超解像
    OGAWA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS2-35, pp. 1004-1010, Japanese
    Report scientific journal

  • 固有空間でのモデルフィッティングによる単眼画像からの人体3次元姿勢推定
    ONISHI Katsunori, BO Geli, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS3-30, pp. 589-594, Japanese
    Report scientific journal

  • 階層的領域分割法に基づく木構造条件付確率場による一般物体認識
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS3-32, pp. 1779-1783, Japanese
    Report scientific journal

  • 階層的強化学習を適用したPOMDPによるカーナビゲーションシステムの音声対話制御
    KISHIMOTO Yasuhide, ARIKI Yasuo, TAKIGUCHI Tetsuya
    In this paper, we propose a dialogue manager in a car navigation systems using Partially Observable Markov Decision Processes (POMDP) that can treat ambiguous information. Even when it occurs speech recognition errors are caused by car indoor noises, it can manage the dialogue. we also propose a variation of the classic POMDP by incorporating hierarchical reinforcement learning. It can deal with large task than traditional system. The results confirms that the proposed method outperforms a handcrafted dialogue manager.
    The Institute of Electronics, Information and Communication Engineers, Jul. 2010, IEICE Speech Committee, SP2010-43, pp. 49-54(143) (143), 49 - 54, Japanese
    Report scientific journal

  • Image Annotation by Concept Level Search Using PLSA
    ZHENG Yu, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS1-41, pp. 319-324, Japanese
    Report scientific journal

  • Gaussian Processes for RegressionとAAMパラメータによる視線方向認識
    TAKATANI Manabu, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2010, MIRU, IS-40, pp. 315-318, Japanese
    Report scientific journal

  • バイラテラルフィルタによる実雑音下音声認識のための音声特徴量抽出
    YAMADA Kenshiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jun. 2010, IEICE Speech Committee, SP2010-29,pp. 43-48, Japanese
    Report scientific journal

  • D-11-57 Learning-Based Super-Resolution Using Wavelet Transform
    Ogawa Yuki, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 02 Mar. 2010, Proceedings of the IEICE General Conference, 2010(2) (2), 57 - 57, Japanese

  • D-12-70 Generic Object Recognition by Tree Conditional Random Field based on Hierarchical Segmentation
    Okumura Takeshi, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 02 Mar. 2010, Proceedings of the IEICE General Conference, 2010(2) (2), 181 - 181, Japanese

  • D-12-91 Soccer Player Tracking Using 3D Particle Filter and Earth Mover's Distance
    Nishino Takuro, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 02 Mar. 2010, Proceedings of the IEICE General Conference, 2010(2) (2), 202 - 202, Japanese

  • AAMを用いた唇領域特徴による音声発話認識
    KOMAI Yuto, MIYAMOTO Chikoto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jan. 2010, 電子情報通信学会技術研究報告, CQ2009-107,PRMU2009-206,SP2009, Japanese
    Report scientific journal

  • Front end for automatic speech recognition
    ARIKI Yasuo
    The Acoustical Society of Japan (ASJ), 25 Dec. 2009, The Journal of the Acoustical Society of Japan, 66(1) (1), 13 - 17, Japanese

  • 多重ベータ混合モデルを用いた調波時間構造のモデル化による音声合成の検討
    NAKASHIKA Toru, TACHIBANA Ryuki, NISHIMURA Masafumi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2009, 第11回音声言語シンポジウム, SP2009-93,No. 29,pp. 165-170, Japanese
    Report scientific journal

  • ランダムプロジェクションを用いた音響モデルの線形変換
    YOSHII Mariko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2009, 第79回音声言語情報処理研究会, 2009-SLP-79,No. 22,pp. 123-128, Japanese
    Report scientific journal

  • Buried Markov Modelを用いた音声認識モデルの構築法の検討
    YAMAMOTO Takayuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2009, 電子情報通信学会,音声研究会, 2009-SLP-79,No. 21,pp. 1-6, Japanese
    Report scientific journal

  • AAMを用いた顔方位にロバストな唇領域特徴抽出と音声特徴による構音障害者の音声認識
    MIYAMOTO Chikoto, KOMAI Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. The articulation of speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we use multiple acoustic frames as an acoustic feature to solve this problem. Further, in a real environment, the speech recognition systems do not have sufficient performance due to noise influence. In addition to acoustic features, visual features are used to increase noise robustness in a real environment. However, there is a recognition problem due to the tendency of his/her unsettling head movement. We investigate a pose-robust audio-visual speech recognition method using Active Appearance Model (AAM) to solve this problem.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2009, 第11回音声言語シンポジウム, SP2009-93,pp. 195-200(356) (356), 195 - 200, Japanese
    Report scientific journal

  • 構音障害者の音声認識における動的特徴量の考察
    MIYAMOTO Chikoto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    Oct. 2009, 電子情報通信学会,音声研究会, SP2009-55,pp.37-42, Japanese
    Report scientific journal

  • Bottom-upとTop-downアプローチの組み合わせによる単眼画像からの人体3次元姿勢推定
    大西克則, 滝口哲也, 有木康雄
    29 Sep. 2009, 平成21年度情報処理学会関西支部支部大会講演論文集, 2009, Japanese

  • H-011 Content Analysis based on Human Face Images
    Okada Tomoko, Takiguchi Tetsuya, Ariki Yasuo
    Forum on Information Technology, 20 Aug. 2009, 情報科学技術フォーラム講演論文集, 8(3) (3), 117 - 118, Japanese

  • H-006 Estimation of Ground Surface Displacement from SAR Satellite Image Using High-Accuracy Image Matching
    Mizuno Yusuke, Takiguchi Tetsuya, Ariki Yasuo
    Forum on Information Technology, 20 Aug. 2009, 情報科学技術フォーラム講演論文集, 8(3) (3), 107 - 108, Japanese

  • 複数特徴量の重み付け統合による一般物体認識
    SUGA Akira, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 画像の理解・認識シンポジウム, MIRU2009, IS1-29, pp. 589-594, Japanese
    Report scientific journal

  • 単眼サッカー映像におけるボールの3次元位置情報を用いた状況認識
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 画像の理解・認識シンポジウム, MIRU2009, IS2-61, pp.1269-1276, Japanese
    Report scientific journal

  • 大域的特徴としてBoFを導入したCRFによる一般物体認識
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 画像の理解・認識シンポジウム, MIRU2009, OS4-2, pp.95-102, 95 - 102, Japanese
    [Refereed]
    Report scientific journal

  • 回帰分析とパーティクルフィルタを用いた単眼画像からの人体3次元姿勢推定
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 画像の理解・認識シンポジウム, MIRU2009, IS3-43, pp. 1668-167, Japanese
    Report scientific journal

  • 過学習を考慮したAAMパラメータの選択と回帰分析による顔・視線方向同時推定
    TAKATANI Manabu, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 画像の理解・認識シンポジウム, MIRU2009, IS1-60, pp. 769-776, Japanese
    Report scientific journal

  • ランダムプロジェクションを用いた音声特徴量変換
    YOSHII Mariko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2009, 電子情報通信学会,音声研究会, SP2009-41,pp. 1-6, Japanese
    Report scientific journal

  • 尤度最大化に基づくエコー推定を用いた車室内マルチスピーカ音響エコーキャンセラの検討
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In this paper, as a key technology for improvement of speech recognition system in car environments, we propose a single-microphone-based acoustic echo canceller that selects an optimum cancellation result based on the echo estimation with maximum likelihood using an acoustic model for signals from multi-loudspeakers. The results of experiments conducted to speech superimposed on music show that the proposed canceller can improve S/N ratio and speech recognition rate, compared to the canceller based on the algorithm of NLMS, where the signals from multi-loudspeakers are measured by a single microphone.
    The Institute of Electronics, Information and Communication Engineers, May 2009, 電子情報通信学会,音声研究会, SP2009-14,pp. 45-48(57) (57), 45 - 48, Japanese
    Report scientific journal

  • D-12-23 Pose Robust and Person Independent Facial Expressions Recognition using AAM Model Selection
    Okada Tomoko, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 04 Mar. 2009, Proceedings of the IEICE General Conference, 2009(2) (2), 132 - 132, Japanese

  • D-12-76 GENERIC OBJECT RECOGNITION BASED ON WEIGHTED INTEGRATION OF MULTIPLE FEATURE
    Suga Akira, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 04 Mar. 2009, Proceedings of the IEICE General Conference, 2009(2) (2), 185 - 185, Japanese

  • D-12-104 Ball and Player Positional Estimation in 3D from Monocular Image Sequence
    Nishino Takuro, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 04 Mar. 2009, Proceedings of the IEICE General Conference, 2009(2) (2), 213 - 213, Japanese

  • D-12-122 3D Human Pose Estimation Integrating Bottom-Up and Top-Down Approach from Monocular Image
    Onishi Katsunori, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 04 Mar. 2009, Proceedings of the IEICE General Conference, 2009(2) (2), 231 - 231, Japanese

  • D-12-112 FACE AND GAZE ANGLE ESTIMATION USING AAM AND REGRESSION
    Takatani Manabu, Takiguchi Tetsuya, Arki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 04 Mar. 2009, Proceedings of the IEICE General Conference, 2009(2) (2), 221 - 221, Japanese

  • Dysarthric speech recognition using speech enhancement
    Miyamoto Chikoto, Takiguchi Tetsuya, Ariki Yasuo
    Kobe University, Mar. 2009, Report of Research Center for Urban Safety and Security Kobe University, 13, 75 - 80, Japanese

  • Grammar-gramとGrammarVerb-gramを用いたドメイン非依存型Whyテキストセグメント判定と回答抽出
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Mar. 2009, 第14回 Webインテリジェンスとインタラクション研究会WI2, pp. 89-94, Japanese
    Report scientific journal

  • Extracting Meta-information for Sports Live Games Based on Speech and Situation Recognition
    佐古 淳, 滝口 哲也, 有木 康雄
    近年,多くのマルチメディア・コンテンツの所有が可能となってきた.大量のコンテンツの中から欲しい情報を得るためには,検索のためのメタ情報を付与しておく必要がある.本研究では,マルチメディア・コンテンツの一例としてスポーツ実況中継,特に野球実況中継に注目し,実況中継音声から音声認識を用いてメタ情報を抽出することを目的としている.野球のメタ情報としては,今何が起こっているかを表すイベントと,その積み重ねである状況が存在すると考えられる.まず,現実にイベントや状況が存在し,これを基にアナウンサは実況を行う.本研究では,実況音声から単語列だけを推定する音声認識を拡張し,実況音声から単語列・イベント系列・状況系列すべてを同時に推定する音声認識手法を提案する.定式化により,イベント依存音響モデル,状況遷移モデル,イベント推定モデル,状況依存言語モデルを得る.これらを確率の枠組みで統合的に用いることで,単語列とメタ情報の同時推定を行う.実験により,イベント検出F値0.87,イベント正解率0.86,状況正解率0.77を得た.その他,各モデルの「メタ情報付与性能」への寄与や,音声認識率と「メタ情報付与性能」との関係について考察を行う.Recently a large quantity of multimedia contents are broadcast and accessed. In order to retrieve exactly what we want to know from multimedia database, automatic extraction of meta-information is required. We focused on live speeches, especially baseball commentary speeches as a kind of multimedia contents. The purpose of this study is to provide meta-information based on speech recognition techniques. Events and situations are defined as metainformation. First of all, an event is occured or a situation is changed, then an announcer speaks based on an event and a situation. In this paper, we propose a extended speech recognition technique that estimates not only a word sequence but also a event sequence and a situation sequence concurrently. As a result of formulation, event dependent acoustic model, situation transition model, event estimation model and situation dependent language model are derived. A word sequence and meta-information are estimated based on these models. The experimental results showed that the proposed method provided meta-information with a high degree of accuracy.
    情報処理学会, 15 Feb. 2009, 情報処理学会論文誌, 50(2) (2), 563 - 574, Japanese

  • 音声・状況の同時認識に基づく野球実況中継へのメタ情報付与
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2009, 第3回音声ドキュメント処理ワークショップ, pp. 59-64, Japanese
    Report scientific journal

  • Multi-class AdaBoost
    Ji Zhu, Hui Zou, Saharon Rosset, Trevor Hastie
    2009, STATISTICS AND ITS INTERFACE, 2(3) (3), 349 - 360, English
    Report scientific journal

  • 複数の言語情報を用いたCRFによる音声認識誤りの検出
    MATSUMOTO Tomohiko, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jan. 2009, 電子情報通信学会音声研究会, pp. 7-12, Japanese
    Report scientific journal

  • Language Model Adaptation by Topic Model Based on Sequence of Words
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    It is important to consider semantics for reductions of recognition errors unlike humans or understanding meanings and contents. To accommodate these problems, Latent Semantic Analysis (LSA) or Probabilistic LSA have been proposed. However these methods are based on Bag-of-words techniques. For more sophisticated analysis, it needs to consider a sequence of words in a document. In this paper, we propose the method based on Kernel PCA and Dynamic Time Alignment Kernel in order to consider a sequence of words. Preliminary experimental results shows the proposed method can separete clearly a sequence of right turn/left turn prots data. Moreover, experimental results of language corpus shows the reduction of perplexity.
    Information Processing Society of Japan (IPSJ), 02 Dec. 2008, IPSJ SIG Notes, 2008(123) (123), 249 - 254, Japanese

  • Speech Recognition by Topic Models with Continuous / Discontinuous Topic Changes
    SAKO Atsushi, ARIKI Yasuo, IWATA Tomoharu, WATANABE Shinji, HORI Takaaki
    In this paper, we propose topic models with continuous/discontinuous topic changes, and describe experiments using MIT Open Course Ware corpus. In a real environment, acoustic and language features vary momentarily depending on speakers, speaking styles or topic changes. To accommodate these changes, speech recognition with incremental tracking of changing environments has attracted attention. We propose a language model adaptation technique by Online Topic Model for continuous topic changes, and a technique by Topic HMM for discontinuous topic changes. The experimental results showed the improvements of Word Error Rate with these topic models. Moreover, the proposed methods outperformed the batch adaptation of language model using whole speech recognition results by tracking temporal changes of topics.
    Information Processing Society of Japan (IPSJ), 02 Dec. 2008, IPSJ SIG Notes, 2008(123) (123), 55 - 60, Japanese

  • Language Model Adaptation by Topic Model Based on Sequence of Words
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    It is important to consider semantics for reductions of recognition errors unlike humans or understanding meanings and contents. To accommodate these problems, Latent Semantic Analysis (LSA) or Probabilistic LSA have been proposed. However these methods are based on Bag-of-words techniques. For more sophisticated analysis, it needs to consider a sequence of words in a document. In this paper, we propose the method based on Kernel PCA and Dynamic Time Alignment Kernel in order to consider a sequence of words. Preliminary experimental results shows the proposed method can separete clearly a sequence of right turn/left turn prots data. Moreover, experimental results of language corpus shows the reduction of perplexity.
    The Institute of Electronics, Information and Communication Engineers, 02 Dec. 2008, IEICE technical report, 108(337) (337), 249 - 254, Japanese

  • Speech Recognition by Topic Models with Continuous/Discontinuous Topic Changes
    SAKO Atsushi, ARIKI Yasuo, IWATA Tomoharu, WATANABE Shinji, HORI Takaaki
    In this paper, we propose topic models with continuous/discontinuous topic changes, and describe experiments using MIT Open Course Ware corpus. In a real environment, acoustic and language features vary momentarily depending on speakers, speaking styles or topic changes. To accommodate these changes, speech recognition with incremental tracking of changing environments has attracted attention. We propose a language model adaptation technique by Online Topic Model for continuous topic changes, and a technique by Topic HMM for discontinuous topic changes. The experimental results showed the improvements of Word Error Rate with these topic models. Moreover, the proposed methods outperformed the batch adaptation of language model using whole speech recognition results by tracking temporal changes of topics.
    The Institute of Electronics, Information and Communication Engineers, 02 Dec. 2008, IEICE technical report, 108(337) (337), 55 - 60, Japanese

  • 多重解像度独立性検定を用いた遺伝子ネットワークの構築
    YAMAMOTO Takayuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2008, 情報処理学会バイオ情報学研究会研究報告, pp.115-118, Japanese
    Report scientific journal

  • 制約付き非負行列因子分解を用いた音声特徴抽出の検討
    PARK Hynshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2008, 第10回音声言語シンポジウム, pp.43-48, Japanese
    Report scientific journal

  • 音声の動的特徴のモデルを使った突発性雑音の除去
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2008, 第10回音声言語シンポジウム, pp.191-196, Japanese
    Report scientific journal

  • スペクトル平面における勾配ヒストグラムに基づく音声特徴量の検討
    MUROI Takashi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2008, 第10回音声言語シンポジウム, pp.161-166, Japanese
    Report scientific journal

  • SIFTとGraph Cuts を用いた物体認識及びセグメンテーション
    SUGA Akira, FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2008, 画像の認識・理解シンポジウムMIRU2008, pp.611-616, 611 - 616, Japanese
    Report scientific journal

  • PrefixSpan を用いた人物の日常行動抽出
    TONARU Takuya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2008, 画像の認識・理解シンポジウムMIRU2008, pp.508-513, Japanese
    Report scientific journal

  • HOG特徴に基づく単眼画像からの人体3 次元姿勢推定
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2008, 画像の認識・理解シンポジウムMIRU2008, pp.960-965, 960 - 965, Japanese
    Report scientific journal

  • AdaBoostとSaliency Mapを用いたGraph Cutsによる花弁領域の自動抽出法
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jul. 2008, 画像の認識・理解シンポジウムMIRU2008, pp.796-801, 796 - 801, Japanese
    Report scientific journal

  • メタモデルと音響モデルの統合による構音障害者の音声認識
    MATSUMASA Hironori, TAKIGUCHI Tetsuya, ARIKI Yasuo, Li Ichao, NAKABAYASHI Toshitaka
    May 2008, 電子情報通信学会技術研究報告WIT2008, pp. 37-42, Japanese
    Report scientific journal

  • NetTv: Cross-Platform Video Retrieval and QA System with Speech Interface
    Tanaka Katsuyuki, Takiguchi Tetsuya, Ariki Yasuo
    The objective of this research is to construct a video searching mechanism and speech interface on the multimedia crossplatform, namely TV and Internet, which requires the capability to deal with dynamic contents. Current NetTv enables users to search both recorded TV contents and news on the Internet by simply speaking keywords as a query; hence the videos related to the keyword spoken are retrieved. Also, the system provides a simple keyword based QA system to answer various questions that may occur to users whilst watching retrieved videos. In this way, NetTv improves the usability of video searching and viewing in a hands free way.
    The Institute of Electronics, Information and Communication Engineers, May 2008, 電子情報通信学会技術研究報告SP2008, pp.31-36(67) (67), 31 - 36, Japanese
    Report scientific journal

  • D-12-5 Extraction of Human Daily Activities from videos as Action Sequences using PrefixSpan
    Tonaru Takuya, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 05 Mar. 2008, Proceedings of the IEICE General Conference, 2008(2) (2), 136 - 136, Japanese

  • D-12-121 Graph Cuts by using Local Texture Features of Wavelet Coefficient for Image Segmentation
    Fukuda Keita, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 05 Mar. 2008, Proceedings of the IEICE General Conference, 2008(2) (2), 252 - 252, Japanese

  • D-12-122 OBJECT RECOGNITION AND SEGMENTATION USING SIFT AND GRAPH CUTS
    Suga Akira, Fukuda Keita, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 05 Mar. 2008, Proceedings of the IEICE General Conference, 2008(2) (2), 253 - 253, Japanese

  • ニュース検索タスクにおけるシステム要求と雑談の判別
    SAKO Atsushi, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2008, 第2回音声ドキュメント処理ワークショップ, pp. 67-72, Japanese
    Report scientific journal

  • 弱識別器にSVMを用いたAdaBoostの検討
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    VAD (Voice Activity Detection) by separating of speech and non-speech from noisy speech is an important probrem for speech recognition. The proposed method constructs AdaBoost using SVM as weak learners for separation of speech and non-speech. AdaBoost is an iterative algorithm that combines simple classification rules to produce a highly accurate classification rule. Though AdaBoost generally takes CART as weak learners, the proposed method takes SVM, which can make an good assumption through the miximizing margin and the kernel method, as weak learners. Because of this, we can expect to do more sophisticated classification, while keeping SVM's generalizing capability. We report the experimental results that compared single SVM, AdaBoost with CART and the proposed method on VAD database of CENSREC-1-C.
    Information Processing Society of Japan (IPSJ), Dec. 2007, 第9回音声言語シンポジウム, SP2007-120, pp.109-114(129) (129), 109 - 114, Japanese
    Report scientific journal

  • 顔表情からの関心度推定に基づく映像コンテンツへのタギング
    MIYAHARA Masanori, AOKI Masaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Recently, there are so many videos available for people to choose to watch. To solve this problem, we propose a tagging system for video content based on facial expression that can be used for video content recommendations. Viewer's face captured by a camera is extracted by Elastic Bunch Graph Matching, and Interest class is estimated by Support Vector Machines. The interest classes are Neutral, Positive, Negative and Rejective. They are recorded as "interest tags" in synchronization with video content. Experimental results achieved an averaged recall rate of 87.61%, and averaged precision rate of 88.03%.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2007, 電子情報通信学会技術研究報告, PRMU2007-137, pp. 13-18(384) (384), 13 - 18, Japanese
    Report scientific journal

  • 画像セグメンテーションにおけるウェーブレット係数の局所テクスチャ特徴量を用いたGraph Cuts
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2007, 電子情報通信学会技術研究報告, PRMU2007-138, pp. 19-24(384) (384), 19 - 24, Japanese
    Report scientific journal

  • 音素部分空間の統合による音声特徴量抽出の検討
    PARK Hyunshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In this paper, we propose a speech feature extraction method that is estimating each phoneme-subspace and integrate each subspace within a framework of feature extraction by pre-learning. The most commonly used speech feature for speech recognition is MFCC that is computed applying DCT to the mel-scale filter bank output. This feature space dose not depend on target speech data set and is decided by uniquely. To make speech recognition system fit for practical use, noise that is latent in observed data and useless for recognition must be removed. MFCC is uesd combined with other removing noise methods but performance degradation is inescapable if unexpected noises are mixed in observed data. Consequently, subspaces (projection matrix) that only extract phonemic information are estimated by pre-learning with observed data. Specifically, PCA or LDA are applied to each phoneme data set and each phoneme-subspace were estimated. Additionally, all phoneme-subspaces are integrated by PCA. This integrated subspace will have phonemic information of target speech data set and extract only that information. In evaluation experiment, we modeled phoneme HMM by proposed feature and carried out isolated word recognition experiments. The experiment results showed that the proposed method is effective compared to conventional methods.
    Information Processing Society of Japan (IPSJ), Dec. 2007, 第9回音声言語シンポジウム, SP2007-145, pp. 289-294(129) (129), 241 - 246, Japanese
    Report scientific journal

  • 音声認識との統合によるシステム要求検出
    SAKO Atsushi, YAMAGATA Tomoyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2007, 第9回音声言語シンポジウム, SP2007-120, pp. 143-148, Japanese
    Report scientific journal

  • 音声GMMと雑音重み推定を用いた雑音除去
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2007, 第9回音声言語シンポジウム, SP2007-100, pp. 25-30(129) (129), 25 - 30, Japanese
    Report scientific journal

  • 韻律及び話者交代情報を用いたシステム要求検出
    YAMAGATA Tomoyuki, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2007, 第9回音声言語シンポジウム, SP2007-145, pp. 289-294, Japanese
    Report scientific journal

  • J-002 Tagging for Video Contents Based on User's Facial Expression
    Miyahara Masanori, Aoki Masaki, Takiguchi Tetsuya, Ariki Yasuo
    Forum on Information Technology, 22 Aug. 2007, 情報科学技術フォーラム一般講演論文集, 6(3) (3), 389 - 390, Japanese

  • H-015 Eye Detection Using PCA Correlation Filter.
    SUZUKI Akiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Forum on Information Technology, 22 Aug. 2007, 情報科学技術フォーラム一般講演論文集, 6(3) (3), 37 - 38, Japanese

  • 探索手法の切り替えを用いたサッカー映像におけるボール追跡システム
    YANO Kazuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2007, 画像の認識・理解シンポジウム, MIRU2007, IS-3-22, pp. 1052-10, Japanese
    Report scientific journal

  • 固定カメラ映像からの音声・画像情報を用いた映像コンテンツの生成
    ADACHI Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2007, 画像認識・理解シンポジウム, MIRU2007, IS2-08, pp. 750-755, Japanese
    Report scientific journal

  • マルチ識別器を用いた画像検索による花図鑑システム
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Aug. 2007, 画像認識・理解シンポジウム, MIRU2007, IS-5-21, pp. 1498-15, Japanese
    Report scientific journal

  • EBGMを用いた唇の形状抽出による発話区間の検出
    MASUDA Ken, AOKI Masaki, MATSUDA Hiroyoshi, ARIKI Yasuo, TAKIGUCHI Tetsuya
    Aug. 2007, 画像の認識・理解シンポジウム, MIRU2007, IS-4-08, pp. 1189-11, 1189 - 1194, Japanese
    Report scientific journal

  • 情報家電操作における脳性麻痺構音障害者の音声認識評価
    MATSUMASA Hironori, TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    May 2007, 電子情報通信学会技術研究報告WIT, WIT2007-7, pp. 33-38, Japanese
    Report scientific journal

  • 音素PCAを用いた残響下における音声特徴量抽出
    PARK Hyunshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    May 2007, 電子情報通信学会技術研究報告, SP2007-1, pp. 1-6, Japanese
    Report scientific journal

  • D-12-18 Construction of the Flower Image Search System Using Multi Classifier
    Fukuda Keita, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2007, Proceedings of the IEICE General Conference, 2007(2) (2), 134 - 134, Japanese

  • D-11-86 Driver's Face Azimuth Judgment in Infrared Image
    Inoue Junichi, Takiguchi Tetsuya, Ariki Yasuo, Koga Kentarou
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2007, Proceedings of the IEICE General Conference, 2007(2) (2), 86 - 86, Japanese

  • D-12-88 A Fast Algorithm for Eye Detection Using Two-Dimensional CSP with Multitemplates
    Suzuki Akiko, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2007, Proceedings of the IEICE General Conference, 2007(2) (2), 204 - 204, Japanese

  • D-12-40 自動映像生成のためのパーティクルフィルタによるボールの追跡(D-12.パターン認識・メディア理解,一般講演)
    矢野 一樹, 滝口 哲也, 有木 康雄
    一般社団法人電子情報通信学会, 07 Mar. 2007, 電子情報通信学会総合大会講演論文集, 2007(2) (2), 156 - 156, Japanese

  • D-12-80 Facial Expressions Recognition based on Combination of Movement and Distance Variation between Facial Feature Points
    Miyahara Masanori, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2007, Proceedings of the IEICE General Conference, 2007(2) (2), 196 - 196, Japanese

  • D-14-17 Image Content Gneration Using Voice Information from Fixed Camera
    Adachi Jun, Takiguchi Tetsuya, Ariki Yasuo
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2007, Proceedings of the IEICE General Conference, 2007(1) (1), 153 - 153, Japanese

  • Meta Data Generation for Multimedia Using Speech Information
    ARIKI Yasuo
    Feb. 2007, 第1回音声ドキュメント処理ワークショップ, pp.41-46, Japanese
    Introduction scientific journal

  • ブースティングを用いた野球実況中継に対するメタデータの作成
    SAKO Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2007, 第1回音声ドキュメント処理ワークショップ, pp. 121-126, 115 - 120, Japanese
    Report scientific journal

  • トピックモデルとタスクの知識を用いた言語モデルによる野球実況中継の構造化
    SAKO Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Feb. 2007, 第1回音声ドキュメント処理ワークショップ, pp. 115-120, Japanese
    Report scientific journal

  • 構音障害者の音声認識の検討
    MATSUMASA Hironori, TAKIGUCHI Tetsuya, ARIKI Yasuo, RI Ichao, NAKABAYASHI Toshitaka
    Jan. 2007, 電子情報通信学会技術研究報告, WIT2006-75,pp13-18, Japanese
    Report scientific journal

  • NetTv:NetNewsとテレビ放送のクロスプラットホームにおける動画のインデキシングと音声検索
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Jan. 2007, 情報処理学会データベースシステム研究会研究報告, 2007-DBS-141, pp.59-66, 59 - 66, Japanese
    Report scientific journal

  • Noise Detection with Multi-class AdaBoost
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    A noise signal decreases speech recognition rate. Therefore, noise reduction is important, and it needs to estimate the noise signal. However, estimating noise is difficult when the noise happens suddenly in a speech. We proposed the method for detecting and identifying the noise in a speech, where it happens suddenly. Its effectiveness is confirmed at SNR -5〜5dB for the noise duration time 200 ms.
    Information Processing Society of Japan (IPSJ), 21 Dec. 2006, IPSJ SIG Notes, 2006(136) (136), 7 - 12, Japanese

  • Noise Detection with Multi-class AdaBoost
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    A noise signal decreases speech recognition rate. Therefore, noise reduction is important, and it needs to estimate the noise signal. However, estimating noise is difficult when the noise happens suddenly in a speech. We proposed the method for detecting and identifying the noise in a speech, where it happens suddenly. Its effectiveness is confirmed at SNR -5〜5dB for the noise duration time 200ms.
    The Institute of Electronics, Information and Communication Engineers, 14 Dec. 2006, IEICE technical report, 106(443) (443), 7 - 12, Japanese

  • 局所特徴量によるフィッシャー重みマップに基づく音素認識
    KATO Shunsuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    In this paper, we propose a new feature extraction method based on higher-order local auto-correlation (HLAC) and Fisher weight map (FWM). Widely used MFCC features lack temporal dynamics. To solve this problem, 35 types of local auto-correlation features are computed within two-dimensional local regions. These local features are accumulated over more global regions by weighting high scores on the discriminative areas where the typical features among all phonemes are well expressed. This score map is called Fisher weight map. We verified the effectiveness of the HLAC and FWM through total phoneme recognition.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2006, 第8回音声言語シンポジウム, SIG-SLP64, pp. 19-24(444) (444), 19 - 24, Japanese
    Report scientific journal

  • 音響モデルを利用したシングルチャネルによる音源方向推定
    SUMIDA Yuji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    This paper presents a voice localization method using only a single microphone, where the GMM (Gaussian Mixture Model) of clean speech is introduced to estimate the acoustic transfer function from any user's position. The sequence of the acoustic transfer function is estimated by maximizing the likelihood of train data (only several words) uttered from an unknown position, where the cepstral parameters are used due to effectively represent useful clean speech information. Using the sequence data of the acoustic transfer function, the GMM of the acoustic transfer function is created to deal with the influence of a long impulse response. Its effectiveness is confirmed by voice (talker) direction experiments in a room environment.
    The Institute of Electronics, Information and Communication Engineers, Dec. 2006, 電子情報通信学会技術研究報告, EA2006-90, pp. 7-11(432) (432), 7 - 11, Japanese
    Report scientific journal

  • AdaBoostを用いたシステムへの問い合わせと雑談の判別
    SAKO Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2006, 第8回音声言語シンポジウム, SIG-SLP64, pp. 19-24, Japanese
    Report scientific journal

  • 3次キュムラントのBispectrumとMFCCの統合による音声区間検出の検討
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Dec. 2006, 電子情報通信学会技術研究報告, SP2006-85, pp. 89-94, Japanese
    Report scientific journal

  • 3次キュムラント音声特徴を用いた音声区間検出
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    The separation of speech and non-speech events is an important problem for speech recognition. In clean conditions, energy or zero-crossing features work well. However, a traditional voice activity detection (VAD) is not robust to noisy conditions where speech signal is seriously contaminated by noise. A robust VAD algorithm based on the determination of the speech/non-speech bispectra of the third order auto-cumulants has been proposed. In this paper, we investigate the effectiveness of the integration between MFCC and the bispectra of the third order auto-cumulants. Experimental results show the proposed algorithm effective.
    The Institute of Electronics, Information and Communication Engineers, Sep. 2006, 電子情報通信学会技術研究報告, SIP, pp. 37-42(263) (263), 37 - 42, Japanese
    Report scientific journal

  • I_022 A Fast Algorithm for Eye Detection Using Two-Dimensional CSP
    SUZUKI Akiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Forum on Information Technology, 21 Aug. 2006, 情報科学技術フォーラム一般講演論文集, 5(3) (3), 49 - 50, Japanese

  • 唇領域の動静判定と音声・雑音判定の統合に基づく発話区間の検出
    MASUDA Ken, MATSUDA Hiroyoshi, INOUE Junichi, ARIKI Yasuo, TAKIGUCHI Tetsuya
    Jul. 2006, 画像認識・理解シンポジウム, pp. 934-939, Japanese
    Report scientific journal

  • D-14-7 AdaBoostと音声・唇GMMによる発話区間検出(D-14.音声・聴覚,一般講演)
    松田 博義, 増田 健, 滝口 哲也, 有木 康雄, 神谷 昌宏
    一般社団法人電子情報通信学会, 08 Mar. 2006, 電子情報通信学会総合大会講演論文集, 2006(1) (1), 131 - 131, Japanese

  • J-012 Automatic production Method with Personal Adaptation for soccer-game videos
    Kubota Shintaro, Ariki Yasuo, Tsukada Kiyoshi
    Forum on Information Technology, 22 Aug. 2005, 情報科学技術フォーラム一般講演論文集, 4(3) (3), 199 - 202, Japanese

  • Automatic Production of Soccer Sports Video by Digital Camera Work Based on Situation Recognition of Ball and Players
    KUBOTA Shintaro, ARIKI Yasuo, KUMANO Masahito
    Jul. 2005, 画像の認識・理解シンポジウム, IS3-117, pp. 1145-1151, Japanese
    [Refereed]
    Report scientific journal

  • Noise reduction using 2-channel microphone in complex spectrum plane
    OHKUBO Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    08 Mar. 2005, 日本音響学会研究発表会講演論文集, 2005(1) (1), 123 - 124, Japanese

  • Structuring the Baseball Game Based on Word Cooccurrences after Speech Recognition
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    08 Mar. 2005, 日本音響学会研究発表会講演論文集, 2005(1) (1), 149 - 150, Japanese

  • Automatic Production Method Based on Preference Classification For Soccer-Game Videos
    KUBOTA Shintaro, ARIKI Yasuo, TSUKADA Kiyoshi
    2005, 電子情報通信学会技術研究報告, PRMU2005-115, pp. 7-12, Japanese
    Report scientific journal

  • A Study on Conversational TV with Contextual Awareness
    TAKIGUCHI Tetsuya, ARIKI Yasuo, SAKO Atsushi
    2005, 音声言語情報処理研究会, SLP2005-58, pp. 25-30, Japanese
    Report scientific journal

  • A Study on Robust Feature Extraction using Kernel PCA
    TAKIGUCHI Tetsuya, ARIKI Yasuo
    2005, 音声言語情報処理研究会, SLP-59, pp. 175-180, Japanese
    Report scientific journal

  • D-12-170 Study on a Method of Digital Camera Work Focused on Players and a Ball : Toward Automatic Soccer Video Contents Production from HD Video
    Iwamoto Takeshi, Kumano Masahito, Ariki Yasuo, Tsukada Kiyoshi
    The Institute of Electronics, Information and Communication Engineers, 08 Mar. 2004, Proceedings of the IEICE General Conference, 2004(2) (2), 336 - 336, Japanese

  • Survey and Study on Urban Information System for Disaster
    Ariki Yasuo
    Kobe University, Mar. 2004, Report of Research Center for Urban Safety and Security Kobe University, 8, 205 - 211, Japanese

  • 知識を用いた音声認識による野球実況中継の構造化
    佐古 淳, 有木 康雄
    2004, 第6回音声言語シンポジウム SP2004, 136, 85-90, Japanese
    [Refereed]
    Others

  • 映像文法に基づいた実時間使用可能ショット識別による撮影ナビゲーションシステム
    熊野 雅仁, 天野 美紀, 有木 康雄, 塚田 清志
    In this paper, we propose a video shooting navigation system by real-time useful shot discrimination based on video grammar to support and instruct users shooting nice shots for the later editing work. In this system, the processing speed must be very high so that we use a gray value projection method to extract the camerawork parameters in real-time. From the result of camerawork and gray value analysis, the shots are classified into 14 states and the system issues alarm and instructions about the usefulness and uselessness of the shots in real-time just after shooting the shot. Thereby, users can retake a shot necessary for later editing efficiently.
    The Institute of Electronics, Information and Communication Engineers, 2004, 電子情報通信学会技術研究報告, PRMU, パターン認識・メディア理解, 104, 369, 1-6(369) (369), 1 - 6, Japanese
    [Refereed]
    Others

  • ボールと選手に着目したディジタルカメラワークの実現法 -ディジタルシューティングによるサッカー解説映像生成システムに向けて-
    熊野 雅仁, 岩本 健, 有木 康雄, 塚田 清志
    2004, 画像の認識・理解シンポジウム(MIRU2004), SUP-C1-12, Ⅱ-341-Ⅱ-346, Japanese
    [Refereed]
    Others

  • A Study on Spontaneous Speech Recognition Incorporating Pronunciation Variation and Acoustic Error Tendency
    OGATA Jun, ARIKI Yasuo
    啓学出版, 18 Mar. 2003, 日本音響学会研究発表会講演論文集, 2003(1) (1), 9 - 10, Japanese

  • Highlight scene detection in sports live using speech recognition
    KANEKO T., SHIGEMORI T., OGATA J., FUJIMOTO M., ARIKI Y., TSUKADA K., HAMAGUCHI S., KIYOSE H.
    18 Mar. 2003, 日本音響学会研究発表会講演論文集, 2003(1) (1), 189 - 190, Japanese

  • 野球中継のハイライトシーン実時間配信を目的としたPCシーンの自動検出
    熊野 雅仁, 神崎 伸夫, 藤本 雅清, 有木 康雄, 塚田 清志, 濱口 伸, 清瀬 基
    2003, 電子情報通信学会,パターン認識・メディア理解, PRMU2003-18 27-34, Japanese
    Others

  • 時間領域SVDとGMMに基づく音声信号推定法の統合による雑音下音声認識
    藤本 雅清, 有木 康雄
    2003, 情報処理学会研究報告, SLP-45-2 7-12, Japanese
    Others

  • 音響・言語適応処理を用いたスポーツ実況中継音声の認識 ~ハイライトシーン検出への応用~
    重森 猛, 金子 剛志, 緒方 淳, 藤本 雅清, 有木 康雄, 塚田 清志, 濱口 伸, 清瀬 基
    This paper proposes a method to automatically extract keywords from baseball radio speech through LVCSR for highlight scene retrieval. For robust recognition, we employed acoust.ic and language model adaptation. In acoustic model adaptation. supervised and unsupervised adaptat ions were carried out using MLLR+MAP. By this two level adaptation, word accuracy was improved by 28%. In language model adaptation language model merging and pronunciation modification were carried out. This adaptation showed 13% improvement at word accuracy. Finally, by integrating both adaptations, 38% improvement was achieved at word accuracy level.
    The Institute of Electronics, Information and Communication Engineers, 2003, 電子情報通信学会,音声研究会, SP2003-166 33-40(618) (618), 33 - 40, Japanese
    Others

  • GMMに基づく音声信号推定法の改良と、実走行車内音声による評価
    藤本 雅清, 有木 康雄
    2003, 情報処理学会研究報告, SLP-47-16 83-88, Japanese
    Others

  • GMMとEMアルゴリズムを用いた加法性雑音及び乗法性歪みの抑圧
    藤本 雅清, 有木 康雄
    2003, 電子情報通信学会,音声研究会, SP2003-117 25-30, Japanese
    Others

  • Noise Robust Speech Recognition Using GMM Based Speech Estimation Method
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, a noise robust speech recognition method is proposed, by combining temporal domain singular value decomposition(SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The critical neck of the GMM based approach is the noise estimation problem. For this noise estimation problem, we investigated the adaptive noise estimation in the GMM based approach. Furthermore, in order to obtain higher recognition accuracy, we employed a temporal domain SVD based speech enhancement method as the pre-processing module of the GMM based approach. In addition, to reduce more influence of the noise included in the noisy speech, we introduce an adaptive over subtraction factor into the temporal domain SVD based speech enhancement. In evaluation on the AURORA2 tasks, our method showed the significant improvement in the recognition accuracy at all the noise conditions.
    Information Processing Society of Japan (IPSJ), 16 Dec. 2002, IPSJ SIG Notes, 2002(121) (121), 25 - 30, Japanese

  • Noise Robust Speech Recognition Using GMM Based Speech Estimation Method
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, a noise robust speech recognition method is proposed, by combining temporal domain singular value decomposition(SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The critical neck of the GMM based approach is the noise estimation problem. For this noise estimation problem, we investigated the adaptive noise estimation in the GMM based approach. Furthermore, in order to obtain higher recognition accuracy, we employed a temporal domain SVD based speech enhancement method as the pre-processing module of the GMM based approach. In addition, to reduce more influence of the noise included in the noisy speech, we introduce an adaptive over subtraction factor into the temporal domain SVD based speech enhancement. In evaluation on the AURORA2 tasks, our method showed the significant improvement in the recognition accuracy at all the noise conditions.
    The Institute of Electronics, Information and Communication Engineers, 12 Dec. 2002, IEICE technical report. Natural language understanding and models of communication, 102(527) (527), 25 - 30, Japanese

  • Unsupervised Acoustic Model Adaptation Based on Phoneme Error Minimization
    OGATA Jun, ARIKI Yasuo
    The Institute of Electronics, Information and Communication Engineers, 01 Dec. 2002, The Transactions of the Institute of Electronics,Information and Communication Engineers., 85(12) (12), 1771 - 1780, Japanese

  • I-95 Player Tracking in Soccer Game Based on Normalized Correlatation Method Using Divided Template
    Kanzaki Nobuo, Ariki Yasuo
    Forum on Information Technology, 13 Sep. 2002, 情報科学技術フォーラム一般講演論文集, 2002(3) (3), 189 - 190, Japanese

  • K-60 Inquiry system of the utterance contents and unknown persons in news videos
    Inoue Toru, Fujimoto Masakiyo, Yamamoto Natsuo, Ariki Yasuo, Kumano Masahito, Doshita Syuji
    Forum on Information Technology, 13 Sep. 2002, 情報科学技術フォーラム一般講演論文集, 2002(3) (3), 487 - 488, Japanese

  • I-47 Automatic Usefull Shot Extraction and Indexing for a Video Editing Support System Based on Video Grammar
    Kumano Masahito, Ariki Yasuo
    Forum on Information Technology, 13 Sep. 2002, 情報科学技術フォーラム一般講演論文集, 2002(3) (3), 93 - 94, Japanese

  • Speaker Name Indexing System by Integrating SpeechRecognition and Speaker Recognition
    NISHIDA Masafumi, OGATA Jun, ARIKI Yasuo
    The purpose of this study is to retrieve a video clip where a specific speaker talks about some topics, for example, "I would like to watch a video clip where President Clinton talks about information super highway". In order to retrieve the speaker name and the spoken contents simultaneously, it is required to detect speaker changes, index the speaker name to the obtained speaker section and extract important words. In this study, the speaker changes are detected by performing the speaker segmentation and a speaker model is automatically constructed. A phrase suggesting the speaker change as well as the speaker name in a news speech data is extracted by large vocabulary continuous speech recognition and word spotting technique. Thus, the extracted speaker names are automatically indexed to the speaker section obtained by the speaker segmentation. Therefore, we can simultaneously retrieve the speaker name and the spoken contents based on the speaker name indexing and the important words extracted by the large vocabulary continuous speech recognition.
    Information Processing Society of Japan (IPSJ), 15 Jul. 2002, Transactions of Information Processing Society of Japan, 43(7) (7), 2205 - 2213, Japanese

  • A Study on Lecture Video Structuring by Topic Segmentation
    YAMAMOTO Natsuo, OGATA Jun, ARIKI Yasuo
    In this paper, we study on a segmentation method of continuous lecture speech into topics. A lecture has a few changes of subject and it is difficult to judge their boundaries. To solve this problem, we matched a lecture speech with the lecture text based on the table of contents, and obtained the high performance of the topic segmentation with an average of 93.7%. Incorporating this method, we constructed a system where we can see a part of lecture concerning a table of contents, by specifying the chapters or sections, as well as index words by specifying them.
    Information Processing Society of Japan (IPSJ), 12 Jul. 2002, IPSJ SIG Notes, 2002(65) (65), 59 - 64, Japanese

  • Noisy Speech Recognition Based on Noise Reduction and Acoustic Model Adaptation : An Evaluation on the AURORA2 Tasks
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we have evaluated a noisy speech recognition method based on noise reduction and acoustic model adaptation, on the AURORA2 tasks. For noise reduction method, we employed two noise reduction methods. One is an Adaptive Sub-Band Spectral Subtraction (ASBSS) method which can optimize the noise subtraction rate according to the SNR in frequency bands at each frame. The other is a Kalman filtering estimation method which re-estimates the accurate speech spectra from those estimated by ASBSS. The accurate speech spectra was estimated by combining these two methods. Usually, a noise reduction method has a problem that it degrades the recognition rate because of spectral distortion caused by residual noise occurred through noise reduction and over estimation. To solve the problem in the noise reduction method, adaptation of the acoustic models is employed by using an unsupervised MLLR adaptation to the spectral distortion. In evaluation on the AURORA2 tasks, our method showed the significant improvement in recognition accuracy for both clean training condition and multi training condition.
    Information Processing Society of Japan (IPSJ), 12 Jul. 2002, IPSJ SIG Notes, 2002(65) (65), 71 - 76, Japanese

  • Automatic Shot Size Discrimination for a Video Editing Support System
    KUMANO Masahito, ARIKI Yasuo, UEHARA Kuniaki, SHIMOJO Shinji, SHUNTO Kenji, TSUKADA Kiyoshi
    ディジタル時代の到来により,映像コンテンツの不足が問題となっている.この問題を解決するためには,映像コンテンツの制作において最も時間を必要とする編集作業の効率化が必要である.一般に,放送用の映像は,映像内容を一意に伝えるための普遍的な規則である映像文法に従って表現されている。したがって,編集作業を効率良く行うためには,この映像文法を反映した映像編集支援システムを開発する必要がある.この映像編集支援システムを実現するためには,映像文法が適用できるように,カメラマンが撮影した素材映像に対して,あらかじめ索引情報を付与しておく必要がある.映像文法の中で特に重要な規則は,ショットの接続に関する規則である.この点から論文では,映像文法に基づく映像編集支援システムを提案するとともに,ショットサイズを自動付与する手法を提案する.
    The Institute of Electronics, Information and Communication Engineers, 01 Jul. 2002, The Transactions of the Institute of Electronics,Information and Communication Engineers., 85(7) (7), 592 - 602, Japanese

  • Automatic Usefull Shot Extraction for a Video Editing Support System
    KUMANO Masahito, ARIKI Yasuo
    In the coming digital age, a lack of video contents makes a serious problem. To solve this problem, an efficient video editing is required because it consumes a lot of works. To do the video editing, useful shots have to be extracted from raw video materials for broadcasting. The shot extraction is unefficient and occupies the most part of the video editing. This paper proposes a method to automatically extract the useful shots.
    The Institute of Electronics, Information and Communication Engineers, 21 Jun. 2002, Technical report of IEICE. PRMU, 102(156) (156), 1 - 8, Japanese

  • Ball and Player Tracking in Soccer Game Based on Normalized Correlation Method Using Divided Template
    KANZAKI Nobuo, ARIKI Yasuo
    For contents based retrieval in video, it is necessary to describe the video contents. Especially, positions of moving objects in an image are important content description. So far, a nomalized correlation method was widely used for moving object tracking. However, it is difficult to correctly track the object by a normalized correlation method, because it is sensitive to local intenity change and partial occlution due to whole template matching. In order to solve this problem, we propose the ball and player tracking based on normalized correlation method by dividing the template. The divided template can solve the local intensity change and partial occlution.
    The Institute of Electronics, Information and Communication Engineers, 20 Jun. 2002, Technical report of IEICE. PRMU, 102(155) (155), 51 - 56, Japanese

  • マルチモーダルインタラクションによるニュース映像中の人物認識と検索 (テーマ:一般)
    藤本 雅清, 山本 夏夫, 有木 康雄
    人工知能学会, 07 Jun. 2002, 言語・音声理解と対話処理研究会, 35, 7 - 13, Japanese

  • A Study on Noise Robust Hands-Free Speech Recognition Using Microphone Array and Kalman Filter
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we investigate hands-free speech recognition as a front-end system of conversational TV. The conversational TV is one of machine conversation systems to retrieve the interesting information by inquiring it to the TV. To realize the natural machine conversation without consciousness of microphone, hands-free speech recognition is required. In the hands-free speech recognition system, the directions of the arriving signal is estimated by using a microphone array and the desired signal is enhanced by beam forming. Then, the voice activity region is detected automatically from continuously observed signal. Furthermore, by applying the noise reduction and noise adaptation, the enhanced speech signal is recognized accurately.
    The Institute of Electronics, Information and Communication Engineers, 19 Apr. 2002, Technical report of IEICE. EA, 102(33) (33), 13 - 18, Japanese

  • Goods Catalog Video Segmentation and Indexing Based on Integration of Speech and Video Caption Recognition
    FUJIMOTO Masakiyo, ARIKI Yasuo, MATSUMOTO Hiroshi
    18 Mar. 2002, 日本音響学会研究発表会講演論文集, 2002(1) (1), 143 - 144, Japanese

  • Unsupervised Adaptation of an Acoustic Model Based on Decoding Strategies Using Word and Phoneme Posterior Probabilities
    OGATA Jun, ARIKI Yasuo
    18 Mar. 2002, 日本音響学会研究発表会講演論文集, 2002(1) (1), 137 - 138, Japanese

  • A CALL System with Segmentation and Evaluation Function of an User Utterance
    Ikari Shingo, Sano Teruki, Ogata Jun, Ariki Yasuo
    In communication learning of second language, three abilities have to be improved; listening, speaking and writing ability. In this sence, it is important to evaluate user's pronunciation ability and to detect mispronunciations in CALL (Computer-Assisted Language Learning) systems. In this paper, we propose three functions (segmentation, phrasing and dictation) in CALL system using speech recognition technology. As experiments, the system was evaluated from the result of a questionnaire to ten learners.
    Information Processing Society of Japan (IPSJ), 01 Feb. 2002, IPSJ SIG Notes, 2002(10) (10), 7 - 12, Japanese

  • Effectiveness of An Expanded Dictionary in Information Retrieval System by Keyword Spotting
    YAMAMOTO Tetsuya, OGATA Jun, ARIKI Yasuo
    An information retrieval system by voice input for broadcast news is investigated. Our purpose is to design the system in which robust speech recognition is possible to inquiries from a user with comparatively high flexibility. The transcription result of a news speech is used to contruct a user's keyword dictionary for inquiry. In order to solve a problem of time difference between training data and evaluation data, the N-gram language model was created from the news story in the newest Web data, and it was adapted to evaluation data. In addition, widely used vector space model and LSI method were investigated to deal with the words out of vocabulary. Through the experiments, some effect was verified. The simulation experiment of keyword potting was conducted and the system validity was shown.
    The Institute of Electronics, Information and Communication Engineers, 18 Jan. 2002, IEICE technical report. Speech, 101(604) (604), 41 - 46, Japanese

  • Speech Recognition under Noisy Environments Using Speech Signal Estimation Method Based on Kalman Filter
    FUJIMOTO Masakiyo, ARIKI Yasuo
    本研究では, 雑音環境下における音声認識の前処理として, カルマンフィルタによる音声信号推定法を提案する.従来, カルマンフィルタは膨大な計算量を要するため, 実時間向けの処理には不向きであった.そこで本研究では, カルマンフィルタの計算量を削減して, 高速演算することにより, 実時間向けの音声信号推定法を提案する.提案手法の評価のために雑音重畳音声から抽出された音声信号を用いて単語認識実験を行い, 従来のSpectral Subtraction法及びParallel Model Combination法と単語認識精度の比較を行った.また, フィルタパラメータを話者, 雑音の種類, SNRなどの条件に応じて人手で変更を行うことなく, 自動で種々の定常雑音に対処できることを示すために, 提案手法の雑音補償範囲についても評価を行った.その結果, 従来手法では認識率が低くなる雑音においても, 提案手法により高い単語認識率が得られた.特に, 提案手法は低SNRにおいて有効であることが確認できた.
    The Institute of Electronics, Information and Communication Engineers, 01 Jan. 2002, The Transactions of the Institute of Electronics,Information and Communication Engineers., 85(1) (1), 1 - 11, Japanese

  • Noise Robust Speech Recognition by Integration of MLLR Adaptation and Feature Extraction for Noise Reduced Speech
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we investigate a noise robust acoustic feature in our proposed noise robust speech recognition method using Kalman filtering for speech signal estimation and iterative unsupervised MLLR adaptation. For the noise robust acoustic feature, we employed root cepstral coefficients and compared the results with conventionally used MFCCs at speech recognition accuracy. Furthermore, we investigate the number of phoneme clusters in MLLR adaptation. In order to evaluate the proposed method, we carried out large vocabulary continuous speech recognition experiments under 3 types of music. As a result, the proposed method showed the significant improvement in word accuracy.
    Information Processing Society of Japan (IPSJ), 20 Dec. 2001, IPSJ SIG Notes, 2001(123) (123), 57 - 62, Japanese

  • Unsupervised Adaptation of an Acoustic Model Using Confidence Measures Based on Phoneme Posterior Probabilities
    OGATA Jun, ARIKI Yasuo
    In this paper, we study on an accurate unsupervised adaptation method for spontaneous speech recognition. In unsupervised adaptation framework, the effectiveness of adaptation process is greatly affected by the mis-recognized labels. Therefore, selection of the adaptation data guided by the confidence measures is effective in unsupervised adaptation. We propose and phoneme error minimization framework for accurate phoneme-labels and use of phoneme-level confidence measures for improved unsupervised adaptation. Experimental results showed that the proposed method could reduce the mis-recognized labels in the adaptation process, and consequently improved the adaptation accuracy. Furthermore the selection of the adaptation data using the phoneme confidence measures improved the adaptation accuracy.
    The Institute of Electronics, Information and Communication Engineers, 14 Dec. 2001, IEICE technical report. Natural language understanding and models of communication, 101(521) (521), 19 - 24, Japanese

  • Noise Robust Speech Recognition by Integration of MLLR Adaptation and Feature Extraction for Noise Reduced Speech
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we investigate a noise robust acoustic feature in our proposed noise robust speech recognition method using Kalman filtering for speech signal estimation and iterative unsupervised MLLR adaptation. For the noise robust acoustic feature, we employed root cepstral coefficients and compared the results with conventionally used MFCCs at speech recognition accuracy. Furthermore, we investigate the number of phoneme clusters in MLLR adaptation. In order to evaluate the proposed method, we carried out large vocabulary continuous speech recognition experiments under 3 types of music. As a result, the proposed method showed the significant improvement in word accuracy.
    The Institute of Electronics, Information and Communication Engineers, 13 Dec. 2001, IEICE technical report. Natural language understanding and models of communication, 101(520) (520), 57 - 62, Japanese

  • An Efficient N-Best Search Method Using Best-Word Back-off Connection in Large Vacabulary Continuous Speech Recognition
    OGATA Jun, ARIKI Yasuo
    本論文では, 大語彙(い)連続音声認識のための高速なN-best探索手法を提案する.大語彙連続音声認識で一般的に用いられているlexical tree searchは効率的な探索アルゴリズムであるが, bigram確率のfactorizationを行う際, その必要メモリ量と処理時間の観点で問題があった.本論文ではまず, bigram言語モデルのback-off接続制約を考慮した探索ネットワークを用いることで, bigram factorizationにおける必要メモリ量を大幅に抑え, 全体の処理時間に影響を与えることなく認識可能であることを確認した.そして, 大語彙連続音声認識の高速化を目的とし, 上記の探索ネットワークを発展させた, 最ゆう単語back-off接続という方法を提案する.本手法は, あるフレーム中で最大のゆう度をもつ単語にのみback-off接続を行うという方法である.提案手法を用いることで, 認識率をほとんど落とすことなく, 全体の処理時間を半分以下にできることを実験により確認した.
    The Institute of Electronics, Information and Communication Engineers, 01 Dec. 2001, The Transactions of the Institute of Electronics,Information and Communication Engineers., 84(12) (12), 2489 - 2500, Japanese

  • テロップ文字認識に基づく商品紹介ビデオの区間分割 (セッション1 2次元画像技術と応用)
    藤本 雅清, 三島 崇志, 有木 康雄
    画像電子学会, 22 Nov. 2001, 研究会講演予稿, 190, 9 - 14, Japanese

  • A Study on Topic Segmentation for Lectur Speech
    YAMAMOTO Natsuo, TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    01 Oct. 2001, 日本音響学会研究発表会講演論文集, 2001(2) (2), 191 - 192, Japanese

  • A study on Confidence Measures for Unsupervised Acoustic Model Adaptation
    OGATA Jun, ARIKI Yasuo
    01 Oct. 2001, 日本音響学会研究発表会講演論文集, 2001(2) (2), 93 - 94, Japanese

  • Effectiveness of N-best search method Using Maximum Back-off Connection in Lecture Speech Recognition.
    OGATA Jun, ARIKI Yasuo
    01 Oct. 2001, 日本音響学会研究発表会講演論文集, 2001(2) (2), 97 - 98, Japanese

  • Continuous Speech Recognition under Non-stationary Noisy Environments by Combination of Model Adaptation and Noise Reduction.
    FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Oct. 2001, 日本音響学会研究発表会講演論文集, 2001(2) (2), 35 - 36, Japanese

  • Information Retrieval based on Spoken Dialogue by Keyword Spotting using Automatic Expanded Dictionary
    YAMAMOTO Tetsuya, OGATA Jun, ARIKI Yasuo
    01 Oct. 2001, 日本音響学会研究発表会講演論文集, 2001(2) (2), 61 - 62, Japanese

  • Automatic Shot Size Discrimination Using Active Search for a Video Editing Support
    KUMANO Masahito, HAYASHI Yoshifumi, ARIKI Yasuo, Uehara Kuniaki, SHIMOJO Shinji, SHUNTO Kenji, TSUKADA Kiyoshi
    Video grammars are utilized in broadcasting videos as universal rules to convey the video contents uniquely. The minimum unit in the video grammar is a shot. The shot size is classified into relative three types; loose shot shooting objects from a long distance, middle shot shooting in a short distance, and tight shot shooting near by. This paper proposes a method to antomatically discriminate the shot size using an active search method.
    The Institute of Electronics, Information and Communication Engineers, 13 Sep. 2001, Technical report of IEICE. OFC, 101(298) (298), 31 - 38, Japanese

  • Shot size discrimination using active search for video editing support
    KUMANO Masahito, HAYASHI Yoshifumi, SAKAE Shingo, Ariki Yasuo, SHUNTO Kenji, TSUKADA Kiyoshi
    The Institute of Electronics, Information and Communication Engineers, 29 Aug. 2001, Proceedings of the Society Conference of IEICE, 2001, 195 - 195, Japanese

  • A Method to Segment Goods Catalog Video into Individual Sections Based on Integration of Speech and Image Information
    FUJIMOTO Masakiyo, MISHIMA Takashi, ARIKI Yasuo, MATSUMOTO Hiroshi
    The Institute of Electronics, Information and Communication Engineers, 29 Aug. 2001, Proceedings of the Society Conference of IEICE, 2001, 244 - 244, Japanese

  • Cross Media Passage Level Retrieval - Access Method to Spoken Documents by Telop and CG Flip Character Strings as Queries -
    TAKAO Seiichi, ARIKI Yasuo, OGATA Jun
    近年,放送の多チャネル化等により大量のニュース映像情報が生成され始めたため,視聴者側には興味のあるニュース番組だけを見たいという要求が生じている.そのため,ニュースの検索システムや,ニュースに適切なインデックスを付与することが必要となってきている.そこで本研究では,ニュース映像に出現するテロップやCGフリップ文字列がニュース番組の内容を要約している点に着目し,このテロップやCGフリツプ文字列をインデックスとしてニュース映像に付与するシステムの構築を行った.映像にインデックスを付与する場合,インデックスを付与する時間区間の長さをどう定義するかが問題となる.記事が長かったり,複数のトピックが1記事に含まれている場合には,記事を単位としてインデックスを付与することは好ましくない.したがって本研究では,記事という単位に対してインデックスを付与するのではなく,パッセージという内容の表現単位に対してインデックスを付与している.
    The Institute of Electronics, Information and Communication Engineers, 01 Aug. 2001, The Transactions of the Institute of Electronics,Information and Communication Engineers., 84(8) (8), 1809 - 1816, Japanese

  • A Study on Speech Recognition and Structuring for Lectures
    OGATA Jun, YAMAMOTO Natsuo, TAKAO Seiichi, ARIKI Yasuo
    In this paper, we study on a segmentation method of continuous lecture speech into the topics. In the topic segmetation, extraction of topic words(keywords)is important. We selected the keywords from indices of lecture text and added them as a category of unknown-word in language model. As a result, the keywords were recognized accuratelly and achived a F-measure of 49.7% in the topic segmentation experiments.
    Information Processing Society of Japan (IPSJ), 13 Jul. 2001, IPSJ SIG Notes, 2001(68) (68), 79 - 84, Japanese

  • A Study on A Method to Segment Goods Catalog Video into Individual Sections Based on Keyword Spotting
    FUJIMOTO Masakiyo, TAKAO Seiichi, ARIKI Yasuo, MATSUMOTO Hiroshi
    In this paper, we propose a method to segment goods catalog video into individual sections and index them. Our proposing method uses the keyword spotting which extract the keywords from noise reduced speech signal within the goods catalog video. In order to extract the keywords by using keyword spotting, the goods name dictionary is required. In this paper, we study a method to generate the goods name dictionary automatically, by using the video captions within the goods catalog video. As the experimental result, the proposed method could segment the individual goods sections with approximately 82% accuracy when the goods name dictionary is available, and with approximately 60% accuracy when goods name dictionary is generated automatically.
    Information Processing Society of Japan (IPSJ), 13 Jul. 2001, IPSJ SIG Notes, 2001(68) (68), 49 - 54, Japanese

  • Continuous Speech Recognition under Non-stationary Noisy Environments Using Kalman Filter and Iterative MLLR adaptation
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we propose a speech recognition method under non-stationary noisy environments using Kalman fitering speech signal estimation method and iterative unsupervised MLLR adaptation. Our proposing method estimates the speech signaI under non-stationary noisy environments such as musical background by applying speech state transition model to Kalman fiItering estimation. The speech state transition model represents the state transition of speech conlponent in non-stationary noisy speech and is modeled by using Taylor expansion. In this model, the state transition of noise is estimated by using linear predictive estimation. Furthermore, to obtain higher recognition accuracy, we consider to adapt the acoustic modeIs by using iterative unsupervised MLLR adaptation to speech spectra distorted by Kalman filtering residual noise. In order to evaluate the proposed method, we carried out large vocabulary continuous speech recognition experiments under 3types of music. As a result, the proposed method obtained the significant improvement in word accuracy.
    The Institute of Electronics, Information and Communication Engineers, 21 Jun. 2001, IEICE technical report. Speech, 101(155) (155), 7 - 14, Japanese

  • Voice conversion using subspace method and Gaussian mixture model
    INOUE Toru, NISHIDA Masafumi, FUJIMOTO Masakiyo, ARIKI Yasuo
    In voice conversion, if phonetic information and speaker information included in speech data are separated, similar voice to a target speaker will be obtained by exchanging speaker information. In this paper, we propose a voice conversion method from an original speaker to a target speaker by a subspace method. At first, a speaker space and a phonetic space are separately constructed for each speaker and the speaker spaces are exchanged between the original speaker and the target speaker. The proposed method was experimentally compared with the conversion method based on GMM. As a result, the proposed method was shown to be superior to GMM in terms of subjective evaluation.
    The Institute of Electronics, Information and Communication Engineers, 17 May 2001, IEICE technical report. Speech, 101(86) (86), 1 - 6, Japanese

  • Information Retrieval based on Spoken Dialogue by Keyword Spotting for TV Broadcasting Speech
    YAMAMOTO Tetsuya, OGATA Jun, ARIKI Yasuo
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2001, Proceedings of the IEICE General Conference, 2001(1) (1), 277 - 278, Japanese

  • Study on a Video Editing System Based on Video Grammar and Video Analysis
    TOKUHIRA Masatsune, NAGATA Hiroyasu, YAMAGUCHI Satoshi, YAMAMOTO Tetsuya, KUMANO Masahito, ARIKI Yasuo, SHUNTO Kenji, TSUKADA Kiyoshi
    The Institute of Electronics, Information and Communication Engineers, 07 Mar. 2001, Proceedings of the IEICE General Conference, 2001(2) (2), 363 - 364, Japanese

  • A Comparison of Confidence Measures for Improved Speech Recognition.
    OGATA Jun, ARIKI Yasuo
    01 Mar. 2001, 日本音響学会研究発表会講演論文集, 2001(1) (1), 15 - 16, Japanese

  • Pronunciation Evaluation and Mispronunciation Detection in English Learning.
    SAKAGUCHI Fukutaro, OGATA Jun, ARIKI Yasuo
    01 Mar. 2001, 日本音響学会研究発表会講演論文集, 2001(1) (1), 151 - 152, Japanese

  • A Study on Noise Reduction Method for Continuous Speech Recognition under Non-stationary Noisy Environments Based on Estimating Speech State Transition.
    FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Mar. 2001, 日本音響学会研究発表会講演論文集, 2001(1) (1), 73 - 74, Japanese

  • Pronunciation Evaluation and Mistake Detection of Spoken Word in English Learning
    SAKAGUCHI Fukutaro, OGATA Jun, ARIKI Yasuo
    In communication learning of second language, three abilities have to be improved ; listening, speaking and writing ability. In this sence, it is important to evaluate user's pronunciation ability and detect word pronunciation mistakes in CALL (Comuputer-Assisted Language Learning) systems. In this paper, a mthod is proposed to detect word pronunciation mistakes and evaluate user's pronunciation ability under the guidance of the CALL system. At first, Japanese phoneme HMMs and English phoneme HMMs are mixed and are used to evaluate user's pronunciation ability. Then the word sction is extracted based on forced alignment between text and a spectral frame sequence. Word pronunciation mistakes are detected by comparing the likelihood score of the user pronunciation and the likelihood score of the native's pronunciation on the extracted word section.
    The Institute of Electronics, Information and Communication Engineers, 19 Jan. 2001, IEICE technical report. Speech, 100(595) (595), 49 - 56, Japanese

  • Figures Indexing for a Video Editing Support System
    NAGATA Hiroyasu, TOKUHIRA Masatsune, YAMAGUCHI Satoshi, YAMAMOTO Tetsuya, KUMANO Masahito, ARAKI Yasuo, SHUNTO Kenji, TSUKADA Kiyoshi
    The Institute of Electronics, Information and Communication Engineers, 2001, Proceedings of the IEICE General Conference, 305, 305 - 305, Japanese

  • A method to segment Goods Catalog Video into Indivisual Good Sections Based on Telop Recognition
    TAKAO Seiichi, ARIKI Yasuo, MATSUMOTO Hiroshi
    The Institute of Electronics, Information and Communication Engineers, 2001, Proceedings of the IEICE General Conference, 361 - 362, Japanese

  • Automatic Shot Size Discrimination Using Active Search for a Video Editing Support
    KUMANO Masahito, HAYASHI Yoshifumi, ARIKI Yasuo, Uehara Kuniaki, SHIMOJO Shinji, SHUNTO Kenji, TSUKADA Kiyoshi
    Video grammars are utilized in broadcasting videos as universal rules to convey the video contents uniquely. The minimum unit in the video grammar is a shot. The shot size is classified into relative three types ; loose shot shooting objects from a long distance, middle shot shooting in a short distance, and tight shot shooting near by. This paper proposes a method to automatically discriminate the shot size using an active search method.
    The Institute of Image Information and Television Engineers, 2001, ITE Technical Report, 25(0) (0), 31 - 38, Japanese

  • Speech Recognition under Non - stationary Noisy Environments Using Signal Estimation Method Based on Speech State Transition Model
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we propose a non-stationary noise reduction method based on speech state transition model. Our proposed method extimates the speech signal under non-stationary noisy environments such as musical background by applying speech state transition model to Kalman filtering estimation. The speech state transition model represents the state transition of speech component in non-stationary noisy speech and is modeled by using Taylor expansion. In this model, the state transition of noise component is estimated by using linear predictive estimation. In order to evaluate the proposed method, we carried out large vacabulary continuous speech recognition experiments under 3 types of musics and compared the results with conventionally used Parallel Model Combination(PMC)method in word accuracy rate. As a result, the proposed method obtained word accuracy rate superior to PMC.
    Information Processing Society of Japan (IPSJ), 21 Dec. 2000, IPSJ SIG Notes, 2000(119) (119), 19 - 24, Japanese

  • A Comparison of Confidence Measures for Improved Speech Recognition
    OGATA Jun, ARIKI Yasuo
    In this paper, we investigate some confidence measures calculated from word graphs for improved speech recognition. In confidence estimation, mainly two methods are compared ; one is based on number of hypothesis in word graphs and the other is based on word posterior probabilities. We implemented them in an iterative decoding method based on the confidence estimation and the word graph re-construction, and evaluated them in LVCSR task.
    Information Processing Society of Japan (IPSJ), 21 Dec. 2000, IPSJ SIG Notes, 2000(119) (119), 113 - 118, Japanese

  • Speech Recognition under Non-stationary Noisy Environments Using Signal Estimation Method Based on Speech State Transition Model
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we propose a non-stationary noise reduction method based on speech state transition model. Our proposed method estimates the speech signal under non-stationary noisy environments such as musical background by applying speech state transition model to Kalman filtering estimation. The speech state transition model represents the state transition of speech component in non-stationary noisy speech and is modeled by using Taylor expansion. In this model, the state transition of noise component is estimated by using linear predictive estimation. In order to evaluate the proposed method, we carried out large vocabulary continuous speech recognition experiments under 3 types of musics and compared the results with conventionally used Parallel Model Combination (PMC) method in word accuracy rate. As a result, the proposed method obtained word accuracy rate superior to PMC.
    The Institute of Electronics, Information and Communication Engineers, 14 Dec. 2000, IEICE technical report. Natural language understanding and models of communication, 100(520) (520), 19 - 24, Japanese

  • A Comparison of Confidence Measures for Improved Speech Recognition
    OGATA Jun, ARIKI Yasuo
    In this paper, we investigate some confidence measures calculated from word graphs for improved speech recognition. In confidence estimation, mainly two methods are compared; one is based on number of hypothesis in word graphs and the other is based on word posterior probabilities. We implemented them in an iterative decoding method based on the confidence estimation and the word graph re-construction, and evaluated them in LVCSR task.
    The Institute of Electronics, Information and Communication Engineers, 14 Dec. 2000, IEICE technical report. Natural language understanding and models of communication, 100(520) (520), 113 - 118, Japanese

  • Effectiveness of PMC for Speaker Recognition under Real Environments
    YAMASHITA Takayuki, NISHIDA Masafumi, FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Sep. 2000, 日本音響学会研究発表会講演論文集, 2000(2) (2), 119 - 120, Japanese

  • A Study on Noise Reduction Method Considering Time Variation of Noise
    FUJIMOTO Masakiyo, OGATA Jun, ARIKI Yasuo
    01 Sep. 2000, 日本音響学会研究発表会講演論文集, 2000(2) (2), 123 - 124, Japanese

  • Comparison of Passage Level Spoken Document Retrieval Methods
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    01 Sep. 2000, 日本音響学会研究発表会講演論文集, 2000(2) (2), 139 - 140, Japanese

  • Speech Recognition Improvement Using Iterative Decoding Based on Confidence Measure
    OGATA Jun, ARIKI Yasuo
    01 Sep. 2000, 日本音響学会研究発表会講演論文集, 2000(2) (2), 71 - 72, Japanese

  • Organization of Video Image by Information Integration of Telop Character and Speech Dictation
    TAKAO Seiichi, ARIKI Yasuo
    Video image must be beforehand segmented into indivisual topic for its organization. In this paper, we study on automatic topic segmentation. Conventional topic segmentation techniques large amount of training data changing everyday. But this is really impossible, because time difference and topic distribution difference occur between traning data and test data. To solve these problems, we propose a method of word space learning by integrating the information of telop appearing sections with speech transcription in this paper. Its effectiveness was shown by carrying out topic segmentation based on the method of word space learning.
    Information Processing Society of Japan (IPSJ), 26 Jul. 2000, IPSJ SIG Notes, 2000(69) (69), 377 - 382, Japanese

  • Organization of Video Image by Information Integration of Telop Character and Speech Dictation
    TAKAO Seiichi, ARIKI Yasuo
    Video image must be beforehand segmented into indivisual topic for its organization. In this paper, we study on automatic topic segmentation. Conventional topic segmentation techniques require large amount of training data changing everyday. But this is really impossible, because time difference and topic distribution difference occur between training data and test data. To solve these problems, we propose a method of word space learning by integrating the information of telop appearing sections with speech transcription in this paper. Its effectiveness was shown by carrying out topic segmentation based on the method of word space learning.
    The Institute of Electronics, Information and Communication Engineers, 21 Jul. 2000, IEICE technical report. Data engineering, 100(228) (228), 1 - 6, Japanese

  • A Study on Confidence Based Decoder for Improved Speech Recognition
    OGATA Jun, ARIKI Yasuo
    In this paper, we study on a confidence based decoding method for improved speech recognition, and evaluate it. A word graph is constructed as an intermediate result in our 2-pass decoder. Confidence values are calculated from the word graph, and evaluated in word graph rescoring. In this study, we propose an iterative decoding method incorporating a confidence based search and word graph reconstruction. We evaluated the proposed method in LVCSR task. As a result, a slight improvement was observed in terms of the word accuracy compared to the standard 2-pass method.
    Information Processing Society of Japan (IPSJ), 14 Jul. 2000, IPSJ SIG Notes, 2000(64) (64), 1 - 6, Japanese

  • A Study of Moving Object Extraction System Using Spatio Temporal Network
    TOKUHIRA Masatsune, ARIKI Yasuo
    In this paper, we propose a new method to extract moving objects by focusing attention on fontinuity of moving objects in a spatio-temporal domain. The purpose of this method is to be solved the following three problems that are difficult to solve by conventional methods in moving object extraction. (1)Extracting an object with consistency in a unity of movement without the object models. (2) Being practical computational complexity. (3)Detecting occlusion caused by overlap of moving objects, and tracking them robustly. This method at first extracts the isochromatic lines information that is adapting itself to an input image by updating the model of bin clustering dynamically while accumulating histogram feature. Then it extracts and tracks moving objects by integrating regions divided by isochromatic lines on the basis of moving vectors.
    The Institute of Electronics, Information and Communication Engineers, 07 Jul. 2000, Technical report of IEICE. Multimedia and virtual environment, 100(184) (184), 61 - 66, Japanese

  • Automatic Viewpoints Extraction and Structuring in News Speech Article Database
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    Recently, TV news programs are broadcast from all over the world owing to the broadcast digitization. In this situation, TV viewers want to select and watch the most interesting news articles. In order to satisfy this requirement, a system is required which can retrieve news documents related to TV viewer's interest. So far news articles related to TV viewer's queries are only retrieved. Consequently, it is difficult to retrieve the related news articles in a case where TV viewers can not give queries clearly and definitely. To solve this problem, in this paper, we propose a system which can structure the news speech article database by extracting viewpoints automatically and classifying the articles based on the extracted viewpoints. As an experimental result, it was found that the system is effective when TV viewers can not give queries clearly and definitely.
    The Institute of Electronics, Information and Communication Engineers, 02 May 2000, IEICE technical report. Data engineering, 100(31) (31), 89 - 96, Japanese

  • Fast Implementation of LVCSR incorporating back-off connection
    OGATA Jun, ARIKI Yasuo
    01 Mar. 2000, 日本音響学会研究発表会講演論文集, 2000(1) (1), 43 - 44, Japanese

  • Study on Retrieval Methods Robust for Error Words Caused by Speech Recognition
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    01 Mar. 2000, 日本音響学会研究発表会講演論文集, 2000(1) (1), 61 - 62, Japanese

  • Integration of Dynamic and Static Features for Speaker Verification Using Subspace Method -Study on Priori Decision of Subspace Dimension-
    NISHIDA Masafumi, ARIKI Yasuo
    01 Mar. 2000, 日本音響学会研究発表会講演論文集, 2000(1) (1), 97 - 98, Japanese

  • A Proposal of Noise Reduction Method Based on Fast Kalman Filter for Noisy Speech Recognition
    FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Mar. 2000, 日本音響学会研究発表会講演論文集, 2000(1) (1), 5 - 6, Japanese

  • A Study on Network Structure of Lexical Tree Search
    OGATA Jun, ARIKI Yasuo
    In this paper, we propose an efficient network structure of lexical tree search, and evaluate the proposed method. At first, we describe some problems about processing time and memory requirments in lexical tree search, and then we discuss on the method to solve these problems. We propose a new type network structure where the word with the best partial score in frame is only linked to the back-off connection. The experimental results showed that this method can reduce about half of the processing time without increasing any errors.
    The Institute of Electronics, Information and Communication Engineers, 21 Jan. 2000, IEICE technical report. Speech, 99(577) (577), 35 - 40, Japanese

  • Noisy Speech Recognition Using Noise Reduction Method Based on Kalman Filter
    FUJIMOTO Masakiyo, ARIKI Yasuo
    In this paper, we propose a noise reduction method based on Kalman filter for noisy speech recognition. Since Kalman filter needs a huge quantity of computation, it was never used for real time processing. We propose a noise reduction method using fast Kalman filter which can reduce a large quantity of computation and achieve processing in 1.5∼2.0 times of real time, without loosing the accuracy. In order to evaluate the proposed method, we carried out experiments to extract clean speech signal from noisy speech and compared the results by our method with conventional Spectral Subtraction and Parallel Model Combination (PMC) in word recognition accuracy. As a result, the proposed method obtained word recognition rate equal or superior to PMC.
    Information Processing Society of Japan (IPSJ), 20 Dec. 1999, IPSJ SIG Notes, 1999(108) (108), 73 - 78, Japanese

  • Comparison of Retrieval Methods to News Speech
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    Recently, TV news programs are broadcast from all over the world owing to the broadcast digitization. In this situation, TV viewers want to select and watch the most interesting news. In order to satisfy this requirenlent, news database has to be constructed which has automatic topic segmentation and retrieval function, In this paper, We focus on topic retrieval among them. Conventional term weighting methods and vector space models have no applicability in spoken document retrieval because of error words caused by speech recognition. In order to solve this problem, in this paper, we propose mutual information considering TF-IDF as a new term weighting method, and word space model as a new vector space model.
    The Institute of Electronics, Information and Communication Engineers, 20 Dec. 1999, IEICE technical report. Natural language understanding and models of communication, 99(523) (523), 97 - 102, Japanese

  • Pattern Recognition Viewed from Media Analysis
    ARIKI Yasuo
    Multimedia information will be distributed much more through the digitization hereafter. We human beings require the fast access methods to such kinds of digitized information for our intellectual activity. For this purpose, multimedia information has to be structured and is given indices by integrating the methods of media analysis for speech, acoustics, character and video images. In this paper, speech dictation, topic extraction, speaker and music retrieval, and fast retrieval method for acoustic signals are described on an emphasis of their requirement. In the same way, image structuring method by cut detection and camera work extraction, motion recognition and event detection, video summary and the handling are also described on the emphasis of their requirement.
    The Institute of Electronics, Information and Communication Engineers, 16 Dec. 1999, Technical report of IEICE. PRMU, 99(514) (514), 43 - 50, Japanese

  • A Study on Noisy Speech Recognition by Kalman Filter
    FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Sep. 1999, 日本音響学会研究発表会講演論文集, 1999(2) (2), 31 - 32, Japanese

  • News Article Classification by Speech Dictation for Automatically Extracted Announcer Utterance
    NISHIDA Masafumi, OGATA Jun, ARIKI Yasuo
    In order to construct a news database with a function of video on demand (VOD), it is required to classify news articles into topics. In this paper, we propose a system which can automatically dictate news speech, extract keywords and classify news articles into topics based on the extracted keywords. We employed χ^2 method to select keywords and to compute the association degree between keywords and topics. We also propose to dictate only the announcer utterance for classifying the news articles because it contributes to save the dictation time. In order to segment the announcer speech section from other speakers, we propose a speaker verification method based on subspace method. For 48 NHK news articles, we carried out the extraction of announcer utterance, speech dictation and article classification. As a result, we reduced the dictation time by restricting the dictation to the announcer utterance without losing the classification accuracy.
    Information Processing Society of Japan (IPSJ), 15 Apr. 1999, Transactions of Information Processing Society of Japan, 40(4) (4), 1482 - 1490, Japanese

  • A study on Noisy Speech Recognition Using Improved HMM Composition
    HATAKE Tsubasa, ARIKI Yasuo
    The Institute of Electronics, Information and Communication Engineers, 08 Mar. 1999, Proceedings of the IEICE General Conference, 1999(1) (1), 255 - 255, Japanese

  • Comparison of Methods to Decide Word Importance in Unsupervised Topic Segmentation for News Speech Articles
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    01 Mar. 1999, 日本音響学会研究発表会講演論文集, 1999(1) (1), 171 - 172, Japanese

  • Organization and Retrieval of Video Data
    TANAKA Katsumi, ARIKI Yasuo, UEHARA Kuniaki
    This paper focuses on the problems how to organize and retrieve video data in an effective manner. First we identify several issues to be solved for the problems. Next, we overview our current research results together with a brief survey in the research area of video databases. We especially describe the following research results obtained by the the Japanese Ministry of Education under Grant-in-Aid for Scientific Research on Priority Area: "Advanced Databases" concerned with organization and retrieval of video data: Instance-Based Video Annotation Models, Self-Organization of Video Data, and A Query Model for Fragmentally Indexed Video.
    The Institute of Electronics, Information and Communication Engineers, 25 Jan. 1999, IEICE Transactions on Information and Systems, 82(1) (1), 34 - 44, English

  • Automatic Classification of TV News Articles by Spoken Word Spotting
    ARIKI Yasuo
    多くのテレビニュース番組の中から, 最も知りたいニュースだけを見たいといった要求を満たすためには, ニュース記事からキーワードを抽出し, これをもとに, ニュース記事を政治や経済といったトピックに分類することのできるニュースデータベースが必要である. 本論文では, ニュースキャスタの音声に対して単語スポッティングを適用し, 記事内容に関するキーワードを自動抽出して, テレビニュース記事をトピックに自動分類する方法を提案している. 音声からキーワードを抽出する方法としては, これまでに提案されているいくつかの音声単語スポッティングの方法を理論的に比較するとともに比較実験を行い, わき出しが少なくかつ処理時間の短い方法を採用した. 記事分類では, キーワードとトピックの関係を記述した「分類表索引」を用いて, キーワードのトピックに対する寄与率を計算しておく. これと音声単語スポッティングで求めたキーワードの存在確率を掛け合わせることで, 記事の分類確率を計算し, 最大の分類確率をもつトピックに分類している. NHKニュース26日分に本手法を適用し, 記事分類の有効性を確認した.
    The Institute of Electronics, Information and Communication Engineers, Jan. 1999, The Transactions of the Institute of Electronics,Information and Communication Engineers., 82(1) (1), 223 - 233, Japanese

  • Study on Speech Feature Extraction based on KLT
    TOKUHIRA Masatsune, ARIKI Yasuo
    This paper presents a new feature extraction method of speech recognition based on KLT. We examined a new feature extraction method and some other feature extraction methods classified into "dynamic feature extraction method". In this study, we found to improve the recognition performance with our new feature extraction method.
    Information Processing Society of Japan (IPSJ), 10 Dec. 1998, IPSJ SIG Notes, 1998(114) (114), 159 - 166, Japanese

  • Topic Segmentation and Classification to News Speech
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    In order to construct a news database with a function of video on demand(VOD), it is required to classify a news articles into topics. So far, the article classification has been done to the articles, segmented in advance from the continuous news speech. However, in the case where the articles are not segmented, it was difficult to apply the classification to the continuous news speech. From this viewpoint, we propose a segmentation and classification method of the news articles from the continuous news speech.
    The Institute of Electronics, Information and Communication Engineers, 10 Dec. 1998, IEICE technical report. Natural language understanding and models of communication, 98(460) (460), 55 - 62, Japanese

  • Automatic Classification of TV Sports News Video by Multiple Subspace Method
    SUGIYAMA Yoshiaki, ARIKI Yasuo
    スポーツニュースのような映像では, カメラ位置の制約などから, 画像内に映る対象の位置などに制約がある.従って, 特定のスポーツを特徴づける代表的な画像が存在し, それをもとにスポーツニュースの記事を, テニスや野球といったスポーツカテゴリーに分類することができる.本研究では, 画像上の物理的なオブジェクトを明確に認識することなく, シーンの構図に関する全体的な特徴をもとに, スポーツニュース記事を分類する手法を提案している.分類手法として多重部分空間法を用い, 98.6%の記事分類率を得た.
    The Institute of Electronics, Information and Communication Engineers, Sep. 1998, The Transactions of the Institute of Electronics,Information and Communication Engineers., 81(9) (9), 2112 - 2119, Japanese

  • Study on Topic Segmentation of News Speech
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    01 Sep. 1998, 日本音響学会研究発表会講演論文集, 1998(2) (2), 157 - 158, Japanese

  • Study on Speech Feature Extraction by Statistical Spectral Analysys
    TOKUHIRA Masatsune, ARIKI Yasuo
    01 Sep. 1998, 日本音響学会研究発表会講演論文集, 1998(2) (2), 121 - 122, Japanese

  • Speaker Verification by Subspace Method
    NISHIDA Masafumi, ARIKI Yasuo
    01 Sep. 1998, 日本音響学会研究発表会講演論文集, 1998(2) (2), 111 - 112, Japanese

  • Comparison of Keyword Selection for Classification of News Speech Articles
    TAKAO Seiichi, OGATA Jun, ARIKI Yasuo
    In order to construct a news database with a function of video on demand(VOD), it is required to classify a news articles into topics. In this study, we implemented and compared keyword selection methods such as χ^2, mutual information and TF-IDF. These selected keywords are used to classify the articles after news speech dictation. Further more we compared the classification methods which use the selected keywords.
    Information Processing Society of Japan (IPSJ), 24 Jul. 1998, IPSJ SIG Notes, 1998(68) (68), 75 - 82, Japanese

  • Comparison of Dictation and Word Spotting Techniques in Classification of News Speech Articles
    OGATA Jun, ARIKI Yasuo
    In order to construct a news database with a function of video on demand (VOD), it is required to classify news articles into topics. A news article is composed of speech to convey the contents, characters to summarize them and images to convey the situation, so that keyword spotting from the news speech is a key technique to classify the news articles. In article classification, a sequence of keywords is searched and their probabilities are computed from the news speech. Then probabilities of 10 topics are computed using the keyword probabilities and topic contribution of the keywords which are computed in advance. This paper describes comparison of dictation and word spotting techniques on accuracy of article classification.
    The Institute of Electronics, Information and Communication Engineers, 12 Jun. 1998, IEICE technical report. Speech, 98(106) (106), 67 - 72, Japanese

  • News Dictation and Article Classification for Automatically Extracted Announcer Utterance
    OGATA Jun, NISHIDA Masafumi, ARIKI Yasuo
    In order to construct a news database with a function of video on demand (VOD), it is required to classify a news articles into topics. In this study, we describe a system which can dictate news speech, extract keywords and classify news articles based on the extracted keywords. We propose that it is sufficient to dictate only the announcer utterance for classifying the news articles and it contributes to reduce the processing time and increases the classification accuracy. As an experiment, we compared the classification performance of news articles between in two cases; in the case of dictating only the announcer utterances which are automatically extracted and in a case of dictating a whole speech which includes reporter or interviewer utterances.
    Information Processing Society of Japan (IPSJ), 28 May 1998, IPSJ SIG Notes, 1998(49) (49), 55 - 60, Japanese

  • Improvement of Telop Character Recognition by Character Segmentation Improvement
    KATAYAMA Masao, ARIKI Yasuo
    The Institute of Electronics, Information and Communication Engineers, 06 Mar. 1998, Proceedings of the IEICE General Conference, 1998(2) (2), 236 - 236, Japanese

  • Extraction Improvement of Multiple Faces with Orientation and Size Invariance
    ISHIKAWA Noriyuki, ARIKI Yasuo
    The Institute of Electronics, Information and Communication Engineers, 06 Mar. 1998, Proceedings of the IEICE General Conference, 1998(2) (2), 338 - 338, Japanese

  • Classification of News Speech Articles by Dictation Using Word-Bigram
    OGATA Jun, MORI Haru, ARIKI Yasuo
    01 Mar. 1998, 日本音響学会研究発表会講演論文集, 1998(1) (1), 151 - 152, Japanese

  • Human information retrieval by face extraction and recognition on TV news images by subspace method
    ARIKI Y.
    1998, ACCV 97

  • Facial Region Tracking and Training Method by Subspace Projection
    SUGIYAMA Yoshiaki, ARIKI Yasuo
    Automatic facial region tracking and the recognition on movie database are strongly required in a scene retrieval system which can locate human position and also identify his name as scene indexing. In this paper, we propose a new framework of frame selection by a subspace projection for face model construction. Our idea for facial frame selection is based on the success of orientation invariant face tracking and modeling using the subspace method which we proposed already.
    The Institute of Electronics, Information and Communication Engineers, 21 Nov. 1997, Technical report of IEICE. PRMU, 97(387) (387), 77 - 82, Japanese

  • Scene Cut Detection and Article Extraction in News Video Based on Clustering of DCT Feature
    ARIKI Yasuo
    本論文では, ニュース映像から個々の記事を自動的に切り出す方法を提案している. ニュース映像の各フレームを, 離散余弦変換(DCT)で圧縮し, このとき得られるDCT特徴でシーンカットを検出する. カット検出の従来法では, 隣接するフレーム間の差分をもとにしているため, 画像の一部または全体の明るさが変化する場合に, 誤検出が生じていた. 本研究では, 同一シーン中の連続するフレームは類似しているという性質に基づいて, ニュース映像中のフレームをクラスタリングすることによって, この問題を解決している. ニュース映像は「スタジオから現場に移りスタジオに戻る」というシンタックス上の構造をもっている. この構造は, 検出したカット点フレーム集合においては, ループとして観測されるため, ループ検出によってスタジオを推定し, 記事を切り出している. NHKのニュース30日分に対して実験を行い, カット検出率87.9%, 記事切出し率99.2%を得た. また, 民放3社のニュース10日分に対して, 記事切出し実験を行いその有効性を示した.
    The Institute of Electronics, Information and Communication Engineers, 25 Sep. 1997, The Transactions of the Institute of Electronics,Information and Communication Engineers., 80(9) (9), 2421 - 2427, Japanese

  • Voice Conversion Based on Exchange of Speaker Subspaces
    ARIKI Yasuo, NISHIMURA Kanto, FUJIMOTO Masakiyo
    This paper proposes a voice conversion method for spontaneous speech by exchanging speaker subspaces between an input speaker and the target speaker, after separating speaker characteristics and phonetic information included in the speech. The speaker subspaces of the input speaker and the target speaker are constructed to maximize their corresponding axes by canonical correlation analysis (CCA). In addition, the axes of each speaker subspace are orthonormalized in order to synthesize the speech. Namely, the CCA is extended to orthogonal canonical correlation analysis. In the speech analysis, the power-spectrum envelope (PSE) analysis is employed to accurately extract the pole and zero of the spectrum. On the other hand, in the speech synthesis, the impulse response convolution method is employed to improve the voice quaility.
    The Institute of Electronics, Information and Communication Engineers, 19 Jun. 1997, IEICE technical report. Speech, 97(114) (114), 17 - 24, Japanese

  • 動画像デ-タベ-ス:内容記述とコンテンツによる構造化 (特集 デ-タベ-ス研究最前線--高度デ-タベ-スプロジェクト)
    田中 克己, 上原 邦昭, 有木 康雄
    サイエンス社, May 1997, Computer today, 14(3) (3), 34 - 39, Japanese

  • Voice Convcrsion Based on Exchange of Speaker Spaces.
    FUJIMOTO Masakiyo, ARIKI Yasuo
    01 Mar. 1997, 日本音響学会研究発表会講演論文集, 1997(1) (1), 265 - 266, Japanese

  • Speaker Normalization and Recognition Based on Speaker Subspace Projection
    ARIKI Yasuo
    01 Mar. 1997, 日本音響学会研究発表会講演論文集, 1997(1) (1), 23 - 26, Japanese

  • Integration of Face and Speaker Recognition by Normalized Multiple Feature Subspace Method
    ISHIKAWA Noriyuki, ARIKI Yasuo
    In this paper, we propose a modified CLAFIC method in face recognition and speaker recognition. This method is derived by modifying a CLAFIC method commonly used in the pattern recognition by a subspace method. The modified CLAFIC method translates subspace origin to the centroid of all the training data so that it can discriminate categories more precisely. We carried out face and speaker recognition by this modified CLAFIC method and showed the effectiveness. We also propose in this paper an integration method of face and speaker recognition results in the subspace method. Input vectors are projected to the face and speaker subspaces, and then their squared lengths of the projected vectors are added. We carried out an integration experiment of the face and speaker recognition and showed the effectiveness of this method.
    The Institute of Electronics, Information and Communication Engineers, 28 Jun. 1996, Technical report of IEICE. PRMU, 96(141) (141), 31 - 38, Japanese

  • Orientation Invariant Face Extraction and Recognition based on Subspace Method
    Komatsu Yoshie, Ariki Yasuo
    This paper describes a method to recognize human faces by a subspace method regardless of their orientation. The method presents human facial images in a subspace spanned by the eigenvectors computed by KL-expansion of the facial images. In facial subspace training of an individual person,the facial images are taken at every 5, 10, or 15 degrees from right to left face over 180 degrees.We investigated the relation between the recognition rate and the number of facial images used in the training. We also compared the recognition rates of one subspace (all orientation) and three subspaces (right, front,left orientation). Using these subspaces, we evaluated facial region extraction and recognition regardless of their orientation.
    The Institute of Electronics, Information and Communication Engineers, 1996, IEICE technical report. Pattern recognition and understanding, 95, 7 - 14, Japanese

  • A Study on Multi-Subspace Method for Handwritten Kanji Character Recognition
    MOTEGI Yuji, ARIKI Yasuo
    手書き文字認識では、今までに統計的手法やニューラルネットを用いた手法などが提案されている。ニューラルネットを用いた手法では、分類カテゴリー数が多くなると学習に時間がかかるといった問題点がある。本研究では、学習時間や学習の収束性を気にしなくてよい部分空間法を手書き文字の認識に適用している。従来、部分空間法では各カテゴリーに一つの部分空間を設定していた。今回、1カテゴリーあたりの部分空間を複数個に設定することにより、分類能力を上げることができたのでその結果を報告する。本稿では、一つのカテゴリーに複数個の部分空間を設定するという意味で、本手法を多重部分空間法と呼んでいる。
    The Institute of Electronics, Information and Communication Engineers, 1994, 信学'94秋大, 312, 312 - 312, Japanese

  • Speaker Recognition based on Subspace Methods
    ARIKI Y.
    1994, ICSLP94

  • Mixture density HMMs with two-level transition
    Ariki Yasuo
    Japan Acoustical Society of Japan, 1993, Journal of the Acoustical Society of Japan ?, 14(4) (4), 279 - 280, English

  • Phoneme Recognition Improvement in Concatenated HMM Training
    Ariki Yasuo, Doshita Shuji
    INSTITUTION FOR PHONETIC SCIENCES UNIVERSITY OF KYOTO, 1993, 音声科学研究 = Studia phonologica, 27, 55 - 65, English

  • Effectiveness of Time Duration Constraints in English Phoneme Recognition by HMM
    ARIKI Yasuo
    電子情報通信学会情報・システムソサイエティ, 25 Dec. 1992, The Transactions of the Institute of Electronics,Information and Communication Engineers., 75(12) (12), 1933 - 2001, Japanese

■ Books And Other Publications
  • Evaluation of an Active Microphone with a Parabolic Reflection Board for Monaural Sound-Source-Direction Estimation (Chapter on Soundscape Semiotics - Localisation and Categorisation. Book edited by Hervé Glotin)
    TAKIGUCHI Tetsuya, TAKASHIMA Ryoichi, ARIKI Yasuo
    Joint editor, I-Tech Education and Publishing, Feb. 2014, English, In this chapter, we introduce the concept of an active microphone that achieves a good combination of active-operation and signal processing. The active microphone has a parabolic reflection board, which is extremely simple in construction. The reflector and its associated microphone rotate together, perform signal processing, and seek to locate the direction of the sound source., ISBN: 9789535112266
    Scholarly book

  • ディジタル信号処理
    ARIKI YASUO, TAKIGUCHI TETSUYA, KAJIKAWA YOSHINOBU, BANNO HIDEKI, MANO KAZUNORI, TAKAHASHI MASANOBU
    Joint work, オーム社, Jan. 2013, Japanese, ISBN: 9784274213052
    Textbook

  • Single-Channel Sound Source Localization Based on Discrimination of Acoustic Transfer Functions, Chapter on "Advances in Sound Localization" Book edited by Powel Strumillo
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Joint work, Intech Open Publisher, Mar. 2011, English
    Scholarly book

  • Video Editing Based on Situation Awareness from Voice Information and Face Emotion, Chapter on "Digital Video." Book edited by Floriano De Rango.
    TAKIGUCHI Tetsuya, ADACHI Jun, ARIKI Yasuo
    Joint work, I-Tech Education and Publishing, Feb. 2010, English
    Scholarly book

  • 3D Human Posture Estimation Using HOG Features of Monocular Images, Chapter on "Pattern Recognition." Book edited by Peng-Yeng Yin.
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Joint work, I-Tech Education and Publishing, Oct. 2009, English
    Scholarly book

  • System Request Utterance Detection Based on Acoustic and Linguistic Features
    Takiguchi Tetsuya, Sako Atsushi, Yamagata Tomoyuki, Ariki Yasuo
    Joint work, I-Tech Education and Publishing, Nov. 2008, English
    Scholarly book

  • Voice and Noise Detection with AdaBoost
    TAKIGUCHI Tetsuya, MIYAKE Nobuyuki, MATSUDA Hiroyoshi, ARIKI Yasuo
    Joint work, I-Tech Education and Publishing, 2007, English
    Scholarly book

  • Spoken Language Systems
    ARIKI Yasuo
    Joint work, Ohmsha, 2005, English
    Scholarly book

  • 情報の構造化と検索 (岩波講座マルチメディア情報学; 第8巻)
    西尾章治郎, 田中克己, UEHARA KUNIAKI, ARIKI YASUO, 加藤俊一, 河野浩之
    Joint work, 岩波書店, Mar. 2000, Japanese, ISBN: 4000109685
    Scholarly book

  • 情報メディア工学
    柳田益造, ARIKI YASUO, 八村広三郎
    Joint work, オーム社, Jun. 1999, Japanese, ISBN: 427413184X
    Textbook

  • パターン認識・理解の新しい展開に向けて
    小川英光, 古井貞煕, ARIKI YASUO
    Joint work, 電子情報通信学会, Mar. 1994, Japanese, ISBN: 488552119X
    Scholarly book

  • Hidden Markov Models for Speech Recognition
    X D Huang, Yasuo Ariki, M A Jack
    Joint work, Edinburgh University Press, Sep. 1990, English
    Scholarly book

■ Lectures, oral presentations, etc.
  • 物体振動を用いた畳み込みニューラルネットワークによる音源復元
    FUSE Yohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第21回画像の認識・理解シンポジウム, 2018, Japanese, Domestic conference
    Poster presentation

  • ハイスピードカメラ画像を用いた唇動画像からの音声生成
    TAKASHIMA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第21回画像の認識・理解シンポジウム, 2018, Japanese, Domestic conference
    Poster presentation

  • Knowledge graph embeddings for Zero-Shot Learning
    Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    第21回画像の認識・理解シンポジウム, 2018, Japanese, Domestic conference
    Poster presentation

  • 災害応急対策支援を目的とした衛星画像の被覆分類精度向上について
    YOSHIHARA Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第20回画像の認識・理解シンポジウム, 2017, Japanese, Domestic conference
    Poster presentation

  • Automation of hospital patients’ leftover food quantity estimation
    Tristan Hascoet, Yasuo Ariki, Tetsuya Takiguchi
    第20回画像の認識・理解シンポジウム, 2017, English, Domestic conference
    Poster presentation

  • 没入型バーチャルリアリティ空間における足元知覚の計測システムの開発
    NAKATANI Masashi, ENAMI Naoko, NIWA Yuudai, YASUOKA Akiko, WADA Honoka, KITA Shinichi, ARIKI Yasuo
    画像の認識・理解シンポジウム, Aug. 2016, Japanese, 電子情報通信学会, 浜松, Domestic conference
    Poster presentation

  • 衛星画像解析と地図情報の統合による被害状況地図の作成
    YOSHIHARA Atsushi, SASAJIMA Keisuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, Aug. 2016, Japanese, Domestic conference
    Poster presentation

  • 映像中の変動の大きな物体に対する音源復元のための物体振動抽出手法の検討
    YASUMI Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, Aug. 2016, Japanese, Domestic conference
    Poster presentation

  • SIFT Boosting for Handwriting Recognition
    CHEN Jinhui, KAMIHIGASHI Takashi, ITOH Munehiko, TAKATSUKI Yasuo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, Aug. 2016, English, Domestic conference
    Poster presentation

  • Object-Based Geo-Eye Satellite Image Segmentation for Tsunami Disaster Map Preparation
    Mohammad Reza Poursaber, Yasuo Ariki, Tetsuya Takiguchi, Atsushi Yoshihara, Mohammad Safi
    画像の認識・理解シンポジウム, Aug. 2016, English, Domestic conference
    Poster presentation

  • Convolutional Neural Networksを用いた物体の機能推定
    AZUMA Ryunosuke, KITANO Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, Aug. 2016, Japanese, Domestic conference
    Poster presentation

  • 適応型 Restricted Boltzmann Machine を用いたパラレルデータフリーな任意話者声質変換
    NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2015年春季研究発表会, Mar. 2015, Japanese, Domestic conference
    Oral presentation

  • 少量のパラレルデータを用いたNon-negative Matrix Factorizationによる雑音環境下の声質変換
    藤井 貴生, 相原 龍, NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2015年春季研究発表会, Mar. 2015, Japanese, Domestic conference
    Oral presentation

  • Deep Boltzmann Machine を用いた音素ラベル情報推定
    高島 悠樹, NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2015年春季研究発表会, Mar. 2015, Japanese, Domestic conference
    Oral presentation

  • 色名顕著性による物体特定
    OZASA Yuko, ENAMI Naoko, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • 色特徴を用いた追い抜き車両の特定
    AZUMA Ryunosuke, ENAMI Naoko, OZASA Yuko, YURIMOTO Mizuki, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • アノテーションに基づくDeformable Part Modelによる顔部品検出
    NISHIDA Kazuhiro, ENAMI Naoko, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • Modeling Deep Bidirectional Relationships for Image Classification and Generation
    NAKASHIKA Toru, Tetsuya Takiguchi, Yasuo Ariki
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • Deformable Part Modelを用いた物体の機能推定
    KITANO Yosuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • Convolutional Bottleneck Networks を用いた重度難聴者のマルチモーダル音声認識
    TAKASHIMA Yuki, KAKIHARA Yasuhiro, AIHARA Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo, MITANI Nobuyuki, OMORI Kiyohiro, NAKAZONO Kaoru
    画像の認識・理解シンポジウム, 2015, Japanese, Domestic conference
    Poster presentation

  • A Robust Multi-classification Algorithm Using Learning SURF Cascade for Emotional Recognition
    Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    画像の認識・理解シンポジウム, 2015, English, Domestic conference
    Poster presentation

  • 話者適応型 Restricted Boltzmann Machine を用いた声質変換の検討
    NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2014年秋季研電子情報通信学会技術研究報告究発表会, Dec. 2014, Japanese, Domestic conference
    Oral presentation

  • 話者適応を用いたNMFによる雑音環境下の声質変換
    藤井 貴生, 相原 龍, NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2014年秋季研究発表会, Sep. 2014, Japanese, Domestic conference
    Oral presentation

  • 話者依存型 Recurrent Temporal Restricted Boltzmann Machine を用いた声質変換
    NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2014年秋季研究発表会, Sep. 2014, Japanese, Domestic conference
    Oral presentation

  • 遺伝的アルゴリズムを用いた 構音障害者の音声特徴量抽出に最適なランダム行列の生成
    片岡 悠一郎, NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    日本音響学会2014年秋季研究発表会, Sep. 2014, Japanese, Domestic conference
    Oral presentation

  • スパース表現に基づく声質変換のための結合型 restricted Boltzmann machine
    NAKASHIKA TORU, TAKIGUCHI TETSUYA, ARIKI YASUO
    電子情報通信学会技術研究報告, May 2014, Japanese, Domestic conference
    Oral presentation

  • 音声・画像処理の共通点と統合・変換処理について
    ARIKI Yasuo
    情報処理学会東海支部主催講演会, Mar. 2014, Japanese, 情報処理学会東海支部, 豊橋, 音声・画像処理の共通点と統合・変換処理について述べる., Domestic conference
    [Invited]
    Invited oral presentation

  • 物体の機能発現を可能とする属性情報の抽出
    KITANO Yosuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • 視覚障碍者の歩行支援のための交差点上の歩行者位置・進行方向推定
    KAWAGUCHI Satoshi, ENAMI Naoko, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • ボールと選手のHMMを統合したサッカー映像のイベント認識
    WANG Hejin, ITOH Hiroki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • Web画像を用いた一般物体認識と指示発話の音声認識を統合した物体選択法
    NISHIMURA Hitoshi, OZASA Yuko, ARIKI Yasuo, NAKANO Mikio
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • The Level of Skill Model for Piano Performance:Analyzing Gaze on Music Videos
    NUMANO Shunsuke, ENAMI Naoko, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • Modeling Context of Pedestrian and Background in Pedestrian Detection
    ENAMI Naoko, TAKAYANAGI Yohei, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • A Robust Learning Algorithm Based on SURF and PSM for Facial Expressions Recognition
    Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • AAMによる顔方位に依存しない連続発話認識
    LI Yiting, YANG Nan, TAKIGUCHI Tetsuya, ARIKI Yasuo
    画像の認識・理解シンポジウム, 2014, Japanese, Domestic conference
    Poster presentation

  • 人検出のためのDifference of Gaussianに基づくHOG特徴量選択
    髙柳 陽平, ENAMI NAOKO, ARIKI YASUO
    画像の認識・理解シンポジウム, Aug. 2013, Japanese, Domestic conference
    Poster presentation

  • Accurate Vehicle Localization using Flow Estimation for Navigation System
    百合本 瑞規, ENAMI NAOKO, ARIKI YASUO
    画像の認識・理解シンポジウム, Aug. 2013, Japanese, Domestic conference
    Poster presentation

  • 物体の機能に基づく認識
    TANAKA Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, Japanese, 情報処理学会CVIM研究会, 東京, 本研究では,物体の画像パターン認識ではなく,物体の機能に基づいた認識について研究を行う., Domestic conference
    Poster presentation

  • サッカー映像におけるホイッスル音声情報を利用した イベント検出
    ITOH Hiroki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, Japanese, 情報処理学会CVIM研究会, 東京, 本研究では,世界的に人気のあるスポーツであるサッカーにおけるイベント検出手法を提案する.ここでのイベントとは,サッカーの試合における,ゴールキックやコーナーキック,ゴールといったアウトオブプレーを指す., Domestic conference
    Poster presentation

  • RGB-D based 3D-Object Recognition by LLC using Depth Spatial Pyramid
    NAKASHIKA Toru, HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, English, 情報処理学会CVIM研究会, 東京, In our proposed approach, the overall object shape is captured by the depth spatial pyramid based on depth information. In more detail, multiple features within each sub-region of the depth spatial pyramid are pooled. As a result, the feature representation including the depth topological information is constructed. We use not only SIFT, but also histograms of oriented normal v, Domestic conference
    Poster presentation

  • Object Recognition by Integrated Information Using Speech and Web Images
    NISHIMURA Hitoshi, OZASA Yuko, ARIKI Yasuo, NAKANO Mikio
    第16回画像の認識・理解シンポジウム, Jul. 2013, English, 情報処理学会CVIM研究会, 東京, In this paper, instead of the manual construction, we propose an automatic image model construction method for object recognition using Web images. The effectiveness of the proposed method is verified in the object recognition by integrating speech and image features., Domestic conference
    Poster presentation

  • Image Classification Based on CodeBook on CodeBooks
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, English, 情報処理学会CVIM研究会, 東京, In this paper, we propose a novel image classification approach, Locality-constrained Linear Coding with codebook on codebooks. The fl ow of our proposed method is, i) generate a class codebook from each class using local descriptors of the class, ii) generate a global codebook based on class codebooks, and iii) encode local descriptors to codes with LLC based on the global cod, Domestic conference
    Poster presentation

  • Human Emotions Estimation Using Combination of 3D Average Face and LUT-AdaBoost
    CHEN Jinhui, ARIKI Yasuo, TAKIGUCHI Tetsuya
    第16回画像の認識・理解シンポジウム, Jul. 2013, English, 情報処理学会CVIM研究会, 東京, One of the most crucial techniques associated with Computer Vision is technology that deals with facial recognition, especially, the automatic estimation of human emotions. However, in real-time facial expression recognition, when a face turns sideways, the expressional feature extraction becomes difficult as the view of camera changes and recognition accuracy degrades signific, Domestic conference
    Poster presentation

  • High-frequency Restoration using Deep Belief Nets for Super-resolution
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, English, 情報処理学会CVIM研究会, 東京, In this paper, we propose a novel super-resolution method using DBNs to restore the missing high-frequencies, motivated by the above-mentioned characteristics of DBNs. In our approach, a low-resolved image is first scaled up to the prescribed size by using bicubic interpolation, and the high-frequency information is estimated by inference of trained DBNs. The networks are train, Domestic conference
    Poster presentation

  • AAMを用いた音声・画像による連続発話認識への構想
    YANG Nan, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第16回画像の認識・理解シンポジウム, Jul. 2013, Japanese, 情報処理学会CVIM研究会, 東京, 本研究では,マルチモーダル連続発話音声認識において,AAMパラメータを用いた画像特徴量抽出手法を提案する., Domestic conference
    Poster presentation

  • 単眼サッカー映像における時間状況グラフを用いた選手の3次元追跡
    ITOH Hiroki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2012, Japanese, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • 視点移動カメラで撮影したサッカー映像中のボール追跡とイベント検出
    SOWA Tomoya, ARIKI Yasuo, TAKIGUCHI Tetsuya
    IEICE, Mar. 2012, Japanese, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • 使用履歴に基づくユーザー嗜好を考慮した POMDPによる音声対話システム
    FUJIKAWA Kenji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Poster presentation

  • 構音障害者を対象としたSSMを用いた音声認識の検討
    ISHII Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Poster presentation

  • 学習画像の選択に基づくAAMの繰り返し適応
    TAKAYANAGI Yohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2012, Japanese, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • 音響尤度を用いたマルチスピーカ音響エコーキャンセラの検討
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Oral presentation

  • 音響伝達特性を用いたシングルチャネル音源位置推定における未学習位置の推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Oral presentation

  • スペクトルと韻律を特徴量とした GMMによる感情音声変換
    AIHARA Ryo, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Oral presentation

  • スパース表現に基づく構音障害者の発話スタイル変動にロバストな特徴量抽出
    YOSHIOKA Toshiya, TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Poster presentation

  • Web画像を用いたカテゴリ別Visual Wordsによる一般物体認識
    TANAKA Yuto, ARIKI Yasuo, TAKIGUCHI Tetsuya
    IEICE, Mar. 2012, Japanese, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • Random Projection を用いた構音障害者の音声認識
    TAKATSUKA Tomonori, TAKIGUCHI Tetsuya, ARIKI Yasuo, RI Yoshiaki
    ASJ 2012 Spring meeting, Mar. 2012, Japanese, 日本音響学会, 神奈川, Domestic conference
    Poster presentation

  • Human Emotions Estimation by Adaboost Based on User's Facial Expression and Average Face from Different Directions
    CHEN Jinhui, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2012, English, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • An AdaBoost-Based Weighting Method for Localizing Human Brain Magnetic Activity
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAWAKATSU Masaki, KOTANI Makoto
    ASJ 2012 Spring meeting, Mar. 2012, English, 日本音響学会, 神奈川, Domestic conference
    Oral presentation

  • Age Estimation Based on Gaussian Process Regression of AAM Parameters Using Hollywood Database
    Songzhu Gao, ARIKI Yasuo, TAKIGUCHI Tetsuya
    IEICE, Mar. 2012, Japanese, 電子情報通信学会, 岡山, Domestic conference
    Poster presentation

  • 尤度最大化に基づくエコー推定を用いたマルチスピーカ音響エコーキャンセラの検討
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Poster presentation

  • 未知語モデルを用いたCRFに基づく音声認識誤り訂正
    NAKATANI Ryohei, IWAHASHI Naoto, NAKANO Mikio, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Oral presentation

  • 文脈特徴を用いたCRFによる音声認識誤り訂正
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Poster presentation

  • 構音障害者を対象とした混合正規分布モデルに基づく統計的声質変換に関する研究
    ISHII Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Poster presentation

  • 音響伝達特性を用いた単一マイクロホンによる話者の頭部方向の推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Oral presentation

  • スパース性基準によるF0 周波数選択を用いたSpecmurt による多重音解析
    NISHIMURA Daiki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Oral presentation

  • 3次元特徴量を用いた構造表現による一般物体認識
    HORI Takahiro, IWAHASHI Naoto, NAKANO Mikio, ARIKI Yasuo
    FIT2011, Sep. 2011, Japanese, 情報処理学会, 函館, Domestic conference
    Oral presentation

  • 2ch マイクによるCSP 係数の識別に基づく話者の頭部方向の推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Autumn meeting, Sep. 2011, Japanese, 日本音響学会, 島根, Domestic conference
    Oral presentation

  • 固有空間法による構音障害者の母音声質変換の検討
    ISHII Ryo, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2011, Japanese, 電子情報通信学会, 東京, Domestic conference
    Others

  • 確率スペクトルを用いた基底生成モデルとNMFによる混合楽音解析
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Spring meeting, Mar. 2011, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 音響伝達特性の判別に基づく単一チャネル音源位置推定におけるMKL-SVMを用いた特徴量重みの自動学習
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Spring meeting, Mar. 2011, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • スパース性を考慮したSpecmurtによる多重音解析
    NISHIMURA Daiki, NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Spring meeting, Mar. 2011, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • グラフ-ベクトル変換を用いたグラフ構造表現による一般物体認識
    HORI Takahiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2011, Japanese, 電子情報通信学会, 東京, Domestic conference
    Others

  • CRFとConfusion Networkを用いた音声認識誤り訂正
    NAKATANI Ryohei, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2011 Spring meeting, Mar. 2011, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • ARCOを特徴量とする顔検出の併用による人誤検出の棄却
    YAMASHITA Ryo, TAKIGUCHI Tetsuya, 有木 康雄
    IEICE, Mar. 2011, Japanese, 電子情報通信学会, 東京, Domestic conference
    Others

  • 2+3次元Active Appearance Modelを用いた視線方向推定
    NAKAMATSU Yukari, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE, Mar. 2011, Japanese, 電子情報通信学会, 東京, Domestic conference
    Others

  • 音響伝達特性を用いた単一チャネル音源位置推定における特徴量選択の検討
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Autumn meeting, Sep. 2010, Japanese, 日本音響学会, 大阪, Domestic conference
    Oral presentation

  • バイラテラルフィルタによる 雑音重畳音声の認識効果に関する検討
    YAMADA Kenshiro, ARIKI Yasuo, TAKIGUCHI Tetsuya
    ASJ 2010 Autumn meeting, Sep. 2010, Japanese, 日本音響学会, 大阪, Domestic conference
    Poster presentation

  • NMFと基底モデルを用いた多重楽音解析
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Autumn meeting, Sep. 2010, Japanese, 日本音響学会, 大阪, Domestic conference
    Poster presentation

  • MKLによる構音障害者の音声特徴量評価
    TAKATSUKA Tomonori, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao
    ASJ 2010 Autumn meeting, Sep. 2010, Japanese, 日本音響学会, 大阪, Domestic conference
    Poster presentation

  • Buried Markov Modelを用いた構音障害者の音声認識の検討
    MIYAMOTO Chikoto, KOMAI Yuto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao
    ASJ 2010 Autumn meeting, Sep. 2010, Japanese, 日本音響学会, 大阪, Domestic conference
    Poster presentation

  • 部分観測マルコフ決定過程を用いたカーナビゲーションシステムにおける音声対話
    KISHIMOTO Yasuhide, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 多重関数を用いた調波時間スペクトル形状のモデル化による音声合成
    NAKASHIKA Toru, TACHIBANA Ryuki, NISHIMURA Masafumi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 識別的言語モデルに基づくConfusion Network上での音声認識誤り訂正
    MATSUMOTO Tomohiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 残響適応パラメータを用いた単一チャネル音源位置推定の検討
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 雑音環境下音声認識のためのバイラテラルフィルタを用いた音声特徴量抽出
    YAMADA Keishiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 階層的領域分割法に基づく木構造条件付確率場による一般物体認識
    OKUMURA Takeshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE2010, Mar. 2010, Japanese, 電子情報通信学会, 仙台, Domestic conference
    Oral presentation

  • ランダムプロジェクションを用いた音響モデルの線形変換
    YOSHII Mariko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • ウェーブレット変換を用いた学習型の超解像
    OGAWA Yuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE2010, Mar. 2010, Japanese, 電子情報通信学会, 仙台, Domestic conference
    Oral presentation

  • PLSA による構音障害者の音素体系構築の検討
    TAKATSUKA Norihiro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • Buried Markov Model の構造構築における独立性検定法の検討
    YAMAMOTO Takayuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2010 Spring Meeting, Mar. 2010, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 3次元パーティクルフィルタとEMDを用いた選手の追跡
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    IEICE2010, Mar. 2010, Japanese, 電子情報通信学会, 仙台, Domestic conference
    Oral presentation

  • 話題追従型言語モデルについての考察
    WATANABE Shinji, IWATA Tomoharu, HORI Takaaki, SAKO Atsushi, ARIKI Yasuo
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Oral presentation

  • 複数特徴量の重み付け統合による一般物体認識
    SUGA Akira, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Information Processing Society of Japan Kansai Branch, Sep. 2009, Japanese, 情報処理学会関西支部, 神戸, Domestic conference
    Oral presentation

  • 多重ベータ分布を用いた音色形状の数理モデリングによる楽器音生成
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Poster presentation

  • 人物の顔画像情報に基づくコンテンツの解析
    OKADA Tomoko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    FIT2009, Sep. 2009, Japanese, 情報処理学会など, 仙台, Domestic conference
    Oral presentation

  • 高精度画像マッチングを用いたSAR衛星画像からの地表変位推定
    MIZUNO Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    FIT2009, Sep. 2009, Japanese, 情報処理学会など, 仙台, Domestic conference
    Oral presentation

  • 局所特徴量を用いた構音障害者の音声認識の検討
    MIYAMOTO Chikoto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Poster presentation

  • Random Projection を用いた音声特徴量抽出におけるRandom Matrix の統合
    YOSHII Mariko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Poster presentation

  • HMMを用いた音響伝達特性の推定と音源位置推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Poster presentation

  • Buried Markov Model を用いた音声認識モデルの検討
    YAMAMOTO Takayuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Autumn Meeting, Sep. 2009, Japanese, 日本音響学会, 郡山, Domestic conference
    Poster presentation

  • Bottom-upとTop-downアプローチの組み合わせによる単眼画像からの人体3次元姿勢推定
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Information Processing Society of Japan Kansai Branch, Sep. 2009, Japanese, 情報処理学会関西支部, 神戸, Domestic conference
    Oral presentation

  • 震災時の用水確保に向けた水道事業体と地域社会との協同のあり方~「緊急時の用水確保に対する研究会」の活動~
    齋藤 雅彦, 沖村 孝, 有木 康雄, 平山 修久, 鍬田 泰子
    第60回全国水道研究発表会講演集, May 2009, Japanese, 社団法人 日本水道協会, 埼玉, Domestic conference
    Oral presentation

  • 震災時の用水確保に向けた水道事業体と地域社会との共同のあり方-「緊急時の用水確保に対する研究会」の活動-
    SAITO Masahiko, OKIMURA Takashi, ARIKI Yasuo, HIRAYAMA Nagahisa, KUWATA Yasuko
    Japan Water Works Association, May 2009, Japanese, 日本水道協会, 埼玉, Domestic conference
    Oral presentation

  • 官民協働による緊急時の用水確保に向けた取り組み事例報告 ―「緊急時の用水確保に対する研究会」における実践的事例―
    OKUMURA Yoshihiro, OKIMURA Takashi, ARIKI Yasuo, HIRAYAMA Nagahisa, IMAI Mitsuhiko, MATSUOKA Mikio, DOI Syoichi
    Japan Water Works Association, May 2009, Japanese, 日本水道協会, 埼玉, Domestic conference
    Oral presentation

  • 尤度最大化基準を用いたエコー推定に基づく車室内マルチスピーカ音響エコーキャンセラの検討
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 複数特徴量の重み付け統合による一般物体認識
    SUGA Akira, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • 複数の言語情報を用いたCRFによる音声認識誤りの検出
    MATSUMOTO Tomohiko, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 単眼動画像におけるボールと選手の3次元位置推定
    NISHINO Takuro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • 多重ベータ分布による音色形状モデルを用いた 多重楽音の解析
    NAKASHIKA Toru, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 雑談中の潜在的話題遷移を考慮したユーザーの意図推定の検討
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 構音障害者の連続音声認識の検討
    MIYAMOTO chikoto, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 勾配ヒストグラムに基づく時間-周波数特徴を用いた単語認識
    MUROI Takashi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 音響伝達特性モデルを用いたシングルチャネル音源位置推定の検討
    TAKASHIMA Ryoichi, SUMIDA Yuji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • 位相限定相関法を用いたマイクロ波レーダからの地表変位推定
    MIZUNO Yusuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • ランダムプロジェクションを用いた音声特徴量抽出
    YOSHII Mariko, TAKIGUCHI Tetsuya, ARIKI Yasuo, Jeff Bilmes
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • パラボラ反射板による音響伝達特性の変化を用いたシングルチャネル音源方向推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2009 Spring meeting, Mar. 2009, Japanese, 日本音響学会, 東京, Domestic conference
    Poster presentation

  • Bottom-UpとTop-Down アプローチの統合による単眼画像からの人体3次元姿勢推定
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • AAMのモデル選択による方位に頑健な不特定人物の顔表情認識
    OKADA Tomoko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • AAMと回帰分析による視線、顔方向同時推定
    TAKATANI Manabu, TAKIGUCHI Tetsuya, ARIKI Yasuo
    Proceedings of the 2009 IEICE General Conference, Mar. 2009, Japanese, 電子情報通信学会, 松山市, Domestic conference
    Oral presentation

  • 顔表情クラスタリングによる映像コンテンツへのタギング
    MIYAHARA Masanori, AOKI Masaki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    情報処理学会平成20年度関西支部大会, Oct. 2008, Japanese, 情報処理学会, 京都, Domestic conference
    Oral presentation

  • SIFTとGraph Cuts を用いた物体認識及びセグメンテーション
    SUGA Akira, FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    情報処理学会平成20年度関西支部大会, Oct. 2008, Japanese, 情報処理学会, 京都, Domestic conference
    Oral presentation

  • HOG特徴に基づく単眼画像からの人体3次元姿勢推定
    ONISHI Katsunori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    情報処理学会平成20年度関西支部大会, Oct. 2008, Japanese, 情報処理学会, 京都, Domestic conference
    Oral presentation

  • AdaBoostとSaliency Mapを用いたGraph Cutsによる物体領域の自動抽出法
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    情報処理学会平成20年度関西支部大会, Oct. 2008, Japanese, 情報処理学会, 京都, Domestic conference
    Oral presentation

  • 勾配に基づく特徴量を用いた音声認識の検討
    MUROI Takashi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • メタモデルと音響モデルの統合による構音障害者の音声認識
    MATSUMASA Hironori, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • アクティブマイクロフォンによる音響伝達特性を用いたシングルチャネル音源方向推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • MDL基準とICAを用いた統合音素部分空間による音声特徴量抽出の検討
    PARK Hyunshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • GMMに基づく音声特徴量の時間変動を考慮した突発性雑音の除去
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • DP-Kernel PCAを用いた発話系列への意図ラベリングの検討
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    ASJ 2008 Autumn Meeting, Sep. 2008, Japanese, 日本音響学会, 福岡, Domestic conference
    Poster presentation

  • 話者正規化に基づく構音障害者の音声認識
    MATSUMASA Hironori, TAKIGUCHI Tetsuya, ARIKI Yasuo, LI Ichao, NAKABAYASHI Toshitaka
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • 尤度最大化基準を用いたエコー推定に基づく車室内音響エコーキャンセラの検討
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • 動的計画法に基づく文脈の変化を考慮したLSAの検討
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • 単一マイクロホンを用いた音響伝達特性の尤度判定による音源位置推定
    SUMIDA Yuji, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • 音声特徴量抽出のための音素部分空間統合法の検討
    PARK Hyunshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • パラボラ反射板を用いたアクティブマイクロフォンによる音源方向推定
    TAKASHIMA Ryoichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • Wavelet係数の局所テクスチャ特徴量を用いたGraph Cutsによる画像セグメンテーション
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2008, Japanese, 電子情報通信学会, 北九州, Domestic conference
    Oral presentation

  • SVMとCARTの組み合わせによるAdaBoostを用いた音声区間検出
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • SIFTとGraph Cutsを用いた物体認識及びセグメンテーション
    SUGA Akira, FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2008, Japanese, 電子情報通信学会, 北九州, Domestic conference
    Oral presentation

  • PrefixSpanを用いた映像における人物の日常行動抽出
    TONARU Takuya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2008, Japanese, 電子情報通信学会, 北九州, Domestic conference
    Oral presentation

  • LSAに基づくOne-Class SVMを用いた音声認識仮説の検証
    MATSUMOTO Tomohiko, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • FBANKとGabor Waveletを用いたシステムへの問い合わせと雑談の判別
    YAMAGATA Tomoyuki, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2008年春季研究発表会, Mar. 2008, Japanese, 日本音響学会, 千葉, Domestic conference
    Poster presentation

  • 被災家屋内の人の検出と救助の為の3次元環境地図作成に関する考察
    INOUE Junichi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電気関係学会関西支部連合大会, Nov. 2007, Japanese, 電気学会, 神戸市, Domestic conference
    Oral presentation

  • 話者交替を考慮したシステムへの問い合わせと雑談の判別
    YAMAGATA Tomoyuki, SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年秋季研究発表会, Sep. 2007, Japanese, 日本音響学会, 甲府市, Domestic conference
    Poster presentation

  • 音声区間検出を用いた音響エコーキャンセラにおける音声歪み低減の試み
    KOGA Kentaro, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年秋季研究発表会, Sep. 2007, Japanese, 日本音響学会, 甲府市, Domestic conference
    Poster presentation

  • フィッシャー重みマップに基づく音声特徴量のロバストネスに関する考察
    MUROI Takashi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年秋季研究発表会, Sep. 2007, Japanese, 日本音響学会, 甲府市, Domestic conference
    Poster presentation

  • PCA相関フィルタによる目領域の探索
    SUZUKI Akiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第6回情報科学技術フォーラム, Sep. 2007, Japanese, 情報処理学会, 豊田市, Domestic conference
    Oral presentation

  • PCAを用いた音素ベクトルによる音声特徴量抽出の検討
    PARK Hyunshin, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年秋季研究発表会, Sep. 2007, Japanese, 日本音響学会, 甲府市, Domestic conference
    Poster presentation

  • 3次キュムラントバイスペクトラム特徴とReal AdaBoostによる音声区間検出
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年秋季研究発表会, Sep. 2007, Japanese, 日本音響学会, 甲府市, Domestic conference
    Poster presentation

  • 赤外線映像におけるドライバの方位判定
    INOUE Junichi, TAKIGUCHI Tetsuya, ARIKI Yasuo, KOGA Kentaroh
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • 自動映像生成のためのパーティクルフィルタによるボールの追跡
    YANO Kazuki, TAKIGUCHI Tetsuya, 有木 康雄
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • 構音障害者の音声認識の検討
    MATSUMASA Hironori, TAKIGUCHI Tetsuya, ARIKI Yasuo, RI Ichao, NAKABAYASHI Toshitaka
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • 固定カメラ映像からの音声情報を用いた映像コンテンツ生成
    ADACHI Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • 顔特徴点移動量・点間距離変化量の組み合わせに基づく顔表情認識
    MIYAHARA Masanori, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • マルチ識別器を用いた花画像検索システムの構築
    FUKUDA Keita, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • マルチテンプレート型二次元CSPによる高速目領域探索
    SUZUKI Akiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    電子情報通信学会総合大会, Mar. 2007, Japanese, 電子情報通信学会, 名古屋, Domestic conference
    Oral presentation

  • ブースティングとキーワードフィルタリングによるシステム要求検出
    SAKO Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • フィッシャー重みマップに基づく不特定話者音素認識の検討
    KATO Shunsuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • SVMを用いたシステムへの問い合わせと雑談の判別
    YAMAGATA Tomoyuki, SAKO Jun, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • NetTv:NetNewsとテレビ放送のクロスプラットホームにおける音声検索
    TANAKA Katsuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • AdaBoostを用いた雑音の検出と識別
    MIYAKE Nobuyuki, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • 3次キュムラントのバイスペクトラムとPCAによる音声区間検出
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • 2ch マイクロフォン間の振幅補正を考慮した複素スペクトル平面上での雑音除去
    OHKUBO Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2007年春季研究発表会, Mar. 2007, Japanese, 日本音響学会, 東京, Domestic conference
    Oral presentation

  • 構音障害者の音声認識の検討
    松政 宏典, 滝口 哲也, 有木 康雄, 李 義昭, 中林 稔堯
    電子情報通信学会 第34回福祉情報工学研究会, Jan. 2007, Japanese, 立命館大学 びわこ・くさつキャンパス, Domestic conference
    Oral presentation

  • 二次元CSPによる目領域探索の高速化
    SUZUKI Akiko, TAKIGUCHI Tetsuya, ARIKI Yasuo
    第5回情報科学技術フォーラム, Sep. 2006, Japanese, 情報処理学会, 福岡, Domestic conference
    Oral presentation

  • 二次の射影法とスペクトルサブトラクションを用いた音響エコー抑圧
    OHKUBO Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2006年秋季研究発表会, Sep. 2006, Japanese, 日本音響学会, 金沢市, Domestic conference
    Oral presentation

  • 音響モデルを利用したシングルチャネルによる音源方向推定の検討
    SUMIDA Yuji, TAKIGUCHI Tetsuya, ARIKI Yauo
    日本音響学会2006年春季研究発表会, Sep. 2006, Japanese, 日本音響学会, 金沢市, Domestic conference
    Oral presentation

  • Real Adaboostによる音声区間検出
    MATSUDA Hiroyoshi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会2006年秋季研究発表会, Sep. 2006, Japanese, 日本音響学会, 金沢市, Domestic conference
    Oral presentation

  • Phoneme Recognition by Local Features Using Pairwise Discriminant Fisher-Weight-Map
    KATO Shunsuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会春季研究発表会, 2006, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • A design and an evaluation of emotional speech database for in-car situation awareness
    TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会春季研究発表会, 2006, Japanese, 日本音響学会, 日本, Domestic conference
    Oral presentation

  • Studies on language model construction using topic-hmm based on PLSA
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会春季研究発表会, 2006, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • Speech Detection with Adaboost and Speech-Lip GMM
    MATSUDA Hiroyoshi, MASUDA Ken, TAKIGUCHI Tetsuya, ARIKI Yasuo, KAMIYA Masahiro
    電子情報通信学会総合大会, 2006, Japanese, 電子情報通信学会, 東京, Domestic conference
    Oral presentation

  • A study about noise reduction in real environment using a 2-channel microphone in a complex spectrum plane
    OHKUBO Toshiya, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会秋季研究発表会, 2005, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • Automatic production method with personal adaptation for soccer-game videos
    KUBOTA Shintaro, ARIKI Yasuo, TSUKADA Kiyoshi
    第4回情報科学技術フォーラムFIT, 2005, Japanese, 情報処理学会, 東京, Domestic conference
    Oral presentation

  • Studies on state dependent speech recognition based on phrases
    SAKO Atsushi, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会秋季研究発表会, 2005, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • Phoneme Recognition by Higher-Order Local Auto-Correlation Features Using Fisher-Weight-Map
    KATO Shunsuke, TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会秋季研究発表会, 2005, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • A Study on Reverberant Speech Recognition Using Kernel PCA
    TAKIGUCHI Tetsuya, ARIKI Yasuo
    日本音響学会秋季研究発表会, 2005, Japanese, 日本音響学会, 日本, Domestic conference
    Poster presentation

  • 野球中継におけるハイライトシーン・リアルタイム配信システムのためのPCショット判定領域自動設定法
    熊野 雅仁, 神崎 伸夫, 藤本 雅清, 有木 康雄, 塚田 清志, 濱口 伸, 清瀬 基
    FIT(情報科学技術フォーラム)2003, 2003, Japanese, 情報処理学会, 未記入, Domestic conference
    Oral presentation

  • 発音変形と音響的誤り傾向を考慮した話し言葉音声認識の検討
    緒方 淳, 有木 康雄
    日本音響学会,2003年春季研究発表会, 2003, Japanese, 日本音響学会, 未記入, Domestic conference
    Oral presentation

  • 指先追跡による人物ポインティングを用いた実時間情報検索~マルチモーダル対話型TVに向けて~
    佐古 淳, 熊野 雅仁, 藤本 雅清, 有木 康雄
    第2回情報科学技術フォーラム, 2003, Japanese, 情報処理学会, 未記入, Domestic conference
    Oral presentation

  • 雑音に頑健な音声認識のための時間領域SVD とGMM に基づく音声信号推定
    藤本 雅清, 有木 康雄
    日本音響学会,2003年春季研究発表会, 2003, Japanese, 日本音響学会, 未記入, Domestic conference
    Oral presentation

  • 改良型GMM Based Wiener Filterを用いた実走行車内音声の認識
    藤本 雅清, 有木 康雄
    第2回情報科学技術フォーラム, 2003, Japanese, 情報処理学会, 未記入, Domestic conference
    Oral presentation

  • 音声情報処理を用いたスポーツ実況中継におけるハイライトシーン検出
    金子 剛志, 重森 猛, 緒方 淳, 藤本 雅清, 有木 康雄, 塚田 清志, 濱口 伸, 清瀬 基
    日本音響学会,2003年春季研究発表会, 2003, Japanese, 日本音響学会, 未記入, Domestic conference
    Oral presentation

  • マルチメディア教材を目指した英語リスニング学習システムの開発
    山内 豊, 緒方 淳, 有木 康雄
    電子情報通信学会,総合大会, 2003, Japanese, 電子情報通信学会, 未記入, Domestic conference
    Oral presentation

  • GMMとEMアルゴリズムを用いた加法性雑音及び乗法性歪みに頑健な音声認識 - 実走行車内音声(AURORA3)データベースによる評価 -
    藤本 雅清, 有木 康雄
    日本音響学会,平成15年度秋季研究発表会, 2003, Japanese, 日本音響学会, 未記入, Domestic conference
    Oral presentation

  • 映像文法のためのカット先読み機構を備えた自動ダイジェスト生成システム
    NISHIZAWA Naohiro, KAMAHARA Junzo, SHUNTOH Takeshi, TSUKADA Kiyoshi, ARIKI Yasuo, UEHARA Kuniaki
    電子情報通信学会第13回データ工学ワークショップ, Mar. 2002, Japanese, 電子情報通信学会データ工学研究専門委員会, 倉敷国際ホテル, Domestic conference
    Oral presentation

■ Affiliated Academic Society
  • 言語処理学会
    Feb. 2022 - Present

  • IEEE
    Jan. 1987 - Present

  • 人工知能学会
    Apr. 1986 - Present

  • 日本音響学会
    Apr. 1980 - Present

  • 情報処理学会
    Oct. 1976 - Present

  • 電子情報通信学会
    Apr. 1976 - Present

  • 日本データベース学会
    Apr. 2002 - Mar. 2016

  • 映像情報メディア学会
    Apr. 2000 - Mar. 2016

  • 画像電子学会
    Apr. 1993 - Mar. 2016

■ Research Themes
  • 羽森 茂之
    科学研究費補助金/基盤研究(A), Apr. 2017 - Mar. 2021
    Competitive research funding

  • 有木 康雄
    学術研究助成基金助成金/基盤研究(C), Apr. 2017 - Mar. 2020, Principal investigator
    Competitive research funding

  • 羽森 茂之
    学術研究助成基金助成金/挑戦的研究(萌芽), Jun. 2017 - Mar. 2019
    Competitive research funding

  • 有木 康雄
    科学研究費一部基金/基盤研究(B), Apr. 2014 - Mar. 2017, Principal investigator
    Competitive research funding

  • 滝口 哲也
    科学研究費一部基金/基盤研究(B)特設, Apr. 2013 - Mar. 2017
    Competitive research funding

  • 嶋田 博行
    学術研究助成基金助成金/基盤研究(C), Apr. 2013 - Mar. 2016
    Competitive research funding

  • 有木 康雄
    科学研究費補助金/萌芽研究, Apr. 2012 - Mar. 2015, Principal investigator
    Competitive research funding

  • 頭脳循環「健常者・障がい者の意図認識によるユニバーサルコミュニケーションの研究」
    有木 康雄
    頭脳循環を活性化する若手研究者海外派遣プログラム, 2012, Principal investigator
    Competitive research funding

  • 頭脳循環「健常者・障がい者の意図認識によるユニバーサルコミュニケーションの研究」
    有木 康雄
    頭脳循環を活性化する若手研究者海外派遣プログラム, 2011, Principal investigator
    Competitive research funding

  • 学生支援GP「地域に根ざし人に学ぶ共生的人間力」
    有木 康雄
    新たな社会的ニーズに対応した学生支援プログラム, 2011, Principal investigator
    Competitive research funding

  • 頭脳循環「健常者・障がい者の意図認識によるユニバーサルコミュニケーションの研究」
    有木 康雄
    頭脳循環を活性化する若手研究者海外派遣プログラム, 2010, Principal investigator
    Competitive research funding

  • 学生支援GP「地域に根ざし人に学ぶ共生的人間力」
    有木 康雄
    2010, Principal investigator
    Competitive research funding

  • 学生支援GP「地域に根ざし人に学ぶ共生的人間力」
    有木 康雄
    2009, Principal investigator
    Competitive research funding

  • 学生支援GP「地域に根ざし人に学ぶ共生的人間力」
    有木 康雄
    2008, Principal investigator
    Competitive research funding

  • 有木 康雄
    科学研究費補助金/萌芽研究, 2006, Principal investigator
    Competitive research funding

  • 吉本 雅彦
    科学研究費補助金/基盤研究(A), 2006
    Competitive research funding

  • 音素・単語・フレーズの同時スポッティングによる対話音声の解析評価
    有木 康雄
    日本学術振興会, 科学研究費助成事業, 重点領域研究, 龍谷大学, 1995 - 1995
    人間が音声を聞き取り内容を理解する場合には、文のレベルで常に聞いているのではなく、音素、単語、フレーズ、文といった階層を選択していると考えられる。人間と機械の音声対話においても、文レベルで完全に解析するのではなく、解析できるところだけを解析して繋ぎ合わせ意味を補完する方法は、対話の文法が完全ではないので、有効でありかつ実現可能な方法と考えられる。解析可能な単位としては、単語、フレーズ、部分文を考えることができるが、本研究では、単語・フレーズに限定して連続音声からこれを抽出(スポッティング)し、対話音声を評価することを目的としている。ワードスポッティングの技術は連続音声から既知語と未知語を判定しながら、既知語のみを抽出する技術である。平成6年度までの研究により、既知語と未知語の判別処理は、「連続音声のある時刻で既知語が終結するという事象の事後確率」を求める処理と等価であることを明らかにしてきた。この事後確率は、連続音声がすべて入力されてから計算されるため、実時間の処理が難しい。本研究では、発話の終了を待つことなく、フレーム同期で既知語の事後確率を計算してワードスポッティングを行う方式について研究した。この方式では、前向き尤度を利用できるため実時間向きのアルゴリズムを実現できる可能性がある。この提案手法を、平成6年度までに研究した方式と比較し、対話音声の解析手法を評価した。また、従来の代表的なワードスポッティング手法であるAT&T, BBN, NECの方式とも比較評価した。

  • Developmental Research of Drawing Image Processor for Translation, Automatic Transformation and Fair Printing
    SAKAI Toshiyuki, OKADA Yoshihiro, MINOH Michihiko, ARIKI Yasuo
    Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research, Grant-in-Aid for Developmental Scientific Research, Kyoto University, 1985 - 1986
    The purpose of this research is to develop a drawing image processor which can translate words or phrases contained in drawing images from one language to another, according to human specification of image size, line width,character font sixe, etc. Drawing images are, at first, fed from a image scanner to data flow processor. The data flow processor extracts the features such as connected regions, vertical or horizontal lines. The features are,then, transmitted from the data flow processor to a host processor. The host processor identifies the character string and extracts the connecting points. These structural data of input drawing images are transformed according to the specification. Then, an output image is generated using the transformed structural data and the input image. In this drawing image processor, segmentation and labeling of constituents and translation, transformation, editing and fair printing of drawing images are included. At 1985, we developed the first version of the drawing image processor. At the year 1986, we improved and intensified the processor to a practical level in the following points. 1. Error occurs frequently in structural analysis when image quality decreases due to contact of characters to lines, or discontinuity of lines. To solve this problem, knowledge base system was constructed by injecting the knowledge concerning "line", "character", "dotted line" and "arrow". 2. Graphics software was developed to increase the quality of output drawing images by using CAD techniques. Fair printing is obtained by this software. 3. Subpattern recognition of chinese character in the low quality drawing is studied.

■ Industrial Property Rights
  • 物体認識システム及び物体認識方法
    ARIKI YASUO, 小篠 裕子, 西村 仁志
    14/190,539, 26 Feb. 2014, 大学長, 9508019, 29 Nov. 2016
    Patent right

  • 物体分類装置、物体分類方法、物体認識装置及び物体認識方法
    ARIKI YASUO, 小篠 裕子, 堀 貴博, 中谷 良平
    特願2011-282103, 22 Dec. 2011, 大学長, 特許5828552, 30 Oct. 2015
    Patent right

  • 物体分類装置、物体分類方法、物体認識装置及び物体認識方法(アメリカ)
    ARIKI YASUO, 小篠 裕子, 堀 貴博, 中谷 良平
    13/724,220, 21 Dec. 2012, 大学長, US8873868, 28 Oct. 2014
    Patent right

  • 雑音検出装置および雑音検出方法
    TAKIGUCHI TETSUYA, ARIKI YASUO, 三宅 信之
    特願2006-336336, 13 Dec. 2006, 大学長, 特許4787979, 29 Jul. 2011
    Patent right

TOP