<研究ノート>OTTサービスを利用したパラレルコーパスの構築方法

徐, 敏徹

このアイテムのアクセス数: 497

https://doi.org/10.14989/281542

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
kulr41_69.pdf		1.71 MB	Adobe PDF	見る/開く

タイトル:	<研究ノート>OTTサービスを利用したパラレルコーパスの構築方法
その他のタイトル:	<Notes>A Method for Constructing Parallel Corpus by Using Over-the-Top Media Service
著者:	徐, 敏徹
著者名の別形:	SEO, Mincheol
キーワード:	Netflix 字幕著作権準口語言語資源 subtitles copyright quasi-spoken language language resource
発行日:	31-Dec-2022
出版者:	京都大学大学院文学研究科言語学研究室
誌名:	京都大学言語学研究
巻:	41
開始ページ:	69
終了ページ:	91
抄録:	本稿は，over-the-top media service（OTTサービス）の字幕を利用して日韓・韓日パラレルコーパスを構築する方法，そして，その際にどのような点に注意する必要があるのかについて紹介することを目的とする。OTTサービスを利用してパラレルコーパスを構築するためには，良質，かつ，十分な量の字幕を提供しているOTTサービスを選択しなければならない。本稿では，OTTサービスとしてNetflixを選択し，Language Reactorを活用して日本語・韓国語の（翻訳）字幕を同時に収集した。なお，OTTサービスを利用して収集した字幕は，重複・修正・重訳などが問題となりうるので，言語研究に用いる際には注意を要する。 The present paper aims to introduce a method for constructing Japanese-Korean and Korean-Japanese parallel corpora using subtitles from an over-the-top (OTT) media service, and it highlights what points need to be considered in doing so. In order to build a parallel corpus using an OTT service, an OTT service that offers high-quality content and a sufficient number of subtitles must be chosen. In this paper, Netflix was selected as the OTT service, and Japanese and Korean (translated) subtitles were simultaneously collected using Language Reactor. Subtitles collected from OTT services may have problems such as repetition, correction, and relay translation, so caution is required when using them for linguistic research.
著作権等:	© 京都大学言語学研究室 © Department of Linguistics, Graduate School of Letters, Kyoto University
DOI:	10.14989/281542
URI:	http://hdl.handle.net/2433/281542
出現コレクション:	第41号

アイテムの詳細レコードを表示する

Export to RefWorks

このリポジトリに保管されているアイテムはすべて著作権により保護されています。