CINXE.COM
Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model</title> <!-- common meta tags --> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <meta name="title" content="Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model"> <meta name="description" content="This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. This technique plays an essential role in downstream applications in Japanese natural language processing systems because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres"/> <meta name="keywords" content="Morphological analysis, Hiragana texts, Bi-LSTM CRF model, Fine-tuning, Domain adaptation"/> <!-- end common meta tags --> <!-- Dublin Core(DC) meta tags --> <meta name="dc.title" content="Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model"> <meta name="citation_authors" content="Jun Izutsu"> <meta name="citation_authors" content="Kanako Komiya"> <meta name="dc.type" content="Article"> <meta name="dc.source" content="Computer Science & Information Technology (CS & IT) Vol. 11, No.23"> <meta name="dc.date" content="2021/12/24"> <meta name="dc.identifier" content="10.5121/csit.2021.112310"> <meta name="dc.publisher" content="AIRCC Publishing Corporation"> <meta name="dc.rights" content="http://creativecommons.org/licenses/by/3.0/"> <meta name="dc.format" content="application/pdf"> <meta name="dc.language" content="en"> <meta name="dc.description" content="This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. This technique plays an essential role in downstream applications in Japanese natural language processing systems because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres."/> <meta name="dc.subject" content="Morphological analysis"> <meta name="dc.subject" content="Hiragana texts"> <meta name="dc.subject" content="Bi-LSTM CRF model"> <meta name="dc.subject" content="Fine-tuning"> <meta name="dc.subject" content="Domain adaptation"> <!-- End Dublin Core(DC) meta tags --> <!-- Prism meta tags --> <meta name="prism.publicationName" content="Computer Science & Information Technology (CS & IT)"> <meta name="prism.publicationDate" content="2021/12/24"> <meta name="prism.volume" content="11"> <meta name="prism.number" content="23"> <meta name="prism.section" content="Article"> <meta name="prism.startingPage" content="123"> <!-- End Prism meta tags --> <!-- citation meta tags --> <meta name="citation_journal_title" content="Computer Science & Information Technology (CS & IT)"> <meta name="citation_publisher" content="AIRCC Publishing Corporation"> <meta name="citation_authors" content="Jun Izutsu and Kanako Komiya"> <meta name="citation_title" content="Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model"> <meta name="citation_online_date" content="2021/12/24"> <meta name="citation_issue" content="11"> <meta name="citation_firstpage" content="123"> <meta name="citation_authors" content="Jun Izutsu"> <meta name="citation_authors" content="Kanako Komiya"> <meta name="citation_doi" content="10.5121/csit.2021.112310"> <meta name="citation_abstract_html_url" content="https://aircconline.com/csit/abstract/v11n23/csit112310.html"> <meta name="citation_pdf_url" content="https://aircconline.com/csit/papers/vol11/csit112310.pdf"> <!-- end citation meta tags --> <!-- Og meta tags --> <meta property="og:site_name" content="AIRCC" /> <meta property="og:type" content="article" /> <meta property="og:url" content="https://aircconline.com/csit/abstract/v11n23/csit112310.html"> <meta property="og:title" content="Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model"> <meta property="og:description" content="This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. This technique plays an essential role in downstream applications in Japanese natural language processing systems because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres."/> <!-- end og meta tags --> <!-- INDEX meta tags --> <meta name="google-site-verification" content="t8rHIcM8EfjIqfQzQ0IdYIiA9JxDD0uUZAitBCzsOIw" /> <meta name="yandex-verification" content="e3d2d5a32c7241f4" /> <!-- end INDEX meta tags --> <link rel="icon" type="image/ico" href="../img/ico.ico"/> <link rel="stylesheet" type="text/css" href="../main1.css" media="screen" /> <style type="text/css"> a{ color:white; text-decoration:none; line-height:20px; } ul li a{ font-weight:bold; color:#000; list-style:none; text-decoration:none; size:10px;} .imagess { height:90px; text-align:left; margin:0px 5px 2px 8px; float:right; border:none; } #left p { font-family: CALIBRI; font-size: 16px; margin-left: 20px; font-weight: 500; } .right { margin-right: 20px; } #button{ float: left; font-size: 14px; margin-left: 10px; height: 28px; width: auto; background-color: #1e86c6; } </style> </head> <body> <div class="font"> <div id="wap"> <div id="page"> <div id="top"> <form action="https://airccj.org/csecfp/library/Search.php" method="get" target="_blank" > <table width="100%" cellspacing="0" cellpadding="0" > <tr class="search_input"> <td width="665" align="right"> </td> <td width="236" > <input name="title" type="text" value="Enter the paper title" class="search_textbox" onclick="if(this.value=='Enter the paper title'){this.value=''}" onblur="if(this.value==''){this.value='Enter the paper title'}" /> </td> <td width="59"> <input type="image" src="../img/go.gif" /> </td> </tr> <tr> <td colspan="3" valign="top"><img src="../img/top1.gif" alt="Academy & Industry Research Collaboration Center (AIRCC)" /></td> </tr> </table> </form> </div> <div id="font-face"> <div id="menu"> <a href="http://airccse.org">Home</a> <a href="http://airccse.org/journal.html">Journals</a> <a href="http://airccse.org/ethics.html">Ethics</a> <a href="http://airccse.org/conference.html">Conferences</a> <a href="http://airccse.org/past.html">Past Events</a> <a href="http://airccse.org/b.html">Submission</a> </div> <div id="content"> <div id="left"> <h2 class="lighter"><font size="2">Volume 11, Number 23, December 2021</font></h2> <h4 style="text-align:center;height:auto;"><a>Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model</a></h4> <h3> Authors</h3> <p class="#left right" style="text-align:">Jun Izutsu<sup>1</sup> and Kanako Komiya<sup>2</sup>, <sup>1</sup>Ibaraki University, Japan, <sup>2</sup>Tokyo University of Agriculture and Technology, Japan</p> <h3> Abstract</h3> <p class="#left right" style="text-align:justify">This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. This technique plays an essential role in downstream applications in Japanese natural language processing systems because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres. </p> <h3> Keywords</h3> <p class="#left right" style="text-align:justify">Morphological analysis, Hiragana texts, Bi-LSTM CRF model, Fine-tuning, Domain adaptation.</p><br> <button type="button" id="button"><a target="blank" href="/csit/papers/vol11/csit112310.pdf">Full Text</a></button> <button type="button" id="button"><a href="http://airccse.org/csit/V11N23.html">Volume 11, Number 23</a></button> <br><br><br><br><br> </div> <div id="right"> <div class="menu_right"> <ul> <li id="id"><a href="http://airccse.org/editorial.html">Editorial Board</a></li> <li><a href="http://airccse.org/arch.html">Archives</a></li> <li><a href="http://airccse.org/indexing.html">Indexing</a></li> <li><a href="http://airccse.org/faq.html" target="_blank">FAQ</a></li> </ul> </div> <div class="clear_left"></div> <br> </div> <div class="clear"></div> <div id="footer"> <table width="100%" > <tr> <td width="46%" class="F_menu"><a href="http://airccse.org/subscription.html">Subscription</a> <a href="http://airccse.org/membership.html">Membership</a> <a href="http://airccse.org/cscp.html">AIRCC CSCP</a> <a href="http://airccse.org/acontact.html">Contact Us</a> </td> <td width="54%" align="right"><a href="http://airccse.org/index.php"><img src="/csit/abstract/img/logo.gif" alt="" width="21" height="24" /></a><a href="http://www.facebook.com/AIRCCSE"><img src="/csit/abstract/img/facebook.jpeg" alt="" width="21" height="24" /></a><a href="https://twitter.com/AIRCCFP"><img src="/csit/abstract/img/twitter.jpeg" alt="" width="21" height="24" /></a><a href="http://cfptech.wordpress.com/"><img src="/csit/abstract/img/index1.jpeg" alt="" width="21" height="24" /></a></td> </tr> <tr><td height="25" colspan="2"> <p align="center">All Rights Reserved ® AIRCC</p> </td></tr> </table> </div> </div> </div> </div> </div> </div> </body> </html>