WEKO3
アイテム
{"_buckets": {"deposit": "c9414b68-717f-4563-8625-66472a36707d"}, "_deposit": {"created_by": 3, "id": "28862", "owners": [3], "pid": {"revision_id": 0, "type": "depid", "value": "28862"}, "status": "published"}, "_oai": {"id": "oai:waseda.repo.nii.ac.jp:00028862", "sets": ["2080"]}, "author_link": ["50020", "50017", "50019", "50018", "50021"], "item_10003_biblio_info_90": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2005-11-16", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "150", "bibliographicPageStart": "139", "bibliographic_titles": [{}]}]}, "item_10003_creator_87": {"attribute_name": "著者別名", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Asahara, Masayuki"}], "nameIdentifiers": [{"nameIdentifier": "50020", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Matsumoto, Yuji"}], "nameIdentifiers": [{"nameIdentifier": "50021", "nameIdentifierScheme": "WEKO"}]}]}, "item_10003_description_123": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"subitem_description": "text", "subitem_description_type": "Other"}]}, "item_10003_description_88": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "During the process of unknown word detection in Chinese word segmentation, many detected word candidates are invalid. These false unknown word candidates deteriorate the overall segmentation accuracy, as it will affect the segmentation accuracy of known words. Therefore, we propose to eliminate as many invalid word candidates as possible by a pruning process. Our experiments show that by cutting down the invalid unknown word candidates, we improve the segmentation accuracy of known words and hence that of the overall segmentation accuracy.", "subitem_description_type": "Abstract"}]}, "item_10003_publisher_116": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "Logico-Linguistic Society of Japan"}]}, "item_10003_relation_124": {"attribute_name": "シリーズ", "attribute_value_mlt": [{"subitem_relation_name": [{"subitem_relation_name_text": "Oral Session"}]}]}, "item_10003_relation_125": {"attribute_name": "関係URI", "attribute_value_mlt": [{"subitem_relation_name": [{"subitem_relation_name_text": "http://www.decode.waseda.ac.jp/PACLIC18/"}]}]}, "item_10003_subject_100": {"attribute_name": "日本十進分類法", "attribute_value_mlt": [{"subitem_subject": "801.06", "subitem_subject_scheme": "NDC"}]}, "item_10003_subject_110": {"attribute_name": "米国議会図書館件名標目", "attribute_value_mlt": [{"subitem_subject": "Computational linguistics--Congresses", "subitem_subject_scheme": "LCSH"}]}, "item_10003_text_144": {"attribute_name": "URI", "attribute_value_mlt": [{"subitem_text_value": "http://hdl.handle.net/2065/567"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Goh, Chooi-Ling"}], "nameIdentifiers": [{"nameIdentifier": "50017", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "浅原, 正幸"}], "nameIdentifiers": [{"nameIdentifier": "50018", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000080379528", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "https://nrid.nii.ac.jp/ja/nrid/1000080379528"}]}, {"creatorNames": [{"creatorName": "松本, 裕治"}], "nameIdentifiers": [{"nameIdentifier": "50019", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "1000010211575", "nameIdentifierScheme": "NRID", "nameIdentifierURI": "https://nrid.nii.ac.jp/ja/nrid/1000010211575"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2016-11-28"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "oral-11.pdf", "filesize": [{"value": "490.4 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 490400.0, "url": {"label": "oral-11.pdf", "url": "https://waseda.repo.nii.ac.jp/record/28862/files/oral-11.pdf"}, "version_id": "37e60dc9-707d-4db6-a9ff-9fb6330800b2"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "conference paper", "resourceuri": "http://purl.org/coar/resource_type/c_5794"}]}, "item_title": "Pruning False Unknown Words to Improve Chinese Word Segmentation", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Pruning False Unknown Words to Improve Chinese Word Segmentation", "subitem_title_language": "en"}]}, "item_type_id": "10003", "owner": "3", "path": ["2080"], "permalink_uri": "http://hdl.handle.net/2065/567", "pubdate": {"attribute_name": "公開日", "attribute_value": "2008-04-28"}, "publish_date": "2008-04-28", "publish_status": "0", "recid": "28862", "relation": {}, "relation_version_is_last": true, "title": ["Pruning False Unknown Words to Improve Chinese Word Segmentation"], "weko_shared_id": -1}
Pruning False Unknown Words to Improve Chinese Word Segmentation
http://hdl.handle.net/2065/567
http://hdl.handle.net/2065/5672673a159-1953-48db-9b2d-12aecf1ccdc6
名前 / ファイル | ライセンス | アクション |
---|---|---|
oral-11.pdf (490.4 kB)
|
|
Item type | 会議発表論文 / Conference Paper(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2008-04-28 | |||||
タイトル | ||||||
言語 | en | |||||
タイトル | Pruning False Unknown Words to Improve Chinese Word Segmentation | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_5794 | |||||
資源タイプ | conference paper | |||||
著者 |
Goh, Chooi-Ling
× Goh, Chooi-Ling× 浅原, 正幸× 松本, 裕治 |
|||||
著者別名 |
Asahara, Masayuki
× Asahara, Masayuki× Matsumoto, Yuji |
|||||
抄録 | ||||||
内容記述タイプ | Abstract | |||||
内容記述 | During the process of unknown word detection in Chinese word segmentation, many detected word candidates are invalid. These false unknown word candidates deteriorate the overall segmentation accuracy, as it will affect the segmentation accuracy of known words. Therefore, we propose to eliminate as many invalid word candidates as possible by a pruning process. Our experiments show that by cutting down the invalid unknown word candidates, we improve the segmentation accuracy of known words and hence that of the overall segmentation accuracy. | |||||
書誌情報 | p. 139-150, 発行日 2005-11-16 | |||||
件名 | ||||||
主題Scheme | NDC | |||||
主題 | 801.06 | |||||
件名 | ||||||
主題Scheme | LCSH | |||||
主題 | Computational linguistics--Congresses | |||||
出版者 | ||||||
出版者 | Logico-Linguistic Society of Japan | |||||
データタイプ | ||||||
内容記述タイプ | Other | |||||
内容記述 | text | |||||
HDL URI | ||||||
http://hdl.handle.net/2065/567 |