QQ登录

只需要一步,快速开始

 注册地址  找回密码
查看: 549|回复: 0
打印 上一主题 下一主题

The Web as a Parallel Corpus

[复制链接]
字体大小: 正常 放大

38

主题

7

听众

74

积分

  • TA的每日心情
    奋斗
    2014-5-24 09:33
  • 签到天数: 5 天

    [LV.2]偶尔看看I

    自我介绍
    新者上路
    跳转到指定楼层
    1#
    发表于 2014-4-27 14:28 |只看该作者 |倒序浏览
    |招呼Ta 关注Ta
    The Web as a Parallel Corpus
    Philip Resnik∗ Noah A. Smith V,mA
    University of Maryland Johns Hopkins University
    Parallel corpora have become an essential resource for work in multilingual natural language processing. In this article, we report on our work using the STRAND system for mining parallel text on theWorldWideWeb, first reviewing the original algorithm and results and then presenting a set of significant enhancements. These enhancements include the use of supervised learning based on structural features of documents to improve classification performance, a new contentbased measure of translational equivalence, and adaptation of the system to take advantage of the Internet Archive for mining parallel text from theWeb on a large scale. Finally, the value of these techniques is demonstrated in the construction of a significant parallel corpus for a low-density language pair.

    转自:http://www.nlpir.org/?action-viewnews-itemid-143
    zan
    转播转播0 分享淘帖0 分享分享0 收藏收藏0 支持支持0 反对反对0 微信微信
    您需要登录后才可以回帖 登录 | 注册地址

    qq
    收缩
    • 电话咨询

    • 04714969085
    fastpost

    关于我们| 联系我们| 诚征英才| 对外合作| 产品服务| QQ

    手机版|Archiver| |繁體中文 手机客户端  

    蒙公网安备 15010502000194号

    Powered by Discuz! X2.5   © 2001-2013 数学建模网-数学中国 ( 蒙ICP备14002410号-3 蒙BBS备-0002号 )     论坛法律顾问:王兆丰

    GMT+8, 2024-6-13 01:25 , Processed in 0.270014 second(s), 49 queries .

    回顶部