Title from disc label. "LDC2013T23" Data type: Text. Data sources: Broadcast conversation, broadcast news. Application: Automatic content extraction, content-based retrieval, machine translation, tagging. Authors: Xuansong Li, Stephen Grimes, Stephanie Strassel.
Summary:
"... contains 179,842 tokens of word aligned Chinese and English parallel text enriched with linguistic tags. This material was used as training data in the DARPA GALE (Global Autonomous Language Exploitation) program."--LDC online catalogue.
This resource is supported by the Institute of Museum and Library Services under the provisions of the Library Services and Technology Act as administered by State Library of Iowa.