Enter
RCPCE Profession-specific Corpora

  • Hong Kong Corpus of Spoken English (HKCSE) (1 million words)
    The HKCSE was made available online in June 2008. It comprises four equal sized sub-corpora: academic, business, conversation and public spoken discourses.

  • Hong Kong Corpus of Surveying and Construction Engineering (HKCSCE) (5.7 million words)
    The HKCSCE was compiled in 2009 with the help of colleagues from the Department of Land Surveying and Geo-Informatics, PolyU, professional bodies, Hong Kong SAR Government, and related private organisations. Like the other profession-specific corpora, the HKCSCE was compiled to enhance the understanding of real world language use in this industry in Hong Kong, in particular the study of the patterns of language use and their meanings.

  • Hong Kong Engineering Corpus (HKEC) (9.2 million words)
    The Hong Kong Institution of Engineers (HKIE) is both advisor and one of the providers of data for the project. The project began in April 2007, and was the first large-scale project to collect texts that are representative of the English of the engineering sector in Hong Kong. The HKEC was compiled to enhance the understanding of real world language use in the engineering industry in Hong Kong, in particular the study of the patterns of language use and their meanings. Up to December 2009, the HKEC has about 9.2 million words, from texts provided by over 60 different companies, government departments, associations/institutions related to engineering. The HKEC, which was made publicly available in early 2009, is an educational and research resource for the benefit of engineering professionals, academics and students alike.

  • Hong Kong Financial Services Corpus (HKFSC) (7.3 million words)
    The HKFSC project began in May 2006, aiming to investigate the key words and phraseology of distinctive financial services fields and text types, and to explore the implications of the findings for professionals in the financial services field in particular, and for learners and teachers of business English in general. Basic user-friendly interface search functions are available for the web-based HKFSC. Collaborators include AIA and other professional bodies in the financial services sector as advisors. In July 2007, the 6.7-million-word HKFSC was released. The HKFSC includes primarily written texts, for instance the 2005 annual reports of nineteen major constituent companies of the Hong Kong Hang Seng Index, twelve IPO prospectuses from 2005, fund reports, insurance products description and speeches of Securities and Futures Commission Officials and HKSAR Financial Secretaries made between 1997 and 2006. In early 2008, additional search functions were incorporated to enable more sophisticated usages of the HKFSC. In 2009, more texts were added to the corpus and it is now complete.

  • Hong Kong Budget Speeches Corpus 1997-2017 (279,209 words)
    The RCPCE collected every Budget Speech between 1997 and 2017 to make a unique corpus of this particular genre.

  • Hong Kong Policy Address Speeches Corpus 1997-2017 (285,025 words)
    The RCPCE collected every Policy Address Speech between 1997 and 2017 to make a unique corpus of this particular genre.

  • Corpus of Research Articles 2007 (CRA2007) (5.6 million words)
    The corpus enables users to investigate the patterns of language use in research articles from 39 disciplines.

  • Hong Kong Corpus of Corporate Governance Reports (HKCCGR) (1 million words)
    The one-million word Hong Kong Corpus of Corporate Governance Reports (HKCCGR) consists of the corporate governance reports of 217 companies listed on the Hong Kong Stock Exchange. These companies were carefully chosen to reflect the weighting of the four main sectors listed on the exchange (i.e. finance, utilities, property, and commercial and industrial). The moves which comprise the corporate governance reports were identified and 25 sub-corpora were compiled based on each move.