This website contains resources and information related to the acWaC project (an acronym for academic Web-as-Corpus). The project aims to provide a pool of corpus resources to study institutional academic English, i.e. the wide range of texts produced in English by higher education institutions to communicate with their stakeholders (students, staff, alumni, etc.), which are likely to feature prominently on universities' and other higher education institutions' websites. Our long-term goal is to produce language resources (e.g. lexical databases, writing aids) to assist professional writers and translators working with this specialized variety of English.

 

Within the project, we created acWaC-EU, a corpus of web pages in English crawled from the websites of European universities. It is a large (nearly 90 million words) monolingual comparable corpus affording comparison of native and lingua franca (ELF) varieties of English in the institutional academic domain. The acWaC-EU page contains information on corpus composition and the scripts that were used for its creation.

 

For further information on acWaC-EU and/or the acWaC project, please visit the publications section.

Reference

Ferraresi, A. and Bernardini, S. (forthcoming). The academic Web-as-Corpus. In Evert, S., Stemle, E., and Rayson, P. (eds). Proceedings of the 8th Web as Corpus workshop (WAC-8), Lancaster, UK.