Assalamualaikum w.b.t
Today we learn about Corpus Linguistics. Such an interesting term, isn’t it? Corpus Linguistics is the study of language as expressed in samples or in this case is known as corpora or ‘real world’ text. It is an approach to derive at a set of abstract rules by which a natural language is governed or relates to another language. It was originally done by hand, but corpora are now largely derived by an automated process.
The word ‘corpus’ is derived from the Latin word, meaning ‘body’. It may be used to refer to any text in written or spoken form. In modern Linguistics, this term is used to refer to large collections of texts which represent a sample of a particular variety or use of languages that are presented in machine readable form. Scope of studies in corpus lingiustics related to the possible words, structures or uses in a language, their probable occurrence in a language, as well as the description and explanation of the nature, structure and use of language with particular matters such as language acquisition, variation and change.
There are few types of Corpora available nowadays including written or spoken (transcribed) language, modern or old texts, texts from one language or several languages, texts from whole books, even in newspapers, journals, speeches, and extracts of varying length. Corpus Linguistics is now seen as the study of linguistics phenomena through the large collections of machine-readable texts, corpora. These are used within a number of research areas going from the Descriptive Study of the Syntax of a Language to Language Learning. The availability of corpora which are so similar in structure is a valuable resourse for researchers interested in comparing different language varieties. Interestingly, there is also Quranic Corpus. We Muslims can surely benefit from this insightful thing by attending it in a profound manner.
As we are learning Computer Assisted Language Learning, of course the role of computers in Corpus Linguistics is essential. Among the role of computers in Corpus Linguistics are to store huge amount of text, quickly retrieve huge amounts of texts, retrieve words, phrases or whole texts in context, sort out linguistic items, increase reliability in searching, counting and sorting linguistic items, as well as provide accurate probability of occurrence of specific linguistic items.
Some of the Corpus-Related Researches are Computational Linguistics, Historical Linguistics, Lexicography, Machine Translation, Natural Language Processing (NLP), Social Psychology, Sociolinguistics, Stylistics, Computational Linguistics, and many more interesting branches of study.
Later, we learn about something called Concordancer. It is an example of software used for corpus linguistics. Madam Rozina showed us few examples of concordance programs and showed some simple demonstrations on how to use it. Using concordancer, we can do amazing thing such as find out how many times the word ‘Muhammad’ or ‘Islam’ appears in the Quran. We are so thrilled to use the software in the class and search on our own names!