ICAME CORPUS COLLECTION - INFORMATION

 

Kolhapur Corpus


A million-word corpus of printed Indian English texts. See the article by S.V. Shastri, ICAME Journal 12, pp. 15-26.

Example:
        **[txt. a01**] 
0010A01 **<<*3Politics of Job Reservations*0**>> $**[begin leader comment,
        begin 
0020A01 underscoring**] *3^The Bihar Government did not foresee or forestall 
0030A01 the complications that_ followed its decision to_ reserve jobs for
0031A01 backward 
0040A01 classes. ^The present violence in the State has raised the controversy 
0050A01 over the criterion for backwardness-- whether it should be caste or 
0060A01 economic conditions.*0 **[end underscoring, end leader comment**] 
0070A01 $^WHY has the Bihar Government*'s decision to_ reserve jobs for
        backward
0080A01 classes led to a violent outburst? ^It is not such an original idea 
0090A01 that it should have triggered demonstrations and riots or attracted
        all-India 

 

Kolhapur Corpus, WordCruncher version


This is an indexed version of the Kolhapur Corpus. It can only be used with WordCruncher.

Example: 
|CCatg.A|EA01|P1  ***3Politics of Job Reservations*0** 
   |P2  *3The Bihar Government did not foresee or forestall the complications 
that_ followed its decision to_ reserve jobs for backward classes. |P3  The 
present violence in the State has raised the controversy over the criterion for 
backwardness -- whether it should be caste or economic conditions.*0 
   |P4  WHY has the Bihar Government`s decision to_ reserve jobs for backward 
classes led to a violent outburst?  |P5  It is not such an original idea that 
it should have triggered demonstrations and riots or attracted all-India
attention.

 


Conditions on the use of ICAME corpus material

The primary purposes of the International Computer Archive of Modern English (ICAME) are:

  1. collecting and distributing information on (i) English language material available for computer processing; and (ii) linguistic research completed or in progress on this material;
  2. compiling an archive of corpora to be located at the University of Bergen, from where copies of the material can be obtained at cost.

The following conditions govern the use of corpus material distributed through ICAME:

  1. No copies of corpora, or parts of corpora, are to be distributed under any circumstances without the written permission of ICAME.
  2. Print-outs of corpora, or parts thereof, are to be used for bona fide research of a non-profit nature. Holders of copies of corpora may not reproduce any texts, or parts of texts, for any purpose other than scholarly research without getting the written permission of the individual copyright holders, as listed in the manual or record sheet accompanying the corpus in question. (For material where there is no known copyright holder, the person(s) who originally prepared the material in computerized form will be regarded as the copyright holder(s).)
  3. Commercial publishers and other non-academic organizations wishing to make use of part or all of a corpus or a print-out thereof must obtain permission from all the individual copyright holders involved.
  4. The person(s) who originally prepared the material in computerized form must be acknowledged in every subsequent use of it.

Use of ICAME texts within an institution
Though ICAME texts cannot be used and distributed outside the institution making the order, they can be freely used within the institution (department, faculty, university) for the purposes of research and teaching. To prevent any use of the material for commercial and profit-making purposes, it is advisable to limit access to registered computer users within the institution. The way this is done may vary depending upon the institution making the order.