
ICAME CORPUS COLLECTION - INFORMATION
London-Lund Corpus, text, original version
The London-Lund Corpus contains samples of educated spoken British
English, in orthographic transcription with detailed prosodic
marking. It consists of 100 `texts', each of some 5,000 running
words. The text categories represented are spontaneous conversation,
spontaneous commentary, spontaneous and prepared oration, etc.
Example:
1 1 1 10 1 1 B 11 ((of ^Spanish)) . graph\ology# /
1 1 1 20 1 1 A 11 ^w=ell# . /
1 1 1 30 1 1 A 11 ((if)) did ^y/ou _set _that# - /
1 1 1 40 1 1 B 11 ^well !J\oe and _I# /
1 1 1 50 1 1 B 11 ^set it betw\een _us# /
1 1 1 60 1 1 B 11 ^actually !Joe 'set the :p\aper# /
1 1 1 70 1 1 B 20 and *((3 to 4 sylls))* /
1 1 1 80 1 1 A 11 *^w=ell# . /
London-Lund Corpus, WordCruncher version I
The text contains prosodic codes, these have been stripped from the
index/word list.
Example text:
|C1
|T1
|S0
<1 B> ((of ^Spanish)) . graph\ology#
<2 A> ^w=ell# .
<3 A> ((if)) did ^y/ou _set _that# -
<4 B> ^well !J\oe and _I#
<5 B> ^set it betw\een _us#
<6 B> ^actually !Joe 'set the :p\aper#
<7 B> and *((3 to 4 sylls))*
<8 A> *^w=ell# .
<9 A> "^m/\ay* I _ask#
Example index/word list:
1 ab
1 ababa
1 aback
2 abacus
2 abandon
1 abatement
1 abbas
London-Lund Corpus, WordCruncher version II
The text contains prosodic codes and these have been kept in the
index/word list.
Example text:
|C1
|T1
|S0
<1 B> ((of ^Spanish)) . graph\ology#
<2 A> ^w=ell# .
<3 A> ((if)) did ^y/ou _set _that# -
<4 B> ^well !J\oe and _I#
<5 B> ^set it betw\een _us#
<6 B> ^actually !Joe 'set the :p\aper#
<7 B> and *((3 to 4 sylls))*
<8 A> *^w=ell# .
<9 A> "^m/\ay* I _ask#
Example index/word list:
11441 a
2 a!!b\out
1 a!!b\ove
2 a!!g\ain
1 a!!g\o
2 a!!gr\ee
1 a!!gr\eeable
Conditions on the use of ICAME corpus material
The primary purposes of the International Computer Archive of
Modern English (ICAME) are:
- collecting and distributing information on (i)
English language material available for computer processing; and
(ii) linguistic research completed or in progress on this
material;
- compiling an archive of corpora to be located at the
University of Bergen, from where copies of the material can be
obtained at cost.
The following conditions govern the use of corpus material
distributed through ICAME:
- No copies of corpora, or parts of corpora, are to be
distributed under any circumstances without the written permission
of ICAME.
- Print-outs of corpora, or parts thereof, are to be used for
bona fide research of a non-profit nature. Holders of copies of
corpora may not reproduce any texts, or parts of texts, for any
purpose other than scholarly research without getting the written
permission of the individual copyright holders, as listed in the
manual or record sheet accompanying the corpus in question. (For
material where there is no known copyright holder, the person(s)
who originally prepared the material in computerized form will be
regarded as the copyright holder(s).)
- Commercial publishers and other non-academic organizations
wishing to make use of part or all of a corpus or a print-out
thereof must obtain permission from all the individual copyright
holders involved.
- The person(s) who originally prepared the material in
computerized form must be acknowledged in every subsequent use of
it.
Use of ICAME texts within an institution
Though ICAME texts cannot be used and distributed outside the
institution making the order, they can be freely used within the
institution (department, faculty, university) for the purposes of
research and teaching. To prevent any use of the material for
commercial and profit-making purposes, it is advisable to limit
access to registered computer users within the institution. The way
this is done may vary depending upon the institution making the
order.
