SpecGram >> Vol CL, No 1 >> A Corpus-Linguistic Approach to Demography--Morten H. Albert
A Corpus-Linguistic Approach to Demography
Morten H. Albert
Rose College
This paper develops a corpus-linguistic approach to the demography
of North American cities. In a groundbreaking study, Chomsky (1957:17)
convincingly showed that it can be proven on linguistic grounds alone
that more people live in New York than in Dayton, Ohio. Unfortunately,
Chomsky did not go on to develop his corpus methods any further.
In the present study, all occurrences of the search string "I
live in" were extracted from the American National Corpus. The
word forms immediately to the right of the search string were extracted
from the concordance and ranked according to frequency. Table 1 shows
the ten items with the highest token frequencies:
Table 1: America's 10 largest cities
CITY | TOKENS
|
New York
|
4223
|
Los Angeles
|
3986
|
Harmony
|
3669
|
Chicago
|
3478
|
London
|
3300
|
Houston
|
2906
|
Philadelphia
|
2495
|
Tokyo
|
2335
|
Phoenix
|
1399
|
The Past
|
599
|
The table reveals a number of surprising results. Besides previously
unknown cities, also a number of cities previously believed to lie
elsewhere are actually found in North America. It will be up to
geographers to face the challenge and actually put these on the
American map.
To obtain demographic information about these cities, the token
numbers must be related to the numbers of inhabitants. To calculate
these figures, it is necessary to know the exact number of inhabitants
of one city. This number, divided by its token frequency, gives us the
token-inhabitant ratio that we need. Chomsky (personal communication)
gives an estimate of the population of Dayton, Ohio with 166,179.
Divided by its token frequency 88, the corpus-linguistic demographical
constant (CLDC) is exactly
1888.4
With this figure in mind, I believe that corpus-based armchair
demography has a number of advantages over more traditional methods of
statistical demography. Not only is it fast and economical, it also
points toward new phenomena that traditional demography did not have
any account for.
 |
Poetry Corner |
 |
Phonemic Color--Tong Shunming |
 |
SpecGram Vol CL, No 1 Contents |
|