GrounderUK

Joined 24 January 2015

Hello. You can leave me a message on my talk page. If you have a comment about the contents of this, my user page, please feel free to make a careful change here, if that's easier for you. Please include a link to your talk page. If you expect a reply, please also leave a comment on my talk page. I am here to help (among other things).

My mother tongue is British English. I speak several European languages, so I might call myself un citoyen européen de l'anglophonie. My professional career has included writing and reviewing many hundreds of documents by or for people for whom English is a second or third language (for a few, perhaps a higher ordinal applies). I have been an en.wikipedia and en.wiktionary contributor on an occasional basis for many years.

Wikipedia:Babel
 en This user is a native speaker of the English language.

Keeping it simpler

Keeping it simple is hard. I just aim to make it simpler.

Word choice

Simpler words are usually more common than less simple alternatives, but not always. If the natural word to use is simple enough, using a more common word could be less simple. For example, walk is quite a common word (ranking around 2000) but go, on and foot are all more common. Should I write If you go on foot to work... or If you walk to work...? What about Take the dog for a go on foot!? Actually, the verb walk [215] is slightly more common than the noun foot [214] but the form of the verb walk [66] is less common than the singular of the noun foot [73]. And if we look at the string walk [×244], we see that foot [×176] is a little less common and {go} on foot [×0.27] is almost rare. (The braces { and } around go just indicate that all forms of the verb are counted, so it includes went on foot and going on foot, for example. The string walk does not include strings beginning walk; {walk} [×611] is just plain common: about the same as the word above [×609].)

Putting words together

A word is never going to make a good encyclopedia by itself. A list of words is hardly any better. A series of simple sentences might be good enough.

Putting sentences together

A simple series of simple sentences made up of simple words is not a simple encyclopedia. The sentences need to be organised. Sentences about the same thing belong in the same article. They may even belong in the same sentence. Putting two or more sentences into the same sentence is called conjunction. Simple conjunction uses words like and or but or punctuation like colons and semi-colons; the result is a less simple sentence. Less simple conjunction (of sentences) makes one sentence less important than another. This is called subordination. The result may be a main clause and a subordinate clause, which it is in this sentence...

Paragraphs, sub-sections and sections

The topic of an article will be the sub-topic of some other article, to which it should link (early on). The sub-topics of an article should be in the article itself, with a link to any article which has the sub-topic as its topic (typically a Main Article). Two uncommon words are proximate [×1] and ultimate [×60] [1]. If an article does not link to its proximate category by use of a Category page, there should be a phrase in the first sentence or two of the article's Lead Section. For example, "William Shakespeare was a person from Warwickshire in England who wrote plays and poems, mainly between 1590 and 1613." (there is no category for Warwickshire playwright/poets of the late 16th and early 17th century, perhaps because there would be only one notable person in that category).

Note

Numbers in brackets show how often a word has been used in texts contained in the British National Corpus. I use an × to show relative frequency. "[×60]" means that a word has been used sixty times more often than a word followed by "[×1]". For more common words, I may simply use a number [493]. This is the number of times the word occurs (on average) for each million words in an earlier version of the corpus which has a Creative Commons licence. The word number has a frequency of 493, according to my source.[2] Because the corpus contains about one hundred million words, this means that it occurs there about 49,300 times ${\displaystyle ({\tfrac {493}{1,000,000}}}$  × ${\displaystyle 100,000,000=49,300)}$ .

I have been using the cqpweb website to query the British National Corpus (XML Edition). Sadly, I cannot find any licensing or citation links there, but the creator does request a reference to his article in "published research" (so I hope my citation on this page will do for now).

BNC itself provides a bibliographic reference: "The British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Bodleian Libraries, University of Oxford, on behalf of the BNC Consortium. URL: http://www.natcorp.ox.ac.uk/". It is my belief, however, that its licensing relates to the text within the Corpus, not statistical analysis of those texts (or any subset of them). Put simply, I can tell you how often a word is used but I cannot quote an example from the Corpus [x7]. In case I am wrong: "Data cited herein have been extracted from the British National Corpus, distributed by the University of Oxford on behalf of the BNC Consortium. All rights in the texts cited are reserved." (You will note that no texts have, in fact, been cited.)

I am working on a spreadsheet version of Creative Commons frequency data [2] ("BNC1994") and I shall share a read-only link when it is finished. For now, you may find the draft web version v0.1 of some use. Please feel free to ask if you would like some help. Or leave comments on my talk page,

References

1. Hardie, A (2012). "CQPweb - combining power, flexibility and usability in a corpus analysis tool" (PDF). cqpweb.lancs.ac.uk/usr/index.php?ui=who_the_hell. International Journal of Corpus Linguistics 17 (3). pp. 380–409. Archived from the original (PDF) on 2020-06-16. Retrieved 2020-06-07.
2. "Companion Website for: Word Frequencies in Written and Spoken English: based on the British National Corpus". Archived from the original on 6 November 2019. Retrieved 2020-04-05.