Audience Dialogue

Sustainable English - a vocabulary

While writing our page on Global English we discovered that the best vocabulary for students of English to aim at is around 15,000 words. With a vocabulary of that size, you should have a sustainable knowledge of English. That means when you find a word you don't know, you can usually work out from the context of the sentence. In a document of average difficulty, there will be less than 2 words in every 100 that you do not know.

Andrei Sokolov, who is learning English, read that web page, and emailed us to ask "Where can I find a list of the 15,000 commonest words in English?" A good question. So we searched, but could not find such a list. There are several published word frequency lists with around 2,000 to 8,000 different words, but we couldn't find any with 15,000.

So we decided to compile a list of the 15,000 commonest words. The most complete source publicly available was the British National Corpus (known as the BNC): a count of 100 million words - equivalent to the full text of about a thousand books. So we downloaded a list of the commonest words in the BNC, and began to extract the commonest 15,000. This was much more difficult that we'd foreseen. It took a lot of work to convert the BNC format to something easily usable by learners of English. But now it's done, and here are two versions of the word list:

Word list in alphabetical order

Word list in descending frequency order

A warning, before you click on those: they are fairly big files (192K each). Also, they are text files, not HTML, so they will try to open in your text editor. If you have an old version of Notepad, Simple Text, etc, you may get an error message saying that the file is too big to open. In that case, save it to hard disk, then open it with a spreadsheet (such as Excel). Tell Excel (or whatever) that this is a tab-delimited file, and it will open easily. If you don't have a spreadsheet, a word processor (even Wordpad - a Windows accessory program) will also work.

Using the spreadsheet, you can go through the list and separate it into three categories: words you know already, words that you almost know or want to know soon, and words you'll leave till later. On the frequency-ordered list, the words you know already will probably be near the top, the words you partly know will be in the middle, and the words you don't know will be towards the end.

A suggestion: look for common words that you don't know yet. By learning those words first, you'll quickly improve your ability to read texts in English. You could even set up three worksheets on the spreadsheet, and separate the words into those three groups, gradually moving them from one worksheet to another as you come to know them. But if you are having no trouble reading this page, you probably know most of the 15,000 words already.

The words in this list (and plenty more) can also be found on the Explore Engrams database, which uses the BNC. However, we've heavily edited the BNC list, to make it more useful for learners of global English. The main edits are...

More details of how how we converted the BNC extract to a word frequency list.