Content analysis is a method for summarizing any form of content by counting various aspects of the content. This enables a more objective evaluation than comparing content based on the impressions of a listener. For example, an impressionistic summary of a TV program, is not content analysis. Nor is a book review: it’s an evaluation.
Content analysis, though it often analyses written words, is a quantitative method. The results of content analysis are numbers and percentages. After doing a content analysis, you might make a statement such as "27% of programs on Radio Lukole in April 2003 mentioned at least one aspect of peacebuilding, compared with only 3% of the programs in 2001."
Though it may seem crude and simplistic to make such statements, the counting serves two purposes:
Also, the fact that programs have been counted implies that somebody has listened to every program on the station: content analysis is always thorough.
As you’ll see below, content analysis can actually be a lot more subtle than the above example. There’s plenty of scope for human judgement in assigning relevance to content.
The content that is analysed can be in any form to begin with, but is often converted into written words before it is analysed. The original source can be printed publications, broadcast programs, other recordings, the internet, or live situations. All this content is something that people have created. You can’t do content analysis of (say) the weather - but if somebody writes a report predicting the weather, you can do a content analysis of that.
All this is content...
Newspaper items, magazine articles, books, catalogues
Web pages, advertisements, billboards, posters, graffiti
Radio programs, news items, TV programs
Photos, drawings, videos, films, music
Speeches, interviews, plays, concerts
Gestures, rooms, products in shops
That’s one way of looking at content. Another way is to divide content into two types: media content and audience content. Just about everything in the above list is media content. But when you get feedback from audience members, that’s audience content. Audience content can be either private or public. Private audience content includes:
Public audience content comes from communication between all the audience members, such as:
The analysis of private audience content, in verbal form, is covered in chapter 12, on depth interviews. Therefore this chapter will focus mainly on public audience content and on media content.
If you’re also doing audience research, the main reason for also doing content analysis is to be able to make links between causes (e.g. program content) and effect (e.g. audience size). If you do an audience survey, but you don’t systematically relate the survey findings to your program output, you won’t know why your audience might have increased or decreased. You might guess, when the survey results first appear, but a thorough content analysis is much better than a guess.
For a media organization, the main purpose of content analysis is to evaluate and improve its programming. All media organizations are trying to achieve some purpose. For commercial media, the purpose is simple: to make money, and survive. For public and community-owned media, there are usually several purposes, sometimes conflicting - but each individual program tends to have one main purpose.
As a simple commercial example, the purpose of an advertisement is to promote the use of the product it is advertising: first by increasing awareness, then by increasing sales. The purpose of a documentary on AIDS in southern Africa might be to increase awareness of ways of preventing AIDS, and in the end to reduce the level of AIDS. Often, as this example has shown, there is not a single purpose, but a chain of them, with each step leading to the next.
Using audience research to evaluate the effects (or outcome) of a media project is the second half of the process. The first half is to measure the causes (or inputs) - and that is done by content analysis. For example, in the 1970s a lot of research was done on the effects of broadcasting violence on TV. If people saw crimes committed on TV, did that make them more likely to commit crimes? In this case, the effects were crime rates, often measured from police statistics. The problem was to link the effects to the possible causes. The question was not simply "does seeing crime on TV make people commit crimes?" but "What types of crime on TV (if any) make what types of people (if any) commit crimes, in what situations?" UNESCO in the 1970s produced a report summarizing about 3,000 separate studies of this issue - and most of those studies used some form of content analysis.
When you study causes and effects, as in the above example, you can see how content analysis differs from audience research:
The entire process - linking causes to effects, is known as evaluation.
Content analysis has six main stages, each described by one section of this chapter:
Section 6 gives some examples, illustrating the wide range of uses for content analysis.
Content is huge: the world contains a near-infinite amount of content. It’s rare that an area of interest has so little content that you can analyse it all. Even when you do analyse the whole of something (e.g. all the pictures in one issue of a magazine) you will usually want to generalize those findings to a broader context (such as all the issues of that magazine). In other words, you are hoping that the issue you selected is a representative sample. Like audience research, content analysis involves sampling, as explained in chapter 2. But with content analysis, you’re sampling content, not people. The body of information you draw the sample from is often called a corpus – Latin for body.
Unless you want to look at very fine distinctions, you don’t need a huge sample. The same principles apply for content analysis as for surveys: most of the time, a sample between 100 and 2000 items is enough - as long as it is fully representative. For radio and TV, the easiest way to sample is by time. How would you sample programs during a month? With 30 days, you might decide on a sample of 120. Programs vary greatly in length, so use quarter-hours instead. That’s 4 quarter-hours each day for a month. Obviously you need to vary the time periods to make sure that all times of day are covered. An easy way to do this, assuming you’re on air from 6 am to midnight, is to make a sampling plan like this:
and so on. After 18 days you’ll have covered all quarter-hours. After 30 days you’ll have covered most of them twice. If that might introduce some bias, you could keep sampling for another 6 days, to finish two cycles. Alternatively, you could use an 18-minute period instead of 15, and still finish in 30 days.
With print media, the same principles apply, but it doesn’t make sense to base the sample on time of day. Instead, use page and column numbers. Actually, it’s a lot easier with print media, because you don’t need to organize somebody (or program a computer) to record the on-air program at regular intervals.
When you set out to do content analysis, the first thing to acknowledge is that it’s impossible to be comprehensive. No matter how hard you try, you can’t analyse content in all possible ways. I’ll demonstrate, with an example. Let’s say that you manage a radio station. It’s on air for 18 hours a day, and no one person seems to know exactly what is broadcast on each program. So you decide that during April all programs will be taped. Then you will listen to the tapes and do a content analysis.
First problem: 18 hours a day, for 30 days, is 540 hours. If you work a 40-hour week, it will take almost 14 weeks to play the tapes back. But that’s only listening - without pausing for content analysis! So instead, you get the tapes transcribed. Most people speak about 8,000 words per hour. Thus your transcript has up to 4 million words – about 40 books the size of this one.
Now the content analysis can begin! You make a detailed analysis: hundreds of pages of tables and summaries. When you’ve finished (a year later?) somebody asks you a simple question, such as "What percentage of the time are women’s voices heard on this station?"
If you haven’t anticipated that question, you’ll have to go back to the transcript and laboriously calculate the answer. You find that the sex of the speaker hasn’t always been recorded. You make an estimate (only a few days’ work, if you’re lucky) then you’re asked a follow-up question, such as "How much of that time is speech, and how much is singing?"
Oops! The transcriber didn’t bother to include the lyrics of the songs broadcast. Now you’ll have to go back and listen to all those tapes again!
This example shows the importance of knowing what you’re looking for when you do content analysis. Forget about trying to cover everything, because (a) there’s too much content around, and (b) it can be analysed in an infinite number of ways. Without having a clear focus, you can waste a lot of time analysing unimportant aspects of content. The focus needs to be clearly defined before you begin work.
An example of a focus is: "We’ll do a content analysis of a sample of programs (including networked programs, and songs) broadcast on Radio Lukole in April 2003, with a focus on describing conflict and the way it is managed."
To be able to count content, your corpus needs to be divided into a number of units, roughly similar in size. There’s no limit to the number of units in a corpus, but in general the larger the unit, the fewer units you need. If the units you are counting vary greatly in length, and if you are looking for the presence of some theme, a long unit will have a greater chance of including that theme than will a short unit. If the longest units are many times the size of the shortest, you may need to change the unit - perhaps "per thousand words" instead of "per web page." If the interviews vary greatly in length, a time-based unit may be more appropriate than "per interview."
Depending on the size of your basic unit, you’ll need to take a different approach to coding. The main options are (from shortest to longest):
The longer the unit, the more difficult and subjective is the work of coding it as a whole. Consider breaking a document into smaller units, and coding each small unit separately. However, if it’s necessary to be able to link different parts of the document together, this won’t make sense.
When you are analysing audience content (not media content) the unit will normally be based on the data collection format and/or the software used to store the responses. The types of audience content most commonly produced from research data are
In any of these cases, the unit can be either a person or a comment. Survey analysis is always based on individuals, but content analysis is usually based on comments. Most of the time this difference doesn’t affect the findings, but if some people make far more comments than others, and these two groups give different kinds of comments, it will be best to use individuals as the unit.
Usually the corpus is a set of the basic units: for example, a set of 13 episodes in a TV series, an 85-message discussion on an email listserv over several months, 500 respondents’ answers to a survey question - and so on. What varies is (a) the number of units in the corpus, and (b) the size of the units.
Differences in these figures will require different approaches to content analysis. If you are studying the use of language, focusing on the usage of new words, you will need to use a large corpus - a million words or so - but the size of the unit you are studying is tiny: just a single word. The word frequencies can easily be compared using software such as Wordsmith.
At the other extreme, a literary scholar might be studying the influence of one writer on another. The unit might be a whole play, but the number of units might be quite small - perhaps the 38 plays of Shakespeare compared with the 7 plays of Marlowe. If the unit is a whole play, and the focus is the literary style, a lot of human judgement will be needed. Though the total size of the corpus could be much the same as with the previous example, far more work is needed when the content unit is large - because detailed judgements will have to be made to summarize each play.
Often, some units overlap other units. For example, if you ask viewers of a TV program what they like most about it, some will give one response, and others may give a dozen. Is your unit the person or the response? (Our experience: it’s best to keep track of both types of unit, because you won’t know till later whether using one type of unit will produce a different pattern of responses.)If you would like to learn more about content analysis and many other research methods why not buy a copy of Know Your Audience, our uniquely comprehensive and yet practical guide to audience and media research? Just $20 for a 384 page PDF that is more usable than scrolling through many web pages. Click the button below to buy a copy simply and securely using PayPal.