Audience Dialogue

Know Your Audience: chapter 16
Content analysis - part 4 of 4

7. Software for content analysis

When all the preparation for content analysis has been done, the counting is usually the quickest part - specially if all the data is on a computer file, and software is used for the counting.

Software is an important tool for content analysis, but this page has mentioned it only briefly. Because software and links to it are constantly changing, we have a separate page on content analysis software. For the type of content analysis that you can use to analyse long interviews, our page on qualitative software also has some useful references, covering software such as Nud*ist and Atlas TI.

8. Coming to conclusions

An important part of any content analysis is to study the content that is not there: what was not said. This sounds impossible, doesn’t it? How can you study content that’s not there? Actually, it’s not hard, because there’s always an implicit comparison. The content you found in the analysis can be compared with the content that you (or the audience) expected - or it can be compared with another set of content.

It’s when you compare two corpora (plural of corpus) that content analysis becomes most useful. This can be done either by doing two content analyses at once (using different corpora but the same principles) or comparing your own content analysis with one that somebody else has done. If the same coding frame is used for both, it makes the comparison much simpler.

Comparisons can be:

...and so on. Making a comparison between two matched sets of data will often produce very interesting results.

There’s no need to limit comparisons to two corpora: any number of content analyses can be compared, as long as they used the same principles. With two or three comparisons, results are usually very clear, and with 10 or more, a few corpora usually stand out as different - but with about 4 to 9 comparisons, comparisons can become rather messy.

These comparisons are usually made using cross-tabulation (cross-tabs) and statistical significance testing. A common problem is that unless the sample sizes were huge, there are often too few entries in each cell to make statistical sense of the data. Though it’s always possible to group similar categories together - and easy, if you were using a hierarchical coding frame - the sharpness of comparison can be lost, and the results, though statistically significant, may not have a much practical meaning.

Reporting on a content analysis

Ten people could analyse the same set of content, and arrive at completely different conclusions. For example, the same set of TV programs analysed to find violent content could also be analysed to study the way in which conversations are presented on TV. The focus of the study controls what you notice about the content. Therefore any content analysis report should begin with a clear explanation of the focus – what was included, what was excluded, and why.

No matter how many variables you use in a content analysis, you can’t guarantee that you haven’t omitted an important aspect of the data. Statistical summaries are often highly unsatisfying, specially if you’re not testing a simple hypothesis. If readers don’t understand how you’ve summarized the content, they will be unlikely to accept your conclusions.

Therefore it’s important to select and present some key units of content in the report. This can be done when you describe the coding frame. For each code, find and reproduce a highly typical (and real) example. This is easier when units are short (e.g. radio news items). When units are long (such as whole TV programs) you will need to summarize them, and omit aspects that are important to some people. If you used judges (who will have read through the content already) ask them to select typical examples. If the judges achieve a consensus on this, it will add credibility to your findings.

Another approach is to cite critical cases: examples that only just fitted one code rather than another - together with arguments on why you chose the code you did. This helps readers understand your analysis, as well as making the data more vivid.

Explain your coding principles

When you read a content analysis report, the process can seem forbiddingly objective - something like "37.3% of the references to Company A were highly favourable, and only 6.4% highly unfavourable." However, this seeming objectivity can actually be very subjective. For example, who decides whether a reference is "highly favourable" or just "favourable"? Different people may have different views on this - all equally valid. More difficult still, can you count a reference to Company A if it’s not mentioned by name, but by some degree of association? For example, "All petrochemical factories on the Port River are serious polluters." Does that count as a reference to Company A if its factory is on that river? And what if many of the audience may not know that fact?

The point is that precise-looking percentages are created from many small assumptions. A high-quality report will list all the assumptions made, give the reasons why those assumptions were made, and also discuss how varying some assumptions would affect the results.


Content analysis can produce quite trivial results, particularly when the units are small. The findings of content analysis become much more meaningful when units are large (e.g. whole TV or radio programs) and when those findings can be compared with audience research findings. Unfortunately, the larger the units, the more work the content analysis requires.

When used by itself, content analysis can seem a shallow technique. But it becomes much more useful when it’s done together with audience research. You will be in a position to make statements along the lines of "the audience want this, but they’re getting that." When backed by strong data, such statements are very difficult to disagree with.