on July 22, 2011 by in Public Health, Statistical Methods, Comments (0)

# Basic Statistics for Epidemiology: Prevalence vs Incidence

Author Sara Muller, University of Keele (based on material from Health Knowledge)

Epidemiology vs Clinical Medicine

There are lots of definitions of epidemiology, but the one below is fairly comprehensive:

“…the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to control of health problems”

In short, epidemiology is sort of the who what, when, why, how and where of disease.  It doesn’t involve any experimentation or clinical trials, so it is really of interest to public health, because it is just about what happens.

A concept most people are probably familiar with is ‘clinical medicine’, where interest is in the individual patient and you want to see large and clinical relevant changes in that patient. In the case of a clinical trial, you might be interested in these relatively large changes in a select group of people. For example: Will this painkiller reduce the level of pain in this patient so that they are comfortable?

Epidemiology differs from clinical medicine, because the unit of interest is the population, not the individual. In epidemiology, we are interested in small changes that have an effect on the population level. For example, the slight raising of your cholesterol leads to only a very slight increased risk of you dying from a coronary heart disease, but if everyone in a population increased their cholesterol level very slightly, there would be a big jump in the total number of CHD deaths in the population.

Factors Affecting Health

There are many factors that can affect health.

At the biological level, our genetic heritage may make us susceptible to specific conditions such as hypertension, sickle cell anaemia, Cancer and haemophilia.

Then there is how we live – how much we exercise, what we eat, what we drink and what we might smoke. That some societies seem to live longer than others probably reflects differences in the interaction of these factors.

Then there is the environment. Hippocrates, ‘the father of medicine’, was one of the first to identify the environment as determinant. He noted that the seasons had an impact and the influence of where people lived.  It was a prophetic suggestion, later borne out in modern society. Along with the innovations of Industrial Revolution came urbanisation and in particular problems of squalid housing, poor sanitation and polluted water supplies. Concern at the combined effect of these factors gave rise to the first Public Health Act of 1848.

More latterly, the shrinking of the earth through advances in transport has now made it easier than ever for individuals – and diseases – to move freely across the globe. Whilst economic progress has many advantages there can be positive and negative impacts on health.

We often have to think about these more in epidemiology and public health than in other areas of medicine because we do not control for these things, e.g. in a RCT, we randomise out many of the effects

Quantifications of disease in population

Absolute counts of the incidence of disease is important, but just relying on counts makes it difficult for comparisons to be made. The bigger a population, the larger the number of cases is likely to be. One way to get around this is to consider the number of cases in relation to the population as a whole. Several possible approaches are possible and we shall review some of the more frequently encountered methods.

Ratio (also known as odds)

Calculating the ratio is the foundation for the calculation of many other measures.

a number/another number

where the two numbers are separate e.g. number of males, number of females and the same person is not counted in each group and the size of both or either group can change.

It allows the comparison between the number of people with disease in one population with the number of people with the disease in another population.  It doesn’t have any unit.

Let’s take an example of the number of deaths in males and females. We have the total number of deaths in each gender.

2005 all cause mortality

Females                      269,368

Males                          243,324

Divide number of deaths in females, by number of deaths in males to get the female to male death ratio.

$269,368/243,324 = 1.107$

Proportion

Proportion is the fraction of a population who have a characteristic of interest.  Those who are included in the numerator are also included in the denominator.  As the numerator is a subgroup of the denominator the value is always between 0 and 1.  Values can be expressed as a percentage by multiplying by 100.  Again this is a dimensionless quantity, so it has no units.

Examples of proportion include:

• Proportion of men in a population
• Proportion of people in a sample with diabetes

Using the same example as before, let’s consider the proportion of deaths that are in females, rather than the ratio of female to male deaths.

The proportion of deaths that were in females:

$269368/512692 = 0.525$

We can alternatively present this as 52.5%.

Rate

Rates provide a common time frame and unit of population. They allow a direct comparison of frequency of disease. Two requirements for the rate calculation are: (1) time frame and; (2)  a unit of population.

$Frequency of observed event in a given time period/total number in whom event might occur in that time period.$

The denominator includes all those who are eligible to appear in the numerator i.e., those at risk of the observed event.

Time frame is important, especially if your event of interest is death: everyone will die at some point, so the time period has to be defined to get a sensible answer

An example is based around a group of 742 people with knee pain. They were followed up for 3 years after completing a survey. During the follow-up, the group as a whole consulted their GPs 202 times for knee surgery.

Rate of consultation for knee surgery was:

$202/742*3 = 202/2226 = 0.091 consultation per person per year.$

Prevalence Rate

The prevalence rate is a measure of the proportion of a population affected by a specific condition in a specified time period.  Prevalence itself is the number of people in a population who have the disease of interest in a particular time period.  This isn’t very useful, as the larger the number of people in the population, the more people might have the disease, so a prevalence rate (often referred to simply as prevalence) is what we really want to know.

$Number of cases of disease in given time period/total number in population in that time period$

Be careful to specify your total population properly in terms of socio-demographic and environmental factors. In looking at the prevalence of cervical cancer, you wouldn’t want to include men in the denominator as the prevalence estimate would be far too low.

Prevalences are usually given in three ‘types’:

1. Point prevalence: relates to prevalence with respect to a specific point in time – Did you have an asthma attack on Monday?
2. Period prevalence: related to prevalence over a defined period of time – Did you have an asthma attack in January?

An example is based on the number of responders to a survey reporting knee pain in the past month.

All responders to a survey aged 50+                                                   787

Responders reporting knee pain in the past month                           287

The one prevalence rate of knee pain in the responders of the survey:

$2874/7878 = 0.365$

Alternatively, this can be presented as 36.5% or 365 per 1,000 responders

Incidence Rate

Incidence is the number of new cases of a disease in a population.  As when considering prevalence, because of different population sizes, it is usual to consider a rate.

$Number of new cases of disease in given time period/total number in population at risk at that time period$

For the calculation:

• The time period should be specified, as the number of incident cases can be made arbitrarily large or small depending on the length of the time period being considered.
• ‘At Risk’ population.  People who already have the disease at the start of the time period are not included in the denominator – if they already have the disease, they are not at risk of developing it.

Depending on your question, you might want to consider first ever instances of disease, or just new cases during your time period.  For example, if you consider back pain, you might want any flu where that episode of flu started during your 1 month time period of interest, or you might want any flu that started during your 1 month time period and where that was the first time that person had had flu.  This decision will change who is a member of our ‘at risk’ population.  Others ways of defining your ‘at risk’ group might be age, immunisation status, gender (don’t look at cervical cancer in men!)

Responders to BHPS with new CVD in 2007                                       264

Responders to BHPS with no new CVD in 2007                               1309

The one year incidence in CVD in responders to this survey is:

$264 /13097+264 = 0.020 per year$

Alternatively, this can be expressed as 2.0% per year or 20 per 1000 responders per year

Prevalence vs Incidence

So hopefully the difference between prevalence and incidence rates are clear, but they are related – along with the average duration of disease.  The following relationships hold:

Low incidence, long duration – chronic diseases, e.g. asthma, diabetes

High incidence, short duration – acute, common diseases, e.g. cold, chicken pox in children

Preventative measures might lower incidence, e.g. vaccination, public health campaigns, whereas clinical interventions may reduce decrease duration, or decrease mortality, resulting in an increase in disease duration.

This whole process can be seen in the following diagram.

Prevalence rates are generally used to describe the extent of a disease in a particular population whereas incidence rates look at the rate at which new cases of disease develop.  Whilst prevalence can be affected by how long people live with a condition, incidence does not take this into account

Prevalence is descriptive, often demonstrating public health ‘need’. On the other hand, incidence is useful for studying the causes of disease (the aetiology) or to look at the order in which events occur.  In disease with long durations and very low levels of incidence, there may be little difference between prevalence and incidence.