Statistical Methods – Health and Public Health Knowledgeblog http://health.knowledgeblog.org Enhancing your Health Knowledge Wed, 09 May 2012 15:18:56 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.3 Disease Prevalence Modelling http://health.knowledgeblog.org/2011/07/25/disease-prevalence-modelling/ http://health.knowledgeblog.org/2011/07/25/disease-prevalence-modelling/#respond Mon, 25 Jul 2011 12:55:23 +0000 http://health.knowledgeblog.org/?p=128

Author: David Lamb, Salford NHS

(based on Health Knowledge materials)

Disease prevalence modelling estimates the number of people with a disease or risk factor in a population when direct evidence such as local surveys, is not available.  In January 2011 the Association of Public Health Observatories (APHO) published a technical report on prevalence modelling.  It outlines why there is a need, methods, validation procedures, projections and forecasts using models, and the strengths and limitations . They also produce prevalence models for Cancer, Dementia, Diabetes and many others that are available for download.

In this video, David outlines some examples of disease prevalence modelling whilst highlighting the information sources, statistical and validation methods and how the estimates were used.

]]>
http://health.knowledgeblog.org/2011/07/25/disease-prevalence-modelling/feed/ 0
Survival Analysis for Public Health http://health.knowledgeblog.org/2011/07/25/survival-analysis-for-public-health/ http://health.knowledgeblog.org/2011/07/25/survival-analysis-for-public-health/#respond Mon, 25 Jul 2011 11:39:39 +0000 http://health.knowledgeblog.org/?p=101

Author: Dr. Richard Emsley, Post Doctoral Research Fellow, University of Manchester

Survival analysis encompasses a variety of methods for analysing the timing of events. The variable (we are interested in) is the time from a well-defined start point to the occurrence of a particular event or endpoint.  The most typical being death.  However, survival analysis can be used in many kinds of scenarios that have a defined start point: e.g., in social sciences  it is used to analyse events such as birth of children, divorce and marriage.

This video provides an introduction to survival analysis for public health intelligence analysts by Dr. Richard Emsley.  He covers two main topics: survival analysis and the definitions of survival rates.  The exercises that Richard covers will be added into the e-Lab.

]]>
http://health.knowledgeblog.org/2011/07/25/survival-analysis-for-public-health/feed/ 0
Basic Statistics for Epidemiology: Risk http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology-risk/ http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology-risk/#respond Fri, 22 Jul 2011 16:32:44 +0000 http://health.knowledgeblog.org/?p=198

Author: Sara Muller, University of Keele

(based on Health Knowledge materials)

Risk has a very similar meaning in epidemiology as it does in everyday usage – it is about chance.  It is defined by Unwin et al, as “the probability that an event will occur

It is often used to compare the risk of an event between groups.  There are lots of ways to define the groups you might want to compare. For example, socio-demographic factors, or exposure to factors that may cause the disease.

There are several measures of risk, and we will deal with each in turn in this article.

  1. Absolute risk = incidence rate
  2. Relative risk
  3. Attributable risk
  4. Odds Ratio

A worked example deals with the association between smoking and cancer.

Lung Cancer

Yes No
Ever Smoked Yes 70 60
No 20 90

Absolute Risk

Absolute risk of lung cancer by smoking status

Smokers: \[70/70+60 = 70/130\] \[=0.538\]

Non-smokers: \[20/20+90 = 20/110\] \[=0.181\]

Relative Risk

The relative risk is the ratio of absolute risk (incidence rates).  Relative risk measures the strength of association between an exposure and a disease.  Groups are usually defined by exposure to a potential determinant/cause of the disease, but can be similar things, such as gender.

\[incidence rate of disease in group with exposure/incident rate of disease in group without exposure\]

If the result is:

<1        exposure decreases risk of disease

0        exposure has no effect on risk of disease

>1        exposure increases risk of disease

Using the absolute risks above the relative rate would be:

\[0.538/0.181 = 2.97\]

Those people who have ever smoked are 3 times more likely to die of lung cancer over a 15 year period than those who have never smoked.

Attributable Risk

Attributable risk measures the proportion of disease in the population (or just in the exposed group) that can be ‘attributed’ to the exposure.  It can be expressed in any of the same ways as a proportion.

AR population = incidence rate population – incidence rate non-exposed

AR exposed = incidence rate exposed – incidence rate non-exposed

An example of attributable risk is below:

Population incidence rate = \[70+20/70+20+60+90 = 90/240 = 0.375\]

AR population = \[0.375 – 0.181 = 0.194\]

AR exposed = \[0.538 – 0.181 – 0.357\]

This suggest that in the 15 year follow up period, 19% of lung cancer deaths in the population and 36% in smokers can be attributed to smoking.

Odds Ratio

Compare the odds of an event of interest between two groups using a ratio.  It is not the same as a risk ratio, although it will give similar results if the disease is rate.  It is often used in case-control studies.

Odds of exposure among cases/odds of exposure among controls.

If we to continue the example above the odds ratio calculation would be as follows:

Odds of disease in exposed = \[70/20\]

Odds of disease in unexposed =  \[60/90\]

Odds ratio = \[70/20/60/90 = 70*90/20*60 = 5.25\]

Those people how have ever smoked had 5.25 times the odds of developing lung cancer in the 15 year follow up period those who had never smoked.

]]>
http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology-risk/feed/ 0
Basic Statistics for Epidemiology: Prevalence vs Incidence http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology/ http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology/#respond Fri, 22 Jul 2011 13:28:39 +0000 http://health.knowledgeblog.org/?p=172

Author Sara Muller, University of Keele (based on material from Health Knowledge)

Epidemiology vs Clinical Medicine

There are lots of definitions of epidemiology, but the one below is fairly comprehensive:

“…the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to control of health problems”

In short, epidemiology is sort of the who what, when, why, how and where of disease.  It doesn’t involve any experimentation or clinical trials, so it is really of interest to public health, because it is just about what happens.

A concept most people are probably familiar with is ‘clinical medicine’, where interest is in the individual patient and you want to see large and clinical relevant changes in that patient. In the case of a clinical trial, you might be interested in these relatively large changes in a select group of people. For example: Will this painkiller reduce the level of pain in this patient so that they are comfortable?

Epidemiology differs from clinical medicine, because the unit of interest is the population, not the individual. In epidemiology, we are interested in small changes that have an effect on the population level. For example, the slight raising of your cholesterol leads to only a very slight increased risk of you dying from a coronary heart disease, but if everyone in a population increased their cholesterol level very slightly, there would be a big jump in the total number of CHD deaths in the population.

Factors Affecting Health

There are many factors that can affect health.

At the biological level, our genetic heritage may make us susceptible to specific conditions such as hypertension, sickle cell anaemia, Cancer and haemophilia.

Then there is how we live – how much we exercise, what we eat, what we drink and what we might smoke. That some societies seem to live longer than others probably reflects differences in the interaction of these factors.

Then there is the environment. Hippocrates, ‘the father of medicine’, was one of the first to identify the environment as determinant. He noted that the seasons had an impact and the influence of where people lived.  It was a prophetic suggestion, later borne out in modern society. Along with the innovations of Industrial Revolution came urbanisation and in particular problems of squalid housing, poor sanitation and polluted water supplies. Concern at the combined effect of these factors gave rise to the first Public Health Act of 1848.

More latterly, the shrinking of the earth through advances in transport has now made it easier than ever for individuals – and diseases – to move freely across the globe. Whilst economic progress has many advantages there can be positive and negative impacts on health.

We often have to think about these more in epidemiology and public health than in other areas of medicine because we do not control for these things, e.g. in a RCT, we randomise out many of the effects

Quantifications of disease in population

Absolute counts of the incidence of disease is important, but just relying on counts makes it difficult for comparisons to be made. The bigger a population, the larger the number of cases is likely to be. One way to get around this is to consider the number of cases in relation to the population as a whole. Several possible approaches are possible and we shall review some of the more frequently encountered methods.

Ratio (also known as odds)

e-Lab link: http://elabdemo.nweh.org.uk

Calculating the ratio is the foundation for the calculation of many other measures.

a number/another number

where the two numbers are separate e.g. number of males, number of females and the same person is not counted in each group and the size of both or either group can change.

It allows the comparison between the number of people with disease in one population with the number of people with the disease in another population.  It doesn’t have any unit.

Let’s take an example of the number of deaths in males and females. We have the total number of deaths in each gender.

2005 all cause mortality

Females                      269,368

Males                          243,324

Divide number of deaths in females, by number of deaths in males to get the female to male death ratio.

\[ 269,368/243,324 = 1.107 \]

Proportion

Proportion is the fraction of a population who have a characteristic of interest.  Those who are included in the numerator are also included in the denominator.  As the numerator is a subgroup of the denominator the value is always between 0 and 1.  Values can be expressed as a percentage by multiplying by 100.  Again this is a dimensionless quantity, so it has no units.

Examples of proportion include:

  • Proportion of men in a population
  • Proportion of people in a sample with diabetes

Using the same example as before, let’s consider the proportion of deaths that are in females, rather than the ratio of female to male deaths.

The proportion of deaths that were in females:

\[ 269368/512692 = 0.525 \]

We can alternatively present this as 52.5%.

Rate

Rates provide a common time frame and unit of population. They allow a direct comparison of frequency of disease. Two requirements for the rate calculation are: (1) time frame and; (2)  a unit of population.

\[ Frequency of observed event in a given time period/total number in whom event might occur in that time period. \]

The denominator includes all those who are eligible to appear in the numerator i.e., those at risk of the observed event.

Time frame is important, especially if your event of interest is death: everyone will die at some point, so the time period has to be defined to get a sensible answer

An example is based around a group of 742 people with knee pain. They were followed up for 3 years after completing a survey. During the follow-up, the group as a whole consulted their GPs 202 times for knee surgery.

Rate of consultation for knee surgery was:

\[  202/742*3 = 202/2226 = 0.091 consultation per person per year. \]

Prevalence Rate

The prevalence rate is a measure of the proportion of a population affected by a specific condition in a specified time period.  Prevalence itself is the number of people in a population who have the disease of interest in a particular time period.  This isn’t very useful, as the larger the number of people in the population, the more people might have the disease, so a prevalence rate (often referred to simply as prevalence) is what we really want to know.

\[ Number of cases of disease in given time period/total number in population in that time period \]

Be careful to specify your total population properly in terms of socio-demographic and environmental factors. In looking at the prevalence of cervical cancer, you wouldn’t want to include men in the denominator as the prevalence estimate would be far too low.

Prevalences are usually given in three ‘types’:

  1. Point prevalence: relates to prevalence with respect to a specific point in time – Did you have an asthma attack on Monday?
  2. Period prevalence: related to prevalence over a defined period of time – Did you have an asthma attack in January?
  3. Lifetime prevalence: Have you ever had an asthma attack?

An example is based on the number of responders to a survey reporting knee pain in the past month.

All responders to a survey aged 50+                                                   787

Responders reporting knee pain in the past month                           287

The one prevalence rate of knee pain in the responders of the survey:

\[ 2874/7878 = 0.365 \]

Alternatively, this can be presented as 36.5% or 365 per 1,000 responders

Incidence Rate

Incidence is the number of new cases of a disease in a population.  As when considering prevalence, because of different population sizes, it is usual to consider a rate.

\[ Number of new cases of disease in given time period/total number in population at risk at that time period \]

For the calculation:

  • The time period should be specified, as the number of incident cases can be made arbitrarily large or small depending on the length of the time period being considered.
  • ‘At Risk’ population.  People who already have the disease at the start of the time period are not included in the denominator – if they already have the disease, they are not at risk of developing it.

Depending on your question, you might want to consider first ever instances of disease, or just new cases during your time period.  For example, if you consider back pain, you might want any flu where that episode of flu started during your 1 month time period of interest, or you might want any flu that started during your 1 month time period and where that was the first time that person had had flu.  This decision will change who is a member of our ‘at risk’ population.  Others ways of defining your ‘at risk’ group might be age, immunisation status, gender (don’t look at cervical cancer in men!)

Responders to BHPS with new CVD in 2007                                       264

Responders to BHPS with no new CVD in 2007                               1309

The one year incidence in CVD in responders to this survey is:

\[ 264 /13097+264 = 0.020 per year \]

Alternatively, this can be expressed as 2.0% per year or 20 per 1000 responders per year

Prevalence vs Incidence

So hopefully the difference between prevalence and incidence rates are clear, but they are related – along with the average duration of disease.  The following relationships hold:

Low incidence, long duration – chronic diseases, e.g. asthma, diabetes

High incidence, short duration – acute, common diseases, e.g. cold, chicken pox in children

Preventative measures might lower incidence, e.g. vaccination, public health campaigns, whereas clinical interventions may reduce decrease duration, or decrease mortality, resulting in an increase in disease duration.

This whole process can be seen in the following diagram.

Prevalence rates are generally used to describe the extent of a disease in a particular population whereas incidence rates look at the rate at which new cases of disease develop.  Whilst prevalence can be affected by how long people live with a condition, incidence does not take this into account

Prevalence is descriptive, often demonstrating public health ‘need’. On the other hand, incidence is useful for studying the causes of disease (the aetiology) or to look at the order in which events occur.  In disease with long durations and very low levels of incidence, there may be little difference between prevalence and incidence.

]]>
http://health.knowledgeblog.org/2011/07/22/basic-statistics-for-epidemiology/feed/ 0