view counter

Infection projections: how the spread of Ebola is calculated

The World Health Organization (WHO) estimates that the actual number of cases in Guinea is about 1.5 times the reported figure, with corresponding factors of 2 for Sierra Leone and 2.5 for Liberia.

(The WHO hasn’t published the methodology used to estimate these figures: they may be little more than guesses).

Putting the pieces together
Bioinformatics plays a key role in detecting, monitoring and responding to EIDs.

In the case of Ebola, the bioinformatics community has responded rapidly. For example, the current outbreak of Ebola in Sierra Leone was first detected in May, but by September a study reported sequencing 99 Ebola virus genomes from 78 patients diagnosed with the disease between late May and mid-June.

The process of sequencing a genome involves assembling many thousands of short sequences — gene fragments obtained from all over the genome. Algorithms for assembling genome sequences detect overlaps between fragments, and align and merge them to reconstruct the sequence of the whole genome.

Bioinformaticians have been developing and refining algorithms for sequence assembly since the late 1980s, and are constantly adapting them so they can handle new sequencing technologies and ever-larger scales of assembly.

By the end of September this year, the UC Santa Cruz Genomics Institute had released a new Ebola genome browser with an alignment of 148 individual viral genomes, including 102 from the current outbreak. This was a monumental effort: UCSC researcher Jim Kent led a team that reportedly worked around the clock in the last week of September to produce the browser.

Such genome browsers will undoubtedly accelerate global efforts to develop a vaccine and antiserum.

One of the advantages of having whole genome sequences is that they can be used to reconstruct the family tree (phyolgeny) of Ebola viruses, and trace the course of the outbreak. Reconstructions of this kind can provide important insights into the successes and failures of current management strategies.

They can also be used to estimate parameters that govern how rapidly the virus spreads, in terms of both number of cases and geographic range.

These can, in turn, be used to forecast the future course of the epidemic and predict the impact of various management strategies.

There are many other ways in which bioinformatics contributes to the management of EIDs. Genomic sequence analyses can lead to a better understanding of the biology of a disease, the features that make it pathogenic, and potential drug targets or clinical interventions.

A recent survey of Australian life sciences conducted by Bioinformatics Resource Australia EMBL (BRAEMBL) found that bioinformatics is seen by many laboratory scientists as core to their work, but also identified marked community concern about a lack of expertise and access to expertise in bioinformatics.

In light of the importance of bioinformatics in managing EIDs, and its growing role in facilitating research in the life sciences more generally, it is important that students and early career researchers from mathematics, statistics, computer science and biology are attracted into this field, and receive world-class training in its practice and implementation.

Bioinformatics is, and will continue to be, a core component of the international response to Ebola and other EIDs, and patients, medical staff and those close to them need all the help they can get.

The annual BioInfoSummer conference and training workshop is an initiative of the Australian Mathematical Sciences Institute. This year the event is being hosted by Monash University on 1-5 December.

Jonathan Keith is Senior Lecturer, School of Mathematical Sciences at Monash University. This story is published courtesy of The Conversation (under Creative Commons-Attribution/No derivatives).

view counter
view counter