Before you read on, please complete this survey!
In this presentation, I will be guiding you through the first steps of the drug discovery process. We will be looking at how drugs interact with proteins as well as novel ways of determining the structure of proteins. You can hover over words that are highlighted to find definitions.
Q) What happens when a disease or clinical condition without a suitable medical product is found?
A) A drug discovery program is initiated!
A drug discovery program can be broken down into the following steps.
The “Basic Research” and “Lead Discovery” sections are circled as they are the parts I will focus on in this presentation. Generally speaking, there is initial research within academia to generate a hypothesis that the inhibition or activation of a certain protein and its associated protein pathway results in a therapeutic effect for the disease or condition the drug discovery program is attempting to target. Based on this hypothesis, there is an intensive search for a drug-like small molecule that can progress into the pre-clinical, and if successful, clinical stage.
One of the most important steps in the early stages is target identification and validation. In this case, a target refers to macromolecules such as proteins, genes, and RNA. A good target must be druggable. In other words, the target must be accessible to small molecules and upon binding, the target must display a measurable biological response.
Q) How are targets identified? A) Targets can be identified through data mining, computer simulations, and sometimes pure chance!
Data mining refers to the use of bioinformatics to identify and prioritize potential disease targets. Such data comes from existing publications and patent information, gene expression data, proteomics data, transgenic phenotyping and compound profiling data.
In terms of computer simulations, there are many different molecular modeling and docking software available. Such software use advanced algorithms to model intramolecular interactions within macromolecules as well as intermolecular interactions between macromolecules, such as proteins, and their substrates, such as small drug molecules. During my internship at a biotechnology company last summer, I was fortunate enough to be introduced to three tools that work in conjunction with each other to predict how potential small drug molecules dock to the active sites of target proteins.
UCSF Chimera is a molecular modeling system that allows for interactive visualization and analysis of molecular data. During my internship, I used UCSF Chimera to prepare proteins and small drug molecules for docking as well as to see the results of the simulated docking.
RCSB Protein Data Bank is an archive of the 3D shapes of macromolecules such as proteins and nucleic acids as well as their associated small drug molecules and substrates. Over the summer, I used this resource to locate and load proteins and ligands of interest into UCSF Chimera and AutoDock Vina.
AutoDock Vina is a program used to predict ligand-protein docking conformations.
Below are results of flexible docking of ligands to the crystal structures of proteins using AutoDock Vina.
Q) But what if we don't know the structure of the target protein?
A) Before we can use AutoDock Vina to predict ligand-protein docking conformations, we need to know the structure of the active site of the target protein. To do this, one can run computer simulations to find which protein structure out of hundreds of possible structures is correct.
Lets take a step back first. What is a protein?
Proteins are basically the workhorses all cells. There are many different varieties of proteins with different functions, but all are made up of long chains of amino acids. These amino acids are considered polymers. Amino acids consist of a main chain that connect each individual monomer through peptide bonds and a side chain (R group). The R group determines the chemical property of the amino acid monomer. They can be acidic or basic and hydrophobic or hydrophilic. A protein’s final folded structure is based on these chemical properties.
Amino acids don’t like to stay stretched out in a long line; their most stable form is as a compact “blob”. Proteins usually fold themselves, but sometimes they need more help from other proteins. Each protein has a unique structure that defines its function (the same proteins will always fold the same way to achieve identical structures). Substrates and ligands must have a “lock and key” fit with the active site of such proteins. If there is no “lock and key” fit, the protein pathway associated with that specific protein cannot be activated.
This is exactly why we need to know the structure of a protein. Knowing the structure of a protein allows us to know how to target it with drugs. Figuring out which of the many protein structures is correct has been described as one of the biggest challenges in molecular biology. There are computer programs that simulate the folding of proteins. Over the years, these programs have been improving in terms of both computing power as well as the algorithms used. However, these simulations are still bottle-necked by high computing costs as well as the required approximations of molecular forces.
Q) How do we improve protein folding computer simulation algorithms? A) Use humans!
Foldit is a crowdsourcing computer game developed by the University of Washington’s Center for Game Science allowing players to contribute directly to scientific research. It takes advantage of human puzzle-solving intuitions as well as having people compete against each other to solve proteins.
The game started off as an experimental research project to see if it is possible to harness the power of distributed computing to discover new protein structures. Over the years, it grew larger and larger, with more and more players joining. This growth in player base can be attributed to Foldit’s developers using gamification to make it more appealing to the general public. As proteins are modified, scores are calculated based on how well folded the structures are. A list of high scores is maintained for each puzzle, acting as a challenge to other players to find a better method to fold a specific protein.
Already, Foldit has proven itself to be a powerful tool for scientific research. A study has shown that Foldit player solutions can often outperform state-of-the-art computational methods. To codify the various strategies players use to successfully fold proteins, the game developers added tools into the game that would allow players to encode their folding strategies as “recipes” and share it with other players. Players would then modify existing “recipes” and redistribute them. Through this, there was, and still is, continual improvement in the power of the “recipes”. Over the period of the study, players developed over 5,400 different “recipes”, with the most successful spreading quickly throughout the player population. During the study, two “recipes” became particularly dominant. When researchers performed benchmark calculations based on the new algorithms encoded within these two “recipes”, they found that the algorithms outperformed previously published methods. Such algorithms encoded within these “recipes” can be used to improve existing computer simulation algorithms.
Another success story
Up till 2011, there have been numerous attempts to solve the crystal structure of the M-PMV retroviral protease. However, all these attempts using a wide range of computation methods failed in producing a viable model of sufficient quality. The players of Foldit were then given this puzzle to solve. Shockingly, a team of players were able to come up with a solution within 10 days. Their model was of sufficient quality for further structural determination.
Why should we care about this success?
Retroviral proteases are critical for viral maturation and proliferation and are important for antiretroviral drug development. Through the refined crystal structure of the M-PMV retroviral protease, researchers gained a better insight into how to target AIDS using small drug molecules.
Take this quiz to test your knowledge!