2.3
#58
ANALYSES OF ENTRIES IN THE HLA LIGAND/MOTIF DATABASE AND THE PREDICTIVE ALGORITHMS EMPOWERED BY THESE DATA.
Muthuraman Sathiamurthy , Heather D. Hickman , Joshua W. Cavett and William H. Hildebrand, PhD . Oklahoma City OK, USA, Univ of Oklahoma Health Sciences Center, 73104, Department of Microbiology/Immunology .

We maintain an NIH funded HLA Ligand Database that is publicly available on the World Wide Web at http://hlaligand.ouhsc.edu. The main goal of the database is to provide a bridge for researchers and clinicians to logically access (1) the complex and rapidly expanding HLA ligand and motif data set, and (2) predictive algorithms that utilize the database for identification of new peptide epitopes. Here we discuss the parameters of the dataset and address the question “How many ligands are required to support accurate predictions of peptide binding to HLA?”
The HLA epitope binding prediction algorithm available on the database predicts epitopes that bind sequences using an un-weighted, dynamic amino acid frequency table. Due to the dynamic nature of this algorithm, the important factor that drives algorithm accuracy is the number of endogenous ligands databased for each allele. In order to analyze the minimum number of endogenous ligands needed for accurate prediction, we input the prediction algorithm with randomly selected A*0201 peptides in additive groups of 10 (for example). We then entered the HIV gag protein and tested the algorithm’s ability to predict A*0201-binding of the well-characterized gag epitope SLYNTVATL (SL9). Multiple randomized trials were carried out with random peptide sets of different sizes and a median dataset size was determined for SL9 prediction. These experiments allow us to extrapolate the minimum number of ligands necessary for algorithm accuracy. A complete understanding of the minimum dataset for accurate ligand prediction will allow better algorithms for the study of minor histocompatibility antigens, cancer CTL epitopes, and viral ligands.