EPGD

Version 0.2, Updated May 2008

Gene duplication is a common occurrence in the eukaryotic genomes and provides new genetic material for mutation, drift and selection to act on. Here we describe a sophisticated procedure to collect the duplicated genes which we call paralogs in 26 available eukaryotic genomes, to pre-calculate several evolutionary indexes (evolution rate, synonymous distance/clock, transition redundant exchange clock and so on) based on paralog family, and to identify block or segmental duplications (paralogons). Then, we constructed an internet-accessible Eukaryotic Paralog Group Database (EPGD; http://epgd.biosino.org/EPGD/). The database is gene-centered and organized by paralog family. It focused on the paralogs and the duplication events in the evolution. The paralog families and paralogons can be searched by text or sequence, and are downloadable from the website in plain text files. The database will be very useful for both experimentalists and bioinformaticians for the study of duplication events or paralog families.


Citation:
Guohui Ding, Yan Sun, Hong Li, Zheng Wang, Haiwei Fan, Chuan Wang, Dan Yang, Yixue Li. EPGD: a comprehensive web resource for integrating and displaying eukaryotic paralog/paralogon information. Nucleic Acids Research, 2008 36(Database issue):D255-D262; doi:10.1093/nar/gkm924.