Article (Scientific journals)
A procedure to recruit members to enlarge protein family databases--the building of UECOG (UniRef-Enriched COG Database) as a model.
Fernandes, G. R.; Barbosa Da Silva, Adriano; Prosdocimi, F. et al.
2008In Genetics and Molecular Research, 7 (3), p. 910-24
Peer Reviewed verified by ORBi
 

Files


Full Text
uecog.pdf
Publisher postprint (887.57 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Computational Biology/methods; Databases, Protein; Reproducibility of Results
Abstract :
[en] A procedure to recruit members to enlarge protein family databases is described here. The procedure makes use of UniRef50 clusters produced by UniProt. Current family entries are used to recruit additional members based on the UniRef50 clusters to which they belong. Only those additional UniRef50 members that are not fragments and whose length is within a restricted range relative to the original entry are recruited. The enriched dataset is then limited to contain only genomes from selected clades. We used the COG database - used for genome annotation and for studies of phylogenetics and gene evolution - as a model. To validate the method, a UniRef-Enriched COG0151 (UECOG) was tested with distinct procedures to compare recruited members with the recruiters: PSI-BLAST, secondary structure overlap (SOV), Seed Linkage, COGnitor, shared domain content, and neighbor-joining single-linkage, and observed that the former four agree in their validations. Presently, the UniRef50-based recruitment procedure enriches the COG database for Archaea, Bacteria and its subgroups Actinobacteria, Firmicutes, Proteobacteria, and other bacteria by 2.2-, 8.0-, 7.0-, 8.8-, 8.7-, and 4.2-fold, respectively, in terms of sequences, and also considerably increased the number of species.
Disciplines :
Biotechnology
Author, co-author :
Fernandes, G. R.
Barbosa Da Silva, Adriano ;  University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Prosdocimi, F.
Pena, I. A.
Santana-Santos, L.
Coelho Junior, O.
Barbosa-Silva, A.
Velloso, H. M.
Mudado, M. A.
Natale, D. A.
Faria-Campos, A. C.
Aguiar, S. C. V.
Ortega, J. M.
More authors (3 more) Less
External co-authors :
yes
Language :
English
Title :
A procedure to recruit members to enlarge protein family databases--the building of UECOG (UniRef-Enriched COG Database) as a model.
Publication date :
2008
Journal title :
Genetics and Molecular Research
ISSN :
1676-5680
Publisher :
Fundacao de Pesquisas Cientificas de Ribeirao Preto, Brazil
Volume :
7
Issue :
3
Pages :
910-24
Peer reviewed :
Peer Reviewed verified by ORBi
Focus Area :
Systems Biomedicine
Available on ORBilu :
since 13 April 2016

Statistics


Number of views
156 (3 by Unilu)
Number of downloads
120 (0 by Unilu)

Scopus citations®
 
5
Scopus citations®
without self-citations
0
WoS citations
 
5

Bibliography


Similar publications



Contact ORBilu