PBIL PBIL

. This is a Python rewritten apparently lightened version of BIBI, the software environment for sequence based identification of prokaryots (Bacteria and Archaea. The first version of this bacteria DNA sequence identification webtool was initially written by Devulder et Al. 2003
BIBI light edition uses now python seqclass (JP Flandrois) tree class (JP Flandrois and Emmanuelle Dantony 2005), Sequence colourist (Stephane Vellay and JP Flandrois) parts of T4Bi (JP Flandrois 2005). The main developpers are now Jean-Pierre Flandrois (database construction and general scripts) and Manolo Gouy (tree visualisation programs). BIBI front end is version 0.2 (dec 06) and the databases builder is version 3 (dec 07) For any question :flandrs at biomserv.univ-lyon1.fr
you may try betalactamase identification (plain but working)

Sending an unknown sequence to BIBI

Enter the QUERY sequence
/!\ Fasta only (first line begins with >, other line(s) = sequence) /!\      BiBile How To     BiBile usage statistics
Read about Data Bases Classification
BIBI's parameters GENES in DATABASES are : SSU-rDNA-16S (Bacteria and Archaea) gyrB, recA, sodA, rpob, tmRNA, tuf, groES, groEL, dnaK, dnaJ, fusA (Bacteria), groel2-hsp65 (Actinobacteria). Databases are called "stringent" or "lax" (depending if the names are nomenclature compliant or not). TS databases contains only TS+complete genome sequences
Sequences Databases ( Currently compilated 18/Apr/2009) Extraction ratio from GenBank and Evolution of extractions
Nb seqs to align : Seq_Id :
Alignment : Phylogeny :
                                                                                                                  
Tests sequences
Tests sequences
NEWS SERVER improvements :Speed : The server is now a Dell PowerEdge 2 x quadcore (dec 2007). The server is now on a 1Gb connexion (june 2007), hopefully you will feel change. Now the computer restart automatically if accidentally stopped. A sister server has been built in case of the main server failure but the shift is not automatic

DATABASES improvements :Version 3 of BIBIdb is now in use. Improvement are VERY important, the Type Strain extraction is 5 to 10 \% more effective and false TS have been suppressed. Recognition of serotypes, biotypes, pathovar is better and clearer. (10-12-2007). New Databases included, now databases with a wide spectrum of species are at the beginning in the choice, other basis have a limited spectrum of species and their name is beginning with "ux_".(2007-05-15)The ~X~ tag
appears again to call user attention to putative nomenclature errors (15-04-2007). A TS only bank has been added for Archae (13-04-2007) ; The post-extraction process has been completely rewritten to correct all the irrelevant names : the rejected or corrected names etc. are replaced by the most recent correct name in these cases, a Md flag is added, if there are several sequences sharing the same bankId, a number follows. (26-01-2007)

/!\ Note : your browser has to accept to open a new window to see the results PROGRAM Improvements : A bug affecting the html tree in some cases of answers has been corrected (08/01/2007). Version 3.0 of the database builder is running 10-12-2007
Important : Unfortunatly the topology algorithm does not (in rare cases) identifiy some unexpected position of the query (query is alone outside tree), please look at the tree. This will be corrected, I hope soon
Please send questions and bugs signaling to flandrs at biomserv.univ-lyon1.fr
QUICK HELP The NEW graphic How To
BIBIle is a web tool designed to identify bacteria sequences among databases.
Altough this webtool is a "work in progress" this version has been thoroughly tested and may be used, of course with a critical analysis of the results, as for any automated system of sequence interpretation
The use is very simple : submit a query sequence (fasta format, see the tests if needed) choose the sequence database and click. The SSU-rDNA-16S_stringent test-set contains indications for use in various cases
Of course you can choose to include gaps (usually phylogenetic tree are better, this is the default way) or to exclude gaps
Names of sequences databases are indicating the scope (Bacteria, Actinobacteria ...), the gene (ex: SSU-rDNA-16S) and the taxonomic and bacteria nomenclature compliance (stringent) or non-compliance (lax)
the stringent choice led to a taxonomically relevant identification. Type strains are identified (but some may be missing due to incomplete descriptions in GenBank
Note : now the databases are actualized monthly and the improvment of the type strain (TS) detection is going on.
After some time for computing BIBIle returns a phylogenetic tree and some basic commentaries, including the names of the sequences sharing the query node (this is based on node topology, not distances)
The [Aligment-Cleaner] button launches a cleaning program to suppress the ill aligned or trailing sub fragments essentielly resulting of sequencing uncertainties or aligment deficiencies at the beginning and end of the sequences. The N rich ends of the sequences are also suppressed.
A sequence in the SSU-rDNA-16S_stringent test-set is specially explicative of the process
A test for biodiversity is done to prevent a "one species only" selection
A topographical analysis of the Type Strains position (within the tree or outside) is done. A button enable to add the sequences of type strains initialy missing in the tree. This is restricted to sequences that are sufficiently close from the query (some does not have any common segment with the query, others may be wrongly identified). You have now a direct look at the GenBank description of TS NOT included in the tree
The optimum alignment is obtained by the [alignment cleaner] option, thus it could be wise to use it before adding extra TS, but extra TS expand the tree and this could be interesting too...
Biovars, serovars etc. are identified at the end of the strain description (ex: Mycobacterium parascrofulaceum i AY337275 II), the sequences from our lab have a "Ly" tag (this is for our own use)
Type strains are indicated in the tree by T.
Sequences identities with the query are indicated by the color of the symbol (>99%)(>98%)(>95%)(>90%)(<90%)
Position in the BLAST result is indicated by a number in exponent position at the end of the line [¹]
This is a typical description in the tree : Corynebacterium coyleae i AB234868¹  Try the links (to DSMZ site and to PBIL GenBank description), see the indicating the identities % and the ¹  position in BLAST results
A "<" is the indication of an host (a GenBank feature) , like here : •Bartonella bacilliformis i DQ179110 <"Homo_sapiens"
The cases of Endosymbionts are indicated using braces following the bank ID <{Drosophila melanogaster}
You have also access to the PDF version of the tree and to the alignments (complete or reduced to the length of the query)
BIBIle uses the "T4BiFasta-like" description of the sequence : >Genus~species[~subsp~subspecies]~[?/v/Xn]~[T/(N==i)]~GB-Id[=biovar, serovatr...,[Ly]][Md[1...n]/[]] to condense all the main descriptors in the first line ; as for bacteria identification sequences are short (<1500 bp) the second line contains the nucleotidic sequence (this does not respect the 80 characters rule). You may see the caracteristic ~ in some part of the answer/technical annexes. Note that "?" means "not in approved lists of bacteria names" and v means on the contrary "validated". X indicates that the sequence is falsly attributed to the species and the number is a confidence criteria ; T is for TS, N or i(in the tree) indicates "non type stains". Md indicates a name modification.
BIBIle uses Muscle as aligment tool Clustal for phylogenetic reconstruction and the newicktotxt and newicktopdf programs of Manolo Gouy.

BIBIle is still under active development but has been extensively tested, its identification power is at least the same as BIBI's one.
To Do : critical : This version is stabilized only major bugs are corrected. A major remaining question is the use of the [Alignment-Cleaner] at first, in the next version. No critical bugs or missing functions identified
To Do : improvment, major: 1) Built the database concerning the bad quality of some identities in GenBank 3) collapse the tree in regions containing the same Id's
To Do : improvment, minor: 1) Improve the printer friendly summary; 2)simplify the pdf work-flow
Important note : Our objective if functionnality, not a heavy and beautiful site !