Wednesday, 11 December 2013



Protein Data Bank


The Protein Data Bank (PDB) is a repository for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations (PDBe, PDBj, and RCSB). The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.
The PDB is a key resource in areas of structural biology, such as structural genomics. Most major scientific journals, and some funding agencies, such as the NIH in the USA, now require scientists to submit their structure data to the PDB. If the contents of the PDB are thought of as primary data, then there are hundreds of derived (i.e., secondary) databases that categorize the data differently. For example, both SCOP and CATH categorize structures according to type of structure and assumed evolutionary relations; GO categorize structures based on genes.

These are the link  for some Protein Data Bank website

How do data information is stored in form of data query?


Search Field
Example
PDB ID
4HHB, 2MHR
Deposition/Release Date
September 1 1996
Contain Chain Type
Protein: Ignore, Enzyme: Yes, DNA
Citation Author
S.S. Taylor

Here are some examples of the protein:

 HtrA

Molecule:Putative serine protease
Polymer:1Type:protein

Chains:A
Fragment:PDZ domain, UNP residues 266-390
OrganismStreptococcus pneumoniae
Gene NameSP_2239

 Aminopeptidase

Molecule:METHIONINE AMINOPEPTIDASE
Polymer:1Type:protein

Chains:A
EC#:3.4.11.18   
Details:COMPLEX WITH T-BUTANOL AND COBALT
OrganismHomo sapiens
Gene NamesMETAP2 MNPEP P67EIF2


 Carboxypeptidase

Molecule:CARBOXYPEPTIDASE A
Polymer:1Type:protein

Chains:A
EC#:3.4.17.1   
OrganismBos taurus
Gene NamesCPA1 CPA


 Collagenase

Molecule:COLLAGENASE
Polymer:1Type:protein

Chains:A
EC#:3.4.24.3   
Fragment:RESIDUES 119-880
OrganismClostridium histolyticum
Gene NamecolG


Subsitilin

Molecule:Tk-subtilisin
Polymer:1Type:protein

Chains:A
Mutation:S324A, D372A
OrganismThermococcus kodakarensis KOD1
Gene NameTK1675





Enjoy Learning The Protein

















Tuesday, 10 December 2013


The Simplified Molecular-Input Line-Entry System or SMILES is a specification in form of a line notation for describing the structure of chemical molecules using short ASCII strings ASCII. SMILES strings can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules.

The term SMILES refers to a line notation for encoding molecular structures and specific instances should strictly be called SMILES strings. However, the term SMILES is also commonly used to refer to both a single SMILES string and a number of SMILES strings; the exact meaning is usually apparent from the context. The terms Canonical and Isomeric can lead to some confusion when applied to SMILES. The terms describe different attributes of SMILES strings and are not mutually exclusive.


The term Canonical SMILES refers to the version of the SMILES specification that includes rules for ensuring that each distinct chemical molecule has a single unique SMILES representation while The term Isomeric SMILES refers to the version of the SMILES specification that includes extensions to support the specification of isotopes,  chirality, and configuration about double bonds.


Graph-based Definition
In terms of a graph-based computational procedure, SMILES is a string obtained by printing the symbol nodes encountered in a depth-first tree traversal of a chemical graph. The chemical graph is first trimmed to remove hydrogen atoms and cycles are broken to turn it into a spanning tree. Where cycles have been broken, numeric suffix labels are included to indicate the connected nodes. Parentheses are used to indicate points of branching on the tree.

There are some examples of SMILES bonds that can be use:

SMILES BONDS
SYMBOL
SINGLE
-
DOUBLE
=
TRIPLE
#


There are also some example of SMILES and their names:

SMILES
NAME
CC
Ethane
O=C=O
Carbon dioxide
C#N
Hydrogen cyanide
CCN(CC)CC
Triethylamine

Here are some links for tutorial on how to use SMILES:

Here are the examples of SMILES notation using Chemsketch:



Example of complex SMILES using Chemsketch:



HAVE FUN LEARNING SMILES :)