CBclickbank.com Bioinformatics
Home
companies
Centers world wide
Summer Training
Project work
Courses in India
Worldwide Courses
Career Opportunities
 
Tutorials links
Bioinformatics Tools
 
Species Specific databases
Download Softwares
Bioinformatics Databases
Thesaurus
Bioinformatics books
Application Areas
Tools & sofwares
History
Biological databases
Definitions
Human Genome Project
organization
State wise University
Study program, Program Detail and Admission procedure
Scholarships
News
Facts
Protocols
Tell a Friend
Miscellaneous
General
Miscellaneous collection
Virtual drug development
Drug discovery
Sequence analysis
Genomics glossary
glossary ( R-Z)
glossary (A-R )
Arunachal Pradesh
Assam
Tamil Nadu
Gujarat
Haryana
Himachal Pradesh
Goa
Rajasthan
Chattisgarh
Jharkhand
Andhra Pradesh
Punjab
Bihar
Kerala
Madhya Pradesh
Karnataka
O- INDIAN -O
Maharashtra
Manipur
Industry
Jobs
Find more
 
Bioinformatics
OO  Join Biofriend India to get answers from Team of Expert OO
Contact us
Join Biofriend
Blog
Center for Bioinformatics at CBClickbank.com
Subscribe to biofriend_india
Email:
Browse Archives at groups.google.co.in
FAQ'S
 
 
Web www.cbclickbank.com
 
 
Web www.cbclickbank.com

Bioinformatics Glossary

 

( A-C )         ( D - H)          ( I  M)          (N -R )         (S- T )          (U-Z ) 

 

S

 

Secondary structure (protein)

The organization of the peptide backbone of a protein that occurs as a result of hydrogen bonds e.g alpha helix, Beta pleated sheet.

 

Selectivity

Selectivity of bioinformatics similarity search algorithms is defined as the significance threshold for reporting database sequence matches. As an example, for BLAST searches, the parameter E is interpreted as the upper bound on the expected frequency of chance occurrence of a match within the context of the entire database search. E may be thought of as the number of matches one expects to observe by chance alone during the database search.

 

Sense strand

The strand of double-stranded DNA that acts as the template strand for RNA synthesis. Typically only one gene product is produced per gene, reading from the sense strand only. (Some viruses have open reading frames in both the sense and the antisense strands).

 

Sensitivity

Sensitivity of bioinformatics similarity search algorithms centers around two areas: First, how well can the method detect biologically meaningful relationships between two related sequences in the presence of mutations and sequencing errors; Secondly how does the heuristic nature of the algorithm affect the probability that a matching sequence will not be detected. At the user's discretion, the speed of most similarity search programs can be sacrificed in exchange for greater sensitivity - with an emphasis on detecting lower scoring matches.

 

Sequence Tagged Site (STS)

A unique sequence from a known chromosomal location that can be amplified by PCR. STSs act as physical markers for genomic mapping and cloning.

 

Sexual PCR (Molecular Diversity)

Sexual PCR is a form of PCR in which similar, but not identical, DNA sequences are reassembled to obtain novel juxtapositions, simulating the result of genetic recombination. The result is the creation of an array of related genes which may possess improved characteristics. By repeated rounds of recombination, selection and PCR-based amplification vastly improved gene-products, such as enzymes with greater activity, may be generated and selected.

 

Shotgun cloning

The cloning of an entire gene segment or genome by generating a random set of fragments using restriction endonucleases to create a gene library that can be subsequently mapped and sequenced to reconstruct the entire genome.

 

Similarity (homology) search

Given a newly sequenced gene, there are two main approaches to the prediction of structure and function from the amino acid sequence. Homology methods are the most powerful and are based on the detection of significant extended sequence similarity to a protein of known structure, or of a sequence pattern characteristic of a protein family. Statistical methods are less successful but more general and are based on the derivation of structural preference values for single residues, pairs of residues, short oligopeptides or short sequence patterns. The transfer of structure/function information to a potentially homologous protein is straightforward when the sequence similarity is high and extended in length, but the assessment of the structural significance of sequence similarity can be difficult when sequence similarity is weak or restricted to a short region.

 

Signal sequence (leader sequence)

A short sequence added to the amino-terminal end of a polypeptide chain that forms an amphipathic helix allowing the nascent polypeptide to migrate through membranes such as the endoplasmic reticulum or the cell membrane. It is cleaved from the polypeptide after the protein has crossed the membrane.

 

Single nucleotide polymorphisms (SNPs)

Variations of single base pairs scattered throughout the human genome that serve as measures of the genetic diversity in humans. About 1 million SNPs are estimated to be present in the human genome, and SNPs are useful markers for gene mapping studies.

 

Single-pass sequencing

Rapid sequencing of large segments of the genome of an organism by isolating as many expressed (cDNA) sequences as possible and performing single sequencer runs on their 5' or 3' ends. Single-pass sequencing typically results in individual, error-prone sequencing reads of 400-700 bases, depending on the type of sequencer used. However, if many of these are generated from numerous clones from different tissues, they may be overlapped and assembled to remove the errors and generate a contiguous sequence for the entire expressed gene.

 

Site

Sites in sequences can be located either in DNA (e.g. binding sites, cleavage sites) or in proteins. In order to identify a site in DNA, ambiguity symbols are used to allow several different symbols at one position. Proteins, however, need a different mechanism (see Pattern). Restriction enzyme cleavage sites, for instance, have the following properties: limited length (typically, less than 20 base pairs); definition of the cleavage site and its appearance (3', 5' overhang or blunt); definition of the binding site.

 

Southern blotting

A procedure for the identification of DNA by transmitting a fragment isolated on an agarose gel to a nitrocellulose filter where it can be hybridized with a complementary "probe" sequence.

Splice site

The sequence found at the 5' and 3' region of exon/intron boundaries, usually defined by a consensus sequence:

Intron

5' CAGGTAAGT---------TNCAGG 3'

A G C T

N represents any nucleotide; the bottom line represents alternative nucleotides at the indicated positions.

 

Splice form

By using alternative splicing, a single message precursor from DNA can generate an entire family of mRNAs and proteins. This can be utilized to create specificity in cell-cell or cell-ligand interactions. A cell may produce a given protein, but it will be a different splice-form of the protein than that produced by an adjacent cell. In this manner, the two cells have the potential to interact differently with other cells or molecules. Two places where this has been extremely important is in the production of cell-surface specificity proteins in the immune and nervous systems.

 

Splicing

The joining together of separate DNA or RNA component parts. For example, RNA splicing in eukaryotes involves the removal of introns and the stitching together of the exons from the pre-mRNA transcript before maturation.

 

Solvent accessibility

The surface area (typically measured in square angstroms) of a biological molecule, usually a protein, that is exposed to solvent in its native, folded form. Determining the solvent accessibility of a protein helps define which amino acids in its molecular sequence are on the exterior of the molecule, and thus available to participate in interactions with other molecules.

 

Structural gene

Gene which encodes a structural protein (cf. Regulatory gene).

 

Structure prediction

Algorithms that predict the secondary, tertiary and sometimes even quarternary structure of proteins from their sequences. Determining protein structure from sequence has been dubbed "the second half of the Genetic Code" since it is the folded tertiary structure of a protein that governs how it functions as a gene product. As yet most structure prediction methods are only partially successful, and typically work best for certain well-defined classes of proteins.

 

Substitution matrix

A model of protein evolution at the sequence level resulting in the development of a set of widely used substitution matrices. These are frequently called Dayhoff, MDM (Mutation Data Matrix), BLOSUM or PAM (Percent Accepted Mutation) matrices. They are derived from global alignments of closely related sequences. Matrices for greater evolutionary distances are extrapolated from those for lesser ones.

 

Subtraction library

A cDNA library that only contains cDNAs uniquely expressed in a given cell or tissue. e.g T cells and B cells will express many common RNAs, as well as a very small percentage which will be unique for T cells and B cells respectively. To make a T cell subtraction library, the cDNA from a T cell library is hybridized with a vast excess of B cell RNA. The commonly expressed genes will result in RNA-cDNA hybrids which can be removed (or subtracted) to leave only T cell specific cDNAs.

 

 

T

 

Tentative Consensus (TC)

The identification of a sequence from an EST cluster that represents part or all of a complete gene. TCs are usually determined by clustering ESTs allowing for sequencing errors, artefacts such as chimeric clones, and naturally occuring biological phenomena such as alternative splicing. Creation of a cluster allows one to generate a consensus sequence and then identify a long open reading frame which would suggest the possibility of that consensus representing a bona fide gene.

 

Tentative Human Consensus sequences (THCs)

A consensus sequence generated from human EST fragments. THCs may be validated by comparison against databases of known human gene sequences, human genomic sequences, or by identification of the ORFs or other sequence features contained within the consensus as belonging to a known human gene product.

 

Tertiary structure

Folding of a protein chain via interactions of its sideschain molecules including formation of disulphide bonds between cysteine residues.

 

Thymine

A pyrimidine base found in DNA but not in RNA.

 

Tissue

Section of an organ that consists of a largely homogenous population of cell types. Since many organs are multifunctional, they have developed highly specialized cell types to perform different functions. Identifying the section of an organ that is homogenous for a particular cell type ensures that the gene expression profiles extracted from those cells will accurately resemble the class of cells that make up the tissue.

 

Transcript

The single-stranded mRNA chain that is assembled from a gene template.

 

Transcription

The assembly of complementary single-stranded RNA on a DNA template.

 

Transcription factors

A group of regulatory proteins that are required for transcription in eukaryotes. Transcription factors bind to the promoter region of a gene and facilitate transcription by RNA polymerase.

 

Transfer RNA (tRNA)

A small RNA molecule that recognizes a specific amino acid, transports it to a specific codon in the mRNA, and positions it properly in the nascent polypeptide chain.

 

Transformation

A genetic alteration to a cell as a result of the incorporation of DNA from a genetically diferent cell or virus; can also refer to the introduction of DNA into bacterial cells for genetic manipulation.

 

Transgene

A foreign gene that is introduced into a cell or whole organism (eg.transgenic mice) for therapeutic or experimental purposes.

 

Translation

The process of converting RNA to protein by the assembly of a polypeptide chain from an mRNA molecule at the ribosome.

 

Transmembrane region

The region of a transmembrane protein that actually spans the membrane. Transmembrane regions are usually hydrophobic in order to be thermodynamically compatible with the lipid bilayer portion of the membrane. They may consist of either alpha-helical or beta-strand secondary structure elements, but in either case the external residues (the ones facing the membrane) are invariably hydrophobic while the internal residues may be hydrophilic (as in the case of a pore or channel) or polar. One common transmembrane structural domain is the seven-helix bundle seen in numerous channel proteins.

 

Tissue

Section of an organ that consists of a largely homogenous population of cell types. Since many organs are multifunctional, they have developed highly specialized cell types to perform different functions. Identifying the section of an organ that is homogenous for a particular cell type ensures that the gene expression profiles extracted from those cells will accurately resemble the class of cells that make up the tissue.

 

 

( A-C )         ( D - H)          ( I  M)          (N -R )         (S- T )          (U-Z ) 

Free Softwares