In the field of
molecular biology, a transcription factor (sometimes
called a sequence-specific DNA binding factor) is a
protein that binds to specific parts of
DNA
using
DNA binding domains and is part of the system that controls
the transfer (or
transcription) of genetic information from DNA to
RNA.[1][2]
Transcription factors perform this function alone, or by
using other proteins in a complex, by increasing (as an
activator), or preventing (as a
repressor) the presence of
RNA polymerase, a protein which
transcribes genetic information.[3][4][5]
|
Transcription factor glossary |
| •
transcription - copying of
DNA by
RNA polymerase into
messenger RNA |
| • factor - a substance, such as a protein,
that contributes to the cause of a specific biochemical
reaction or bodily process |
| •
transcriptional regulation - controlling
the rate of gene transcription for example by helping or
hindering RNA polymerase binding to DNA |
| •
upregulation, activation, or promotion
- increase the rate of gene transcription |
| •
downregulation, repression, or
suppression - decrease the rate of gene
transcription |
| •
coactivator - a protein which works with
transcription factors to increase the rate of
gene transcription |
| •
corepressor - a protein which works with
transcription factors to decrease the rate of
gene transcription |
|
|
Biological roles
Transcription factors are one of the groups of proteins that
read and interpret the genetic "blueprint" in the DNA. They bind
DNA and help initiate a program of increased or decreased gene
transcription. As such, they are vital for many important
cellular processes. Below are some of the important functions
and biological roles transcription factors are involved in:
- Response to intercellular signals Cells can
communicate with each other by releasing molecules that
produce
signaling cascades within another receptive cell. If the
signal requires upregulation or downregulation of genes in
the recipient cell, often transcription factors will be
downstream in the signaling cascade.
Estrogen signaling is an example of a fairly short
signaling cascade that involves the
estrogen receptor transcription factor: estrogen is
secreted by tissues such as the
ovaries and
placenta, crosses the
cell membrane of the recipient cell, and is bound by the
estrogen receptor in the cell's
cytoplasm. The estrogen receptor then goes to the cell's
nucleus and binds to its DNA binding sites, changing the
transcriptional regulation of the associated genes.
- Response to environment Not only do transcription
factors act downstream of signaling cascades related to
biological stimuli, but they can also be downstream of
signaling cascades involved in environmental stimuli.
Examples include
heat shock factor (HSF) which upregulates genes
necessary for survival at higher temperatures,
hypoxia inducible factor (HIF) which upregulates genes
necessary for cell survival in low oxygen environments, and
sterol regulatory element binding protein (SREBP) which
helps maintain proper
lipid levels in the cell.
- Cell cycle control Many transcription factors,
especially some that are
oncogenes or
tumor suppressors, help regulate the cell cycle and as
such determine how large a cell will get and when it can
divide into two daughter cells. One example is the
Myc
oncogene, which has important roles in
cell growth and
apoptosis.
Regulation of transcription factor
activity
It is common in biology for important processes to have
multiple layers of regulation and control. This is just as true
with transcription: not only do rates of transcription regulate
the amounts of gene products (RNA and protein) available to the
cell, but the process of transcription itself is regulated.
Below is a brief synopsis of some of the ways that the activity
of transcription factors can be regulated:
- Transcription factor synthesis Transcription
factors (like all proteins) are transcribed from a gene on a
chromosome into RNA, and then the RNA is translated into
protein. Any of these steps can be regulated to affect the
production (and thus activity) of a transcription factor.
One interesting implication of this is that transcription
factors can regulate themselves. For example, in a negative
feedback loop, the transcription factor acts as its own
repressor: if the transcription factor protein binds the DNA
of its own gene, it will down-regulate the production of
more of itself. This is one mechanism to maintain low levels
of a transcription factor in a cell.
- Localization to the nucleus In
eukaryotes, transcription factors (like most proteins)
are transcribed in the
nucleus but are then translated in the cell's
cytoplasm. Many proteins that are active in the nucleus
contain
nuclear localization signals that direct them to the
nucleus. But for many transcription factors this is a key
point in their regulation. Important classes of
transcription factors such as some
nuclear receptors must first bind a
ligand while in the cytoplasm before they can relocate
to the nucleus.
- Activation via chemical modifications or ligand
binding Not only is ligand binding able to influence
where a transcription factor is located within a cell, but
this can also affect whether the transcription factor is in
an active state and capable of binding DNA or other
cofactors. Another way that a transcription factor can be
activated is by chemical modification of the transcription
factor itself. For example, many transcription factors such
as
STAT proteins must be
phosphorylated before they can bind DNA.
- Accessibility of DNA binding site In eukaryotes,
genes that are not being actively transcribed are often
located in
heterochromatin. Heterochromatin are regions of
chromosomes that are heavily compacted by tightly bundling
the DNA onto
histones and then organizing the histones into compact
chromatin fibers. DNA within heterochromatin is
inaccessible to many transcription factors. For the
transcription factor to bind to its DNA binding site the
heterochromatin must be first converted to
euchromatin, usually via
histone modifications. A transcription factor's DNA
binding site may also be inaccessible if the site is already
occupied by another transcription factor. Pairs of
transcription factors can play antagonistic roles (activator
versus repressor) in the regulation of the same gene.
- Availability of other cofactors/transcription factors
needed for a complex Most transcription factors don't
work alone. Often for gene transcription to occur, a number
of transcription factors must bind to DNA regulatory
sequences. This collection of transcription factors in turn
recruit intermediary proteins such as
cofactors that allow efficient recruitment of the
preinitiation complex and
RNA polymerase. Thus, for a single transcription factor
to initiate transcription, all of these other proteins must
also be present and the transcription factor must be in a
state where it can bind to them if necessary.
Structure
Schematic diagram of the amino acid sequence (amino
terminus to the left and carboxylic acid terminus to
the right) of a prototypical transcription factor
which contains (1) a DNA-binding domain (
DBD),
(2) signal sensing domain (
SSD), and a
transactivation domain (
TAD). The order of
placement and the number of domains may differ in
various types of transcription factors. In addition,
the transactivation and signal sensing functions are
frequently contained within the same domain.
Transcription factors are modular in structure and contain
the following
domains:[1]
- DNA-binding domain (DBD) which attach to
specific sequences of DNA (enhancer
or
promoter sequences) adjacent to regulated genes. DNA
sequences which bind transcription factors are often
referred to as
response elements.
- Trans-activating domain (TAD) which
contain binding sites for other proteins such as
transcription coregulators. These binding sites are
frequently referred to as activation functions (AFs).[6]
- An optional signal sensing domain (SSD) (e.g.,
a ligand binding domain) which senses external signals and
in response transmit these signals to the rest of the
transcription complex resulting in up or down regulation of
gene expression. Alternatively the DBD and signal sensing
domains may reside on separate proteins that associate
within the transcription complex to regulate gene
expression.
DNA binding domains
The portion (domain)
of the transcription factor that binds DNA is called its DNA
binding domain. Below is a partial list of some of the major
families of DNA-binding domains/transcription factors:
There are other proteins that play crucial roles in the
regulation of transcription, that aren't classified as
transcription factors because they lack
DNA binding domains.[7]
(for example
coactivators,
chromatin remodelers,
histone acetylases,
deacetylases,
kinases, and
methylases).
Transcription factor binding
sites/response elements
The DNA sequence that a transcription factor binds to is
called a transcription factor binding site or response element.
Chemically, transcription factors usually interact with their
binding sites using a combination of
hydrogen bonds and
Van der Waals forces. Due to the nature of these chemical
interactions, most transcription factors bind DNA in a sequence
specific manner. However, not all
bases in the transcription factor binding site may actually
interact with the transcription factor. In addition some of
these interactions may be weaker than others. Thus,
transcription factors don't bind just one sequence but are
capable of binding a subset of closely related sequences, each
with a different strength of interaction.
For example, although the
consensus binding site for the
TATA binding protein (TBP) is:
TATAAAA
the TBP transcription factor can also bind similar sequences
such as:
TATATAT or TATATAA
Because transcription factors can bind a set of related
sequences and the sequences don't tend to be that long,
potential transcription factor binding sites can occur just by
chance if the DNA sequence is long enough. It is unlikely,
however, that a transcription factor binds all compatible
sequences in the
genome of the
cell.
Other constraints, such as DNA accessibility in the cell or
availability of
cofactors may also help dictate where a transcription factor
will actually bind. Thus, given the genome sequence it is still
difficult to predict where a transcription factor will actually
bind in a living cell.
Classes
Mechanistic
There are three mechanistic classes of transcription factors:
-
General transcription factors are involved in the
formation of a
preinitiation complex. The most common are abbreviated
as
TFIIA,
TFIIB,
TFIID,
TFIIE,
TFIIF, and
TFIIH. They are ubiquitous and interact with the core
promoter region surrounding the transcription start site(s)
of all
class II genes.[8]
- Upstream transcription factors are proteins that
bind somewhere upstream of the initiation site to stimulate
or repress transcription.
- Inducible transcription factors are similar to
upstream transcription factors but require activation or
inhibition.
Functional
Alternatively transcription factors have been classified
according to their regulatory function:[7]
- I. constitutively active - present in all cells
at all times -
general transcription factors,
Sp1,
NF1,
CCAAT
- II. conditionally active - requires activation
- II.A developmental (cell specific) -
expression is tightly controlled but once expressed
require no additional activation -
GATA,
HNF,
PIT-1,
MyoD,
Myf5,
Hox,
Winged Helix
- II.B signal dependent - requires external
signal for activation
- II.B.1 extracellular ligand dependent -
nuclear receptors
- II.B.2 intracellular ligand dependent -
activated by small intracellular molecules -
SREBP,
p53, orphan nuclear receptors
- II.B.3 cell membrane receptor dependent-
second messenger signaling cascades resulting in the
phosphorylation of the transcription factor
- II.B.3.a resident nuclear factors -
reside in the nucleus regardless of activation
state -
CREB,
AP-1,
Mef2
- II.B.3.b latent cytoplasmic factors -
inactive form reside in the cytoplasm but when
activated are translocated into the nucleus -
STAT,
R-SMAD,
NF-kB,
Notch,
TUBBY,
NFAT
Roles and Conservation in Different
Organisms
Transcription factors are essential for the regulation of
gene expression and consequently are found in all living
organisms. The number of transcription factors found within an
organism increases with the genome size and the larger genomes
tend to have more transcription factors per gene.[9]
There are approximately 2600 proteins in the
human genome that contain DNA-binding domains and most of
these are presumed to function as transcription factors.[10]
Therefore approximately 10% of genes in the genome code for
transcription factors which makes this family the single largest
family of human proteins. Furthermore genes are often flanked by
several binding sites for distinct transcription factors and
efficient expression of each these genes requires the
cooperative action of several different transcription factors
(see for example
hepatocyte nuclear factors). Hence the combinatorial use of
a subset of the approximately 2000 human transcription factors
easily accounts for the unique regulation of each gene in the
human genome during
development.[7]
Transcription factors and human
disease
Due to their important roles in development, intercellular
signaling, and cell cycle, some human diseases have been
associated with
mutations in transcription factors. Below are a few of the
more well-studied examples:
- Rett syndrome Mutations in the
MECP2 transcription factor are associated with
Rett syndrome, a neurodevelopmental disorder.
- Developmental verbal dyspraxia Mutations in the
FOXP2 transcription factor are associated with
developmental verbal dyspraxia, a disease in which
individuals are unable to produce the finely coordinated
movements required for speech.
- Cancer Many transcription factors are tumor
suppressors or oncogenes, and thus mutations or aberrant
regulation of them are associated with cancer. For example,
Li-Fraumeni syndrome is caused by mutations in the tumor
suppressor
p53.
Classification of Transcription
Factors
Transcription factors are often classified based on the
similarity of their DNA binding domains:[11][12][13]
- 1 Superclass: Basic Domains (Basic-helix-loop-helix)
- 1.1 Class:
Leucine zipper factors (bZIP)
- 1.1.1 Family:
AP-1(-like) components; includes (c-Fos/c-Jun)
- 1.1.2 Family:
CREB
- 1.1.3 Family:
C/EBP-like factors
- 1.1.4 Family: bZIP / PAR
- 1.1.5 Family: Plant G-box binding factors
- 1.1.6 Family: ZIP only
- 1.2 Class: Helix-loop-helix factors (bHLH)
- 1.2.1 Family: Ubiquitous (class A) factors
- 1.2.2 Family: Myogenic transcription factors (MyoD)
- 1.2.3 Family: Achaete-Scute
- 1.2.4 Family: Tal/Twist/Atonal/Hen
- 1.3 Class: Helix-loop-helix / leucine zipper factors
(bHLH-ZIP)
- 1.3.1 Family: Ubiquitous bHLH-ZIP factors;
includes USF (USF1,
USF2); SREBP (SREBP)
- 1.3.2 Family: Cell-cycle controlling factors;
includes
c-Myc
- 1.4 Class: NF-1
- 1.4.1 Family: NF-1 (NFIC)
- 1.5 Class: RF-X
- 1.6 Class: bHSH
- 2 Superclass: Zinc-coordinating DNA-binding domains
- 2.1 Class: Cys4
zinc finger of
nuclear receptor type
- 2.2 Class: diverse Cys4 zinc fingers
- 2.3 Class: Cys2His2 zinc finger domain
- 2.3.1 Family: Ubiquitous factors, includes
TFIIIA,
Sp1
- 2.3.2 Family: Developmental / cell cycle
regulators; includes
Krüppel
- 2.3.4 Family: Large factors with NF-6B-like
binding properties
- 2.4 Class: Cys6 cysteine-zinc cluster
- 2.5 Class: Zinc fingers of alternating composition
- 3 Superclass:
Helix-turn-helix
- 3.1 Class:
Homeo domain
- 3.1.1 Family: Homeo domain only; includes
Ubx
- 3.1.2 Family:
POU domain factors; includes
Oct
- 3.1.3 Family: Homeo domain with LIM region
- 3.1.4 Family: homeo domain plus zinc finger
motifs
- 3.2 Class: Paired box
- 3.2.1 Family: Paired plus homeo domain
- 3.2.2 Family: Paired domain only
- 3.3 Class:
Fork head /
winged helix
- 3.3.1 Family: Developmental regulators; includes
forkhead
- 3.3.2 Family: Tissue-specific regulators
- 3.3.3 Family: Cell-cycle controlling factors
- 3.3.0 Family: Other regulators
- 3.4 Class:
Heat Shock Factors
- 3.5 Class: Tryptophan clusters
- 3.6 Class: TEA domain
- 4 Superclass: beta-Scaffold Factors with Minor Groove
Contacts
- 4.1 Class: RHR (Rel homology region)
- 4.2 Class: STAT
- 4.3 Class: p53
- 4.4 Class:
MADS box
- 4.4.1 Family: Regulators of differentiation;
includes (Mef2)
- 4.4.2 Family: Responders to external
signals, SRF (serum response factor) (SRF)
- 4.5 Class: beta-Barrel alpha-helix transcription
factors
- 4.6 Class:
TATA binding proteins
- 4.6.1 Family: TBP
- 4.7.1 Family:
SOX genes,
SRY
- 4.7.2 Family: TCF-1 (TCF1)
- 4.7.3 Family: HMG2-related, SSRP1 (SSRP1)
- 4.7.5 Family: MATA
- 4.8 Class: Heteromeric CCAAT factors
- 4.8.1 Family: Heteromeric CCAAT factors
- 4.9 Class: Grainyhead
- 4.10 Class: Cold-shock domain factors
- 4.11 Class: Runt
- 0 Superclass: Other Transcription Factors
- 0.1 Class: Copper fist proteins
- 0.2 Class: HMGI(Y) (HMGA1)
- 0.3 Class: Pocket domain
- 0.4 Class: E1A-like factors
- 0.5 Class:
AP-2/EREBP-related factors
References
- ^
a
b
Latchman DS (1997).
"Transcription factors: an overview". Int. J. Biochem.
Cell Biol. 29 (12): 1305-12.
DOI:10.1016/S1357-2725(97)00085-X.
PMID 9570129.
-
^
Karin M (1990). "Too
many transcription factors: positive and negative
interactions". New Biol. 2 (2): 126-31.
PMID 2128034.
-
^
Roeder RG (1996). "The
role of general initiation factors in transcription by
RNA polymerase II". Trends Biochem. Sci. 21
(9): 327-35.
DOI:10.1016/0968-0004(96)10050-5.
PMID 8870495.
-
^
Nikolov DB, Burley SK
(1997). "RNA polymerase II transcription initiation: a
structural view". Proc. Natl. Acad. Sci. U.S.A.
94 (1): 15-22.
DOI:10.1073/pnas.94.1.15.
PMID 8990153.
-
^
Lee TI, Young RA
(2000). "Transcription of eukaryotic protein-coding
genes". Annu. Rev. Genet. 34: 77–137.
DOI:10.1146/annurev.genet.34.1.77.
PMID 11092823.
-
^
Wärnmark A, Treuter E,
Wright AP, Gustafsson J-Å (2003). "Activation functions
1 and 2 of nuclear receptors: molecular strategies for
transcriptional activation". Mol. Endocrinol.
17 (10): 1901-9.
DOI:10.1210/me.2002-0384.
PMID 12893880.
- ^
a
b
c
Brivanlou AH, Darnell
JE (2002). "Signal transduction and the control of gene
expression". Science 295 (5556): 813-8.
DOI:10.1126/science.1066355.
PMID 11823631.
-
^
Orphanides G, Lagrange
T, Reinberg D (1996). "The general transcription factors
of RNA polymerase II". Genes Dev. 10 (21):
2657-83.
DOI:10.1101/gad.10.21.2657.
PMID 8946909.
-
^
van Nimwegen E (2003).
"Scaling laws in the functional content of genomes".
Trends Genet. 19 (9): 479-84.
DOI:10.1016/S0168-9525(03)00203-8.
PMID 12957540.
-
^
Babu MM, Luscombe NM,
Aravind L, Gerstein M, Teichmann SA (2004). "Structure
and evolution of transcriptional regulatory networks".
Curr. Opin. Struct. Biol. 14 (3): 283-91.
DOI:10.1016/j.sbi.2004.05.004.
PMID 15193307.
-
^
Stegmaier P, Kel AE,
Wingender E (2004). "Systematic
DNA-binding domain classification of transcription
factors". Genome informatics. International
Conference on Genome Informatics 15 (2):
276-86.
PMID 15706513.
- ^
a
b
Matys V, Kel-Margoulis
OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter
I, Chekmenev D, Krull M, Hornischer K, Voss N, Stegmaier
P, Lewicki-Potapov B, Saxel H, Kel AE, Wingender E
(2006). "TRANSFAC® and its module TRANSCompel:®
transcriptional gene regulation in eukaryotes".
Nucleic Acids Res. 34 (Database issue):
D108-10.
DOI:10.1093/nar/gkj143.
PMID 16381825.
- ^
TRANSFAC® database. Retrieved on
2007-08-05.
-
^
Singer,
Susan R.; Gilbert, Scott F. (2006). Developmental
Biology. Sunderland, Mass: Sinauer Associates.
ISBN 0-87893-250-X.
See also
External links
|
Transcription factors and
intracellular receptors |
|
General transcription factors |
TFIIA,
TFIIB,
TFIID,
TFIIE,
TFIIF,
TFIIH |
|
Basic-helix-loop-helix |
AhR -
BMAL-CLOCK
-
E2F -
HIF -
Myc -
Pax (PAX3,
PAX6) -
Twist -
Myogenic regulatory factors (MyoD,
Myogenin,
MYF5,
MYF6) |
|
Basic leucine zipper |
C/EBP -
CREB -
AP-1 (c-Fos,
c-Jun) -
Activating transcription factor |
|
Basic helix-loop-helix leucine zipper |
MITF -
SREBP |
|
Nuclear receptors |
subfamily 1 (Thyroid
hormone,
CAR,
FXR,
LXR,
PPAR (α,
γ),
PXR,
RAR,
ROR,
Rev-ErbA,
VDR) -
subfamily 2 (COUP-TF,
Ear-2,
HNF4,
PNR,
RXR (α),
Testicular receptor,
TLX) -
subfamily 3 (Steroid
hormone (Estrogen,
Estrogen related,
Glucocorticoid,
Mineralocorticoid,
Progesterone,
Androgen)) -
subfamily 4
NUR (NGFIB,
NURR1,
NOR1) - subfamily 5 (SF1,
LRH-1) - subfamily 6 (GCNF)
- subfamily 0 (DAX1,
SHP) |
|
Winged-helix transcription factors |
FOX proteins (FOXP1,
FOXP2,
FOXP3) |
|
Zinc finger/protein |
Gli1 -
Gli2 -
Gli3 -
KlF (Sp1)
-
Zbtb7 -
Zif268 |
|
Other families |
CAP -
CBF (RUNX1,
RUNX2) -
GATA (GATA1)
-
NANOG -
NF-kB (NFKB1,
NFKB2,
RELA,
RELB) -
Rho/Sigma
-
R-SMAD -
Sox2 -
POU domain (PIT-1,
BRN-3,
Octamer transcription factor:
2,
4) -
STAT (1,
2,
3,
4,
5,
6) |