What is the biological significance of finding palindromes in DNA sequence?

What is the biological significance of finding palindromes in DNA sequence?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I found a function called palindromes in Matlab that finds palindromes from DNA sequence. Now what is the biological intention behind incorporating this function? What the biological significance of finding palindrome in DNA sequences?

My knowledge of biology is extremely limited, but this is what I know of palindromic sequences:

Palindromic sequences are their own reverse complements. I have seen many restriction sites be palindromic. Also, some Transcription Factor Binding sites are palindromic. The canonical E-box site, for example, can be expressed as CANNTG.

Also, these palindromes, when occurring in pairs with a sequence between the occurrences, can double back and bind to each other (when the DNA is transcribed to RNA, say), and this would result in the RNA stem and loop structure.

Palindrome sequences are important in bioinformatics understanding, it helps us to extract patterns in the genomic sequences. I'll give an example in HIV (very important application).

HIV has a nasty requirement that the virus must keep the cell alive. It does that by inserting itself into the host's genome. We know from biology that cutting and pasting of DNA works best when the sequences at either end of the cut site are identical. In fact, we usually look for palindromes, sequences that read the same backward and forward. We know the virus prefer to integrate at a palindromic sequence.

Therefore we should look at those palindromic patterns for HIV. Once we find those regions, we might do an alignment with known HIV sequences.

Please Google "restriction sites" for more information.

This information is mainly from my bioinfo class notes. Palindromes or inverted repeats in genome can help in

  • Predicting regions in RNA that are self-complementary and that, therefore, have the potential of forming secondary structure.
  • Identification of binding site of regulatory proteins involved in regulation of transcription. This is because sometimes the binding protein has a couple of monomers which bind to both sequences (orignal and inverted) resulting in stronger interaction than cases where only one monomer binds.
  • Inverted repeats of small length are also present in some transposons (though I am not really aware of their function there).

I also came across a section on function of inverted repeats (palindromes) on google books that discusses the topic in more detail.

Biotechnology: Principles and Processes Important Extra Questions Very Short Answer Type

Question 1.
What is genetic engineering?
Genetic engineering. It is a technique for artificially and deliberately modifying DNA (genes) to suit human needs. It is also called recombinant DNA technology or DNA splicing. It is a kind of biotechnology.

Question 2.
Define recombinant DNA.
Recombinant DNA. They are molecules of DNA that are formed through genetic recombination methods.

Question 3.
What is the role of restriction endonuclease?
Restriction endonucleases are specific enzymes which can cleave double-stranded DNA at the specific site.

Question 4.
What are BACs and YACs? (CBSE 2016)
BACs and YACs are artificial chromosomes from bacteria and yeast efficient for gene transfer. They are vectors.

Question 5.
Name the soil bacterium which contains the gene for production of endotoxins.
Agrobacterium tumefaciens.

Question 6.
Name a technique by which DNA fragments can be separated. (CBSE Delhi 2008)
Gel electrophoresis.

Question 7.
What is the principle of Gel electrophoresis?
DNA fragments are negatively charged so they move to anode under electric field through the matrix (usually agarose). This matrix gel acts as sieve and DNA fragments resolve according to their size.

Question 8.
Name the compound used for staining DNA to be used in Recombinant Technology. What is the colour of such stained DNA?
The compound used for staining DNA is ethidium bromide. Stained DNA becomes orange.

Question 9.
Name the technique for vector less direct gene transfer.
Gene gun.

Question 10.
What is the role of ‘Ori’ in any plasmid?
The plasmid is prokaryotic circular DNA which has a sequence of nucleotides from where the replication starts. This is called the origin of replication.

Role of Ori is to start replication. Also, the copy number of linked DNA is controlled by Ori.

Question 11.
Do normal £. coli cells have any gene resistance against antibiotics?

Question 12.
What is the function of TPA?
TPA (Tissue plasminogen activator) dissolves blood clots after a heart attack and stroke.

Question 13.
Give an example in which recombinant DNA technology has provided a broad range of tool in the diagnosis of diseases.
Construction of probes, which are short segments of single-stranded DNA attached to a radioactive or fluorescent marker.

Question 14.
Give the full form of PCR. Who developed it? (CBSE Delhi 2013)
PCR is a polymerase chain reaction. It was developed by Kary Mullis in 1985.

Question 15.
What is the source of DNA polymerase, i.e. Taq polymerase? (CBSE Outside Delhi 2013)
Taq polymerase is isolated from the bacterium Thermus Aquaticus.

Question 16.
Define “melting of target DNA”.
The target DNA containing the sequence to be amplified is heat-denatured (around 94° C for 15 seconds) to separate its complementary strands. This process is called melting of target DNA.

Question 17.
Expand ELISA. Write one application. (CBSE Delhi 2013)
ELISA-Enzyme Linked ImmunoSorbent Essay. Importance-lt is used for the diagnosis of AIDS.

Question 18.
What are transgenic animals? Give one example. (CBSE Outside Delhi 2016)
Transgenic animals: The animals obtained by genetic engineering containing transgenes are known as transgenic animals.
Example. Transgenic cow ‘Rosie’.

Question 19.
How many PCR cycles are adequate for proper amplification of DNA segment?
20-30 cycles.

Question 20.
Define gene therapy.
Gene Therapy: It is the replacement of a faulty gene by normal healthy functional gene.

Question 21.
What is the importance of gene bank?
It provides a stock from which genes can be obtained for improving the varieties or used in genetic engineering.

Question 22.
What can be the source of thermostable DNA?
Thermostable DNA is obtained from a bacterium Thermus Aquaticus.

Question 23.
What are selectable markers?
Genes which are able to select transformed cell from the non-recombinant cells are called selectable markers.

Question 24.
Why is enzyme cellulase used for isolating genetic material from plant cells and not from animal cells? (CBSE 2010)
Cellulase is used for breaking the cell wall of plant cells whereas animal cells lack a cell wall. The cell wall is made of cellulose which can be broken down by cellulase.

Question 25.
Give one example each of transgenic plant and transgenic animal.

Question 26.
What would be the molar concentration of human DNA in a human cell?
Humans have 3 M of DNA per cell, i.e. the molar concentration is 3.

Question 27.
Do eukaryotic cells have restriction endonucleases?
Yes, eukaryotic cells possess restriction endonucleases. They are involved in editing (Proofreading) and DNA repairs during DNA replication.

Question 28.
Name a technique by which DNA fragments can be separated. (CBSE Delhi 2008)
Get electrophoresis.

Question 29.
Biotechnological techniques can help to diagnose the pathogen much before the symptoms of the disease appear in the patient. Suggest any two such techniques. (CBSE Outside Delhi 2019)
PCR – Polymerase chain reaction
ELISA – Enzyme-linked immunosorbent assay

Question 30.
Why is it not possible for an alien DNA to become part of chromosome anywhere along its length and replicate? (CBSE 2014)
For multiplication of any alien DNA, it needs to be a part of a chromosome which has a specific sequence known as the origin of replication.

Question 31.
Mention the type of host cells suitable for the gene guns to introduce an alien DNA. (CBSE Delhi 2014)
Plant cells.

Question 32.
Name the enzymes that are used for isolation of DNA from bacterial and fungal cells for rDNA technology. (CBSE 2014)
Lysozyme for bacterial cell and chitinase for the fungal cell.

Question 33.
What is EcoRI? How does EcoRI differ from an exonuclease? (CBSE Outside Delhi 2015)
EcoRI is an endonuclease restriction enzyme which cut both the stands of palindromic DNA at a specific position of nitrogen base 5′ (GAATTC) 3′ while exonuclease removes nucleotides from terminals of DNA strands.

Biotechnology: Principles and Processes Important Extra Questions Short Answer Type

Question 1.
(i) While cloning vectors, which of the two will be preferred by biotechnologists, bacteriophages or plasmids. Justify with reason.
Biotechnologists prefer bacteriophages for cloning over plasmids because they have very high copy numbers of their genome within the bacterial cells whereas some plasmids may have only one or two copies per cell and others may have 15-100 copies per cell. Phage vectors are more efficient than plasmids for cloning large DNA fragments.

(ii) Name the first transgenic cow developed and state the Improvement In the quality of the product produced by it. (CBSE Sample paper 2018-19)
Transgenic cow Rosie produced human protein-enriched milk (2.4 grams per litre).

Question 2.
What are the two core techniques that enabled the birth of biotechnology?
The two core techniques that enabled the birth of modern biotechnology are:

  1. Genetic engineering techniques to alter the chemistry of genetic material (DNA and RNA), to introduce these into host organisms and thus change the phenotype of the host organism.
  2. Maintenance of sterile (microbial contamination-free) ambience in chemical engineering processes to enable the growth of only the desired microbe/eukaryotic cell in large quantities for the manufacture of biotechnological products like antibiotics, vaccines, enzymes, etc.

Question 3.
Make a list of tools of recombinant DNA technology. (CBSE Delhi, 2011)
Key tools of recombinant DNA technology:

  1. Restriction enzymes
  2. Polymerase enzymes
  3. Ligases
  4. Vectors
  5. Host organism.

Question 4.
What does EcoRI signify? How its name is derived?
EcoRI signifies the name of restriction endonuclease:

  1. First capital letter of the name that comes from the genus Escherichia is ‘E’.
  2. Second two small letters come from the species Coli of prokaryotic cells from which they are isolated, i.e. ‘co’.
  3. Letter R is derived from the name of the strain, i.e. Escherichia coli Ry 13.
  4. The Roman number indicates the order in which enzymes were isolated from that strain of bacteria.

Question 5.
What are recognition sequences or recognition sites?
The sites recognised by restriction endonucleases are called recognition sites. The recognition sequences are different and specific for the different restriction endonucleases. These sequences are palindromic in nature.

Question 6.
Define vector. Give the properties of a “Good Vector”.
A vector is a DNA molecule that has the ability to replicate in an appropriate host cell, and into which the DNA fragment to be cloned is integrated for cloning.

A good vector must have the following properties:

  • It should have an origin of replication so that it is able to replicate autonomously.
  • It should be easy to isolate and purify.
  • It should get easily introduced into the host cells.

Question 7.
What is the difference between cloning and expression vectors?
All vectors that are used for propagation of DNA inserts in a suitable host are called cloning vectors. When a vector is designed for the expression of, i.e. production of the protein specified by, the DNA insert, it is termed as an expression vector.

Question 8.
What do you understand by the term selectable marker?
Selectable marker:

  1. A marker is a gene which helps in selecting those host cells which contain the vector (transformant) and eliminating the non-transformants. It selectively permits the growth of transformants.
  2. Common selectable markers for E. coli include the genes encoding resistance to antibiotics such as ampicillin. chloramphenicol tetracycline and kanamycin or the gene for (i-galactosidase which can be identified by a colour reaction. Normal E.Coli do not carry resistance against any of these antibodies.

Question 9.
Explain the principle that helps in separation of DNA fragments in Gel electrophoresis. (CBSE Delhi 2009 C)
Get electrophoresis is a technique of molecules such as DNA/RNA/protein on the basis of their size under the influence of the electric field so that they migrate in the direction of electrode bearing the opposite charge. Positively charged molecules move towards cathode (-ve electrode) and vice versa. These molecules move through a medium or matrix and can be separated on the basis of their size.

Question 10.
Give the applications of PCR technology. (CBSE, Delhi 2013)

  1. Amplification of DNA and RNA.
  2. Determination of orientation and location of restriction fragments relative to one another.
  3. Detection of genetic diseases such as sickle cell anaemia, phenylketonuria and muscular dystrophy.

Question 11.
Why is “Agrobacterium-mediated genetic transformation” in plants described as natural genetic engineer of plants? (CBSE Delhi 2011)
Agrobacterium tumefaciens is a plant pathogenic bacterium which can transfer part of its plasmid DNA because it infects host plants. Agrobacterium produces crown gall in most of the dicotyledonous plants. These bacteria contain large tumours inducing plasmid (Ti-plasmids) which pass on their tumour causing gene into the genome of the host plant. Thus gene transfer is happening in nature without human involvement hence Agrobacterium-mediated genetic transformation is described as natural genetic engineering in plants.

Question 12.
Differentiate gene therapy and gene cloning.
Differences between gene therapy and gene cloning:

Gene therapy Gene cloning
It is the replacement and/ or alteration of defective genes responsible for hereditary diseases by normal genes. It is the technique of obtaining identical copies of a particular segment of DNA or a gene.

Question 13.
From what you have learnt, can you tell whether enzymes are bigger or DNA is bigger in molecular size? How did you know?
DNA molecules are bigger in size as compared to the molecular size of enzymes. Enzymes are proteins. Protein synthesis occurs from a small portion of DNA called genes.

Question 14.
How does restriction endonuclease work? (CBSE Delhi 2013, 2014, Outside Delhi 2019)
What are molecular scissors? Explain their role. (CBSE 2009)
Restriction endonuclease enzymes are called molecular scissors which can cut double-stranded DNA at specific sites.

Role of restriction endonuclease:

  • Restriction endonuclease inspects the length of DNA sequence.
  • It finds specific recognition sequence, i.e. palindromic nucleotide sequence in DNA.
  • These enzymes cut the strand of DNA a little away from the centre of palindromic sites.
  • Thus restriction endonucleases leave overhanging stretches called sticky ends on each strand.

Question 15.
How and why is the bacterium Thermus Aquaticus employed in recombinant DNA technology? Explain. (CBSE Delhi 2009)
Thermus Aquaticus bacterium is employed in recombinant DNA technology because it has thermostable DNA polymerase (Taq. Polymerase) that remains active during high temperature-induced denaturation of a step of PCR.

This enzyme is employed during amplification of gene using PCR (Polymerase Chain Reaction). The amplified fragment can be used to ligate with a vector for further cloning.

Question 16.
Name the source of Taq polymerase. Explain its advantages. (CBSE Outside Delhi 2009)
Taq Polymerase is extracted from Thermostable bacteria, namely Thermus Aquaticus. It remains active at a higher temperature and is used for denaturation of DNA during PCR.

Question 17.
What are recombinant proteins? How do bioreactors help in their production? (CBSE Outside Delhi 2009, 2015)
Recombinant proteins. When any protein-encoding gene is expressed in a heterologous host, it is called recombinant protein. Bioreactors help in the production of recombinant proteins on large scale. A bioreactor provides optimal conditions for achieving the desired recombinant protein by biological methods.

Question 18.
What is meant by gene cloning?
Formation of multiple copies of a particular gene is called gene cloning. A gene is separated and ligated to a vector-like plasmid. The recombinant plasmid is introduced into a plasmid-free bacterium through transformation. The transformed bacterium is made to multiply and form a colony. Each and every bacterium of the colony has a copy of the gene.

Question 19.
Both the wine marker and a molecular biologist who has developed a recombinant vaccine claim to be biotechnologist. Who in your opinion is correct?
Both are considered biotechnologists. Wine marker utilises a strain of yeast which produces wine by fermentation. The molecular biologist uses a cloned gene for the antigen. The antigen is used as a vaccine. This permits the formation of antigen in huge quantity. Both generate products and services using living organisms useful to mankind.

Question 20.
You have created a recombinant DNA molecule by ligating a gene to a plasmid vector. By mistake, your friend adds an exonuclease enzyme to the tube containing the recombinant DNA. How will your experiment get affected as you plan to go to the transformation now?
The experiment is not likely to be affected as the recombinant DNA molecule is circular and closed, with no free ends. Hence, it will not be a substrate for exonuclease enzyme which removes nucleotides from the free ends of DNA.

Question 21.
Explain the work carried out by Cohen and Boyer that contributed immensely to biotechnology. (CBSE2012)
Work of Cohen and Boyer:

  1. Discovery of restriction endonuclease, an enzyme of E.coli which cut DNA at palindromic sequence.
  2. Preparation of recombinant DNA (Plasmid and DNA of interest.)
  3. Their work established recombinant DNA (rDNA) technology also called genetic engineering.

Question 22.
How are ‘Sticky ends’ formed on a DNA strand? Why are these so-called? (CBSE Delhi 2014)
1. Restriction enzymes cut the strand of DNA a little away from the centre of palindrome site, but between the same two bases on opposite strands. As a result, single-stranded portions are left at each end. These overhanging stretches of DNA are called ‘Sticky ends’.

2. The Sticky ends are named so because they form hydrogen bonds with their complementary cut counterparts. The stickiness helps in the action of DNA ligase.

Question 23.
How is a continuous culture system maintained in bioreactors and why? (CBSE Delhi 2019)
In order to maintain a continuous culture system, the used medium is drained out from one side of the bioreactor and the fresh medium is added from one side. This type of culturing method produces larger biomass leading to higher yields of the desired product.

Question 24.
Galactosidase enzyme is considered a better selectable marker. Justify the statement. (CBSE Delhi 2019)
Recombinant strains can be differentiated from the non-recombinants ones easily by using this selectable marker. The selection is done on the basis of the colour change. All are grown on a chromogenic substance. Non-recombinants will change from colourless to blue while in recombinants insertional inactivation of galactosidase gene occurs.

Hence, recombinants showed no colour change. This is a single step, an easy method for selection.

Biotechnology: Principles and Processes Important Extra Questions Long Answer Type

Question 1.
What is genetic engineering? Explain briefly the distinct steps common to all genetic engineering technology.
With the help of diagrams show the different steps in the formation of recombinant DNA by the action of restriction endonuclease. (CBSE 2011)
Genetic engineering: It is a technique for artificially and deliberately modifying DNA (genes) to suit human needs. It is also called recombinant DNA technology or DNA splicing.

It is a kind of biotechnology:

  1. Isolation of genetic material which has the gene of interest.
  2. Cutting of gene of interest from genome and vector with the same restriction endonuclease enzyme. Amplifying gene of interest (PCR).
  3. Ligating gene of interest and vector using DNA ligase forming rDNA.
  4. Transformation of rDNA into the host cell.
  5. Multiplying host cell to create clones.

Diagram showing various steps Involved in DNA recombinant technology for the production of a recombinant protein.

Question 2.
List three important features necessary for preparing a genetically modifying organism.
Conditions necessary for preparing:

  1. Identification of DNA with desirable genes.
  2. Introduction of the identified DNA into the host.
  3. Maintenance of introduced DNA in the host and transfer of the DNA to its progeny.

Question 3.
How are restriction endonuclease enzymes named? Write examples. (CBSE 2014
The naming of restriction enzymes is as follows:

  1. The first letter of the name comes from the genus and the next two letters from the name of the species of the prokaryotic cell from which they are isolated.
  2. The next letter comes from the strain of the prokaryote.
  3. The roman numbers following these four letters indicate the order in which the enzymes were isolated from that strain of the bacterium.
  1. EcoR I is isolated from Escherichia coli RY 13.
  2. Hind II is from Haemophilus influenza.
  3. Bam H I is from Bacillus amylotiquefaciens.
  4. Sal I is from Streptomyces Albus
  5. Pst I is from Providencia stuartii.

Question 4.
Explain any three methods of vector less gene transfer. (CBSE Outside Delhi 2013)
Vectors of gene transfer. Following are common methods of vectors gene transfer.

  1. Microinjection: Microinjection is the process/technique of introducing foreign genes into a host cell by injecting the DNA directly into the nucleus by using microneedle or micropipette.
  2. Electroporation: Electroporation is the process by which transient holes are produced in the plasma membrane of the (host) cell to facilitate entry of foreign DNA.
  3. Gene Gun: Gene gun is the technique of bombarding microprojectiles (gold or tungsten particles) coated with foreign DNA with great velocity into the target cell.

Question 5.
Write a note on the cloning vector.
Cloning vectors:

  1. Plasmids and bacteriophages are the commonly used vectors
  2. Presently genetically engineered/ synthetic vectors are also used for easily linking the foreign DNA and selection of recombinants from non-recombinants.
  3. The following features are required to facilitate cloning in a vector:
    (a) Origin of replication (Ori)
    (b) Selectable marker
    (c) Cloning (Recognition) site
    (d) Small size of the vector.

Question 6.
What is PCR? List the three main steps. Show the steps with a diagrammatic sketch.

  1. PCR. Polymerase Chain Reaction.
  2. Three steps of PCR.
    (a) Denaturation
    (b) Primer annealing and
    (c) Extension of primers.

The three steps of PCR

Question 7.
Name the various cloning vectors and explain how a plasmid can be used for genetic engineering.
Cloning vectors:

  • Plasmids
  • Bacteriophages
  • Plant and animal vectors
  • Jumping genes (Transposons)
  • Artificial chromosomes of bacteria, yeast and mammals (BAC, YAC).

Use of plasmid as genetic material Plasmids are obtained from bacteria. They are treated with a restriction endonuclease enzyme to obtain the fragments of the desired genome. They are allowed to fuse with the help of a DNA ligase enzyme. The recombinant plasmids thus formed are used as genetic material.

Question 8.
Give various means by which a competent host is formed for recombinant DNA technology. Why and how bacteria can be made ‘competent’? (CBSE Delhi 2013)
A host cell should be competent enough to take the DNA molecule for the transformation as the following methods can be used.

  1. Using divalent cations: Bacteria are treated with Ca 2+ , etc. so that DNA enters the bacterium through pores in its cell wall.
  2. Heat shock: Cells can be incubated on ice and then at 42°C for a heat shock and then again put on ice.
  3. Microinjection: Recombinant DNA is directly injected into the nucleus of an animal cell.
  4. Biolistic. Cells bombarded with high- velocity micro-particles of gold or tungsten coated with DNA is known as a gene gun.

Question 9.
How is recombinant DNA transferred to host?
Transfer of recombinant DNA into the host:

  1. The bacterial cells must be made competent to take up DNA this is done by treating them with a specific concentration of calcium, that increases the efficiency with which DNA enters the cell through the pores in its cell wall.
  2. Recombinant DNA can then be forced into such cells by incubating the cells with recombinant DNA on ice followed by placing them at 42°C and then putting them back on ice (heat shock treatment),
  3. Microinjection is a method in which the recombinant DNA is directly injected into the nucleus of the animal cell with the help of microneedles or micropipettes.
  4. Gene gun or biolistics is a method suitable for plant cells, where cells are bombarded with high-velocity microparticles of gold or tungsten coated with DNA.
  5. Disarmed pathogens are used as vectors when they are allowed to infect the cell, they transfer the recombinant DNA into the host.

Question 10.
Why DNA cannot pass through the cell membrane? How can the bacteria be made competent to take up a plasmid? Explain a method for the introduction of alien DNA into a plant host cell. Name a pathogen that is used as a disarmed vector. (CBSE Outside Delhi 2019)
DNA is a hydrophilic molecule thus it cannot pass through the cell membrane.

Bacterial cells are made ‘competent’ by treating them with a specific concentration of divalent cation such as calcium in order to take up the plasmid. The divalent cation increases the efficiency with which DNA enters the bacterium through the pores of the cell wall.

Procedure: Recombinant DNA is forced into ‘competent’ bacterial host cells by incubating them on the ice. It is followed by placing them briefly at 42°C. It is termed ‘heat stock’ treatment. Again they are placed back on ice. This process allows bacteria to take up the recombinant DNA.

Gene gun or biolistic method is used for the introduction of alien DNA into a plant host cell. Here, the plant cells are bombarded with high-velocity micro-particles of gold or tungsten coated with DNA. Agrobacterium tumefaciens or Retroviruses can be used as a disarmed vector.

Question 11.
Write a note on vectors used during recombinant DNA technology. (CBSE Delhi 2008)
A vector or vehicle DNA is used as a carrier for transferring selected DNA into cells. A plasmid with its small DNA from a bacterium is a good choice for indirect gene transfer because it can move from one cell to another and make several copies of itself. However, artificial chromosomes from bacteria and yeast called BACs and YACs respectively are more efficient for eukaryotic gene transfers.

Plasmid and Yeast Artificial Chromosome

Question 12.
(i) Identify A and B illustrations in the following:
A= 5′ GAATTC 3′
B marks for ORI (origin of replication).

(ii) Write the term given to A and C and why?
A represents a nucleotide palindromic sequence. C-sticky end.

(iii) Expand PCR. Mention its importance in biotechnology. (CBSE Delhi 2011)
PCR, Polymerase chain reaction. It helps in gene amplification.

Question 13.
Write the role of the following sites in pBR322 cloning vector:
(a) rop
Role of rop, ori and selectable marker in pBR322 cloning vector.

Role of rop: Rop gene regulates copy number. Rop process is involved in stabilising the interaction between RNA I and RNA II which in turn prevents replication of pBR322.

(b) ori
Origin of replication (Ori):

  • It is a specific sequence of DNA bases, which is responsible for initiating replication.
  • An alien DNA for replication should be linked to the origin of replication.
  • A prokaryotic DNA has normally a single origin of replication, while eukaryotic DNA may have more than one origin of replication.
  • The sequence is responsible for controlling the copy number of linked DNA.

(c) selectable marker (CBSE Delhi 2019 C)
Selectable marker:

  • A marker is a gene which helps in selecting those host cells which contain the vector (transformant) and eliminating the non-transformants.
  • Common selectable markers for E. coli include the genes encoding resistance to antibiotics such as ampicillin. Chloramphenicol, tetracycline and kanamycin or the gene for B-galactosidase can be identified by a colour reaction.

Question 14.
(i) Explain the significance of palindromic nucleotide sequence in the formation of recombinant DNA.
The palindromic sequences, i.e. the sequence of base pairs read the same on both the DNA strands when the orientation of reading is kept the same, e.g.
5’ — GAATTC — 3’
3’ — CTTAAG — 5’

Every endonuclease inspects the entire DNA sequence for palindromic recognition sequence.

(ii) Write the use of restriction endonuclease in the above process. (CBSE 2017)
On finding the palindrome, the endonuclease binds to the DNA. It cuts the opposite strands of DNA, but between the same bases on both the strands and forms sticky ends. This sticky ends facilitate the action of enzyme DNA ligase and help in the formation of recombination DNA.

Question 15.
Describe the roles of heat, primers and the bacterium Thermus Aquaticus in the process of PCR. (CBSE 2017)
Role of heat: Heat helps in the denaturation process in PCR. The double-stranded DNA is heated in this process at very high temperature (95°C) so that both the strands separate.

Role of primers: Primers are chemically synthesised small oligonucleotides of about 10-18 nucleotides. These are complementary to a region of template DNA and helps in the extension of the new chain. Rote of Bacterium Thermus

Aquaticus: A thermostable Taq DNA polymerase is isolated from this bacterium, which can tolerate high temperatures and forms new strand.

Question 16.
How has the use of Agrobacterium as vectors helped in controlling Meloidogyne incognita infestation in tobacco plants? Explain in the correct sequence. (CBSE 2018, Outside Delhi 2019)
(a) Write the mechanism that enables Agrobacterium tumefaciens to develop tumours in their host dicot plant.
(b) State how Agrobacterium tumefaciens and some retroviruses have been modified as useful cloning vectors. (CBSE Delhi 2019 C)
(a) Cloning
(b) A nematode Meloidogyne incognita infects the roots of tobacco plants and causes a great reduction in yield.

To prevent this infestation a novel strategy was adopted which was based on the process of RNA interference (RNAi).

Nematode-specific genes were introduced into the host plants using Agrobacterium vectors. The introduction of DNA was such that it produced both sense and anti-sense RNA in the host cells. These two RNAs, being complementary to each other, formed a double-stranded RNA (dsRNA) that initiated RNAi and thus, silenced specific mRNA of the nematode. Due to this the parasite could not survive in a transgenic host by expressing specific interfering RNA. The transgenic plant, therefore, got itself protected from the parasite.

Question 17.
Explain the roles of the following with the help of an example each in recombinant DNA technology:
(i) Restriction Enzymes
Restriction enzymes :
(a) Restriction enzymes belong to nucleases class of enzymes which breaks nucleic acids by cleaving their phosphodiester bonds.
(b) Since restriction endonucleases cut DNA at a specific recognition site, they are used to cut the donor DNA to isolate the desired gene.
(c) The desired gene has sticky ends which can be easily ligated to cloning vector cut by same restriction enzymes having complementary sticky ends to form recombinant DNA.
(d) An example is EcoR1 which is obtained from E.coli bacteria “R” strain which cuts DNA at specific palindromic recognition site.
5‘ GAATTC 3‘
3‘ CTTAAG 5‘

(ii) Plasmids (CBSE 2018)
Plasmids: Plasmids are autonomous, extrachromosomal circular double-stranded DNA of bacteria. They are used as cloning vectors in genetic engineering because they are small and self-replicating. Some plasmids have antibiotic resistance genes which can be used as marker genes to identify recombinant plasmids from non-recombinant ones.

To obtain the desired products, plasmids are cut and ligated with desired genes and transformed into a host cell for amplification. An example of artificially modified plasmids is pBR322 (constructed by Bolivar and Rodriguez) or pUC (constructed at University at California).

Question 18.
When the gene product is required in large amounts, the transformed bacteria with the plasmid inside the bacteria are cultured on a large scale in an industrial fermenter which then synthesises the desired protein. This product is extracted from the fermenter for commercial use.
(a) Why is the used medium drained out from one side while the fresh medium is added from the other? Explain.
In the bioreactor used medium is drained out and the fresh medium is added to maintain the cells in their physiologically most active log / experimental phase,

(b) List any four optimum conditions for achieving the desired product in a bioreactor. (CBSE Sample Paper 2020)
Condition for obtaining the desired product in a bioreactor:

Question 19.
List the steps in the formation of rDNA.
Steps in formation of rDNA:
Recombined DNA technology involves the following steps:

  1. Isolation of DNA.
  2. Fragmentation of DNA by restriction endonucleases.
  3. Isolation of the desired DNA fragment.
  4. Amplification of the gene of interest.
  5. Ligation of the DNA fragment into a vector using DNA ligase.
  6. Transfer of DNA fragment into the vector using DNA ligase.

Question 20.
How is the isolated gene of interest amplified? (CBSE Delhi 2019, 2019 C)
Amplification of the DNA/gene of interest:

  1. Amplification refers to the process of making multiple copies of the DNA segment in vitro.
  2. It employs the polymerase chain reaction (PCR).
  3. The process was designed by K. Mullis,
  4. This technique involves three main steps:
    (a) Denaturation
    (b) Primer annealing and
    (c) Extension of primers.
  5. The double-stranded DNA is denatured by subjecting it to high temperatures.
  6. Two sets of primers are used primers are the chemically synthesised short segments of DNA (oligonucleotides), that are complementary to the segment of DNA (of interest).
  7. DNA polymerase enzyme (Taq polymerase) is used to make copies of DNA making use of genomic template DNA and primer.

Question 21.
List the features required to facilitate cloning into a vector. Show with a sketch the E. coli cloning vector showing restriction sites.
Sketch pBR322. (CBSE 2012, Outside Delhi 2019)
Features required to facilitate cloning vector.

  1. Origin of replication (Ori)
  2. Selectable marker
  3. Cloning sites
  4. Vectors for cloning genes in plants and animals
    Sites of cloning vector

E. coli Cloning Vector pBr322 showing restriction sites (Hindlll, EcoRI, BamHI, Sal I, Pvu II, Pst I, ClaI), oriV and antibiotic resistance genes (ampR and tetR). Rop codes for the proteins involved in the replication of the plasmid.

Question 22.
With the help of simple sketch show the action of restriction enzyme (EcoR1).
The action of restriction enzyme.

Ecol cuts the DNA between bases G and A only when the sequence GAATTC is present in the DNA.

Question 23.
Explain the importance of (a) ori, (b) ampR and (c) rop in the E. colt vector. (CBSE Outside Delhi 2009, Outside Delhi 2019)

  1. Importance of ori: This is a sequence from where replication starts and any piece of DNA, when linked to this sequence, can be made to replicate within the host cells, it allows multiple copies per cell.
  2. Importance of ampR: It is the antibiotic resistance gene for ampicillin. It helps in the selection of transformer cells.
  3. Importance of rop: It codes for the proteins involved in the replication of plasmid.

Question 24.
Name any two cloning vectors. Describe the features required to facilitate cloning into a vector. (CBSE Sample Paper)
Plasmids and bacteriophages are two examples of the cloning vector. A vector is a DNA molecule that has the ability to replicate in an appropriate host cell and into which the DNA fragment to be cloned is integrated for cloning.

A good vector must have the following properties:

  • It should be able to replicate autonomously.
  • it should be easy to isolate and purify,
  • It should be easily introduced into the host cells.

Cloning vectors: All vectors that are used for propagation of DNA inserts in a suitable host are called cloning vectors. When a vector is designed for the expression of, i.e. production of the protein specified by, the DNA insert, it is termed as an expression vector.

Question 25.
What are bioreactors? Sketch the two types of bioreactors. What is the utility? Which is the common type of bioreactors? (CBSE Delhi 2013)
How do bioreactors help in the production of recombinant proteins? (CBSE Outside Delhi 2009)
(i) How has the development of bioreactor helped in biotechnology?
(ii) Name the most commonly used bioreactor and describe its working. (CBSE Delhi 2018, 2019 C)
Small volume cultures cannot yield appreciable quantities of products. To produce these products in large quantities the development of ‘bioreactors’ was required where large volumes (100-1000 litres) of culture can be processed. Thus bioreactors can be thought of as vessels in which raw materials are biologically converted into specific products, using microbial, plant, animal or human cells or individual enzymes.

Role. A bioreactor provides the optimal conditions for achieving the desired product by providing optimum growth conditions (temperature, pH, substrate, salts, vitamins, oxygen).

One of the most commonly used bioreactors is of stirring type.

A stirred tank reactor is cylindrical or a container with a curved base which facilitate the mixing of the reactor contents. The stirrer facilitates even mixing and oxygen availability throughout the bioreactor. Alternatively, air can be passed through the reactor. It consists of agitator system, an oxygen delivery system, a foam control system, a temperature control system, pH control system and sampling ports so that small volumes of the culture can be withdrawn periodically.

(a) Simple stirred-tank bioreactor (b) Sparged stirred-tank bioreactor through which sterile air bubbles are sparged

Question 26.
Describe briefly the following:
(i) Origin of replication (Ori).
(a) It is a specific sequence of DNA bases, which is responsible for initiating replication.
(b) An alien DNA for replication should be linked to the origin of replication.
(c) A prokaryotic DNA has normally a single origin of replication, while eukaryotic DNA may have more than one origin of replication.
(d) The sequence is responsible for controlling the copy number of linked DNA.

(ii) Bioreactor.
(a) They are vessels in which raw materials are biologically converted into specific products using microbial, plant or human cells.
(b) A bioreactor provides optimal conditions for achieving the desired product by providing optimum growth conditions, pH, substrate salts, vitamins, oxygen, etc.
(c) The commonly used bioreactors are of stirring type.
(d) A stirred-tank reactor is usually cylindrical or with a curved base to facilitate the mixing of the contents.
(e) The stirrer facilitates the even-mixing and oxygen availability throughout the bioreactor.
(f) The bioreactor has the following components:

  • An agitator system.
  • An oxygen delivery system.
  • A foam control system.
  • A temperature control system.
  • pH control system and
  • Sampling ports.

(iii) Downstream processing.
(a) It refers to the series of processes, to which a genetically modified product has to be subjected before it is ready to be marketed.
(b) The processes include two processes:

(c) The product has to be formulated with suitable preservatives.
(d) Such formulation has to undergo thorough clinical trials in the case of drugs. Strict quality control testing is also required.
(e) A proper quality controlled testing of each product is also required.

Question 27.
Besides better aeration and mixing properties, what other advantages do stirred tank bioreactors to have over shake flasks?
Shake flasks are the conventional flasks for fermentation studies during secondary screening or laboratory process development. So, stirred-tank bioreactors are used to produce the product in large quantities.

Besides aeration and mixing:

  1. it also helps in providing optimum growth conditions (temperature, pH, substrate, salts, vitamins, oxygen) to achieve the desired product.
  2. cost-effective
  3. due to baffles, the oxygen transfer rate is very high
  4. the capacity of fermenters is more.

Question 28.
Explain briefly the following
(i) PCR
(ii) Restriction enzymes and DNA
(iii) Chitinase. (CBSE 2012)
Explain the three steps involved in a polymerase chain reaction. (CBSE Delhi 2018C)
(i) PCR-Polymerase Chain Reaction It is the process in which multiple copies of the gene or segment of DNA of interest are synthesised in vitro using primers and DNA polymerase.

Working Mechanism of PCR: A single PCR amplification cycle involves three basic steps: denaturation, annealing and extension (polymerisation).
(a) Denaturation. In the denaturation step, the target DNA is heated to a high temperature (usually 94°C), resulting in the separation of the two strands. Every single strand of the target DNA then acts as a template for DNA synthesis.

(b) Annealing (Anneal = Join). In this step, the two oligonucleotide primers anneal (hybridize) to each of the single-stranded template DNA since the sequence of the primers is complementary to the 3’ ends of the template DNA. This step is carried out at a lower temperature depending on the length and sequence of the primers.

(c) Primer Extension (Polymerisation): The final step is an extension, wherein Taq DNA polymerase (of a thermophilic bacterium Thermus aquatics) causes synthesis of the DNA region between the primers, using dNTPs (deoxynucleoside triphosphates) and Mg 2+ . It means the primers are extended towards each other so that the DNA segment lying between the two primers is copied.

The optimum temperature for this polymerisation step is 72°C. To begin the second cycle, the DNA is again heated to convert all the newly synthesised DNA into single strands, each of which can now serve as a template for synthesis of more new DNA. Thus the extension product of one cycle can serve as a template for subsequent cycles and each cycle essentially doubles the amount of DNA from the previous cycle. As a result, from a single template molecule, it is possible to generate 2 n molecules after n number of cycles.

  • Diagnosis of pathogen
  • Diagnosis of the specific mutation
  • DNA fingerprinting
  • Detection of plant pathogens
  • Cloning of DNA fragments from mummified remains of humans and extinct animals.

(ii) Restriction enzymes:
(a) They are called “molecular scissors” or chemical scalpels.
(b) Restriction enzymes, synthesised by micro-organisms as a defence mechanism, are specific endonucleases, which can cleave double-stranded DNA.
(c) Restriction enzymes belong to a class of enzymes called nucleases.
(d) They are of two kinds:

  • Exonucleases, which remove nucleotides from the ends of DNA.
  • Endonucleases, which cut the DNA at specific positions anywhere in its length (within).

(e) The recognition sequence is a palindrome, where the sequence of base pairs reads the same on both the DNA strands when the orientation of reading is kept the same, i.e. 5′ → 3′ direction or 3′ → 5′ direction.
e.g. 5′ – GAATTC – 3′
3′ – CTTAAG – 5′

(f) Each restriction endonuclease functions by inspecting the length of a DNA sequence and binds to the DNA at the recognition sequence.
(g) It cuts the two strands of the double helix at specific points in their sugar-phosphate backbones, a little away from the centre of the palindrome sites, but between the same two bases on both the strands.
(h) As a result, single-stranded portions called sticky ends are produced at the ends of the DNA this stickiness of the end facilitates the action of enzyme DNA ligase.

  1. When cut by the same restriction endonuclease, the DNA fragments (of the donor as well as the host/ recipient) yield the same kind of ‘sticky ends’ which can be joined end-to-end by DNA ligases.
  2. Chitinase. This cell wall in fungi is made of chitin. The enzyme is used in fungi to break open the cell to release DNA along with their macromolecules like RNA proteins, lipids and polysaccharides.

Question 29.
Discuss with your teacher and find out how to distinguish between
(i) Plasmid DNA and chromosomal DNA
Differences between plasmid DNA and chromosomal DNA:

Plasmid DNA Chromosomal DNA
1. It is self-replicating, DNA molecule found naturally in many bacteria and yeast. 1. Chromosomal DNA present in chromosomes of all organisms.
2. It is not essential for normal growth and division. 2. It Is essential for growth and division.
3. It contains information for a few traits. 3. It contains information for all traits.

(ii) Exonuclease and Endonuclease (CBSE, Delhi 2013)
Differences between Exonuclease and Endonuclease:

Endonuclease Exonuclease
It cuts the DNA at a specific position of nitrogen bases anywhere within the length of DNA except the ends. This enzyme removes nucleotides from the terminals from 5’ or 3’ ends of DNA molecules.

Question 30.
Collect the examples of palindromic sequences by consulting your teacher. Better try to create a palindromic sequence by following base pair rules.

Question 31.
Can you list 10 recombinant proteins which are used In medical practice? Find where they are used as therapeutics (use the Internet).

Recombinant Protein Therapeutic Use
1. Insulin For the treatment of diabetes Mellitus
2. Human Growth Hormone For the treatment of dwarfism
3. Interferons For the treatment of viral diseases, cancer and AIDS.
4. Streptokinase For treating thrombosis.
5. Tumour Necrosis factor For treating sepsis and cancer
6. Interleukins For treating various cancers.
7. Hepatitis-B Surface Antigen The vaccine against Hepatitis- B
8. Granulocyte Colony-stimulating factor For treating cancer and AIDS and in bone marrow transplantation
9. Granulocyte-macrophage Colony-stimulating factor For treating cancer and AIDS
10. Bovine growth hormone For increasing milk yield.

Question 32.
How is DNA isolated in a purified form? (CBSE Outside Delhi 2009)
Isolation of DNA in the purified form:

  1. DNA has to be isolated in pure form for the action of restriction enzymes.
  2. DNA can be released from the cells by digesting the cell envelope by the use of enzymes like lysozyme for bacterial cells, chitinase for fungal cells and cellulase for plants cells.
  3. Since DNA is intertwined with histone proteins and RNAs, proteins are removed by treatment with proteases and RNAs by ribonucleases.
  4. Other impurities are removed by employing suitable treatments.
  5. The purified DNA is precipitated by the addition of chilled ethanol. It is seen as fine threads in suspension.

Question 33.
How is isolation and Fragmentation of DNA of interest carried out in recombinant DNA technology? (CBSE Outside Delhi 2009, 2019)
1. Fragmentation DNA: Fragmentation of DNA is carried out by incubating the purified DNA molecules with suitable restriction enzymes at optimal conditions of temperature and pH.

2. Isolation of DNA (gene) of Interest:
(a) The fragments of DNA are separated by a technique called gel electrophoresis.
(b) The DNA is cut into fragments by restriction endonucleases.
(c) These fragments are separated by a technique called gel electrophoresis.
(d) Agarose, the natural polymer obtained from seaweeds, is used as the matrix.
(e) DNA fragments being negatively charged are separated by forcing them to move through the matrix towards the anode under an electric field.
(f) The DNA fragments separate/ resolve according to their size.
(g) The separated molecules are stained by ethidium bromide and visualised by exposure to UV radiation, as bright orange coloured bands.
(h) The separated bands of DNA (on the gel) are cut from the gel and extracted from the gel piece (elution).
(i) Such DNA fragments are purified and used for constructing recombinant DNA by joining them with cloning vectors.

The Origins of CRISPR

As revolutionary as CRISPR has been for biomedical science, its discovery stemmed from basic scientific curiosity about a biological topic about as far removed from medicine as it gets. To understand where CRISPR comes from, we need to delve into one of the longest standing genetic conflicts on Earth: the relentless arms race between bacteria and bacteria-specific viruses (Rohwer et al., 2014).

Everyone knows about bacteria, those pesky microorganisms that can make us sick—think Streptococci, the cause of strep throat and pneumonia, or Salmonella infections that cause food poisoning—but which are also indispensable for normal human function. (We depend on a vast army of bacteria that collectively make up our microbiome and help break down food, produce vitamins, and perform numerous other essential functions.) Few outside the research community, though, may know about the ubiquity of bacterial viruses, also known as bacteriophages (“eaters of bacteria”). In fact, bacteriophages are by far the most prevalent form of life on our planet: at an estimated abundance of ten million trillion trillion, they outnumber even bacteria ten to one. There are approximately one trillion bacterial viruses for every grain of sand in the world, and ten million viruses in every drop of seawater (Keen, 2015)!

Bacterial viruses evolved to infect bacteria, and they do so remarkably well. They exhibit three-dimensional structures that are exquisitely well suited to latch onto the outer surface of bacterial cells, and after attaching themselves in this manner, they inject their genetic material inside the bacterial host using pressures similar to that of an uncorked champagne bottle. After the viral genome makes its way inside the bacteria, it hijacks the host machinery to replicate its genetic code and build more viruses, ultimately destroying the cell in the process. Roughly twenty to forty percent of the ocean’s bacteria are eliminated every day from such viral infections, vastly reshaping the marine ecosystem by causing release of carbon and other nutrients back into the environment.

To understand where CRISPR comes from, we need to delve into one of the longest standing genetic conflicts on Earth: the relentless arms race between bacteria and bacteria-specific viruses

Yet bacteria are not passive bystanders in the face of such an onslaught—quite the contrary. Bacteria possess numerous immune systems to combat viruses at multiple stages during the viral life cycle, which microbiologists have studied for many decades. By the turn of the twenty-first century, the existing paradigm held that, while diverse, these immune systems constituted only a simple innate response to infection. Unlike multicellular vertebrate organisms, which possess innate immune systems together with elaborate adaptive immune systems that can create and store immunological memory, bacteria had no ability to adapt to new threats.

Enter CRISPR, short for Clustered Regularly Interspaced Short Palindromic Repeats. First detected in 1987 in the bacterium Escherichia coli (Ishino et al., 1987), CRISPRs—to put it simply—are bizarre, repeating sections of bacterial DNA that can extend thousands of letters in length. While CRISPRs initially seemed like a rare oddity, a fluke of nature, researchers had detected CRISPRs in dozens of other bacterial species by the early 2000s (Mojica et al., 2000).

Color-scanning electron microscope (SEM) image of bacteria on the surface of a human tongue. Enlarged 10,000 times to a width of 10 cm

These repeating structures were initially described using a number of different and confusing acronyms, and so, in 2002, Dutch researchers simplified their classification with the informative (and catchy) acronym that we still use today (Jansen et al., 2002).

Despite a growing appreciation that CRISPRs were abundant in nature, being found in the genomes of a third of all bacteria and almost all archaea (another domain of single-celled microorganisms), their biological function remained a complete mystery until 2005, when the first clues surfaced linking CRISPR to antiviral immunity (Mojica et al., 2005). Using bioinformatics analyses, researchers were shocked to find viral DNA sequences buried within those repeating sections of DNA, as if the bacteria had somehow stolen the viral genetic code as a form of molecular memory. Might this information allow bacteria to recognize and destroy viral DNA during an infection?

Evidence supporting this hypothesis came from elegant experiments conducted at a yogurt company (Barrangou et al., 2007). Scientists there were hoping to generate virus-resistant strains of the bacterium Streptococcus thermophilus, the major workhouse ingredient used to ferment milk into yogurt and other dairy products, and they noticed that, like E. coli, their S. thermophilusstrains also contained CRISPRs. By intentionally infecting their strains with a panel of different viruses and then analyzing the DNA of those bacteria that gained immunity, the researchers proved that CRISPRs indeed conferred adaptive immunity. Almost overnight, the long-standing presumption that bacteria and archaea possessed only comparatively simple defenses against viral pathogens was overturned. Instead, these simple microorganisms employed both innate and adaptive immune systems no less remarkable and versatile than the innate and adaptive systems found in multicellular organisms.

Despite a growing appreciation that CRISPRs were abundant in nature, being found in the genomes of a third of all bacteria and almost all archaea (another domain of single-celled microorganisms), their biological function remained a complete mystery until 2005, when the first clues surfaced linking CRISPR to antiviral immunity

After this breakthrough, it was up to geneticists and biochemists to determine how CRISPR immune systems work. Namely, what enzymes were involved, and how were they able to accurately recognize unique features of viral DNA during an infection? From the work of countless researchers all around the world, a new, unified understanding began to emerge: bacteria and archaea used molecules of ribonucleic acid, or RNA—DNA’s molecular cousin—to identify matching sequences of DNA in the viral genome, along with one or more proteins encoded by CRISPR-associated genes to slice apart the DNA (Klompe & Sternberg, 2018). CRISPR was nothing more than a precision-guided pair of molecular scissors, with the incredible ability to home in on specific sequences of DNA and neutralize them by severing both strands of the double helix. And the star actor in this pathway was a protein enzyme called CRISPR-Cas9 (Gasiunas et al., 2012 Jinek et al., 2012).

We found at least 10 Websites Listing below when search with palindrome in dna on Search Engine

A reference catalog of DNA palindromes in the human … DA: 17 PA: 43 MOZ Rank: 60

  • Introduction In DNA, palindromes are defined as a sequence of nucleotides that are followed by its complement sequence appearing in reverse order1
  • 1, the sequence on the positive strand 5′-GACA| TGTC-3′ is a palindrome since GACA is followed by its complement CTGT, but appearing in reverse order as TGTC2.

Palindromes in DNA-A Risk for Genome Stability and

  • A palindrome in DNA consists of two closely spaced or adjacent inverted repeats
  • Certain palindromes have important biological functions as parts of various cis-acting elements and protein binding sites
  • However, many palindromes are known as fragile sites in the genome, sites prone to chromosome br …

Palindromes or Inverted Repeats in DNA and its Role (With

  • Introduction to Palindromes: RNA is a single stranded molecule but it can also form double stranded regions in its structure like DNA
  • When a sequence of bases is followed by a complementary sequence nearby in the same molecule, the chain may fold back upon itself to generate an antiparallel double

Palindromic Sequences Science Primer

These enzymes predictably cut both strands because the sequences they recognize are palindromic. That is the recognition sequences are short string of identical bases on both DNA strands. Palindromic sequences are similar to language palindromes, but follow a distinct set of rules.

An efficient algorithm to detect palindromes in DNA

  • A DNA palindrome is a sequence of nucleotide bases that reads the same as its reverse complement
  • DNA palindromes are crucial for gene regulation
  • Besides regulating protein production through mRNA intermediates, exact and inexact palindromes have structural roles in tRNA, ribozymes, and some ribonuclear proteins,.

What are the functions of palindromic DNA and how does it DA: 13 PA: 50 MOZ Rank: 68

  • I will try to explain in laymen terms
  • Microorganisms have developed a defence mechanism to protect themself from foreign DNA
  • They produce "restriction enzymes", which identifies a particular sequence in DNA and chops it

Palindromic DNA (With Diagram) Biochemistry

  • The palindromic DNA or palindromes are the inverted repeats and region of dyad symmetry
  • The length of palindromes may be short by about 3-10 bases or long by about 50-100 base pairs
  • As compared to prokaryotic DNA, the eukaryotic DNA contains a large palindrome of about several thousand base pairs.


  • The Palindrome Detection Algorithm was applied to two categories of DNA sequences
  • First., the algorithm is applied to a Pseudo-hdom DNA Sequence generated by a Pseudo-Random generator program [9]
  • The second category is an actual DNA sequence

Watson–Crick palindromes in DNA computing DA: 14 PA: 50 MOZ Rank: 72

  • Palindromes Watson–Crick palindromes 1 Introduction Theoretical DNA Computing is an area of biomolecular computing that loosely encom-passes contributions to fundamental research in computer science originated in or moti-vated by research in DNA computing
  • Examples are numerous and they include theoretical

Find longest palindrome substring of a piece of DNA

For example, the DNA sequence ACCTAGGT is palindromic because its nucleotide-by-nucleotide complement is TGGATCCA, and reversing the order of the nucleotides in the complement gives the original sequence.

Y Chromosome VI: Palindromes!

First, a brief review of what the ampliconic class is.

From my Y Chromosome II post:

The final sequence class, the ampliconic, is more complex than the previous two classes as it contains more genes and has stranger architecture. The 10.2Mb class is broken into seven segments and contains the highest density of genes on the MSY. An amplicon is a generic term to group together the highly repetitive MSY-specific units. To identify these amplicons, Skaletsky et al. compared a 50kb sliding window to the rest of the euchromatic sequences in 1kb steps and any window that showed over 50% similarity to another sequence was deemed an amplicon (blue regions in Figure 3). Although this seems arbitrary, 60% of the region shows over 99.9% similarity to something else in the region (Skaletsky et al., 2003).

There is something to this high sequence similarity. What explains it?

Palindromes. Huge, massive palindromes. As you most likely know, a palindrome is “a word, line, verse, number, sentence, etc., reading the same backward as forward, as [in] Madam, I’m Adam” ( The longest palindrome I am aware of is Demitri Martin’s 224-word poem, Dammit, I’m Mad.

That’s in language though – what is a biological palindrome? It’s nearly the same idea. In an e-mail from a member of the Page lab, I was given this example:


The second half is the “reverse-compliment” of the first half. The first half itself is repeated on the right side of the other DNA strand. The double-stranded nature of DNA complicates the picture, but hopefully that makes sense. As this this free mol bio textbook (Google Books) says, “the sequence reads the same forwards on one strand as it reads backwards on the complementary strand” (86). The figure in the book reflects this (figure 1 just add a spacer to get our palindromes).

The three bases in the middle of the example, AAA, are considered a spacer (and each half is an “arm” of the palindrome). If the spacer is smaller than an arm, the Page lab considers that a palindrome if the spacer is larger than an arm, the Page lab considers that only an inverted repeat. The significance of the spacer will be explained two posts from now.

As I said earlier, the Y chromosome’s palindromes are absolutely huge. The longest palindrome, P1, is 2.9 Mb long. That is 2,900,000 letters! The two arms account for

30% of the ampliconic class and

13% of the Y’s euchromatin. The other palindromes aren’t slouches either, as indicated in Table 1. The fact that the two arms are over 99.5% identical with each other explains the high sequence similarity found in the Page lab’s scans.

Why palindromes? I will tell you in the next few posts!
Skaletsky H, Kuroda-Kawaguchi T, Minx PJ, Cordum HS, Hillier L, Brown LG, Repping S, Pyntikova T, Ali J, Bieri T, Chinwalla A, Delehaunty A, Delehaunty K, Du H, Fewell G, Fulton L, Fulton R, Graves T, Hou SF, Latrielle P, Leonard S, Mardis E, Maupin R, McPherson J, Miner T, Nash W, Nguyen C, Ozersky P, Pepin K, Rock S, Rohlfing T, Scott K, Schultz B, Strong C, Tin-Wollam A, Yang SP, Waterston RH, Wilson RK, Rozen S, & Page DC (2003). The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature, 423 (6942), 825-37 PMID: 12815422

Schneider, T.D. & Stephens, R.M. Sequence Logos: a new way to display consensus sequences. Nucleic Acids Res. 18, 6097–6100 (1990).

Stormo, G.D. DNA binding sites: representation and discovery. Bioinformatics 16, 16–23 (2000).

Workman, C.T. et al. EnoLOGOS: a versatile web tool for energy normalized sequence logos. Nucleic Acids Res. 33 (Web Server Issue), W389–W392 (2005).

Matys, V. et al. TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34 suppl. Database issue, D108–D110 (2006).

Vlieghe, D. et al. A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 34 suppl. Database issue, D95–D97 (2006).

Teixeira, M.C. et al. The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 34 suppl. Database issue, D446–451 (2006).

Zhu, J. & Zhang, M.Q. SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15, 607–611 (1999).

Salgado, H. et al. RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 34 suppl. Database issue, D394–D397 (2006).

Munch, R. et al. PRODORIC: prokaryotic database of gene regulation. Nucleic Acids Res. 31, 266–269 (2003).

H DNA and slipped DNA—tying DNA into pseudoknots

DNA with mirror symmetry in its DNA sequence—that in which one strand reads the same in both directions, such as 5′-AATGGTAA-3′—can form a three-stranded structure called H DNA (Fig. 1). Here, one strand in a repeat loops back to form a triplex with the other repeat its complement remains unpaired. Mirkin first described this structure in 1987. Richard Sinden (Florida Institute of Technology) described a related structure formed by DNA with the sequence (CCTG)n found at the myotonic dystrophy type 2 (DM2) locus. As an example of genetic “anticipation,” this DNA reveals an increasingly noticeable phenotype as n increases with each generation. Only when n > 10 4 is the disease truly debilitating. Sinden showed that this DNA can form a slipped structure when it is heated and cooled to form dsDNA with two single-stranded loops (Fig. 1), which apparently can interact with each other to form “donuts” visible by AFM. These structures are stable up to 55°C, unlike the much less stable cruciforms, Z DNA, and H DNA. The sequence (CCTG)n is the first example of slipped DNA that can arise simply from DNA supercoiling. Notably, Z DNA adjacent to this sequence diminishes its propensity to form the slipped structure. Thus, Z DNA, with its left-handed twisting, reduces the supercoiling density and can be protective in some contexts. In E. coli plasmids the slipped structure arises when n is more than ∼40 when n is ∼170, slipped DNA forms without heating and cooling but is dependent on supercoiling. These thresholds are vastly different from the threshold for human disease. Nevertheless, repeat length determines the disease phenotype, just as in the trinucleotide repeat diseases so thoroughly studied, and all of these diseases manifest anticipation.


Bioinformatics involves the integration of computers, software tools, and databases in an effort to address biological questions. Bioinformatics approaches are often used for major initiatives that generate large data sets. Two important large-scale activities that use bioinformatics are genomics and proteomics. Genomics refers to the analysis of genomes. A genome can be thought of as the complete set of DNA sequences that codes for the hereditary material that is passed on from generation to generation. These DNA sequences include all of the genes (the functional and physical unit of heredity passed from parent to offspring) and transcripts (the RNA copies that are the initial step in decoding the genetic information) included within the genome. Thus, genomics refers to the sequencing and analysis of all of these genomic entities, including genes and transcripts, in an organism. Proteomics, on the other hand, refers to the analysis of the complete set of proteins or proteome. In addition to genomics and proteomics, there are many more areas of biology where bioinformatics is being applied (i.e., metabolomics, transcriptomics). Each of these important areas in bioinformatics aims to understand complex biological systems.

Many scientists today refer to the next wave in bioinformatics as systems biology, an approach to tackle new and complex biological questions. Systems biology involves the integration of genomics, proteomics, and bioinformatics information to create a whole system view of a biological entity.

Figure 1. The Wheel of Biological Understanding. System biology strives to understand all aspects of an organism and its environment through the combination of a variety of scientific fields.

For instance, how a signaling pathway works in a cell can be addressed through systems biology. The genes involved in the pathway, how they interact, and how modifications change the outcomes downstream, can all be modeled using systems biology. Any system where the information can be represented digitally offers a potential application for bioinformatics. Thus bioinformatics can be applied from single cells to whole ecosystems. By understanding the complete “parts lists” in a genome, scientists are gaining a better understanding of complex biological systems. Understanding the interactions that occur between all of these parts in a genome or proteome represents the next level of complexity in the system. Through these approaches, bioinformatics has the potential to offer key insights into our understanding and modeling of how specific human diseases or healthy states manifest themselves.

The beginning of bioinformatics can be traced back to Margaret Dayhoff in 1968 and her collection of protein sequences known as the Atlas of Protein Sequence and Structure[1]. One of the early significant experiments in bioinformatics was the application of a sequence similarity searching program to the identification of the origins of a viral gene[2]. In this study, scientists used one of the first sequence similarity searching computer programs (called FASTP), to determine that the contents of v-sis, a cancer-causing viral sequence, were most similar to the well-characterized cellular PDGF gene. This surprising result provided important mechanistic insights for biologists working on how this viral sequence causes cancer[3]. From this first initial application of computers to biology, the field of bioinformatics has exploded. The growth of bioinformatics is parallel to the development of DNA sequencing technology. In the same way that the development of the microscope in the late 1600’s revolutionized biological sciences by allowing Anton Van Leeuwenhoek to look at cells for the first time, DNA sequencing technology has revolutionized the field of bioinformatics. The rapid growth of bioinformatics can be illustrated by the growth of DNA sequences contained in the public repository of nucleotide sequences called GenBank.

Figure 2. The Use of Computers to Process Biological Information. The wealth of genome sequencing information has required the design of software and the use of computers to process this information.

Genome sequencing projects have become the flagships of many bioinformatics initiatives. The human genome sequencing project is an example of a successful genome sequencing project but many other genomes have also been sequenced and are being sequenced. In fact, the first genomes to be sequenced were of viruses (i.e., the phage MS2) and bacteria, with the genome of Haemophilus influenzae Rd being the first genome of a free living organism to be deposited into the public sequence databanks[4]. This accomplishment was received with less fanfare than the completion of the human genome but it is becoming clear that the sequencing of other genomes is an important step for bioinformatics today. However, genome sequence by itself has limited information. To interpret genomic information, comparative analysis of sequences needs to be done and an important reagent for these analyses are the publicly accessible sequence databases. Without the databases of sequences (such as GenBank), in which biologists have captured information about their sequence of interest, much of the rich information obtained from genome sequencing projects would not be available.

The same way developments in microscopy foreshadowed discoveries in cell biology, new discoveries in information technology and molecular biology are foreshadowing discoveries in bioinformatics. In fact, an important part of the field of bioinformatics is the development of new technology that enables the science of bioinformatics to proceed at a very fast pace. On the computer side, the Internet, new software developments, new algorithms, and the development of computer cluster technology has enabled bioinformatics to make great leaps in terms of the amount of data which can be efficiently analyzed. On the laboratory side, new technologies and methods such as DNA sequencing, serial analysis of gene expression (SAGE), microarrays, and new mass spectrometry chemistries have developed at an equally blistering pace enabling scientists to produce data for analyses at an incredible rate. Bioinformatics provides both the platform technologies that enable scientists to deal with the large amounts of data produced through genomics and proteomics initiatives as well as the approach to interpret these data. In many ways, bioinformatics provides the tools for applying scientific method to large-scale data and should be seen as a scientific approach for asking many new and different types of biological questions.

Figure 3. Potential Types of Bioinformatic Data. Computer based databases of biological information enables scientist to generate all sorts of data, from generating protein sequence and predicting protein domains to even producing 3D structures of proteins.

The word bioinformatics has become a very popular “buzz” word in science. Many scientists find bioinformatics exciting because it holds the potential to dive into a whole new world of uncharted territory. Bioinformatics is a new science and a new way of thinking that could potentially lead to many relevant biological discoveries. Although technology enables bioinformatics, bioinformatics is still very much about biology. Biological questions drive all bioinformatics experiments. Important biological questions can be addressed by bioinformatics and include understanding the genotype-phenotype connection for human disease, understanding structure to function relationships for proteins, and understanding biological networks. Bioinformaticians often find that the reagents necessary to answer these interesting biological questions do not exist. Thus, a large part of a bioinformatician’s job is building tools and technologies as part of the process of asking the question. For many, bioinformatics is very popular because scientists can apply both their biology and computer skills to developing reagents for bioinformatics research. Many scientists are finding that bioinformatics is an exciting new territory of scientific questioning with great potential to benefit human health and society.

The future of bioinformatics is integration. For example, integration of a wide variety of data sources such as clinical and genomic data will allow us to use disease symptoms to predict genetic mutations and vice versa. The integration of GIS data, such as maps, weather systems, with crop health and genotype data, will allow us to predict successful outcomes of agriculture experiments. Another future area of research in bioinformatics is large-scale comparative genomics. For example, the development of tools that can do 10-way comparisons of genomes will push forward the discovery rate in this field of bioinformatics. Along these lines, the modeling and visualization of full networks of complex systems could be used in the future to predict how the system (or cell) reacts, to a drug, for example. A technical set of challenges faces bioinformatics and is being addressed by faster computers, technological advances in disk storage space, and increased bandwidth, but by far one of the biggest hurdles facing bioinformatics today, is the small number of researchers in the field. This is changing as bioinformatics moves to the forefront of research but this lag in expertise has lead to real gaps in the knowledge of bioinformatics in the research community. Finally, a key research question for the future of bioinformatics will be how to computationally compare complex biological observations, such as gene expression patterns and protein networks. Bioinformatics is about converting biological observations to a model that a computer will understand. This is a very challenging task since biology can be very complex. This problem of how to digitize phenotypic data such as behavior, electrocardiograms, and crop health into a computer readable form offers exciting challenges for future bioinformaticians.

(This article is based upon an interview with Francis Ouellette, Director of the UBC Bioinformatics Centre)

1. PMID=5789703 Sci. Am. 1969 Jul 221(1):86-95.
2. PMID=6304883 Science. 1983 Jul 15 221(4607):275-7.
3. PMID=6306471 Nature. 1983 Jul 7-13 304(5921):35-9.
4. PMID=7542800 Science. 1995 Jul 28 269(5223):496-512.


(Art by Jiang Long – note that high res versions of image files available here)

1 Introduction

Machine learning is a specialization of computer science closely related to pattern recognition, data science, data mining and artificial intelligence ( William, 2009). Within the field of machine learning, artificial neural networks, inspired by biological neural networks, have in recent years regained popularity ( Schmidhuber, 2015). Their most recent success began with the development of effective methods to train deep neural networks (networks with multiple hidden layers), and the coining of the term deep learning around 2006 ( Hinton and Salakhutdinov, 2006 Schmidhuber, 2015). Since then improvements have been made in part enabled by the access to greater computational resources, especially graphics processing units (GPU), enabling training of deep neural networks containing many parameters in reasonable time. Given this, specialized neural network architectures like convolutional neural networks (CNN) and recurrent neural networks (RNN) with long short-term memory cells (LSTM) can now be trained efficiently and have been successfully applied to many problems including image recognition ( Ciresan et al., 2011 Krizhevsky et al., 2012) and natural language processing tasks such as speech recognition ( Geiger et al., 2014) and language translation ( Sutskever et al., 2014).

The successes of neural networks have led to the development of various programming frameworks to build and train neural networks. Examples are PyTorch (, Caffe ( and TensorFlow ( Our framework of choice here is Lasagne ( Dieleman et al., 2015), a well-established easy to use and extremely flexible lightweight Python library built on top of the Theano numerical computation library ( Bastien et al., 2016). While most other frameworks require the user to learn a dedicated programming language, Lasagne is Python-based and therefore relatively easy to use for bioinformaticians already programming in Python. Further Lasagne’s active community ensures that the latest neural network training algorithms and architectures are available to the user.

Within bioinformatics, examples of deep learning applications include prediction of splicing patterns ( Leung et al., 2014), DNA and RNA targets of regulatory proteins ( Alipanahi et al., 2015), protein secondary structure ( Wang et al., 2016) and biomedical image analysis ( Cha et al., 2016 Moeskops et al., 2016). However, the number of applications is still relatively small, and the application of deep learning methods within biology is in our view being held back due to a lack of examples and programming templates with a biological background facilitating a head start on the use of these libraries for non-experts.

Here, we seek to alter this by providing a non-expert introduction to the field of deep neural networks with application examples and ready to apply and adapt code templates illustrating how convolutional and LSTM neural networks can be successfully designed and trained on biological data to achieve state-of-the-art performance in prediction of (i) protein subcellular localization, (ii) protein secondary structure and (iii) peptides binding to Major Histocompatibility Complex Class II molecules. Implementations of the methods are available online to be used by non-expert end-users as templates for developing models to describe a given problem of interest The code can run both on CPU and cuda-enabled GPU. GPU gives of the order of a 100-fold speed-up.

Adleman L (2000) Towards a mathematical theory of self-assembly. Technical Report 00-722, Department of Computer Science, University of Southern California

Daley M, McQuillan I (2006) On computational properties of template-guided DNA recombination. In: Carbone A, Pierce N (eds) Proceedings of the DNA computing 11. LNCS, vol 3892. Springer, Berlin, pp 27–37

de Luca A (2006) Pseudopalindrome closure operators in free monoids. Theor Comput Sci 362:282–300

Domaratzki M (2006) Hairpin structures defined by DNA trajectories. In: Mao C, Yokomori T (eds) Proceedings of the DNA computing 12. LNCS, vol 4287. Springer, Berlin, pp 182–194

Feldkamp U, Banzhaf W, Rauhe H (2000) A DNA sequence compiler. In: Condon A, Rozenberg G (eds) Pre-proceedings of the DNA-based computers 6. Leiden, Netherlands

Feldkamp U, Saghafi S, Banzhaf W, Rauhe H (2001) DNA sequence generator: a program for the construction of DNA sequences. In: Jonoska N, Seeman N (eds) Proceedings of the DNA-based computers 7. LNCS, vol 2340. Springeer, Berlin, pp 23–32

Garzon MH, Oehman C (2001) Biomolecular computation in virtual test tubes. In: Jonoska N, Seeman N (eds) Proceedings of the DNA-based computers 7. LNCS, vol 2340. Springer, Berlin, pp 117–128

Garzon M, Phan V, Roy S, Neel A (2006) In search of optimal codes for DNA computing. In: Mao C, Yokomori T (eds) Proceedings of the DNA computing 12. LNCS, vol 4287. Springer, Berlin, 143–156

Hartemik J, Gifford DK, Khodor J (1999) Automated constaint-based nucleotide sequence selection for DNA computation. In: Kari L, Rubin H, Wood D (eds) Proceedings of the DNA based computers 4. Biosystems 52(1–3):227–235

Hartemink J, Gifford DK (1999) Thermodynamic simulation of deoxyoligonucleotide hybridization for DNA computation. In: Rubin H, Wood D (eds) Proceedings of the DNA-based computers 3. DIMACS series in discrete mathematics and theoretical computer science. AMS Press, Providence, pp 25–38

Hopcroft J, Ullman J, Motwani R (2001) Introduction to automata theory, languages and computation, 2nd edn. Addison Wesley, Boston

Hussini S, Kari L, Konstantinidis S (2003) Coding properties of DNA languages. Theor Comput Sci 290:1557–1579

Jonoska N, Mahalingam K, Chen J (2005) Involution codes: with application to DNA coded languages. Nat Comput 4(2):141–162

Jonoska N, Kari L, Mahalingam K (2006) Involution solid and join codes. In: Ibarra O, Dang Z (eds) Developments in language theory: 10th international conference. LNCS, vol 4036. Springer, Berlin, pp 192–202

Kari L, Mahalingam K (2007a) Involutively bordered words. Int J Found Comput Sci 18:1089–1106

Kari L, Mahalingam K (2007b) Watson–Crick conjugate and commutative words. In: Garzon M, Yan H (eds) Preproceedings of the DNA computing 13. Springer, Berlin, pp 75–87

Kari L, Mahalingam K (2007c) Watson–Crick bordered words and their syntactic monoid. In: Domaratzki M, Salomaa K (eds) International workshop on language theory in biocomputing, Kingston, Canada, pp 64–75

Kari L, Konstantinidis S, Losseva E, Wozniak G (2003) Sticky-free and overhang-free DNA languages. Acta Inf 40:119–157

Kari L, Konstantinidis S, Losseva E, Sosik P, Thierrin G (2005a) Hairpin structures in DNA words. In: Carbone A, Pierce N (eds) Proceedings of the DNA computing 11. LNCS, vol 3892. Springer, Berlin, pp 158–170

Kari L, Konstantinidis S, Sosik P (2005b) Bond-free languages: formalizations, maximality and construction methods. Int J Found Comput Sci 16: 1039–1070

Kari L, Mahalingam K, Thierrin G (2007) The syntactic monoid of hairpin-free languages. Acta Inf 44(3):153–166

Kari L, Mahalingam K, Seki S (2009) Twin-roots of words and their properties. Theor Comput Sci 410(24–25):2393–2400

Lothaire M (1997) Combinatorics of words. Cambridge University Press, Cambridge

Lyndon RC, Schutzenberger MP (1962) On the equation a M = b N c p in a free group. Mich Math J 9:289–298

Marathe A, Condon A, Corn R (1999) On combinatorial DNA word design. In: Winfree E, Gifford D (eds) Proceedings of the DNA based computers 5. DIMACS series in discrete mathematics and theoretical computer science. AMS Press, Providence, pp 75–89

Shyr HJ (2001) Free monoids and languages. Hon Min Book Company, Taiwan

Soloveichik D, Winfree E (2006) Complexity of compact proofreading for self-assembled patterns. In: Carbone A, Pierce N (eds) Proceedings of the DNA computing 11. LNCS, vol 3892. Springer, Berlin, pp 305–324

Tulpan D, Hoos H, Condon A (2003) Stochastic local search algorithms for DNA word design. In: Hagiya M, Ohuchi A (eds) Proceedings of the DNA-based computers 8. LNCS, vol 2568. Springer, Berlin, pp 229–241

Yu SS (1998) d-minimal languages. Discret Appl Math 89:243–262

Yu SS (2005) Languages and codes. Lecture notes. Department of Computer Science, National Chung-Hsing University, Taichung