CGmCGCG is a versatile substrate with which to evaluate Tet protein activity

Tet family proteins have the ability to convert 5-methylcytosine (mC) to 5-hydroxymethylcytosine, further to 5-formylcytosine and 5-carboxycytosine. We found that CGmCGCG can be the substrate of Tet protein, and observed iterative oxidation of mC by HPLC analysis. We also 10 demonstrated that Tet protein favours single-stranded DNA over double-stranded DNA


RESULTS AND DISCUSSION
First, 3-to 6-mer short DNAs containing mC were synthesized using phosphoramidite chemistry. After purification, each DNA was incubated with mTet1 protein at 37 °C for 1 hour, and then the reaction mixture was directly analyzed by reversed-phase 35 HPLC. The percentage of conversion was calculated from the peak area of mC-containing material and hmC-containing product (Fig. 1). Identification of hmC-containing product was conducted by HPLC analysis following enzymatic digestion. Additionally, the reactivity of 20-mer DNAs 4 was also checked by HPLC 40 following enzymatic digestion of the product ( Fig. 1 and Fig. S1, ESI †). The reactivity of short DNAs was much higher than that of 20-mer DNAs. A 6-mer DNA containing mC at a non-CpG site also reacted with mTet1 to form a hmC-containing product. For   To investigate dynamic change in the amount of CGmCGCG 10 and its oxidized derivatives, CGmCGCG was incubated with a higher concentration of mTet1 to observe further oxidation of hmC to fC and caC. CGhmCGCG, CGfCGCG, and CGcaCGCG were synthesized using phosphoramidite chemistry and checked by reversed-phase HPLC and ESI-TOF-MS ( Fig. S2 and S3, 15 ESI †). By co-injection of these authentic samples with the reaction mixture of mTet1 and CGmCGCG, each peak was identified in two different solvent systems. After mixing CGmCGCG and mTet1, the reaction was quenched by dilution after an appropriate length of time, and then the reaction mixture 20 was directly analyzed by HPLC (Fig. 2). The oxidation from mC to hmC was rapid and it took only about 3 minutes for CGmCGCG to be completely consumed (Fig. 3). At around 3 minutes, the amount of hmC began to decline and concomitantly the amount of caC began to gradually increase. Finally, the 25 relative amount of each oxidized derivative reached a plateau. This may be the result of the inactivation of the mTet1 protein described in a previous report. 4 Additionally, the effect of ATP in the reaction mixture on the oxidation reaction was examined by using CGmCGCG as a substrate (Fig. S4, ESI †). In the absence of ATP, the conversion efficiency from mC to hmC markedly decreased. This finding clearly shows that the activity of Tet protein is greatly regulated by ATP as described in a previous report. 5 45 To determine the temperature dependency of the oxidation reaction of mC to hmC by mTet1 protein, CGmCGCG was incubated with mTet1 protein for 1 hour at various temperatures (Fig. 4). The optimum temperature for the oxidation reaction was around 37 °C. To our surprise, the reaction occurred even at 50 50 °C. At this temperature, almost all CGmCGCG strands are considered to be present as single strand, suggesting that singlestranded DNA can be the substrate of Tet protein. Therefore, the reactivity of single-stranded DNA was investigated subsequently.
Non-self-complementary 6-mer DNA, CCmCGCC was 55 incubated with mTet1 protein at 37 °C for 1 hour followed by direct analysis of the reaction mixture by HPLC (Fig. 5). Indeed, CCmCGCC was oxidized to form CChmCGCC, indicating that Tet protein can act on single-stranded DNA. Subsequently, to examine whether Tet protein prefers mC 5 oxidation in single-stranded or double-stranded DNA, the percentage of the conversion from mC to hmC was calculated using 20-mer DNA, 5'-GAGCGTGACmCGGAGCTGAAA-3' in the presence or absence of its complementary strand (Fig. 6). The percentage of the conversion was 29% for double-stranded DNA, 10 and 68% for single-stranded DNA. Similar experiment using 5'-TTTCAGCTCmCGGTCACGCTC-3' showed same preference (Fig. S5, ESI †). These results suggest that Tet protein has higher activity on single-stranded DNA than double-stranded DNA.
In this study, mTet1 active domain was used for all assays. It is 15 not clear whether full length mTet1 and the other members of Tet family proteins show similar activity. However, it was recently reported that the activity of full length mTet1 is higher than that of mTet1 active domain, 5

Conclusions
In the present study, we demonstrated that 4 to 6-mer DNAs can be substrates of Tet protein. In particular, 6-mer DNAs were much more reactive than conventional 20-mer DNAs. Although there is a report about the effects of substrate length on AlkB, 25 ABH2 and ABH3 proteins, 15 to our knowledge this is the first study which has shown the effect of substrate length on Tet protein. By using CGmCGCG, it is possible to readily observe the oxidation of mC to hmC, fC, and caC. We also showed that Tet protein can oxidize mC in single-stranded and double 729 nM mTet1 protein were incubated at 37 °C for 1 hour (total volume: 25 μL). After incubation, 3 μL of the reaction mixture was used for HPLC analysis.
For analysis of the reactivity of 20-mer DNA with mTet1 protein in the absence or presence of its complementary strand, 95 16.3 μM DNA and 729 nM mTet1 protein were incubated at 37 °C for 1 hour (total volume: 50 μL). After incubation, all of the reaction mixture was purified by QIAquick Nucleotide Removal Kit (Qiagen). Purified DNAs were digested with nuclease P1 (Wako) and Antarctic phosphatase (New England 100 Biolabs) at 37 °C for 4 hours. All of the reaction mixture was used for HPLC analysis. Elution was with 50 mM ammonium formate containing 0-3% acetonitrile in a linear gradient at a flow rate of 1.0 mL/min for 30 minutes, at 40 °C.