KAT7 is a genetic vulnerability of acute myeloid leukemias driven by MLL rearrangements

Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to lysine residues of histones and play a central role in transcriptional regulation in diverse biological processes. Dysregulation of HAT activity can lead to human diseases including developmental disorders and cancer. Through genome-wide CRISPR-Cas9 screens, we identified several HATs of the MYST family as fitness genes for acute myeloid leukaemia (AML). Here we investigate the essentiality of lysine acetyltransferase KAT7 in AMLs driven by the MLL-X gene fusions. We found that KAT7 loss leads to a rapid and complete loss of both H3K14ac and H4K12ac marks, in association with reduced proliferation, increased apoptosis and differentiation of AML cells. Acetyltransferase activity of KAT7 is essential for the proliferation of these cells. Mechanistically, our data propose that acetylated histones provide a platform for the recruitment of MLL-fusion-associated adaptor proteins such as BRD4 and AF4 to gene promoters. Upon KAT7 loss, these factors together with RNA polymerase II rapidly dissociate from several MLL-fusion target genes that are essential for AML cell proliferation, including MEIS1, PBX3 and SENP6. Our findings reveal that KAT7 is a plausible therapeutic target for this poor prognosis AML subtype.


Plasmid construction
gRNA design and cloning : gRNAs from the genome-wide CRISPR-library used in the initial screens 1 were selected for validation. Additional gRNAs were designed using the Wellcome Sanger Institute's Genome Editing (WGE) ( http://www.sanger.ac.uk/htgt/wge/ ) 4 . Only gRNAs that targeted exons present in all putative transcripts and had no off-target hits with less than 3-nucleotide mismatch to their on-target sequence were selected. All gRNAs were cloned into the Bbsl site of pKLV2-U6gRNA5(BbsI)-PGKpuro2ABFP-W 1 . gRNAs used in this study are listed in Table S1.
gRNA-resistant wild-type and E508Q mutant KAT7 expression vector : A T2A-fused wild-type (WT) KAT7 cDNA fragment resistant to both gKAT7 (5) and gKAT7 (A10) was purchased from IDT and cloned into the BsrGI site of pKLV-EF1aGFP using a NEBuilder HiFi kit (NEB), resulting in pKLV-EF1aGFP2AKAT7-W. To introduce E508Q substitution, the region between the SrfI and BsaBI sites of the KAT7 ORF was replaced with a gBlock fragment containing the substitution using NEBuilder HiFi kit, resulting in pKVL-EF1aGFP2AKAT7E508Q-W. Sequences were verified by Sanger sequencing.

Lentivirus production and transduction
Lentivirus was produced as described previously (31). AML and CML cell lines were transduced by adding lentivirus and 8 μg/ml polybrene (Millipore) to 3.0 x 10 4 cells per well of a 96-well plate or 1 x 10 6 cells per well of a 6-well plate and incubated at 37°C for 22 h. Viral supernatant was then replaced with fresh culture media and the cells were passaged for further culture.

Cell sorting and flow cytometry analyses
Cell sorting was performed using Mo-Flo XDP or BD INFLUX under containment-level 2 conditions. FACS analyses were performed using BD LSRFortessa and raw files were analysed using FlowJo.

Proliferation
Cells were first transduced with gRNA-expressing lentivirus in 96-well plates as described above. The percentage of BFP-positive (knock-out) cells was determined for each well every 2 days between day 4 and day 14 post transduction using flow cytometry. Culture media were refreshed every 2 days.
Prior to FACS analysis, cells were fixed in 4 % paraformaldehyde in phosphate-buffered saline (PBS) for 10 minutes and re-suspended in 1 % bovine serum albumin in PBS. Values were normalized to the BFP-positive percentage on day 4.

Differentiation and apoptosis assays
Cells were transduced with gRNA-expressing lentivirus in 6-well plates as described above.
BFP-positive (knock-out) cells were collected by cell sorting 3 days post transduction and further cultured until analysis. For differentiation analysis, 2.3 x 10 5 cells were harvested 7 days post transduction, washed in PBS and staining buffer (2 % FBS in PBS), and stained with APC-conjugated CD11b (1.15 μl antibody per 100 μl staining buffer per test, 17-0118, eBioscience) or APC-conjugated mouse IgG1κ isotype control (17-4714, eBioscience) on ice for 30min in the dark. The cells were then washed twice with staining buffer and analysed by FACS. For apoptosis assays, 2.3 x 10 5 cells were analysed 9 days post transduction using Annexin V Apoptosis Detection Kit APC (88-8007, eBioscience) according to the manufacturer's instruction.

Western blot analysis
Cells were harvested by centrifugation and washed twice in PBS. Cell pellets were then re-suspended in NuPAGE LDS sample buffer (ThermoFisher), NuPAGE sample reducing agent (ThermoFisher) at a concentration of 1 x 10 6 cells/100 μL. The lysates were heated at 95°C for 5 min and vortexed at room temperature for 10 min. Equal volumes of the lysates were loaded into each well of Bis-Tris precast polyacrylamide gels (Thermo Fisher). Antibodies used are listed in Table S2. Log-rank (Mantel-Cox) test was used to compare survival between mouse groups. All animal studies were carried out at the Wellcome Sanger Institute under UK Home Office License PBF095404.

ChIP-seq and ChIP-qPCR
Cells were crosslinked with 1 % Formaldehyde (ThermoFisher) at room temperature for 5 min (histones ChIP) or 10 min (non-histone ChIP), and quenched with 0.125M Glycine (Sigma) at room temperature for 5 min. Cross-linked cells were washed twice with ice-cold PBS before sequential lysis with LB1 (50 mM Hepes, 140 mM NaCl, 1 mM EDTA, 10 % glycerol, 0.5 % NP-40, 0.25 % Triton X-100), LB2 (10 mM Tris-HCl, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA) and LB3 (10 mM Tris-HCl, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1 % Na-Deoxycholate, 0.5 % N-lauroylsarcosine) on ice for 10 min each. Lysed samples were then sonicated for 10 cycles for MOLM-13 and for 11 cycles for OCI-AML3, THP-1 and OCI-AML2, using the Bioruptor Pico instrument (Diagenode). Triton-X was added to the sonicated samples to a final concentration of 1% before centrifugation at 20,000 x g at 4°C for 10 min. 10 % (by volume) of each sample was kept as input. Lysates were incubated with antibody (Table S2)  For qPCR, KAPA SYBR Fast qPCR kit was used following the manufacturer's protocol. % Input was calculated as 100 x 2 Δ Ct , where Δ Ct = Ct Normalized input -Ct ChIP ; Ct Normalized input = Ct Input -log 2 (input dilution factor); input dilution factor = fraction of input saved relative to each IP x dilution of input for qPCR. Sequences of primers used in ChIP-qPCR are listed in Table S3.

ChIP-seq data processing and analysis
Reads from ChIP-seq samples and input were mapped to the human genome GRCh38 using Burrows-Wheeler Aligner (BWA) version 0.7.17 (ref. 5 ) on default parameters. Duplicates were marked by Samtools mkdup 6 . Peaks (broad and narrow) were called by MACS2 version 1.4.1 (ref . 7 ) using input DNA as control with parameters --broad-cutoff 0.01 for broad peaks, and -q 0.01 for narrow peaks. Broad and narrow peaks were merged into a union set. Broad peaks that overlap with one or more narrow peaks were removed. Locations of peaks (promoter, exonic, intronic or intergenic) were computed by customized scripts using Ensembl transcript annotation of GRCh38 version 91. Peaks were considered to be associated with promoter(s) if more than a half of the peak region was located within ±2kb from the transcription start site. Promoter occupancies of a transcript were quantified as the highest MACS2 signal among all the peaks within the 2kb window.
If multiple isoforms existed, genic promoter occupancies were calculated as the highest signal among isoforms. Public data used was GSM1845161 and GSE83671.

RNA extraction and RNA-seq processing
RNA was extracted from AML cells with RNeasy Plus Mini Kit (Qiagen) according to manufacturer's instructions. Sequencing was performed on Illumina HiSeq v4 platform with 75-bp paired-end sequencing.
RNA-seq reads were mapped to the human genome assembly GRCh38 using STAR version 6.43 (ref. 8 ), tolerating mismatch rate of 0.01 and allowing maximal intron lengths of 50kb. Read counts were calculated by STAR using Ensembl annotation of GRCh38 version 91 and normalized as fragments per kilobase of gene length per million uniquely mapped reads (FPKM). Differential expression analysis was done by DESeq2 (ref. 9 ) using paired sample design and significant genes were identified using adjusted p-values of 0.05 as a cut-off.

Statistical analysis
Normal distribution was first tested using F-test. Student's ttest was used for statistical testing unless stated otherwise. Mean was calculated from at least three replicates, as indicated in each figure, and error bars represent the standard deviation. P values ≤0.05 were considered statistically significant.

BRD4 AFF1
MLL-AF9 target genes MLL-AF9 target genes Figure S7. Model of KAT7-dependent recruitment of BRD4 and SEC complex machinery to drive transcription of MLL-AF9 target genes.