If you use CUT&RUN or CUT&Tag to map histone modifications, DNA methylation, transcription factor binding, or cofactors across the genome, you’re ultimately trying to detect and identify real biological changes in gene expression. Of course, you want to ensure that any detected changes are not due to cell input, sample handling, or sequencing depth. Historically, however, normalization controls have been applied inconsistently or not at all, leaving the door open for even the most promising datasets to be unknowingly affected by workflow variability rather than true biology.
“Chromatin profiling has a reputation for being technically demanding, and one major challenge is that assays are more qualitative than quantitative,” explains Fang Chen, PhD, Associate Director of Epigenetic Assays at CST. “Scientists are looking for a simple normalization solution that can reveal the factors impacting results. Spike-ins are easy to implement and give you increased confidence in the data you’re interpreting."
|
|
“This fundamentally changes how confidently you interpret CUT&RUN or CUT&Tag data... With the Drosophila spike‑in control, when you see a change in your data, you know it’s really because of the biology.” |
This post explores how spike-in controls—especially whole-workflow Drosophila spike-in control—make it easy to incorporate normalization into CUT&RUN and CUT&Tag experiments and generate chromatin data you can trust.
The Normalization Gap in Chromatin Profiling
Despite years of discussion in the literature, spike-in normalization remains underused in chromatin assays, in part because there have not been practical, ready-to-use spike-in options available for routine workflows.1-3
“Without an appropriate control, standard read-depth normalization can mask global changes or inflate technical variation,” explains Chen. “Historically, many chromatin assays frequently left out controls because producing spike-in material in-house was technically complex and time-consuming, and few labs had the bandwidth to do it. This can leave datasets looking clean, but quietly misrepresenting the underlying biology.”
As a result, scientists might grapple with three common pain points:
- Uncertainty about whether differences between conditions reflect real biology or just discrepancies in cell input, sample handling, or sequencing depth.
- Apprehension that a noisy or non-reproducible dataset could derail a project or weaken a publication, without knowing what went wrong.
- Concern that normalization will require advanced statistics or custom reagents created in-house that feel out of reach for many labs.
Having a normalization strategy helps alleviate these challenges by providing a benchmark to confidently compare signals across samples targeting the same protein, helping ensure observed changes reflect true biological signal rather than workflow noise.
Types of Normalization Controls: Yeast & E. coli DNA vs Drosophila Nuclei
Most scientists are already familiar with DNA spike-in controls based on yeast genomic DNA. Compatible with CUT&RUN assays, fragmented yeast DNA can be added after chromatin digestion to provide sample normalization for DNA purification, downstream qPCR, and next-generation sequencing (NGS) analysis. In some CUT&RUN and CUT&Tag workflows, residual E. coli DNA carried over from MNase or Tn5 production can also be used as a de facto control: It can report on enzyme‑reaction variability, but—like yeast DNA—it cannot correct for differences in cell input or early wash steps, and the amount of E. coli DNA in the enzymes varies from lot to lot, making it unreliable for consistent quality control.
“Traditional DNA spike-in normalization is a useful strategy when the main concern is variation at those later stages of the workflow,” notes Chen. “However, yeast and E. coli DNA can’t correct for differences in starting cell number or early handling steps, and the yeast DNA spike-in is not compatible with CUT&Tag.”
A newer approach is to introduce Drosophila spike-in nuclei at the very beginning of either the CUT&RUN or CUT&Tag workflow, where it is mixed directly with experimental cells before cell permeabilization (Figure 1). During the antibody incubation step, a Drosophila H2Av monoclonal antibody is included along with the antibody to the target of interest. This antibody consistently and specifically tags Drosophila histones to generate a reference signal across all samples. The signal can then be used to create a normalization factor that is applied to the experimental genome to correct for technical variation across the workflow.
Figure 1. Workflows for the CUT&RUN and CUT&Tag assays that highlight where the Drosophila spike-in nuclei and H2Av monoclonal antibody are incorporated into the experiment. Including Drosophila spike-ins provides normalization coverage across the entire assay.
Because the Drosophila nuclei experience immobilization, membrane permeabilization, antibody binding, targeted digestion or tagmentation, DNA purification, library prep, and sequencing alongside the sample, they act as a true start-to-finish normalization control.
It is necessary to decide which spike-in control you would like to use prior to starting your experiment—the downstream yeast spike-in DNA control and Drosophila spike-in control are redundant, and therefore, both are not needed in the same experiment.
| Whole-Workflow Drosophila Spike-In | Traditional Yeast DNA Spike-In | |
| Assay Compatibility | CUT&RUN and CUT&Tag | CUT&RUN |
| Control Composition | Intact Drosophila nuclei plus an H2Av monoclonal antibody specifically targeting the Drosophila genome, processed in parallel with the samples. | Fragmented Saccharomyces cerevisiae genomic DNA without a paired antibody. |
| Workflow Coverage | Cell handling through permeabilization, antibody binding, chromatin digestion or tagmentation, DNA purification, library prep, and sequencing. | DNA purification, library prep, and sequencing. |
| Normalization Focus | Technical variability across the entire assay. | Bias in library prep and sequencing. |
| Availability |
CUT&RUN: CUT&Tag (compatible w/ CUT&Tag Assay Kit #77552): |
Included with the CUT&RUN Assay Kit #86652. |
| Ideal Use Case | Drug perturbation studies; subtle genome-wide chromatin changes; multi-batch comparisons; and experiments with variable starting material, in addition to variability in library prep and sequencing. | Partial benchmarking and experiments mainly concerned with bias and variability in library prep and sequencing. |
Table 1. Comparison of yeast and Drosophila spike‑ins. Note that, as residual DNA carried over from MNase or Tn5 production, an E. coli DNA spike-in is not available as a CST product.
“You’ll want to choose whichever control best matches the scientific questions you’re asking,” explains Chen. “The yeast DNA spike-in can provide useful information about library-level variation in CUT&RUN assays, while the Drosophila spike-in can be used in both CUT&RUN and CUT&Tag, and extends normalization to earlier, failure-prone steps of the workflow.”
Choosing a yeast DNA spike-in is still a reliable and sufficient choice for certain types of CUT&RUN experiments, such as:
- Simple benchmarking and assay setup, where the main concern is library preparation and sequencing depth.
- Pilot experiments focused on global binding patterns across the entire genome rather than on subtle genome-wide changes.
- Cases where input material and sample handling are tightly standardized (e.g., well-characterized cell lines under controlled conditions).
In contrast, there are cases where a whole-workflow control is recommended and can provide additional assurance of trustworthy, biologically relevant data:
- When reliable normalization is needed for CUT&Tag assays.
- Drug perturbation studies (for example, EZH2 inhibitors) where global shifts in histone marks are expected and must be distinguished from technical variability.
- Comparing subtle chromatin changes across closely related conditions, such as time-course experiments.
- Long-term studies, those that require repeat sample analysis, or the analysis of multiple samples, where differences in day-to-day handling are hard to avoid.
- Working with variable starting material such as patient-derived cells, primary tissues, or low-input samples, where input quality may fluctuate more.
What Drosophila Spike-in Means for Your Experiments
Shifting from a late-stage yeast DNA spike-in to a full-workflow Drosophila control extends normalization coverage from a fraction of the assay to every experimental step.
This becomes especially important in drug mechanism studies and low-input titrations, such as the examples below, where subtle or global chromatin changes must be distinguished from technical variation.
Drug Treatment Validation Example with Normalization
In drug perturbation studies targeting chromatin regulators, global shifts in histone marks like H3K27me3 are easy to overestimate if normalization relies only on sequencing depth. The CUT&RUN experiment shown below illustrates how this plays out in real data.
In this experiment, 100,000 MCF7 cells are treated with an EZH2 inhibitor such as tazemetostat and compared to untreated cells (Figure 2). In the workflow, the Drosophila spike-in nuclei are immobilized on Concanavalin A-coated beads in the first step and are carried together through permeabilization, antibody incubation, pAG-MNase binding, and associated wash steps, as well as library prep and sequencing.
Figure 2. Drug treatment validation with Drosophila spike-in normalization. CUT&RUN was performed with 100,000 MCF7 cells, with or without treatment of 1 μM Tazemetostat for 6 d (as indicated), and either Tri-Methyl-Histone H3 (Lys27) (C36B11) Rabbit Monoclonal Antibody #9733 or Tri-Methyl-Histone H3 (Lys4) (C42D8) Rabbit Monoclonal Antibody #9751. DNA libraries were prepared using DNA Library Prep Kit for Illumina Systems (ChIP-seq, CUT&RUN) #56795. The figure shows binding across the HOXD gene cluster and ACTB gene, known targets of H3K27me3 and H3K4me3, respectively. After normalization using the Drosophila spike-in available in the CUT&RUN Assay Kit (with Drosophila Spike-In Control) #84647, H3K27me3 signal is significantly reduced following EZH2 inhibitor treatment, consistent with decreased H3K27me3 levels. In contrast, H3K4me3 signal remains comparable, as this drug does not affect H3K4me3.
“After applying the Drosophila spike‑in normalization, the drug‑treated samples show the expected decrease in H3K27me3 genomic binding caused by EZH2 inhibition,” explains Chen. “This pattern is exactly what a scientist wants to see, because it's consistent with the decreased H3K27me3 level after drug treatment in western blot.”
Additional orthogonal validation of the CUT&RUN data (Figure 3), demonstrates that reduced H3K27me3 signal after tazemetostat treatment is mirrored by decreased EZH2 and H3K27me3 protein levels in western blot analysis.
Figure 3. Western blot analysis of MCF7 cells using Tri-Methyl-Histone H3 (Lys27) (C36B11) Rabbit Monoclonal Antibody #9733 and Ezh2 (D2C9) Rabbit Monoclonal Antibody #5246 after 1, 3, or 6 days of tazemetostat treatment at indicated concentration. Both H3K27me3 and EZH2 levels decrease after 6 days treatment of the EZH2 inhibitor tazemetostat.
The H3K27me3 mark impacted by the drug decreases as expected, while an unrelated active mark, H3K4me3, is preserved, reinforcing confidence that the observed change is biological rather than a side effect of sample processing or sequencing depth. Without an appropriate spike-in normalization, global changes in H3K27me3 can be partially or completely masked by conventional read-depth normalization, leading to underestimation or misinterpretation of drug effects.
Ensuring Accurate Signal Normalization Across Cell Titrations
Normalization also becomes crucial when experiments push the limits of cell input, such as titrating down from 200,000 to 20,000 cells to test assay sensitivity. In such a titration, Drosophila spike-in normalization allows CTCF signal at a locus like the MYC gene to scale in a strongly positive, near-linear fashion with starting cell number, even as input becomes limiting (Figure 4).
Figure 4. Start-to-finish normalization preserves a strong correlation between starting cell number and signal at the MYC locus. CUT&Tag was performed with 200,000, 100,000, 50,000, or 25,000 HeLa cells (as indicated) and CTCF (D1A7) Rabbit Monoclonal Antibody #3417, using CUT&Tag Assay Kit #77552 and the Drosophila Spike-In Control Kit for CUT&Tag (Rabbit) #29811. DNA libraries were prepared using CUT&Tag Dual Index Primers and PCR Master Mix for Illumina Systems #47415. The figure shows binding across MYC, a known target gene of CTCF. After normalization, signal strength shows a positive correlation with the starting cell number.
For researchers, this serves both as a proof of concept and as a practical quality control measure: the assay faithfully reflects true differences in input, and any deviations from the expected relationship between cell number and signal can be attributed to technical factors rather than to inherent biological variability. This is particularly important in disease modeling contexts, where patient-derived samples may only be available at low cell numbers.
What about “over-correcting” the biology?
It’s natural to worry that if the spike-in nuclei experience every step of the workflow, any technical error affecting them could distort the resulting enrichment calculations. After all, if both the control and the sample are perturbed, how can the control still be trusted?
Modern spike-in normalization strategies explicitly address this concern by using ratios or scaling factors derived from the relationship between spike-in reads and sample reads across conditions. When both sample and control are affected in the same proportion, the technical fluctuation becomes visible in the spike-in signal and can be accounted for in the normalization factor, rather than hiding in the data. In practice, proportional changes to both spike-in and sample do not automatically invalidate biological conclusions, as long as the spike-in/sample ratio is kept as consistent as possible and analyzed appropriately (Figure 5).
Figure 5. Drosophila spike‑in normalization does not “over‑correct” the biology. CUT&RUN was performed with decreasing numbers of HCT 116 cells and Rpb1 CTD enrichment was quantified by qPCR. Before normalization, the qPCR signal decreases with lower cell inputs due to reduced DNA yield. However, this does not reflect the true biological changes in Rpb1 within the same cell model. After normalization, consistent Rpb1 binding is observed across different starting cell numbers, except for the 12.5k cells, where the low signal-to-noise ratio in the pre-normalization data indicates a genuine biological failure. This demonstrates that Drosophila spike‑in normalization exposes technical failure instead of artificially boosting biological signal.
“We’ve calibrated the Drosophila input to function as a true internal standard— sufficiently abundant to be informative, yet specific to the Drosophila genome so it does not compete with your biological signal,” explains Chen. “If the spike-in sees what your sample sees, then it can tell you when something in the workflow is drifting.”
Finally, the amount of Drosophila nuclei in the spike-in module has been optimized to ensure they do not dominate the sequencing depth. For example, approximately 10-fold fewer Drosophila nuclei are required for assays using mouse primary antibodies compared to rabbit antibodies, as mouse antibodies typically exhibit weaker activity in CUT&RUN or CUT&Tag assays. A similar consideration applies to certain transcription factor and cofactor antibodies. The ideal proportion of Drosophila reads is 0.5–10% of total reads. Guidance on appropriate Drosophila nuclei amount is provided directly in the accompanying protocols and troubleshooting materials.
Practical safeguards—such as following the dedicated protocol, keeping spike-in ratios consistent, monitoring Drosophila read counts, and consulting troubleshooting guidance when the control looks abnormal—further reduce the risk that a rare technical outlier will result in a misleading interpretation.
Reducing Complexity for Results You Can Trust
Many labs do not run chromatin profiling as a primary, everyday workflow, but their hypotheses have led them to chromatin profiling—for example, to validate a drug mechanism, establish causal evidence for gene expression, or compare chromatin changes over a time course. In these situations, it can feel all the more challenging to know how to implement and interpret normalization controls.
Both the partial-workflow yeast DNA spike-in and the whole-workflow Drosophila spike-in are designed to reduce that cognitive load, making it easier to get reproducible, trustworthy CUT&RUN and CUT&Tag data.
“This fundamentally changes how confidently you interpret CUT&RUN or CUT&Tag data,” explains Chen. “When you see a change in your data, you can know it’s really because of the biology.”
By making normalization more standardized and transparent, researchers can focus on biological questions while still generating data that stands up to reanalysis and peer review.
- For CUT&RUN workflows:
- The Drosophila spike-in is included with the CST CUT&RUN Assay Kit (with Drosophila Spike-In Control) #84647
- The yeast spike-in is included with the CST CUT&RUN Assay Kit #86652
- For CUT&Tag workflows, the Drosophila spike-in is available as standalone modules with versions configured for rabbit or mouse primary antibodies:
“The goal is not to turn every lab into a methods lab—it’s to give every lab access to chromatin data they can stand behind,” concludes Chen. “Robust, spike-in normalization with either yeast or Drosophila is one of the most direct ways to get there.”
Resources
- CST protocols, including how to add and analyze spike-in controls:
- The Troubleshooting Guide for CUT&RUN and the Troubleshooting Guide for CUT&Tag, which include sections on expected spike-in behavior and how to respond when the spike-in signal looks abnormal.
- CUT&RUN Application Overview
- CUT&Tag Application Overview
Select References
- Patel LA, Cao Y, Mendenhall EM, Benner C, Goren A. The Wild West of spike-in normalization. Nat Biotechnol. 2024;42(9):1343-1349. doi:10.1038/s41587-024-02377-y
-
Egan B, Yuan CC, Craske ML, et al. An Alternative Approach to ChIP-Seq Normalization Enables Detection of Genome-Wide Changes in Histone H3 Lysine 27 Trimethylation upon EZH2 Inhibition. PLoS One. 2016;11(11):e0166438. Published 2016 Nov 22. doi:10.1371/journal.pone.0166438
- Taruttis F, Feist M, Schwarzfischer P, et al. External calibration with Drosophila whole-cell spike-ins delivers absolute mRNA fold changes from human RNA-Seq and qPCR data. Biotechniques. 2017;62(2):53-61. Published 2017 Feb 1. doi:10.2144/000114514

%20BRAND/24-bch-48050/March%20Newsletter%20Fang_200x200%203-18-24.png?width=150&height=150&name=March%20Newsletter%20Fang_200x200%203-18-24.png)
