Introduction

The first evidence that RNA could be targeted for therapeutic intervention was provided by the discovery of streptomycin in 1944 (ref.1) and the subsequent identification of the bacterial ribosome, a macromolecular RNA–protein complex, as the natural product’s cellular target2. Further insight was provided 20 years later when the first nucleic acid sequence and RNA structure — a tRNA carrying alanine (tRNAAla) — was reported3. The key to understanding the role of tRNA in translation was its 3D fold, required for aminoacylation and hence decoding of mRNA sequence into protein. The discovery of catalytic RNAs4,5 expanded the chemical functions of RNAs far beyond encoding proteins and decoding mRNAs.

The sequence of the human genome further reinforced RNA’s contribution to biology, with far fewer canonical open reading frames (ORFs) observed than expected6. Indeed, organismal complexity is not correlated with the number of ORFs but instead with the number and diversity of noncoding (nc)RNAs6 that function in epigenetic regulation7 and regulate gene expression8, particularly during development. Complementarily, genome-wide association studies (GWAS) have defined biological pathways that are dysregulated in disease, elucidating potential therapeutic targets, both RNA and protein. Such data can be leveraged to enable a bench-to-bedside paradigm in which small molecules bind to and deactivate structured RNAs: a patient’s genome could be sequenced and compared with GWAS to identify the malfunctioning RNA. The targeted therapy would bind a functional structure within the RNA to short-circuit disease pathways.

Promising therapeutic strategies to target RNA include antisense oligonucleotides (ASOs), CRISPR gene editing and small molecules that recognize RNA structures. ASOs and CRISPR editing have been invaluable to the field of chemical biology. However, translating these technologies to the clinic has been challenging owing to difficulties with delivery and significant adverse reactions9,10. Small molecules offer an important alternative with potential for oral bioavailability and blood–brain barrier penetrance, particularly with the wealth of knowledge from medicinal chemistry whereby physicochemical properties can be systematically optimized to improve pharmacokinetics and potency.

Several small molecules that bind RNA structures have been identified that exhibit various modes of action (MOAs), from simple binding to direct cleavage to recruitment of endogenous nucleases (induced proximity). These molecules have been demonstrated to modulate diverse biological processes, such as inhibiting bacterial and viral translation (ribocil and a riboswitch in Escherichia coli11, 2-aminobenzimidazole derivatives and the hepatitis C internal ribosome entry site (IRES)12); inhibiting mRNA translation (synucleozid and α-synuclein13); facilitating alternative pre-mRNA splicing (risdiplam14 and branaplam15 and survival of motor neuron 2 (SMN2)); and inhibiting microRNA (miRNA) biogenesis (a spermine–amidine conjugate and the miR-372 precursor16).

This Review describes foundational methods and strategies to: identify structured, functional RNAs; design and discover lead molecules that bind structured regions; and lead-optimize small molecules, including pharmacophore modelling, structure-based design and targeted degradation. Considerations and future directions for the development of small-molecule RNA binders are highlighted.

Defining RNA structures for small-molecule targeting

Determining RNA structure

An accurate model of RNA structure is key to the design or discovery of small molecules that modulate its function. Computational methods can model RNA structure from sequence, including free energy minimization and phylogenetic comparison. Free energy minimization uses experimentally derived thermodynamic parameters to predict RNA structures17, outputting the minimum free energy structure, assumed to be the functional structure, as well as suboptimal structures. The total free energy of an RNA structure is calculated by summing the free energy of each substructure in the system, such as base pairs, bulges and loops. Dynamic programming algorithms18,19, the basis of programs such as mFold20, RNAfold21 and RNAStructure22, incorporate these experimentally determined thermodynamic parameters, predicting accurately ~70% of the base pairs for RNAs <700 nucleotides long.

Experimental constraints can be integrated into the algorithms to improve the reliability of the prediction22. Unpaired nucleotides can be identified both in vitro and in vivo using chemical modification by dimethyl sulfate (DMS23) or by selective 2′-hydroxyl acylation analysed by primer extension (SHAPE)24,25,26,27,28,29. Chemical modification either pauses reverse transcriptase (RT), causing truncation of the cDNA (an ‘RT stop’), or induces a mutation, both of which can be read out by high-throughput sequencing. These sites of modification restrain secondary structure predictions in which the extent of modification is correlated with an energetic penalty for a nucleotide to be paired. Transcriptome-wide DMS and SHAPE probing techniques greatly expanded knowledge of the overall RNA structural profiles in cells and enabled investigation of RNA conformational changes under various biological conditions24,25,26. Although these experimental constraints refine the predicted models, they do not provide information about the 3D fold of the RNA, as provided by X-ray crystallography, NMR spectrometry and cryo-electron microscopy (cryo-EM).

Phylogenetic comparison also provides insight into RNA structure, where genetic differences that preserve secondary structure (covariations: for example, an AU base pair is replaced with GU or GC) can elucidate structure, and conservation suggests selective pressure to retain a functional structure30,31. Covariation-based structural prediction of many RNA structures is highly accurate, as determined by comparison with crystal structures32,33. However, owing to the large dataset required and the complexity of the analysis, automated covariation-based prediction methods remain challenging. Phylogenetic comparison has been coupled with other prediction methods, either in sequence or co-processed, including homology modelling, free energy minimization and chemical modification to improve accuracy34,35,36, which relies on the quality of alignment and sequence availability.

The folding of RNA is hierarchical, and RNAs can form tertiary structures. Information about RNA 3D structure is especially valuable in uncovering the functional mechanism of RNAs and identifying druggable targets. Although 3D prediction methods are still in their infancy compared with those for proteins, currently available programs, including FARFAR2 (ref.37) (RNA analogue of ROSETTA for protein prediction), MC-Fold/MC-Sym38 and iFoldRNA39, have shown promising results for predicting 3D structures from sequence. Scoring functions that estimate the accuracy of prediction results in which the native structure is unknown have also been developed, such as Rosetta37, RASP40 and ARES41. With the implementation of machine learning techniques, ARES outperforms the other two approaches. Root-mean-square deviation (r.m.s.d.) analysis of predicted structures to known crystal structures demonstrated the power of this approach41.

Biophysical methods such as NMR spectroscopy, X-ray crystallography and cryo-EM have also been extensively used to determine RNA structure42,43,44. NMR spectroscopy studies also provide information about structural dynamics. Although cryo-EM is typically used to study the structure of molecules of large molecular weight, considerable effort is being exerted to develop approaches that can access smaller RNA structures, for example, as recently reported for a small riboswitch (<40 kDa)45. Computationally assisted cryo-EM has also been introduced to determine the global conformation of RNA molecules, although it does not provide atomic-level resolution46.

Assessing the quality of RNA structural predictions

Many algorithms have been developed that predict RNA structure. However, without a known structure, it is hard to assess the reliability and accuracy of the prediction, which has been addressed by development of various statistical methods.

Partition function calculations have been incorporated into various structure prediction programs, including RNAfold21, Sfold47 and CONTRAfold48. A partition function contains all of the thermodynamic information for a system and quantifies the probability of the predicted structure or substructure therein. Statistics have also been applied to phylogenetic comparisons such as R-scape, which measures the statistical significance of evolutionary covariation, indicative of functionality49. Although R-scape improved the annotation of the consensus structure in 5S rRNA from the Rfam50 database, it did not find significant covariation in the long noncoding (lnc)RNAs HOX antisense intergenic RNA (HOTAIR), steroid receptor RNA activator (SRA) or X-inactive specific transcript (Xist)49, although plausible structures have been predicted by other computational methods51,52,53. The lack of functional structures in these lncRNAs did not stem from the lack of variance in the phylogenetic tree54, nor does it imply that these lncRNAs do not fold. Instead, they likely form dynamic structures stabilized in the context of other interacting partners, pointing to the importance of knowing the limitations of predictive methods and of experimental validation.

Collectively, these methods demonstrate that predicted structures must be viewed through the lens of statistical power and/or rigour and tempered with in-depth dissection of their biological function. That is, these predictions are hypotheses until further validated.

Defining functional RNA structures in the transcriptome

Functional RNA structures can be found throughout a transcript from the 5′ end to the 3′ end, from untranslated regions to ORFs (Fig. 1). Although data suggest that ORFs are less structured than other regions55,56,57, highly structured coding sequences have been discovered58,59. Functional structures can be identified computationally (below) or experimentally by using ASOs that sterically block a functional structure60 or by mutational analysis61.

Fig. 1: Human RNAs regulate key biological processes, and their functions are driven by their structures.
figure 1

a | Examples of structured regions of RNA with important biological functions include IRE (iron-responsive element; translational regulation), splicing modulators (alternative pre-mRNA splicing including those that interact with U1 small nuclear RNA (snRNA)), RNA repeat expansions (aberrant gain-of-function; microsatellite disorders) and Drosha and Dicer processing sites in microRNA (miRNA) precursors. PDB codes: IRE (PDB: 1NBR), intron splicing junctions (PDB: 6VA1, 6HMO), RNA repeat (PDB: 1ZEV). The model of the miRNA structure can be found in ref.108. b | Comparison of RNA and protein as therapeutic targets. Approximately 75% of the human genome is transcribed into RNA, while 1.5% is translated into protein. Common types of drug target and their modes of action are also listed.

Although many RNA structural prediction methods can provide accurate models of RNA secondary structure, they do not predict functionality. Functionality can be predicted by identifying regions of structure that are unusually thermodynamically stable compared with random sequences. The hypothesis is that these stable structures are evolutionarily retained because of selective pressure. ScanFold, a scanning window approach, uses these principles to identify potentially functional structures62, accurately predicting known functional viral structures as well as predicting potential new functional structures in viral and human transcriptomes63,64.

Structural conservation and genetics can also indicate function. For the former, structural motifs that are evolutionarily conserved across species likely have a biological function30,31; for the latter, genetic mutations can cause gain or loss of function. Perhaps the best example of evolutionary conservation of structure and function is rRNA. Indeed, highly conserved structures can be found in bacteria (riboswitches), viruses (IRESs) and humans (IRESs, splicing regulatory elements).

The hunt for targetable, functional RNA structures, particularly in the human transcriptome, has only just begun. Thus far, RNA structures that have been effectively targeted with small molecules (binding produces a downstream biological response), participate in biomolecular interactions with proteins including the ribosome, other RNAs and DNA. Identifying a functional RNA structure only addresses half of the RNA-targeting problem. The other half is discovering or designing chemical matter that selectively binds to the functional structure.

Factors that affect the selectivity of small molecules targeting RNA

Ideally, a small molecule would be completely selective for its RNA target, but in practicality that is likely not required or even achievable. Historically, selective recognition of RNA by small molecules was thought intractable, owing to its perceived lack of structural diversity with only four building blocks, its anionic backbone and the lack of success in high-throughput screening campaigns. Selective recognition is possible as our fundamental understanding of the RNA structures that are targetable and the chemotypes that bind RNA has evolved, alongside advances in RNA structure modelling that provide insight into the functionality of structures.

Binding selectivity and functional selectivity are distinct. Several factors have been identified that drive the cellular (functional) selectivity of ligands that target RNA, including the uniqueness of the structure in the transcriptome, expression of the target compared with off-targets, the relative affinity of the small molecule for on- and off-targets, accessibility of the target site and the functionality of the binding site, where binding to non-functional sites is biologically silent. Regarding target expression, if two or more RNAs have the same binding site, the more highly expressed target will be more occupied by the compound. In the same vein, if two or more RNAs have different structures ligandable by the same compound, the most occupied target will be a composite of relative affinity and expression level. In cases where selectivity is not sufficient, compounds can be designed that target multiple sites within an RNA target simultaneously, thus overcoming the limitations of structural degeneracy (a structure is not unique in the transcriptome).

A few studies have completed transcriptome- and proteome-wide studies that provide insight into the selectivity of RNA-targeting small molecules. These studies have shown that small molecules can exert selective effects across the transcriptome and proteome. Notably, the observed changes and selectivities are similar to those observed for oligonucleotides65. An analogue of the FDA-approved drug risdiplam, which binds to an RNA–protein manifold not solely an RNA, is selective, altering the levels of 12 of 11,174 transcripts and altering the alternative splicing of a subset of mRNAs, including the desired target66. As observed for small molecules that target proteins, the off-targets for RNA-targeting small molecules could be other RNAs, DNA or proteins. Thus far, it appears that the scaffolds that bind to RNA are different from those that bind to proteins (Box 1), as also indicated by the lack of success in identifying selective RNA binders from small-molecule libraries designed for proteins.

Strategies to identify small-molecule RNA binders

Target-centric approaches

Various approaches can be used to find small molecules that bind RNA structures in vitro, including fluorescence-based assays, fluorescence resonance energy transfer (FRET)-based approaches67,68,69,70 and dynamic combinatorial screening71. The methods described below are target centric, that is, the small molecule’s only choice for binding is a single or a few targets. As with any primary screening assay, secondary analyses are required to identify selective binders. For all fluorescence-based assays described below, care must be taken, as many compounds can interact with the dyes themselves or have emission properties that overlap with the fluorophores.

Affinity mass spectrometry

Affinity selection mass spectrometry (AS-MS) is a label-free method that allows the direct identification of target–ligand complexes by mass spectrometry after separation from unbound ligands by size-exclusion chromatography72,73. Used widely for proteins74, it has only been recently adopted for RNA targets75,76. A variant of AS-MS, named automated ligand identification system (ALIS), uses indirect detection of a target–ligand interaction by dissociating the formed complex before liquid chromatography–mass spectrometry (LC-MS) analysis to identify the bound ligand74,75,76 (Fig. 2a). In one example of the use of ALIS, synthetic ligands that bound the flavin mononucleotide (FMN) riboswitch were identified75. One challenge associated with AS-MS is the requirement of long small-molecule residence times, such that the complex is still intact after size-exclusion chromatography.

Fig. 2: Methods to identify or design small-molecule RNA binders.
figure 2

ac | Methods to identify small molecules that bind RNA. a | Automated ligand identification system (ALIS) is a liquid chromatography–mass spectrometry (LC-MS) method. An RNA target is incubated with a library of small molecules. Unbound ligands are removed by size-exclusion chromatography and then bound ligands are identified by LC-MS. b | Fluorescence-based assays rely on a change in fluorescence upon small-molecule binding to the RNA target. This could be achieved by: displacing a non-selective RNA-binding dye (top); changing the microenvironment of a fluorescent nucleotide analogue (middle); or disrupting donor–acceptor pairs in a fluorescence resonance energy transfer (FRET)-based assay (bottom). c | Microarray-based screening in which a panel of small molecules is pinned to an array surface and incubated with labelled RNA targets, followed by washing and imaging to identify target-binding compounds. d | Compounds functionalized with a cross-linking module (such as diazirine or chlorambucil) and a pull-down tag (such as alkyne or biotin) can be screened against labelled RNA targets by using Chem-CLIP (chemical cross-linking and isolation by pull-down) to identify RNA binders and to map the binding sites. e | Identification of RNA-binding small molecules from a DNA-encoded library (DEL). The library is synthesized on beads, and each building block added during the synthesis is encoded with a DNA tag. The DEL is screened simultaneously for binding to the target of interest and a related RNA to which binding is undesired. The two RNAs are labelled with different fluorophores, and selective binders from the DEL can be identified and isolated by flow cytometry. f,g | Methods to design small molecules that bind RNA. f | Inforna is a lead identification strategy in which the structures present in a cellular RNA are compared with a database of experimentally determined RNA–small molecule interactions. Overlap affords lead targets and lead small molecules. g | Structure-based design of small molecules relies on a model of the structure of the RNA or of the RNA–ligand complex. Both can be used in docking studies while the latter can be used to guide modifications that improve interactions between the RNA and the small molecule. 2-AP, 2-aminopurine; RFU, relative fluorescence units.

Fluorescence-based assays

Another high-throughput assay relies on displacement of a fluorescent dye or compound by a small molecule of interest. Although originally developed to study the binding of DNA structures77, it has also been applied to RNA targets by displacement of the fluorescent dye TO-PRO-1 (refs.78,79) (Fig. 2b), as well as others79,80,81,82. An extension of this dye displacement method is an assay with turn-on fluorescence that uses a known fluorescent or fluorescently labelled binding small molecule and an RNA of interest that is labelled on the 5′ or 3′ end with a quencher70.

Another way to assess small-molecule binding can be carried out by replacing an adenine residue with the fluorescent mimic 2-aminopurine (2-AP)83. The fluorescence of 2-AP depends on its microenvironment, which changes upon small-molecule binding (Fig. 2b). This 2-AP assay was originally developed to study the binding of aminoglycosides to bacterial rRNA and the effect of binding on A-site dynamics84, but has been extended to other targets70,85,86. Notably, the position of the 2-AP substitution within the RNA should be carefully chosen to ensure a strong signal. For small RNAs, this can be accomplished by simply substituting each adenine residue. For longer RNAs, such as riboswitches, SHAPE can be used to elucidate where large conformational changes occur upon ligand binding, and hence the optimal position (or postions) for 2-AP substitution85,86.

Many bioactive small molecules have been identified that bind RNAs participating in bimolecular interactions with proteins. To measure inhibition of formation of an RNA–protein interaction or disruption of a pre-formed complex by a small molecule, fluorescence and FRET assays have been developed, particularly for HIV-1 Tat–transactivation response (TAR)87,88,89 and for RNA repeat expansions–RNA-binding proteins69,90. Labels on the RNA and protein are FRET pairs, and disruption or inhibition of the complex reduces the observed FRET signal (Fig. 2b). Small molecules that bind either the protein or the RNA can reduce FRET, and thus additional investigation is required to confirm that the small molecule binds the RNA as intended.

Microarray-based screening

Small-molecule microarrays (SMMs), created by delivery of minute amounts of compounds to glass slides in a spatial array, were initially used to interrogate protein binding91,92,93,94 and later extended to study the binding of aminoglycosides to the rRNA A-site95 and how binding is affected by aminoglycoside modification by resistance enzymes96. SMMs have now been used to screen a wide variety of compounds and RNA targets97,98,99,100,101,102,103 (Fig. 2c). One advantage of SMMs is that only a small amount of the small molecule is needed to complete a screen and many thousands of interactions can be profiled at once. Compounds are typically covalently attached to the array. Notably, binding to a surface can be quite different from binding in solution. Small molecules can also be non-covalently attached to agarose-coated microarray surfaces by adsorption104. Although this method can be broadly applied to many compounds104, not all compounds adhere to surfaces.

Fragments

Fragment-based ligand discovery uses libraries of low-molecular-weight compounds to efficiently explore chemical space that might bind the target of interest (Fig. 2d). Although fragments can be screened for binding, such as by NMR spectroscopy (as demonstrated to identify a fragment that binds severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)105), these interactions are often difficult to detect as they are low affinity and have short residence times. To overcome these limitations, small-molecule fragments have been functionalized with photoaffinity groups (fully functionalized fragments; FFFs) that enable capture and identification of bound targets, first applied to proteins106,107 and later to RNA105,108. Low-molecular-weight fragments are ideal for a modular assembly approach in which two fragments are tethered together to bind two structural elements in an RNA target simultaneously, as favourable physicochemical properties can be maintained (discussed below).

DNA-encoded compound libraries

DNA-encoded compound library (DEL) technology is a powerful method to explore chemical space that binds a target biomolecule either in solution or on the solid phase. A DEL is synthesized on beads by split and pool in an iterative process in which one of many building blocks is conjugated and its identity is encoded in a short DNA tag that is ligated to the bead. Compound-functionalized beads are screened for binding typically to a fluorescently labelled target, often in the presence of an off-target that is differentially labelled (Fig. 2e). For RNA targets, a counter-screen can be completed by using an RNA in which the desired binding site has been mutated, akin to mutations for binding analyses (Fig. 2e). For example, a bulged nucleotide could be converted into a base pair or a different bulge, the closing base pairs could be altered and so on. Beads that bind the desired target but not the off-target decoy can be analysed and sorted by flow cytometry. Deep sequencing of the beads identifies the binding compound. Because of the potential of nonspecific binding of the encoding DNA tag, nucleic acids and their binding proteins have been avoided. Thus far, the DEL technology has been applied only twice to nucleic acid targets: RNA109 and highly structured DNA G-quartets110.

Designing small-molecule RNA binders

Rather than screening for chemical matter, RNA binders have been designed using selection-based methods that define the preferred targets of RNA-binding small molecules and structure-based design.

Defining binding landscapes

A selection-based platform, 2D combinatorial screening (2DCS), defines the binding landscape of an RNA-binding small molecule. Small molecules displayed on a microarray select their preferred RNA partners from an RNA library displaying a randomized region in a discrete secondary structure pattern111. The selection experiment is completed under stringent conditions in the presence of a large excess of competitor oligonucleotides that mimic regions common to all members of the library, restricting binding interactions to the randomized region. Fully paired DNA and RNA oligonucleotide competitors or tRNAs are also often used to increase the stringency of the selection. Selected RNAs are analysed by RNA sequencing (RNA-seq)112,113, followed by rigorous statistical analysis of the enrichment of an RNA by the small molecule114. Statistical significance scales with affinity such that the more significant the enrichment of the RNA, the more tightly it binds to the small molecule. This analysis affords a binding landscape, or molecular fingerprint, for each small molecule, where a selective small molecule has few RNAs with statistically significant enrichment and a promiscuous one has many. These binding landscapes inform the ideal target for a small molecule and potential off-targets. Cellular RNA targets can be computationally compared with these molecular fingerprints to inform small-molecule design, as in the lead identification strategy Inforna65 (Fig. 2f). Inforna outputs the targetable sites present in a cellular RNA and the rank order of potential small-molecule binders. This approach has been implemented in both a target-centric and a target-agnostic fashion65.

Structure-based design and docking

First employed for protein targets, structure-based design and docking have enabled the discovery and the optimization of lead compounds for RNA targets115,116,117,118,119,120,121 (Fig. 2g). Using well-defined RNA structures, typically elucidated by NMR spectroscopy, small molecules can be designed to fit within binding pockets. Further, NMR studies also enable the prediction of dynamic ensembles of RNA conformations computationally, including short-lived, non-functional species120,121,122,123,124,125. Small-molecule libraries can then be docked into these ensembles to screen small molecules for selective RNA binding in silico121,126. The significant limitations of these approaches are the quality of the RNA structure, as highly accurate structures are difficult to generate by NMR spectroscopy, as well as the docking programs themselves. Interestingly, an NMR method was recently developed to study conformational biases in HIV-1 TAR and Rev responsive element (RRE) RNAs120. A series of mutations were made to each RNA, and the effect on conformational equilibria was measured by NMR spectroscopy. These spectral studies can estimate the percentage of the RNA folded into a non-functional fold, confirmed by studying the cellular activity of the RNA mutants120.

Phenotypic screening

Phenotypic screening is a strategy to identify compounds that affect pathways associated with a specific phenotype and therefore require no knowledge of the MOA or target. These assays design a screen around a biological process, for example, alternative pre-mRNA splicing, inhibition of translation, derepression of protein targets and bacterial growth. Assays completed in mammalian cells typically use generation of either luciferase or a fluorescent protein as a readout. Two small molecules that modulate splicing were discovered from phenotypic screens, the FDA-approved risdiplam66 and branaplam15 (currently in phase II clinical trials) for the treatment of spinal muscular atrophy (SMA)15.

Although phenotypic screens have been executed successfully, challenges remain, largely because the approach is target agnostic. Despite the fact that target validation and mechanistic studies can prove difficult, phenotypic screens have enabled lead optimization, particularly by using structure-based drug design. The reader is referred to ref.127 for a review on the validation of phenotypic screens of RNA targets.

Target validation and selectivity of RNA-targeting small molecules

Direct target engagement is key to defining the MOA of a compound. Below, we describe various target validation methods based on covalent bond formation between the small molecule and the RNA target, cleavage of the RNA target and resistance profiling. Except for resistance profiling, these methods can study target engagement in vitro, in cells or in vivo.

Fortuitously, some methods not only assess engagement of the desired target but also measure selectivity. Here, we note again the difference between binding selectivity and functional selectivity. Binding selectivity, often measured in vitro by introducing point mutations into a model of the binding site of the small molecule, measures relative affinity and hence the extent of target occupancy. Engagement of most structures in a target RNA is biologically silent with no functional consequence. In contrast, functional selectivity measures whether target engagement has a biological effect — the function of the RNA is modulated as assessed by changes in the downstream pathways of the target and by its associated phenotype.

Resistance profiling

Resistance profiling exerts selective pressure to induce mutations in bacterial or viral genomes to confer resistance (Fig. 3a). The genomes of resistant strains are sequenced to identify in which genes mutations have occurred, thus identifying the target of the compound. Such an approach was used to validate the riboswitch target of ribocil (discussed below), roseoflavin and pyrithiamine11,128,129 (Table 1). By comparison with covalent- and cleavage-based target validation methods, mutational resistance profiling does not require synthesis of chemical probes and overall has fewer experimental steps. However, it requires the small molecules to exert sufficient evolutionary pressure to induce mutational resistance, frequently used in cancer biology130.

Fig. 3: Methods of target validation for small molecules that target RNA.
figure 3

a | Resistance profiling is applicable when the small molecule exerts enough selective pressure to induce mutations that confer resistance. These mutations, identified by sequencing, reveal the targets of the small molecule. b | The target validation method Chem-CLIP (chemical cross-linking and isolation by pull-down) generates a covalent bond between a small-molecule probe and its targets, which are isolated and purified by bead pull-down. Bona fide targets are those enriched in the pulled-down fractions, as compared with the starting cell lysate. Upon co-treating increasing concentrations of the lead compound with Chem-CLIP probe, a dose-dependent restoration of the RNA target in the pulled-down fractions would indicate target engagement of the lead compound. c | ASO-Bind-Map is based on previous studies that show that structured regions of RNA are protected from antisense oligonucleotide (ASO) hybridization and hence ribonuclease (RNase) H degradation135,136. Thus, small molecules that bind to and stabilize the structures of RNA targets can elicit a protective effect against ASO-mediated degradation. d | An RNA degrader can cleave its bound RNA target either directly (bleomycin) or by recruiting endogenous RNases (RIBOTAC). Upon co-treating increasing concentrations of the lead compound with degrader probe, a dose-dependent ablation of degradation would indicate target engagement of the lead compound.

Table 1 Examples of small molecules that bind to RNAs implicated in infectious disease

Covalent bond formation to measure direct target engagement

Cellular target validation methods for RNA have been recently developed based on covalent bond formation with65,131,132,133, or cleavage of, the target65. One covalent method, chemical cross-linking and isolation by pull-down (Chem-CLIP; Fig. 3b), developed in 2013 (ref.134), relies on functional modification of small molecules, attaching a cross-linking module (for example, diazirine or chlorambucil) and a purification handle (for example, biotin or an alkyne)65. The RNA-binding module drives target engagement, bringing the cross-linking module into proximity to the RNA such that they react either directly (electrophilic in the case of chlorambucil) or upon irradiation (diazirine). Small molecule–target complexes are captured and purified with beads (streptavidin- or azide-functionalized). Quantitative PCR with reverse transcription (RT–qPCR) or RNA-seq and subsequent statistical analysis are used to analyse the enrichment of RNA targets in the pulled-down fraction, as compared with the starting lysate. Although the exact protocol may vary depending upon the chemistry of cross-linking133 and purification modules, the fundamental idea of Chem-CLIP remains unchanged: transformation of dynamic reversible binding to covalent bonds by cross-linking and amplifying the signal by pull-down enrichment. Chem-CLIP experiments are completed side by side with a probe that lacks the RNA-binding module, controlling for nonspecific reaction of the cross-linking module. Indeed, this approach has validated the RNA targets of small molecules65. This method, however, requires modification of the lead compound, which can be hampered when the synthesis is challenging or if molecular recognition has not been sufficiently defined to inform a site within the small molecule not required for molecular recognition.

Chem-CLIP can also be used to study target binding and functional selectivity, when the pulled-down fractions are analysed by RNA-seq. The enrichment of a transcript indicates the extent of its occupancy by the small molecule, affording the binding selectivity of the small molecule (see ‘Quantifying selectivity’ below). Occupancy of a target RNA is not sufficient for a biological response; the small molecule must bind a functional site. Functional selectivity can be defined by comparing target occupancy with the effect on target expression by RNA-seq analysis upon treatment with the lead compound. Many RNA targets will be occupied but their expression unaffected. Pathway analysis of the RNA-seq data also provides supporting or denying evidence of on-target MOA. Identifying off-targets is key for lead optimization, and fortuitously, Chem-CLIP can identify the exact binding site within a cellular RNA, a method named Chem-CLIP-Map (ref.65).

A variant of Chem-CLIP, competitive-Chem-CLIP (C-Chem-CLIP), defines the targets of the lead compound. Briefly, cells are co-treated with a constant concentration of the Chem-CLIP probe and increasing concentrations of the lead compound. If the two molecules compete for the same binding site, a dose-dependent decrease in target enrichment should be observed as a function of lead compound concentration. C-Chem-CLIP can be used to screen other molecules for binding to the same RNA target to generate a structure–activity relationship (SAR).

Target cleavage to assess occupancy

Complementary cellular target validation strategies have been developed that rely on the competitive cleavage of RNA targets, including ASO-Bind-Map (Fig. 3c) and competition with RNA degraders65 (Fig. 3d).

In ASO-Bind-Map, an ASO gapmer complementary to the target RNA sequence competes with a structure-binding small molecule13. In the absence of small molecule, the ASO induces cleavage by ribonuclease (RNase) H, resulting in reduced abundance of the target RNA. If a small molecule binds to the same region, it impedes hybridization of the ASO and thus diminishes target depletion. This strategy was inspired by studies that used ASOs to probe RNA folding, in which regions of RNA that fold quickly into stable structures are less accessible to ASO hybridization135,136. In contrast to Chem-CLIP, ASO-Bind-Map does not require modification of the lead compound, but does require knowledge of the binding site within the cellular RNA and that the small molecule stabilizes the structure of the RNA sufficiently to impede ASO hybridization. ASO-Bind-Map cannot be used to study selectivity transcriptome-wide, as it would require an ASO to every structure in every target. If an off-target is suspected, ASO-Bind-Map could prove or disprove target engagement.

Methods have also been developed in which the small molecule directly cleaves the RNA target or recruits a nuclease to do so, identifying both on- and off-targets by their depletion upon analysis by RNA-seq, akin to ASOs. To directly cleave RNA targets, a method dubbed small molecule nucleic acid profiling by cleavage applied to RNA (RiboSNAP), uses a lead molecule conjugated to the natural product bleomycin. Bleomycin induces strand scission of both DNA and RNA through a metal-ion and oxidative process137,138,139,140,141,142. Bleomycin A5 is typically conjugated through its terminal amine, which drives affinity for DNA; alkylation of the amine reduces DNA damage by the conjugate and directs its activity for the RNA target143. As with Chem-CLIP, side-by-side experiments are completed with a probe that lacks the RNA-binding module to control for non-selective reaction of bleomycin. The cellular targets of the small molecule–bleomycin conjugate and the control probe are identified by their depletion in RNA-seq data where bona fide targets are those cleaved only by the conjugate. Likewise, the lead compound can be studied directly in a competition experiment. As with Chem-CLIP, RiboSNAP is a robust method that can measure cellular binding selectivity by analysing the extent to which each transcript is cleaved. Functional selectivity is measured by coupling these cleavage data with the effect of the lead compound on expression levels, where indirect target expression levels are altered by the lead compound, but not depleted by the small molecule–bleomycin conjugate. RiboSNAP can also elucidate the small-molecule binding site within an RNA target.

As an alternative to direct cleavage, RNA targets can be cleaved by small-molecule chimeras that recruit a nuclease, or ribonuclease-targeting chimeras (RIBOTACs)144. A RIBOTAC comprises an RNA-binding module tethered to an RNase L-recruiting small molecule. RNase L, which functions in the antiviral immune response, is normally expressed in minute amounts as an inactive monomer. Its activity is regulated transcriptionally, post-transcriptionally and by an RNase L inhibitor, to tightly control RNase L activity145,146,147. Upon viral infection, 2′–5′-oligoadenylate (2′-5′A) is synthesized, which dimerizes and activates RNase L, inducing degradation of viral RNA148,149. Previous studies have shown that RIBOTACs recruit and activate RNase L locally and do not elicit an immune response143,150. As with Chem-CLIP and direct cleavage methods, the RNA-binding molecule drives engagement of the target while a nuclease-recruiting module activates RNase L to induce cleavage; bound targets are depleted in RNA-seq analysis. To study targets engaged by the lead compound, a competition experiment can be used in which the levels of bona fide targets are restored as a function of lead compound concentration. For both direct and indirect cleavage methods, pathway analysis can be performed to evaluate on-target and potential off-target effects. An important advantage of cleavage strategies for target validation is that they do not require purification and isolation of cross-linked material; instead targets can be defined by simple RNA-seq analysis.

Quantifying selectivity

Quantifying the effect of a compound across the transcriptome, coupled with defining direct targets and pathway analysis (at the transcriptome and/or proteome level) provides insight into small-molecule selectivity. An important metric to quantify small-molecule selectivity such that small molecules can be compared is the Gini coefficient151,152. The targets of a small molecule are rank ordered by their percentage inhibition, and the cumulative fraction and the total cumulative effect for each are calculated. The former is a function of the number of targets studied and the latter is a weighted inhibition. The two values are plotted against each other, and the Gini coefficient is equal to 1 − 2B where B is the area under the resulting curve153. For a review of Gini coefficients as applied to RNA, the reader is referred to ref.152.

Lead optimization of RNA-targeting small molecules

One advantage of small-molecule therapeutics is that medicinal chemistry efforts can be applied to optimize the initial hit compound, improving both its potency and its physicochemical properties. Below, three broad strategies to lead-optimize RNA-targeting small molecules — including traditional medicinal chemistry as well as structure-guided and modular assembly approaches, which are often integrated synergistically — are discussed.

Traditional medicinal chemistry approaches

Traditional medicinal chemistry optimization typically starts with analogue synthesis or purchase to establish the SAR around the hit compound (Fig. 4a). The bioactivity of an RNA-targeting ligand can be affected by multiple factors, such as binding affinity, cellular permeability and the number of off-targets. SAR is typically defined by a combination of biophysical and cellular assays to evaluate hit analogues. After acquiring sufficient SAR data, pharmacophore modelling and subsequent chemical similarity searching can then diversify and optimize the starting scaffold. Among all scaffolds, those with desirable or optimizable physicochemical properties are prioritized. Traditional medicinal chemistry can be applied to essentially any small-molecule candidate, tempered by the resources required to synthesize or purchase a large number of analogues. Below, we describe how medicinal chemistry approaches were used to optimize risdiplam, a treatment for SMA, and a pyrimido-indole that directs MAPT alternative splicing. The reader is referred to refs.154,155,156,157,158,159 for other notable examples of lead optimization of RNA-targeted small molecules using medicinal chemistry.

Fig. 4: Strategies for lead optimization of RNA-targeted small molecules.
figure 4

a | Traditional medicinal chemistry optimization begins with hit expansion and analogue synthesis to generate structure–activity relationships (SARs), which can be used to optimize lead compounds as well as to discover new scaffolds by pharmacophore modelling. b | Structure-guided lead optimization relies on structure modelling of the RNA target to perform virtual screening and to inform compound design based on ligand–RNA interactions. c | Sequence-based lead optimization explores the differences between structures of on-target and off-targets and modifies the lead compound based on structural features unique to the on-target, that is, modular assembly or dimerization. PK, pharmacokinetic; pri, primary.

Structure-guided approaches

Structure-guided approaches for lead optimization rely on sophisticated models of a ligand–RNA complex to define key interactions and identify how positions within the compound’s structure can be modified to improve interactions or eliminate those that are unfavourable (Fig. 4b). Such models are typically experimentally generated by NMR spectroscopy or X-ray crystallography. The dynamic nature of RNA must be considered, as no single static structure can truly represent all possible conformations. To address this limitation, experimental models are often coupled with molecular dynamics simulations to generate an ensemble of structures, where the algorithm simulates RNA conformations on the basis of experimentally determined parameters.

Molecular dynamics-based virtual screening has successfully identified ligands that bind various RNAs, typically using models from NMR spectroscopy or X-ray crystallography. As these structures do not capture the dynamic nature of RNA, and molecular dynamics simulations do not accurately recapitulate RNA electrostatics, docking of small-molecule ligands is not as predictive as it could be. To overcome these challenges, a docking method that uses the dynamic ensemble of RNA conformations from combined NMR and molecular dynamics approaches has been developed, an example of which is provided below126,160. The advantage of structure-based approaches is that they minimize the synthesis efforts by rationally designing analogues, but the limitation is that not all target structures are readily attainable.

Modular assembly approaches

One of the many factors that affect the activity of small molecules that target RNA is the uniqueness of the structural motif throughout the transcriptome. A strategy to optimize small molecules with activity against degenerate functional structures is to exploit adjacent structural elements, that is, a modular assembly approach in which a single small molecule binds to structural elements that are uniquely juxtaposed (Fig. 4c). Indeed, polyvalency — used in various biological processes to enhance avidity and specificity, from cell–cell interactions to gene expression and viral infections — has also been exploited in drug discovery efforts to agonize or antagonize receptors161,162. Such an approach has been taken for RNA targets by modularly assembling two or more RNA-binding modules65,163,164,165,166, some with in vivo activity167. One example is described below in which a modular assembly approach broke the degeneracy of two motifs, affording a specific inhibitor of miR-515 biogenesis. Modular assembly has been employed for RNA repeat expansions (Box 2), which display periodic arrays of internal loops.

Examples of small-molecule RNA binders

Several examples in which small-molecule RNA binders perturb downstream biology in a precise and predicted manner are highlighted below. Notably, the small molecules directly engage a functional site, and simple binding is sufficient to exert a biological effect (Box 1 and Tables 13).

Table 2 Examples of small molecules that bind to RNAs implicated in cancer
Table 3 Examples of small molecules that bind to RNAs implicated in neuromuscular and neurodegenerative diseases

Targeting functional RNA structures

Many RNA classes have functional sites, for example, Drosha and Dicer processing sites within miRNA precursors, IRESs in viral RNAs and some human mRNAs, bacterial riboswitches, splicing enhancers and silencers in pre-mRNAs, and regulatory structures in 5′ and 3′ untranslated regions (UTRs). Below, we describe the discovery of exemplar small molecules that bind to these functional sites.

Discovery of small molecules that bind viral RNAs

Apart from the bacterial ribosome, noncoding regions have also been investigated as drug targets to treat viral infections such as those caused by HIV160,168 and hepatitis C virus (HCV)12. The first viral regulatory elements targeted by small molecules that interfere with the infection process were HIV TAR and RRE RNAs169,170,171,172,173,174. Structure-guided approaches have been used to identify new scaffolds for TAR RNA that inhibited Tat-mediated transcriptional activation in a cellular model. One such scaffold, amiloride (Table 1), was later lead-optimized, affording an analogue with >100-fold enhanced affinity compared with the entry compound, and its structural interactions with the RNA target were well characterized126,160.

The HCV mRNA harbours a highly structured and conserved IRES in its 5′UTR, which initiates translation by recruiting and assembling host ribosomes175,176,177 and can be inhibited by small-molecule binding67,178,179. A mass spectrometry-based screen identified 2-aminobenzimidazole derivatives as both binders and inhibitors of HCV translation (Table 1). The affinity of the lead scaffold was improved ~100-fold after SAR studies158,180.

Another amiloride was discovered that inhibits viral replication of the Betacornoavirus OC43 and SARS-CoV-2 (ref.181). A panel of 23 amilorides was screened for inhibition of the infectivity of OC43, affording three lead molecules. Sequence conservation of the 5′-end of betacoronaviruses suggested a functional role, and this region is predicted to fold into six stem–loop structures (SL1–SL6). Fortuitously, one amiloride bound to SL6, as determined by a dye displacement assay and NMR spectral studies181.

Discovery of an antifungal that targets a group II intron

Group II introns are a class of self-splicing ribozymes found in mitochondria of fungi and yeast but not in mammals182, making them perhaps ideal targets for the development of antifungals. An in vitro screen of ~10,000 compounds for inhibitors of group II intron splicing afforded six hits that served as starting points154. Pharmacophore analysis enabled by SAR studies provided alternative scaffolds with improved physicochemical properties and ultimately a low micromolar inhibitor of group II intron splicing with antifungal activity comparable to that of amphotericin B.

Targeting human miRNAs with small molecules

miRNAs, small noncoding RNAs of 20–25 nucleotides, are key players in post-transcriptional gene regulation, silencing gene expression by association with the Argonaute RNA-induced silencing complex (RISC) and base pairing to the 3′UTR of complementary mRNAs8,183,184. Transcribed by RNA polymerase II as primary transcripts (pri-miRNAs), they are processed stepwise, first by the nuclease Drosha into precursor miRNAs (pre-miRNAs); after export to the cytoplasm, the mature (functional) miRNA is generated by the nuclease Dicer149,185. Fortuitously, their structures can be accurately modelled from sequence, and processing (functional) sites can be deduced from deep sequencing of mature miRNAs. Many studies have shown that Drosha or Dicer processing sites within miRNA precursors are targetable functional structures that can be short-circuited to alleviate disease159,167,186,187,188,189,190.

Compounds that bind functional structures in miRNA precursors and selectively inhibit the biogenesis of oncogenic miRNAs have been identified191. In particular, a spermine–amidine conjugate (PA-1) was discovered using an assay that measures enzymatic inhibition of pre-miR-372 processing192 (Fig. 5a and Table 2). In vitro, the molecule, without optimization, bound to the pre-miR-372 Dicer site, inhibited its processing and reduced cancer cell proliferation by derepression of its downstream target large tumour suppressor kinase 2 (LATS2)192. The selectivity of the small molecule was studied miRnome-wide, which revealed that only a small subset of miRNAs were affected. The conjugate also reduced formation of tumour cell spheroids of patient-derived cancer stem cells192. PA-1 was later optimized to afford the pre-miR-372-selective small molecule PA-3 (ref.16) (Table 2). Interestingly, a neomycin-nucleobase-amino acid conjugate that inhibited pre-miR-372 and pre-miR-21 was discovered by the same laboratory; lead optimization by structure-based design afforded a selective inhibitor of pre-miR-21 (ref.193) (Table 2).

Fig. 5: Small-molecule binding to RNA structures elicits biological effects through various modes of action.
figure 5

a | A small molecule that binds the Dicer site of oncogenic pre-miR-372 inhibits its biogenesis and short-circuits downstream pathways. b | A compound that binds to an iron-responsive element (IRE) in the 5′ untranslated region (UTR) of SNCA mRNA inhibits its translation by mechanically blocking ribosomal assembly and polysome loading onto the mRNA. c | A small molecule that binds to a structural element at the exon 10–intron junction of MAPT pre-mRNA directs alternative splicing towards exon exclusion, reducing the amount of toxic 4R tau produced. d | A small molecule inhibits translation by binding to the flavin mononucleotide (FMN) riboswitch and inducing formation of the sequester loop that hides the start codon. e | Another small molecule, later optimized into an FDA-approved drug for the treatment of spinal muscular atrophy, binds to the exon 7–intron junction in SMN2 pre-mRNA and increases SMN2 protein levels by acting as a molecular glue for the RNA and splicing machinery and promoting exon inclusion. f | A compound binds to polypurine sequences in 5′UTRs of a subset of mRNAs and inhibits translation by acting as a molecular glue for the RNA and eukaryotic initiation factor 4A (eIF4A). IC50, half maximal inhibitory concentration; Kd, dissociation constant; miR, microRNA; MOA, mode of action.

Modular assembly has been used to overcome the degeneracy of functional RNA structures in the transcriptome by targeting two adjacent structures in the RNA target with a single molecule. Such an approach was taken to distinguish between human pri-miR-515 and pri-miR-885 to afford a selective molecule that inhibits only the biogenesis of miR-515 (ref.194) (Fig. 4a and Table 2). The lead molecule, a bis-benzimidazole with steric bulk that ablates DNA binding, prefers a structure harboured in the Drosha processing sites of both pri-miR-885 and pri-miR-515, a 5′UCU/3′AUA internal loop. As expected, the compound inhibited the biogenesis of both miRNAs to a similar extent in cells. To drive the selectivity towards a single target, differences in the secondary structure of the two pri-miRNAs were identified and exploited. A second internal loop that binds the substituted bis-benzimidazole is present adjacent to the Drosha site of miR-515, but not that of miR-885. Therefore, a dimeric compound, dubbed Targaprimir-515, consisting of two copies of the original hit, was designed. By comparison with the monomeric ligand, which showed similar affinity for both RNA targets, Targaprimir-515 had no measurable binding to RNA with a singular binding site and 3,200-fold improvement in affinity to pri-miR-515 (ref.194). The enhancement of selectivity in cells was also evident by miRNA profiling and global proteomics. The self-structure of Targaprimir-515 contributed to the improved in vitro and cellular selectivity194. Collectively, these and other studies65 confirm that bivalent compounds, even those assembled from fragments108, can successfully target RNAs with greater affinity and selectivity than the monomer from which they are derived. Modular assembly has also been applied to target RNA repeat expansions with enhanced potency and specificity (Box 2).

Inhibiting undruggable proteins by targeting the encoding RNA: α-synuclein

Pathogenic proteins are often considered undruggable when they lack a defined structure195,196. One way to drug these intrinsically disordered proteins (IDPs) is to inhibit their translation by targeting the encoding mRNA. Iron-responsive elements (IREs) are small stem–loop structures present in 5′- or 3′UTRs197,198,199 that bind to iron regulatory proteins (IRPs)200,201. IRP binding to IREs in the 5′UTR prevents ribosome docking and blocks RNA translation whereas binding to the 3′UTR stabilizes the transcript and upregulates translation202. IREs have an important role in cognitive function and hence in neurodegenerative diseases203,204,205,206,207. Small molecules that target IREs and inhibit the translation of these pathogenic proteins have been discovered13,197,208,209,210,211,212,213.

The aberrant expression and mutation of the IDP α-synuclein, which harbours an IRE in the encoding SNCA mRNA, is linked to Parkinson disease214,215. Targeting the IRE in the SNCA mRNA with small molecules to inhibit its translation is thus a promising strategy216. Synucleozid, a small molecule that binds a bulged adenosine residue within the IRE structure and inhibits SNCA translation, was designed by Inforna13 (Fig. 5b and Table 3). Notably, this bulged adenosine is not present in other IREs. Recognition of the SNCA IRE by synucleozid was dependent upon both the bulged nucleotide and its closing base pairs13; its MOA, by direct target engagement, was reduction of the number of polysomes loaded onto SNCA mRNA. Approximately 90 mRNAs have been identified with IREs or IRE-like elements, which vary in their sequences and structures199; these studies and those mentioned above lay the foundation for regulating translation of other mRNAs with small molecules.

Altering protein isoforms by directing exon exclusion with small molecules: microtubule-associated protein tau

Almost every pre-mRNA goes through a series of processing steps, including alternative splicing (inclusion or exclusion of exons) to produce different protein isoforms217. Unsurprisingly, splice site mutations can alter splicing patterns and cause human diseases218. Studies with oligonucleotides demonstrated that it is indeed possible to rescue splicing defects by binding to and occluding the mutation from the spliceosomal machinery60,219,220,221,222, suggesting that small molecules may also be able to direct splicing.

The microtubule-associated protein tau gene (MAPT) produces six tau isoforms by alternative splicing223. Exons 9–12 encode a microtubule-binding domain (MBD), and exon 10 is alternatively spliced to produce isoforms with three (3R) or four (4R) MBDs, with the latter prone to aggregation223. In healthy individuals, the ratio between 3R and 4R is approximately equal; however, a genetic mutation in the splicing regulatory element (SRE) present in exon 10 alters the splicing pattern to produce excess 4R, the cause of frontotemporal dementia with parkinsonism-17 (FTDP-17). One such intronic mutation, dubbed disinhibition dementia parkinsonism amyotrophy complex (DDPAC), converts a GC base pair into a GU wobble pair, destabilizing the structure of the SRE. This destabilization allows U1 small nuclear RNA (snRNA) to more easily bind the SRE and facilitate exon 10 inclusion224. Stabilization of the structure of the SRE with a small molecule might therefore impede U1 snRNA binding and restore a normal splicing pattern. Various small molecules have been identified that bind to the SRE70,225,226,227,228 (Fig. 5c and Table 3).

Small molecules with activity in primary neurons were lead-optimized by chemical similarity searching, pharmacophore modelling driven by in vitro and cellular screening and analogue synthesis, structure-based design and docking studies based on the 3D structure of the MAPT SRE bound to several lead molecules, and traditional medicinal chemistry approaches70 (Fig. 4b). This synergistic strategy afforded a drug-like molecule that directed MAPT splicing towards the 3R isoform in primary neurons from a human tau transgenic mouse70.

Targeting RNA structures for degradation

If a functional site has not yet been discovered, various strategies may be applied to modulate RNA function by facilitating its degradation (Fig. 6). Two of these strategies are discussed below; both rely on chemically induced proximity — direct small-molecule cleavage of the target RNA (Box 2) or target degradation achieved by nuclease recruitment (Fig. 6a). Although these examples target known functional structures, the cleavage-based methods eliminate the RNA transcript and thus their MOA does not require binding to a functional site. The advantage of small molecule degraders, like ASOs, is that they are simultaneously target-validation methods. Small molecules can also facilitate degradation of an RNA target by increasing its accessibility to endogenous decay pathways such as the exosome and directing splicing such that the mature mRNA encodes a premature stop codon (Fig. 6b).

Fig. 6: Small-molecule RNA degraders and their mechanisms of action.
figure 6

a | Degraders can elicit biological functions even if bound to non-functional sites within the RNA target as they cleave the transcript. Simply binding to non-functional sites is in principle biologically silent. Notably, degraders cleave RNA targets sub-stoichiometrically, as the same degrader molecule can cleave more than one RNA transcript by substrate turnover and can be optimized to improve target selectivity, including linker length, substrate preference and cellular localization of the target. b | Mechanisms of small-molecule-facilitated degradation. From left to right: small molecules can bind to intronic RNA repeat expansions that are harboured as retained introns. This causes excision of the intron, which is decayed by the RNA exosome280; RNA-binding compounds can be appended to natural products such as bleomycin that can cause oxidative cleavage of RNA targets selectively; ribonuclease-targeting chimeras induce the proximity of ribonucleases to unnaturally target an RNA for destruction by native quality control pathways; and small molecules can affect pre-mRNA splicing to create a mature mRNA with included exons that contain premature termination codons that trigger decay via nonsense-mediated decay. RNase, ribonuclease.

Nuclease recruitment to cleave RNA targets: ribonuclease-targeting chimeras

Targeted degradation was first demonstrated for proteins, or proteolysis-targeting chimeras (PROTACs)229. PROTACs are chimeric molecules comprising a protein-binding module and an E3 ubiquitin ligase-recognition module, which tags the targeted protein for selective degradation by the proteasome230,231,232,233,234. RIBOTACs have been developed for targeted degradation of RNA144, composed of an RNA-binding module and a RNase-recruiting module that selectively mediates RNA decay (Fig. 6a). RIBOTACs recruit the ubiquitously expressed cellular endoribonuclease RNase L, which functions in the viral immune response (see above)148,149. In its first two iterations, an RNA-binding module that selectively recognizes the Drosha site of pri-miR-96 or the Dicer site of pre-miR-210 was coupled to 2′-5′A4, inducing its selective cleavage in cells144,235.

As 2′-5′A4 reduces drug-likeness, a small-molecule RNase L recruiter was developed143 based on a previously reported small molecule236. A RIBOTAC with this new recruiter was developed to target pre-miR-21 for degradation by recognition of its Dicer processing site143. Notably, the RIBOTAC was more potent than the binder from which it was derived, possessed a prolonged duration of effect and inhibited breast cancer metastasis in a mouse model. Transcriptome-wide studies showed that the RIBOTAC was also more selective than the binder143, as quantified by a Gini coefficient153 and did not elicit an immune response. This enhanced selectivity is a composite of the specificity of the RNA-binding small molecule, the inherent substrate specificity of RNase L and whether the target has an RNase L substrate adjacent to the site at which the RNA-binding small molecule binds, the distance dictated by the linker that tethers the two components of the chimera.

These studies led to the hypothesis that RIBOTACs may allow reprogramming of known drugs for RNA targets. 2DCS selection of the Repurposing, Focused Rescue, and Accelerated Medchem (ReFRAME) library237 indicated that the receptor tyrosine kinase (RTK) inhibitor dovitinib binds the Dicer processing site of pre-miR-21 and therefore might inhibit its cellular processing, albeit at higher concentrations than those required to inhibit RTKs150. Converting the binding molecule dovitinib into a RIBOTAC enhanced its inherent RNA-targeting activity in cells and concomitantly decreased potency against canonical RTK protein targets, shifting selectivity for pre-miR-21 by 2,500-fold (Table 2). Further, the chimera alleviated disease progression in two mouse models caused by miR-21 overexpression: triple-negative breast cancer and Alport syndrome150.

RNA function can indeed be modulated with small-molecule binders or small-molecule degraders, expanding the MOA of RNA-targeting compounds and likely the number of targetable RNAs. Chimeric small molecules that target RNAs set the foundation for new drug discoveries akin to the revolution of PROTACs, enabling inhibition of RNA circuits when the functional site is unknown or absent. It will be exciting to see whether other RNA-modifying enzymes (editing, splicing machinery and so on) can be selectively recruited to RNA targets.

Targeting RNA-associated pathways

Several small molecules that target RNA-associated pathways have been derived from phenotypic screening and are described below.

Ribocil, an antibacterial that targets the FMN riboswitch

Riboswitches are structured noncoding sequences in the 5′ leader of bacterial mRNAs that control gene expression of a downstream ORF61,238,239,240,241,242 (Fig. 5d). Widely distributed across all known phylogenetic groups of bacteria, riboswitches form highly specific binding pockets for small-molecule metabolites, second messengers and inorganic ions243,244,245,246,247. Binding of the small molecule to the receptor (aptamer) domain directs formation of alternative secondary structures in the adjacent regulatory domain (expression platform)244 that modulates transcription or translation of the message248,249,250. The antibacterial ribocil was identified by a phenotypic screen for inhibitors of the riboflavin biosynthetic pathway in E. coli11 (Fig. 5d and Table 1). Resistance mutations mapped to the FMN riboswitch immediately upstream of the ribB gene11,251, validating the RNA target and elucidating compound MOA. A crystal structure of the riboswitch–ribocil complex revealed that the compound competitively binds to the FMN-binding pocket, using a similar, but not identical, binding mode.

A structure-guided approach to target the FMN riboswitch was also fruitful, yielding the compound 5FDQD (Table 1). Inspection of the crystal structures of the FMN-bound252 and apo-riboswitch253 revealed conformational changes in the RNA that occur upon FMN recognition. An iterative structure-based design strategy was pursued in which, first, structures were analysed for regions that could potentially accommodate chemical changes to FMN; second, a set of structure-guided derivatives was synthesized; third, productive binding was tested using chemical probing and in vitro transcription assays; and fourth, crystal structures of new lead compounds that emerged were determined254,255. The compound that resulted from these efforts, 5FDQD, is an analogue of FMN that binds to its RNA target with activity equipotent to that of the natural effector254,255. The bactericidal activity of this compound is highly selective for Clostridium difficile while having little effect upon diverse other bacteria commonly found in the gut microbiota254. Importantly, in mice, 5FDQD prevented lethal antibiotic-induced C. difficile infection, validating the use of a structure-guided approach to yield potent RNA-targeting therapeutics.

Risdiplam and branaplam, small molecules that target an RNA–protein manifold to direct alternative pre-mRNA splicing

SMA is a genetic disease caused by the deletion or mutation of the survival motor neuron 1 (SMN1) gene, which encodes a catalytic component of a complex responsible for assembly of small nuclear ribonucleoproteins (snRNPs) and hence the spliceosome. Its loss of function in SMA ultimately leads to the degradation of spinal motor neurons, muscle weakness, muscle atrophy and respiratory complications66,256. Fortuitously, humans encode an SMN1 paralogue, SMN2, with the two genes differing by only two nucleotides, one in exon 7 and the other in exon 8. The single-nucleotide polymorphism in exon 7 disrupts a splicing enhancer, resulting in exclusion of SMN2 exon 7 and reduction of the half-life of the encoded protein. If SMN2 exon 7 alternative splicing could be directed towards inclusion, then SMN2 could substitute functionally for loss of SMN1 and hence as a treatment for SMA (Fig. 5e).

A phenotypic screen to discover small molecules that direct splicing of SMN2 such that exon 7 is included identified an orally bioavailable compound, SMN-C3. Subsequent studies showed that SMN-C3 directed endogenous alternative splicing of SMN2 and provided therapeutic benefits in an SMA mouse model, with limited off-target effects. A medicinal chemistry campaign around SMN-C3 generated risdiplam (Evrysdi; Fig. 5e and Table 3), the first small-molecule FDA-approved drug to treat SMA14.

The lead optimization process that afforded risdiplam highlights a traditional medicinal chemistry approach (Fig. 4c). After chemical optimization of SMN-C3 to avoid mutagenicity and to confer favourable pharmacokinetic profiles, the preclinical candidate RG7800 was selected for advancement66,257. As RG7800 caused retinal degeneration14, a medicinal chemistry campaign began around key elements of the chemical structure, particularly with the goal to lower basicity, enhance potency, improve influx into the central nervous system and reduce compound metabolism. A virtual study of a library containing a pyridopyrimidinone central core and a right-hand-side imidazopyridazine fragment was carried out14. Risdiplam emerged from these compounds after a battery of preclinical tests showed high potency for directing SMN2 splicing in vitro and in vivo, reduced basicity, no phototoxicity risk and no formation of active metabolites14.

The challenge was to identify the target of risdiplam and hence its MOA, later determined after a series of studies on a derivative dubbed SMN-C5. NMR spectroscopy revealed that SMN-C5 stabilized an adenosine bulge at the exon 7–intron junction149,258,259, acting as a molecular glue for the ternary complex formed with U1 snRNP, revealed by Chem-CLIP260.

The small molecule branaplam was also identified from a similar phenotypic screen of SMN2 exon 7 alternative splicing, using a mini-gene reporter of a breast cancer 1 (BRCA1) exon 18 mutant that induces exon skipping as a counter-screen15. A series of studies using a related molecule, NVS-SM2, pinpointed that the molecule interacted with 21 nucleotides of the SMN2 5′ splice site, in particular a GA sequence found at the end of exon 7, suggesting involvement of U1 snRNP. NVS-SM2 acts as a molecular glue, as U1 snRNP only bound to SMN2 exon 7 when the small molecule was present15. Branaplam was later discovered to reduce mutant huntingtin (mHTT) protein levels by facilitating pseudo-exon inclusion in the HTT mRNA261. It is currently in clinical trial for the treatment of both SMA and Huntington disease.

The discoveries of risdiplam and branaplam are perhaps the best examples of successful phenotypic drug discovery efforts, as their development did not rely on a specific target but instead on a desired activity.

Rocaglamide, a molecular glue that inhibits translation of polypurine-containing transcripts

Molecular glues that affect the translation of specific mRNAs have also been identified by phenotypic screening, particularly for oncogenic mRNAs that promote proliferation262,263. Notable amongst several interesting compounds is rocaglamide, a member of the flavagline family, a class of bioactive natural products264 (Fig. 5f and Table 2). Rocaglamide has potent anti-tumour activity, specifically inhibiting the translation of a subset of transcripts with polypurine sequences in their 5′UTRs. The compound stabilizes the interaction between eukaryotic initiation factor 4A (eIF4A), an ATP-dependent DEAD-box RNA helicase, and mRNAs that depend on eIF4A binding to unwind them for translation265. Recent studies have afforded rocaglamide analogues with improved potency and physicochemical properties266,267,268. Thus, translation can be affected not only by small-molecule binding to an mRNA, but also by pharmacological stabilization of transient interactions between transcripts and translation factors, providing specific functional outcomes. Other compounds have been shown to have similar activities, suggesting that medicinal optimization of these compounds could provide small-molecule therapeutics264.

The future of small-molecule targeting of RNA

First viewed as simply an intermediate between DNA and protein, a renaissance in RNA structure and function beginning in the 1980s revealed the complex roles of RNA in homeostasis and disease183,269,270,271,272,273. Although considered a challenging target, modulation of RNA function by structure-binding small molecules is becoming increasingly well established as therapeutically practical. However, tremendous gaps in knowledge remain (Table 4 and Box 3).

Table 4 Tools and challenges in developing small-molecule therapeutics targeting RNA

In the protein world, there are target classes that are considered both druggable and undruggable, generally classified by the formation of ordered structures. There is currently not enough knowledge in the RNA world to be able to classify RNA targets; however, most, if not all, bioactive small molecules target stable, functional structures. By identifying regions with unusually stable structures that are evolutionarily conserved, we can gain insight into potential functional structures and expand the druggable transcriptome. Key to the discovery of small molecules that bind these sites is to hypothesize function and hence potential compound MOA. Once the hypothesis is formulated, it must be verified by gain- and loss-of-function studies, particularly the effect on phenotype. Fortuitously, if binding alone is insufficient to modulate RNA function, the binding small molecule can be converted into a degrader. As more data are collected around functional structures, we will be able to tease out factors that contribute to druggablility.

Complementary to identifying high-priority RNA targets is defining physicochemical properties and chemotypes that confer affinity for RNA, which will inform design of RNA-focused small-molecule libraries (Box 1). Such libraries can be used in two ways: first, sequence-based design by defining binding landscapes, which defines both on- and off-targets; and second, screening against defined targets, in which an appropriate counter-screen must be completed to improve the likelihood of selective binding. The resultant datasets refine hypotheses around physicochemical properties and chemotypes that are ideal for RNA targets and hence RNA-focused small-molecule libraries. Notably, these screening collections must maintain chemical diversity to be useful.

Advances in structure-based design and lead optimization for RNA-targeted small molecules are also needed. Force fields that have been developed to enable structure-based design for proteins have not yet been fully customized for RNA, despite recent advances in computational biology. Molecular recognition of RNA by small molecules is driven differently than for protein, specifically the importance of aromatic ring stacking interactions that will require modification of electrostatic parameters. Accurate modelling of electrostatic parameters is key to understanding how small-molecule cores bind to and interact with an RNA target and to guide positioning of functionalities outside the core. Further, small-scale conformational dynamics are important for recognition of RNA targets. In its native state, RNA can have fast exchange between conformations, which are recognized by ligands differently. When the energetic difference between these states is low, there is likely only a modest effect on affinity. When the energy difference between conformations is large, however, the difference in affinity could be dramatic. Such knowledge is key for structure-based design.

Small-molecule binding is often not sufficient for bioactivity. Determining whether binding occurred to the cellular target but produced a biologically silent interaction is now possible with recently developed tools. Correlating target occupancy studies, using methods such as Chem-CLIP, with transcriptome- and proteome-wide studies is a powerful strategy that defines on- and off-targets and when binding elicits a biological response that is not desired. Such data are key to inform lead optimization. Intriguingly, fragment mapping and Chem-CLIP108,133,274, in combination with RNA structure prediction programs such as ScanFold, can provide insight into ligandable cellular RNA structures; biologically silent interactions inform cellular RNAs amenable to targeting by small-molecule-induced degradation.

Translating bioactive small molecules from cells to animals and then to the clinic is a significant hurdle for RNA-targeted small molecules. Of importance will be incorporation of transcriptome-wide analyses in patient tissues that assess efficacy and toxicity into clinical development pipelines, which should also be implemented for oligonucleotide-based modalities. Such studies also inform the range of selectivity that will be acceptable for RNA targets and may be quite different from that of proteins. For in vivo studies, significant differences in the sequence and structure of human and mouse RNAs, particularly noncoding RNAs, can change the activity or selectivity of a compound. As very few RNA-targeted compounds have advanced to animal studies and the clinic, more data will be needed to define pharmacokinetic and pharmacodynamic profiles ideal for preclinical and clinical candidates. As PROTACs and other protein-targeted medicines have changed the view of the rule of five275, we will need to be open to the fact that the physicochemical properties of small molecules that modulate RNA function may be outside the traditional drug-like space.

Outlook

As the functions of RNA in both health and disease have expanded and diversified, so has the field of RNA chemical biology, demonstrating that RNA is indeed druggable by small molecules. Further, RNA-targeted small molecules can be designed and lead optimized, the latter using strategies developed for protein targets with important modifications and considerations. As both academia and industry push the frontiers of this field forward, we believe that many more RNA-targeted medicines will reach the clinic in the years to come.