Single-cell brain organoid screening identifies developmental defects in autism

Stem cell and cerebral organoid culture conditions

Feeder-free hES cells or iPS cells were cultured on hES cell-qualified Matrigel (Corning, catalogue no. 354277)-coated plates with Essential8 stem cell medium supplemented with bovine serum albumin (BSA). H9 embryonic stem cells were obtained from WiCell. Cells were maintained in a 5% CO2 incubator at 37 °C. All cell lines were authenticated using a short tandem repeat assay, tested for genomic integrity using single-nucleotide polymorphism (SNP) array genotyping and routinely tested negative for mycoplasma.

Cerebral organoids were generated using a previously published protocol with modifications23. In brief, cells were cultured to 70–80% confluent, and 16,000 live cells in 150 μl Essential8 medium supplemented with Revitacell (ThermoFisher, catalogue no. A2644501) were added to each well of a U-bottom ultralow attachment 96-well plate (Corning, catalogue no. CLS3473) to form embryoid bodies. For eCas9 induction, 4-hydroxytamoxifen (Sigma-Aldrich, catalogue no. H7904) was added on day 5 at a concentration of 0.3 μg ml−1. Neural induction was started on day 6. Embryoid bodies were embedded in Matrigel (Corning, catalogue no. 3524234) at day 11 or 12 based on morphology check. CHIR99021 (Merck, catalogue no. 361571) at 3 μM was added from day 13 to day 16, and medium was switched to improved differentiation medium supplemented with B27 minus vitamin A (IDM-A) at day 14. On day 25, medium was switched to improved differentiation medium supplemented with B27 plus vitamin A (IDM+A); 1% dissolved Matrigel was added to the medium from day 40 to day 90. From day 60 to day 70, medium was gradually switched to Brainphys neuronal medium (Stemcell Technologies, catalogue no. 05790) and supplemented with brain-derived neurotrophic factor (BDNF) (20 ng ml−1; Stemcell Technologies, catalogue no. 78005.3), glial cell line-derived neurotrophic factor (GDNF) (20 mg ml−1; Stemcell Technologies, catalogue no. 78058.3) and bucladesine sodium (1 mM; MedChemExpress, catalogue no. HY-B0764)24. For ventralized organoids, we followed a previously published protocol47. Embryoid bodies were not embedded, and patterning factors, including 100 nm SAG (Merck-Millipore, catalogue no. US1566660) and 2.5 μM IWP2 (Sigma-Aldrich, catalogue no. IO536), were added from day 5 to day 11.

CHOOSE screen

sgRNA selection and cloning

The top four sgRNAs were first selected on the basis of predictions using multilayered Vienna Bioactivity CRISPR (VBC) score54 and then subjected to the reporter assay (below) to test editing efficiency. sgRNAs were cloned into the gRNA reporter assay lentivirus construct (containing the dual sgRNA cassette: U6-sgRNA1-H1-sgRNA2) using the GeCKO cloning protocol55. The two sgRNAs were cloned using type IIS class restriction enzymes FastDigest BpiI (ThermoFisher, catalogue no. FD1014) and Esp3I (ThermoFisher, catalogue no. FD0454) separately and verified using Sanger sequencing. All gRNAs used for this study can be found in Supplementary Table 1.

sgRNA reporter assay

A construct containing dTomato-2A-gRNA target array-TagBFP under the RSV promoter was assembled using Gibson assembly. The construct was packaged into retrovirus using the Platinum-E retroviral packaging cell line via the calcium phosphate-based transfection method. Virus-containing supernatant (Dulbecco’s modified Eagle’s medium (DMEM), 10% fetal bovine serum (FBS), 2 mM l-glutamine, 100 U ml−1 penicillin and 0.1 mg ml−1 streptomycin) was collected for up to 72 h, filtered through a 0.45-μm filter and then stored on ice. Retroviruses were then used to infect NIH-3T3 cells, and dTomato-positive cells were sorted using flow cytometry into single cells to establish reporter cell lines. To deliver sgRNAs, the lentiviral construct containing the dual gRNA cassette and the spleen focus-forming virus (SFFV) promoter driving eCas9 were packaged using HEK293 cells to produce lentivirus. The reporter 3T3 cell lines generated above were cultured in six-well plates and infected with lentivirus containing dual sgRNA cassette targeting each gene individually. BFP fluorescence was measured at 7, 14 and 20 days postinfection. Fluorescent changes at 20 days postinfection were used to evaluate the gRNA editing efficiency. In total, 98 dual sgRNA cassettes were tested for 36 genes.

Generation of barcoded CHOOSE lentivirus, hES cell infection and embryoid body generation

The CHOOSE lentiviral vector was constructed based on a previously published lentiviral vector that carries a CAG driving ERT2-Cre-ERT2-P2A-EGFP-P2A-puro cassette12. A multicloning site including NheI and SgsI recognition sequences was introduced to the 3′ LTR of the lentivirus backbone according to the CROP-seq vector design17. Then, the original U6 sgRNA expression cassette was removed; instead, the dual sgRNA (U6-sgRNA1-H1-sgRNA2) cassette was introduced to the 3′ LTR cloning site. To generate a barcoded library, the following primers were used to individually amplify (8–10 cycles, monitored using a quantitative polymerase chain reaction (qPCR) machine, stopped when reaching to logarithmic phase) each dual sgRNA cassette from the lentiviral construct used in the reporter assay while introducing a 15 base pair barcode.

FW primer: 5′-tcgaccgctagcagggcctatttcccatga-3′.

RV primer: 5′-cagtagggcgcgccNVDNHBNVDNHBNVDccggcgaaccatgatcaaa-3′.

Equal molar amounts of amplicons for the ASD library (36 paired sgRNAs targeting ASD genes) or control library (a paired non-targeting control gRNA) were pooled. Amplicons and lentiviral backbone were then digested with FastDigest NheI (ThermoFisher, catalogue no. FD0973) and FastDigest SgsI (ThermoFisher, catalogue no. FD1894) and gel purified. Ligation was performed using T4 DNA ligase (ThermoFisher, catalogue no. EL0011) and cleaned up by phenol-chloroform extraction. In total, 90 ng of ASD library plasmids and 30 ng of control library plasmids were used for electroporation of MegaX DH10B T1R Electrocomp Cells (ThermoFisher, catalogue no. C640003) following the manufacturer’s guide. Bacteria were plated on lithium borate medium plates containing ampicillin. Dilutions were performed to calculate the complexity; 2.6 × 107 colonies were obtained for the ASD library, and 0.5 × 107 colonies were obtained for the control library. Lentiviruses were packaged using HEK293T cells, and infection of hES cells was performed as before12. Infection rate was controlled to be lower than 5% to prevent multiple infections18; 6.6 × 105 ASD library cells and 2.3 × 105 control library cells positive for GFP were sorted by flow cytometry. Cells were recovered and passaged two times in 10 cm dishes to maintain maximum complexity. Cells were mixed with a ratio of 96:4 (ASD:control) and then used to make embryoid bodies. For individual gene validations, lentivirus carrying a dual gRNA cassette only targeting one gene was packaged and used to infect the eCas9-inducible cell line. Cells were then collected by FACS and used to make embryoid bodies. Organoids were cultured using the conditions described above.

Cerebral organoid tissue dissociation, FACS and scRNA-seq

For each library, three to seven organoids at 4 months were pooled, washed twice in Dulbecco’s phosphate-buffered saline (DPBS)−/− and dissociated using the gentleMACS dissociator in trypsin–accutase (1×) solution with TURBO DNase (2 μl ml−1; ThermoFisher, catalogue no. AM2238). After dissociation, DPBS−/− supplemented with 10% FBS (DPBS–10% FBS) was gradually added to stop the reaction. Samples were then centrifuged at 400g for 5 min at 4 °C, and the supernatant was aspirated without touching the pellet. The pellet was then resuspended in an additional 1–2 ml of DPBS–10% FBS and then, filtered through a 70 μm strainer and FACS tubes. Cells were then stained with viability dye DRAQ7 (Biostatus; DR70250, 0.3 mM). Target live cells were sorted with a BD FACSAria III on Alexa 700 filter with low pressure (100 μm nozzle) and collected in DPBS–10% FBS at 4 °C. Cells were then centrifuged and resuspended in DPBS–10% FBS to achieve a target concentration of 450–1,000 cells per microliter. Samples with more than 85% viability were processed. For each library, 16,000 cells were loaded onto a 10× chromium controller to target a recovery of 10,000 cells. Libraries using the Chromium Single Cell 3′ Reagent Kits (v.3.1) were prepared following the 10× user guide. Libraries were sequenced on a Novaseq S2 or S4 flow cell with a target of 25,000 paired-end reads per cell.

Custom genomic reference

Each cell expresses eCas9 from a genomic locus (AAVS1) and a polyadenylated dual sgRNA cassette, which is delivered by lentivirus and integrated into the genome. To cover these extrinsic elements, we built a custom genomic reference for mapping 10× single-cell data by amending the GRCh38 human reference. As the individual gRNA sequences differed, we masked them by Ns so as not to interfere with mapping (individual gRNA information is distinguished in a separate counting pipeline). The sequences added covered the genomic loci of AAVS1 with eCas9-dTomato-WPRE-SV40 and the masked lentiviral construct.

Emulsion PCR and target amplification

Emulsion polymerase chain reaction (PCR) was used to recover gRNA and UCB sequences from plasmid libraries, genomic DNA extracted from lentivirus-infected hES cells and cells sorted from CHOOSE mosaic organoids as well as 10× single-cell complementary DNA libraries to reduce PCR bias and to prevent the generation of chimeric PCR products56,57. AmpliTaq Gold 360 master mix (ThermoFisher, catalogue no. 4398876) was used for all PCR reactions. Emulsion PCR was performed using the Micellula DNA Emulsion & Purification Kit (EURX, catalogue no. E3600) according to the manufacturer’s guide. For target amplification from 10× single-cell libraries, heminested emulsion PCRs were performed using the following primers:

First PCR: forward primer (FW): 5′-gcagacaaatggctgaacgctgacg-3′, reverse primer (RV): 5′-ccctacacgacgctcttccgatct-3′; second PCR: FW: 5′-ggagttcagacgtgtgctcttccgatcttgggaatcttataagttctgtatgagaccactctttcc-3′, RV: 5′-ccctacacgacgctcttccgatct-3′.

Amplicons were then indexed with unique NEB dual indexing primers, and amplifications were monitored in a qPCR machine and stopped when reaching the logarithmic phase. Amplicons were sequenced using the Illumina Nextseq2000 or Novaseq6000 system. All primers used can also be found in Supplementary Table 1.

gRNA and UCB recovery and analyses

gRNA sequences were extracted by cutting 5′- and 3′-flanking regions with cutadapt (10% error rate, 1–3 nucleotide (nt) overlap, no indels)58. Sequences were filtered to be between 15 and 21 nt long. The corrected cell barcode (CBC) and the unique molecular identifier (UMI) of each read were derived via the 10× Genomics Cell Ranger 6.0.1 alignment59. Only reads with a corresponding gene expression (GEX) cell were accepted. Reads and target sequences were joined by allowing partial overlaps and hamming distances of two. Reads are counted towards unique CBC–UMI–gRNA combinations. A read count cutoff of 1% of the median read count of the UMI with the highest reads count per cell was applied. Cells with only one gRNA and more than one read were kept. In addition, within unique CBC–UMI combinations, only gRNA with more than 20% of the maximal read count of that group was kept. After read filtering, UMIs were counted for each CBC–gRNA combination. If more than one gRNA was found within a cell, only the gRNAs with equal UMI count compared with the maximum UMI count were kept. Only one-to-one combinations were considered further. Analogous to gRNA extraction, UCB was extracted with at least 6 nt overlap to the flanks. Sequences with 12 nt length were selected and had to follow the synthesis pattern. Further processing was done analogous to gRNA.

Preprocessing of single-cell transcriptomics data

We first aligned reads to the above defined custom genomic reference with Cell Ranger 6.0 (10x Genomics) using pre-mRNA gene models and default parameters to produce the cell by gene UMI count matrix. UMI counts were then analysed in R using the Seurat v.4 (ref. 60). We first filtered features detected in a minimum of three cells. Next, we filtered high-quality cells based on the number of genes detected (minimum 1,000, maximum 8,000), removing cells with high mitochondrial (less than 15%) or ribosomal (less than 20%) messenger RNA content. Thereafter, expression matrices of high-quality cells were normalized (LogNormalize) and scaled to a total expression of 10,000 UMIs for each cell. Principal component analysis (PCA) was performed based on the z-scaled expression of the 2,000 most variable features (FindVariableFeatures()).

Integration and annotation of single-cell transcriptomics data

To annotate the dataset, we first extracted cells with control gRNAs and merged them with cells from uninduced organoids (35,203 cells). We integrated these unperturbed cells across libraries using Harmony61 with default parameters. Using the integrated space, we clustered the dataset at a resolution of one using the Louvain algorithm62 and annotated the clusters as dorsal and ventral telencephalons based on marker gene expression. We then split both trajectories and clustered again with a resolution of two to annotate the cell types more finely. This annotation of unperturbed cells was used to perform a label transfer onto the full dataset with perturbed cells using Seurat. The full CHOOSE dataset was further filtered for cells for which gRNAs were detected and integrated across libraries using the Seurat anchoring method. The integrated count matrix was log-normalized and scaled before computing a PCA. To visualize the dataset, the first 20 principal components were used to compute a UMAP embedding.

Assessment target gene expression in organoid and primary cell types

To assess how the target genes in our screen were expressed in organoid and primary tissue, we obtained gene expression data from cell clusters in the developing human brain32 from For both the primary data and our organoid dataset, we summarized log-normalized expression for each cell type (‘CellClass’ in the primary dataset) by computing the arithmetic mean. We visualized the expression of CHOOSE target genes with a heat map as displayed in Extended Data Fig. 6l,m.

RNA velocity

To obtain count matrices for spliced and unspliced transcripts, we used kallisto (v.0.46.2)63 through the command line tool loompy from fastq from the python package loompy (v.3.0.7; Using scVelo (v.0.2.4)31, moments were computed based on the first 20 principal components using the function scvelo.pp.moments() with n_neighbors = 30. RNA velocity was subsequently calculated using the function (mode = ‘stochastic’), and a velocity graph was constructed using To obtain a pseudotemporal ordering describing the two differentiation trajectories, we first removed clusters annotated as cycling cells (MKI67+) and astrocytes (S100B+) from the dataset. We then calculated a pseudotime based on the velocity graph using the function for both trajectories separately.

Differential gRNA representation analysis

To test whether perturbations affected fitness or proliferation capacity of cells, we compared gRNA representation in eCas9-induced (n = 14 pools of organoids from three batches) versus uninduced (n = 8 pools of organoids from two batches) samples. For each pool of organoids, we computed the fractions of cells with each gRNA. We then computed the average fold change of detection percentage between induced and uninduced samples and performed a two-sided t-test comparing both distributions. Multiple-testing correction on the resulting P values was performed using the Benjamini–Hochberg method.

Differential abundance testing

To assess how the perturbation of ASD risk genes changes abundances of different organoid cell populations, we tested for enrichment of each gRNA in each annotated cell state versus the control. To control for confounding effects through differential gRNA sampling in different libraries, we used a CMH test stratified by library. Multiple-testing correction was performed using the Benjamini–Hochberg method, and a significance threshold of 0.05 was applied to the resulting FDR. Enrichment effects were plotted using the signed −log10 FDR: that is, the sign of the log odds ratio (effect size) multiplied by the −log10 FDR-corrected P value. To further assess the variability of the differential abundance effects across independent pools of organoids, we computed cell-type fold enrichment for each organoids pool and gRNA. For this, we used 14 scRNA-seq libraries obtained from independent pools of organoids as replicates from three batches. Two batches (11 replicates) used non-targeting gRNA as a control, and a third batch (three replicates) used eCas9-uninduced cells as an alternative control. We additionally computed a background distribution of enrichment effects from randomly permuted gRNA labels. We then performed a t-test for each perturbation and cell type against this background distribution.

Local cell compositional enrichment test

To visualize the compositional changes induced by the genetic perturbations at a finer resolution, we used a method outlined in Nikolova et al.64 In brief, a k-nearest neighbour (kNN) graph (k = 200) of cells was constructed on the basis of Euclidean distance on the PCA-reduced CCA (canonical-correlation analysis) space. Next, a CMH test stratified by library was performed on the neighbourhood of each cell, comparing frequencies of the gRNA or gRNA pool and the pool of control gRNAs within and outside of the neighbourhood. The resulting neighbourhood enrichment score of each cell was defined as signed −log(P), where the sign was determined by the sign of the log-transformed odds ratio. A random walk with restart procedure was then applied to smooth the neighbourhood enrichment score of each cell. The smoothened enrichment scores were visualized on the UMAP embedding using the ggplot2 (ref. 65) function stat_summary_hex() (bins = 50).

Differential expression analysis

To investigate the transcriptomic changes caused by each perturbation, we performed differential expression analysis based on logistic regression. We used the Seurat function FindMarkers() (test.use = ‘LR’) to find DEGs for each gRNA label versus control. Tests were performed on log-normalized transcript counts Y while treating library, cell_type and n_UMI as covariates in the model:

$${Y}_{i}\approx {\rm{n}}\_{\rm{UMI}}+{\rm{library}}+{\rm{cell}}\_{\rm{type}}+{\rm{condition}}.$$

Testing within each developmental trajectory was performed by omitting the cell_type covariate. Multiple-testing correction was performed using the Benjamini–Hochberg method, and a significance threshold of 0.05 was applied to the resulting FDR to obtain a set of DEGs (CHOOSE DEGs). We further selected top 30 DEGs on the basis of absolute fold change for each gRNA (TOP-DEGs).

DEG enrichment analysis

To assess the biological processes in which the detected DEGs were involved, we performed gene ontology enrichment across all TOP-DEGs globally as well as using all detected DEGs for each target gene in excitatory and inhibitory neuron trajectories separately. As a background gene set, we used all genes expressed in more than 5% of cells in our dataset. To perform gene ontology analysis, we used the function ‘enrichGO’ from the R package clusterProfiler66 with ‘pAdjustMethod = ′fdr′’. We filtered the results using a significance threshold of FDR < 0.01. To test whether the set of TOP-DEGs was enriched for ASD-associated genes, we first obtained a list of risk genes from SFARI (, 11 April 2021). We then tested the enrichment using a Fisher exact test with all genes expressed in more than 5% of cells in our dataset as the background. To assess the specificity of this enrichment, we obtained a list of ID risk genes from sysID (936 primary ID genes,, 17 March 2022) and tested for enrichment among TOP-DEGs in the same way.

Processing of single-cell multiome data and GRN inference

Initial transcript count and peak accessibility matrices for the multiome data were obtained from sequencing reads with Cell Ranger Arc and further processed using the Seurat (v.4.0.1) and Signac (v.1.4.0)67 R packages. Peaks were called from the fragment file using MACS2 (v.2.2.6)68 and combined in a common peak set before merging. Transcript counts were log-normalized, and peak counts were normalized using term frequency–inverse document frequency normalization. To assess the cell composition of the multiome data, integration with the CHOOSE scRNA-seq data was performed using Seurat (FindIntegrationAnchors() -> IntegrateData()) with default parameters. As a preprocessing step to GRN inference with Pando41, chromatin accessibility data were first coarse grained to a high-resolution cluster level. For this, control cells from the CHOOSE dataset were combined with the multiome dataset, and Louvain clustering was performed at a resolution of 20 based on the first 20 principal components calculated from the 2,000 most variable features (RNA). For each cluster, peak accessibility was summarized by computing the arithmetic mean from binarized peak counts so that each cell in the cluster was represented by the detection probability vector of each peak. To constrain the set of peaks considered by Pando, we used the union of PhastCons conserved elements69 from an alignment of 30 mammals (obtained from and candidate cis-regulatory elements derived from the ENCODE project70 (initiate_grn()). In these regions, we scanned for TF motifs (find_motifs()) based on the motif database shipped with Pando, which was compiled from motifs derived from JASPAR and CIS-BP. Based on motif matches, cell-level log-normalized transcript counts and cluster-level peak accessibilities, we then inferred the GRN using the Pando function infer_grn() (peak_to_gene_method = ‘GREAT’, upstream = 100,000, downstream = 100,000) for the 5,000 most variable features. Here, genes were associated with candidate regulatory regions in a 100,000 radius around the gene body using the method proposed by GREAT71. From the model coefficients returned by Pando, TF modules were constructed using the function find_modules() (P_thresh = 0.05, rsq_thresh = 0.1, nvar_thresh = 10, min_genes_per_module = 5). To visualize subnetworks centred around one TF, we computed the shortest path from the TF to every gene in the GRN graph. If there were multiple shortest paths, we retained the one with the lowest average P value. The resulting graph was visualized with the R package ggraph ( using the circular tree layout.

Enrichment testing for TF modules

To find subnetworks of the GRN at which ASD-associated genes accumulate, we first obtained a list of ASD risk genes from SFARI ( For all genes included in SFARI (1,031 genes), we tested for enrichment in TF modules using a Fisher exact test. All genes expressed in more than 5% of cells in our dataset (12,079 genes) were treated as the background. Fisher test P values were corrected for multiple testing using the Benjamini–Hochberg method, and significant enrichment was defined as FDR < 0.01 and more than twofold enrichment (odds ratio). To assess which TF modules were most affected by genetic perturbations of ASD-associated genes, we similarly used a Fisher exact test. For the set of TOP-DEGs, we tested for enrichment in any of the inferred TF modules. Here, all genes included in the GRN (5,000 most variable features) were treated as the background.

Cell rank analysis

To better understand the differentiation trajectories leading up to inhibitory neuron populations, we used CellRank43 to compute transition probabilities into each terminal fate based on the previously computed velocity pseudotime. First, the clusters with the highest pseudotime for each terminal cell state were annotated as terminal states. We then constructed a Palantir kernel72 (PalantirKernel()) based on velocity pseudotime and used Generalized Perron Cluster Cluster Analysis73 (GPCCA()) to compute a terminal fate probability matrix (compute_absorption_probabilities()). All cell rank functions were run with default parameters. Fate probabilities for each cell were visualized using a circular projection74. In brief, we evenly spaced terminal states around a circle and assigned each state an angle t. We then computed two-dimensional coordinates (\({x}_{i}\), \({y}_{i}\)) from the \(F\in {R}^{{Nx}{n}_{{t}}}\) transition probability matrix for N cells and \({n}_{{t}}\) terminal states as

$${x}_{i}=\sum _{t}{f}_{it}\cos {\alpha }_{t}$$

$${y}_{i}=\sum _{t}{f}_{it}\sin {\alpha }_{t}.$$

To visualize enrichment of perturbed cells in this space, we used the method outlined in Nikolova et al.64. Here, the kNN graph (k = 100) was computed using euclidean distances in fate probability space, and enrichment scores were visualized on the circular projection. Otherwise, the method was performed as described above.


Organoid tissues were fixed in paraformaldehyde at 4 °C overnight followed by washing in PBS three times for 10 min. Tissues were then allowed to sink in 30% sucrose overnight, followed by embedding in O.C.T. compound (Sakura, catalogue no. 4583). Tissues were frozen on dry ice and cryosectioned at 20 μm. For staining, sections were first blocked and permeabilized in 0.1% Triton X-100 in PBS (0.1% PBTx) with 4% normal donkey serum. Sections were then stained with primary and secondary antibodies diluted in 0.1% PBTx with 4% normal donkey serum. Sections were washed in PBS three times for 10 min after each antibody staining and mounted in DAKO fluorescent mounting medium (Agilent Technologies, catalogue no. S3023). The following antibodies were used in this study: DLX2 (Santa Cruz, catalogue no. SC393879, 1:100); OLIG2 (Abcam, catalogue no. ab109186, 1:100); SOX2 (R&D, catalogue no. MAB2018, 1:500); FOXG1 (Abcam, catalogue no. ab18259, 1:200); EOMES (R&D, catalogue no. AF6166, 1: 200); ARID1B (Cell Signaling, catalogue no. 92964, 1:100); ADNP (ThermoFisher, catalogue no. 702911, 1:250); BCL11A (Abcam, catalogue no. 191401, 1:250); PHF3 (Sigma, catalogue no. HPA024678, 1:250); SMARCC2 (ThermoFisher, catalogue no. PA5-54351, 1:250); KMT2C (Sigma, catalogue no. HPA074736, 1:250); Alexa 488, 568 and 647 conjugated secondary anti-bodies (ThermoFisher, 1:250); and Hoechst (ThermoFisher, catalogue no. H3569, 1:10,000).

Microscopy, image processing and quantification

Tissue sections were imaged using an Olympus IX3 Series inverted microscope equipped with a dual-camera Yokogawa W1 spinning disk. Images were acquired with 10× 0.75 (air) working distance (WD) 0.6 mm or 40× 0.75 (air) WD 0.5 mm objectives and produced by the Cellsense software.

For DLX2 and OLIG2 quantification in Fig. 4, images were processed and quantified using Fiji. Based on the size of the tissue, 5–12 regions from each organoid were selected using the Hoechst channel. In total, n = 108 areas (13 organoids from four batches) from the ARID1B control group (c.2201dupG repair), n = 104 areas (15 organoids from four batches) from the ARID1B+/− (c.2201dupG) group and n = 94 areas (15 organoids from three batches) from the ARID1B+/− (6q25.3del) group are collected and subjected to an automatic segmentation using a Fiji macro. Both DLX2 and OLIG2 channels are used to define the cell body area, followed by the intensity measurement. Area mean intensity was used for setting up the threshold. For protein expression quantification in Extended Data Fig. 2, organoids with individual gene perturbations costained for each gene were processed and quantified using Fiji. Five to fourteen cortical plate regions were analysed per gene. Areas containing both uninduced (dTomato) as well as induced (dTomato+) cells were selected and subjected to an automated segmentation using a Fiji macro. The Hoechst channel is used to define the cell body area, followed by intensity measurement. Detected cells were separated into wild-type and perturbed cells by setting up a threshold of mean intensity in the dTomato channel. Additionally, KMT2C protein expression was compared between wild-type (dTomato) and mutant (dTomato+) VZ area. VZs were individually outlined, and mean dTomato as well as KMT2C intensities were measured. For IPC abundance analysis, organoids with individual gene perturbations were costained for EOMES. Mutant columns expressing dTomato were individually segmented, and EOMES+ cells were identified by setting a threshold for EOMES intensity. The number of EOMES+ cells was normalized to the total number of cells. Percentages of EOMES+ cells were compared between individual gene perturbations and non-targeting gRNA control groups. For INP abundance analysis, organoids were costained with DLX2. A Fiji macro for automated segmentation was used to identify DLX2+ cells throughout the entire tissue. Areas containing multiple rosettes from each organoid were collected for quantification. The number of DLX2+ cells was normalized to the tissue area and compared between individual gene perturbations and non-targeting gRNA control groups.

Patient sample collection

The study was approved by the local ethics committee of the Medical University of Vienna. Study inclusion criteria were as follows: (1) mutation in the ARID1B gene proven by whole-exome sequencing, (2) age between 0 and 18 years old, (3) continuous follow-up at the Vienna General Hospital and (4) availability of fetal brain MRI data. After informed consent, 10 ml of blood was collected from two selected patients for iPS cell reprogramming.

Reprogramming of PBMCs into iPS cells

iPS cells were generated from peripheral blood mononuclear cells (PBMCs) isolated from patient blood samples as previously described75. In brief, 10 ml blood was collected in sodium citrate collection tubes. PBMCs were isolated via a Ficoll–Paque density gradient, and erythroblasts were expanded for 9 days. Erythroblast-enriched populations were infected with Sendai Vectors expressing human OCT3/4, SOX2, KLF4 and cMYC (CytoTune; Life Technologies, A1377801). Three days after infection, cells were switched to mouse embryonic fibroblast feeder layers. Five days after infection, the medium was changed to iPS cell medium (KoSR + FGF2). Ten to 21 days after infection, the transduced cells began to form colonies that exhibited iPS cell morphology. iPS cell colonies were picked and passaged every 5–7 days after transfer to the mTeSR culture system (Stemcell Technologies).

Generation of isogenic control cell line for patient 1

Isogenic control cell lines for patient 1 were generated using CRISPR–Cas9. Streptococcus pyogenes Cas9 protein with two nuclear localization signals was purified as previously described76. gRNA transcription was performed with the HiScribe T7 High Yield RNA Synthesis Kit (NEB) according to the manufacturer’s protocol, and gRNAs were purified via phenol:chloroform:isoamyl alcohol (25:24:1; Applichem) extraction followed by ethanol precipitation. The homology-directed repair (HDR) template (custom single-stranded oligodeoxynucleotides; Integrated DNA Technologies) was designed to span 100 base pairs up- and downstream of the mutation site. iPS cells had been grown in mTeSR for 14 passages before the procedure. For generation of isogenic control cell lines, cells were washed with DPBS−/− and incubated for 5 min at 37 °C with 1 ml of accutase solution (Sigma-Aldrich, A6964-500ML). The plate was gently tapped to detach cells, and cells were gently pipetted to generate a single-cell suspension, pelleted by spinning at 200g for 3 min and counted using Trypan Blue solution (ThermoFisher Scientific). For nucleofection, 1.0 × 106 cells were spun down and resuspended in Buffer R of the Neon Transfection System (ThermoFisher Scientific) at a concentration of 2 × 107 cells per millilitre. Twelve nanograms of sgRNA and 5 ng of Cas9 protein were combined in resuspension buffer to form the Cas9–sgRNA ribonucleoprotein complex. The reaction was mixed and incubated at 37 °C for 5 min. Five microliters of the HDR template (100 μM) were added to the Cas9–sgRNA ribonucleoprotein complex and combined with the cell suspension. Electroporations were performed using a Neon Transfection System (ThermoFisher Scientific) with 100 μl Neon Pipette Tips using the embryonic stem cells electroporation protocol (1,400 V, 10 ms, three pulses). Cells were seeded in one matrigel-coated well of a six-well plate in mTeSR. After a recovery period of 3 days, a single-cell suspension was generated, and cells were split into another well of a six-well plate for banking and sparsely into two 10-cm dishes for colony formation from single cells. After colony growth for 1 week, individual colonies were picked and seeded each into one well of a 96-well plate. After colony expansion, gDNA was extracted using DNA QuickExtract Solution (Lucigen), followed by PCR and Sanger sequencing to determine efficient repair of the mutation.

Fetal MRI and 3D reconstruction

Women with singleton pregnancies undergoing fetal MRI at a tertiary care centre from January 2016 to December 2021 were retrospectively reviewed. This study was approved by the institutional ethics board, and all examinations were clinically indicated. A retrospective review of patient records was performed, and a patient with a positive genetic testing report for ARID1B mutation was selected. The participant was included in further analysis, and the gestational age (given in gestational weeks and days postmenstruation) was determined by first-trimester ultrasound. High-quality super-resolution reconstruction was obtained13. Age-matched control subjects were identified and included if they presented an absence of confounding comorbidities, including structural cerebral or cardiac anomalies or fetal growth restriction.

Fetal MRI scans were conducted using 1.5-T (Philips Ingenia/Intera) and 3-T magnets (Philips Achieva). The mother was examined in a supine position or if necessary, left recumbent to achieve sufficient imaging quality. The examinations were performed within 45 min, neither sedation nor MRI contrast medium was applied, and both the fetal head and body were imaged. Fetal brain imaging included T2-weighted sequences in three orthogonal planes (slice thickness = 3–4 mm, echo time = 140 ms, field of view = 230 mm) of the fetal head. Postprocessing was conducted as previously described77. Superresolution imaging was generated using a volumetric superresolution algorithm77. The resulting superresolution data were quality assessed, and only cases that met high-quality standards (score of less than or equal to two of five) were included in the analysis. Atlas-based segmentation was performed for the fetal cortex and total brain volume by nonrigid mapping of a publicly available spatiotemporal, anatomical fetal brain atlas for each investigated case77,78. Segmentation of the GE was performed manually using the open-source application ITK-SNAP79. To delineate the T2-weighted hypointense GE, histological fetal atlases by Bayer and Altman80,81 were used as a reference guide. Volumetric data were generated and calculations for the GE were made based on the investigated gestational ages.


Information on the statistical analyses used is described in each method section.  No statistical methods were used to predetermine sample size unless specified. No blinding and randomization were used unless specified.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link

Rate this post

Leave a Comment