Mismatch repair deficiency is not sufficient to elicit tumor immunogenicity


All animal use was approved by the Department of Comparative Medicine at the Massachusetts Institute of Technology (MIT) and the Institutional Animal Care and Use Committee under protocol no. 0714-076-17. Mice were housed with a 12-h light/12-h dark cycle with temperatures in the range 20–22 °C and 30–70% humidity. KrasLSL-G12D (ref. 59); Trp53flox/flox (ref. 60); R26LSL-Cas9 (ref. 61) (KP; R26LSL-Cas9) mice were maintained on an F1 (C57BL/6 × 129/SvJ) background. KrasLSL-G12D; Trp53flox/flox; Msh2flox/flox (ref. 32) (JAX, stock no. 016231) and R26Cas9 (ref. 62) (JAX, stock no. 028555) mice were maintained on a pure C57BL/6 background. Lung cell lines were isolated from tumors induced in albino C57BL/6 hosts chimeric for tissue derived from blastocyst injection of a KP; R26LSL-Cas9 embryonic stem (ES) cell line (12A2) of mixed C57BL/6 and 129/SvJ background and male sex, as previously described61. In orthotopic lung studies, cell lines were transplanted into male chimeras generated from the same 12A2 ES cell line at 10–16 weeks of age. These chimeras are tolerized to C57BL/6 and 129/SvJ tissues, potential antigens in the R26LSL-Cas9 allele and PuroR introduced into cell lines with Msh2 re-expression (unrecombined KrasLSL-G12D expresses PuroR). Autochthonous tumors in lung and colon were induced in approximately equal numbers of male and female mice at 8–16 weeks of age.

Tumor models

Tumor burden, where measurable, was not allowed to exceed 1 cm2 and animals showing discomfort or distress were humanely euthanized following the recommendations of the American Veterinary Medical Association. Autochthonous lung tumors in KrasLSL-G12D; Trp53flox/flox; R26LSL-Cas9 and KrasLSL-G12D; Trp53flox/flox; Msh2flox/flox mice were induced by intratracheal instillation of 2 × 104 transduction units (TU) of lentivirus and 2 × 108 plaque-forming units of adenovirus-expressing Cre driven by the alveolar type II cell-specific surfactant protein C promoter (SPC-Cre), respectively, as previously described31. Autochthonous colon tumors in R26Cas9 mice were induced by endoscope-guided submucosal injection in the distal colon, as previously described33,63. Two injections at 1.5 × 106 TU of lentivirus in 50 μl of Opti-MEM were delivered per mouse. Lentivirus was produced in HEK293 cells (American Type Culture Collection) and concentrated as previously described31, and functional titers (Cre activity, mScarlet fluorescence) measured as previously described64. Cell lines were orthotopically transplanted by intratracheal instillation of 1 × 105 cells in 50 μl of Spinner Modification of Minimal essential Eagle’s Medium (SMEM)/5 mM EDTA, followed by a 30-μl rinse with the same medium. Cell lines were established from autochthonous lung tumors by microdissection and mechanical mincing in digestion buffer (Hanks’ balanced salt solution with 1 M Hepes, 125 units ml−1 of collagenase type IV (Worthington) and 20 μg ml−1 of DNase (Sigma-Aldrich)), followed by incubation at 37 °C with gentle agitation for 30 min and plating in RPMI + 10% fetal bovine serum (FBS). Lines were plated into 50:50 RPMI/Dulbecco’s modified Eagle’s medium (DMEM) + 10% FBS at first passage and DMEM + 10% FBS at second passage and thereafter. Cells were taken for WES at the third passage. Msh2-expressing lentivirus was produced as above. Cells were incubated with lentiviral supernatant and 3 d later selected and maintained thereafter on medium with 6 μg ml−1 of puromycin (Thermo Fisher Scientific).

WES and mutation calling

Whole-exome libraries were generated using the SureSelect XT Mouse All Exon (Agilent) target enrichment kit; 100-bp paired-end sequencing of samples was performed on the Illumina HiSeq 4000 platform, with the exception of M1–8 passage of 20 single-cell clones, which were 150-bp paired-end sequenced on the Illumina NovaSeq 6000 S4 platform. Library preparation and sequencing to 100× on-target coverage were performed by Psomagen. Raw sequencing reads were mapped to the GRCm38 build of the mouse reference genome using BWA-MEM v.0.7.17-r1188 (ref. 65). Aligned reads in BAM format were processed following the Genome Analysis Toolkit (GATK) v. Best Practices workflow to remove duplicates and recalibrate base quality scores66. Median coverage was 98 (25th quartile = 86; 75th quartile = 112), 90 (25th quartile = 82; 75th quartile = 108) and 103 (25th quartile = 100; 75th quartile = 107) for normal tails, autochthonous tumors and cell lines, respectively.

Somatic SNVs and indels were detected using Mutect2, MuSE v.1.0rc67, VarDict v.1.8.2 (ref. 68) and Strelka2 v.2.9.2 (ref. 69) against matched normal tails. Mutect2 was run using a panel of normals compiled from the 18 tails analyzed in the present study. Each caller was run independently on each tumor-normal pair and calls were integrated using SomaticCombiner v.1.03 (ref. 70). For colon tumors, a panel of four normal tails was used to generate the matched normal control for all samples, because these mice were of pure background. We considered SNV mapping to only exonic regions that were detected by Mutect2 and at least one of the other algorithms. To increase accuracy of indel detection, only indels detected by at least two algorithms were considered. Variants mapping to dbSNP (build ID 150) positions were discarded. Mutations identified in tumors from two or more animals or in at least 50% of tumors from the same animal were discarded. No VAF filter was applied. Microsatellites were annotated using SciRoKo v.3.4 (ref. 71), with minimum score = 8, seed length = 8, repeats = 2 and mismatch penalty = 1.

CCF estimation

Somatic copy-number aberrations were detected by integrating output of GATK and FreeBayes v.1.3 (ref. 72) using PureCN v.1.16.0 (ref. 73). Briefly, GATK4 Somatic CNV workflow was utilized for normalization of read counts and genome segmentation using the panel of normals from all tails. FreeBayes was used to obtain B-allele frequencies for dbSNP variant sites. PureCN was used to integrate output from GATK and FreeBayes to estimate allele-specific consensus copy-number profile, purity and ploidy of each sample. The ploidy of cell lines was determined experimentally by metaphase spreads and input into PureCN. Finally, the CCF for each SNV and indel was computed using the R (v.4.0.2) package cDriver43.

Mutational signature and MSI analyses

Mutational signatures were extracted with the R (v.4.0.2) package MutationalPatterns74 (v.3.2.0) using the COSMIC Mutational Signatures catalog v.3 (ref. 75). We used the function fit_to_signatures with default parameters and included only those mutational processes known to operate in human colon and/or lung cancer (excluding tobacco smoking)75: SBS1, SBS5, SBS6, SBS10a, SBS10b, SBS14, SBS15, SBS17a, SBS17b, SBS18, SBS21, SBS26, SBS28, SBS37, SBS40 and SBS44. For visualization, we collapsed signatures of MMRd (SBS6, SBS14, SBS15, SBS21, SBS26 and SBS44; labeled as MMRd) and POLE deficiency (SBS10a, SBS10b, SBS28 and SBS17b; labeled as POLE). Goodness of fit was determined by computing cosine similarity between observed and reconstructed mutational spectra using estimated signature contributions. To estimate the contribution of each mutational process to the human MSI CRC mutational spectrum, we analyzed somatic mutations in the Kwon et al. cohort (non-formalin-fixed paraffin-embedded samples)35 following the methodology described above. Sample MSI score was calculated using MSIsensor-pro (v.1.2.0)34 against matched normal tails.

Clonal deconvolution by targeted amplicon sequencing

To identify private somatic SNVs for distinguishing individual clones in the M1–8 mixed clone tumors (Fig. 4c and Extended Data Fig. 4g,h), we compiled all clonal SNVs in copy-number-neutral regions (four copies, as all lines were tetraploid by metaphase spreads). We then checked the BAM files across all other samples for the complete absence of reads supporting the alternative allele (base quality >20, mapping quality >30) using an in-house Python script relying on the Pysam library. Four private SNVs for each clone and four common SNVs were validated by PCR amplification and Sanger sequencing before proceeding. The 200- to 250-bp regions spanning these SNVs were either individually PCR amplified from samples, gel purified and combined, or amplified in parallel using a multiplexed PCR panel with primers carrying unique molecular indices (CleanPlex UMI Custom Panel, Paragon Genomics). Amplicon libraries were sequenced on the Illumina NovaSeq 6000 S4 platform with150-bp paired-end chemistry.

Reads were aligned to a fasta reference file of all targets (±250 nt upstream/downstream of SNV, GRCm38) using BWA-MEM (v.0.7.17-r1188)65, following the GATK Best Practices workflow. Pileups were generated using the mpileup function of bcftools v.1.10.2 (ref. 76) with –min-BQ 30 and a bed file of all SNV coordinates. For CleanPlex UMI libraries, the following functions in fgbio (v.2.0.1) (https://github.com/fulcrumgenomics/fgbio) were called to extract unique molecular identifiers (UMIs) and call consensus reads: ExtractUmisFromBam, GroupReadsByUmi and CallMolecularConsensusReads. Using a customized R (v.4.0.2) script, total and SNV-specific depths at all locations were extracted. All SNVs were supported by more reads than other alternative alleles in the M1–8 clone-equal mixture control, except for M6_2, which was excluded from subsequent analysis. Background PCR/sequencing error for each SNV was estimated using the median observed frequencies of SNVs in all metastases of different clones, which represented truly clonal controls. SNV frequencies were adjusted by subtracting background values. Clonal percentages in ex vivo tumors were estimated by taking the median of private SNV frequencies, multiplying by 4 (SNVs are 1/4n) and dividing by tumor purity—estimated as the median observed/expected ratio of frequencies of the four common SNVs (present in all clones).

Neoantigen prediction and expression

Variant consequence was annotated using Ensembl Variant Effect Predictor (VEP) v.99 (ref. 77) with Wildtype and Downstream plugins, the VEP cache and reference genome for GRCm38, and the following parameters: –symbol, –terms=SO, –cache, –offline, –transcript_version and –pick. The –pick parameter was reordered from default to report the transcript with most extreme consequence for each variant: rank, canonical, appris, tsl, biotype, ccds, length and mane. Neoepitopes were predicted with C57BL/6 mouse MHC-I alleles, H2-K1 (H-2Kb) and H2-D1 (H-2Db) and variant effect predictions using pVACtools v.1.5.7 (ref. 78). Mutant peptides were generated for peptides that were 8–11 amino acids and MHC:peptide binding affinity was predicted for all peptide:MHC allele pairs with NetMHC-4.0, NetMHCpan-4.0, SMM v.1.0 and SMMPMBEC v.1.0 (refs. 79,80,81,82). The median value across all affinity predictions was taken as the final measure of binding affinity. Neoantigens were subset to those with median predicted H-2Kb/H-2Db affinity ≤500 nM. Where multiple neoantigens were predicted for the same SNV, only that with highest predicted affinity was retained.

To assess allele-specific expression of neoantigens, RNA-sequencing (RNA-seq) was performed on autochthonous lung tumors (10 sgMsh2 and 10 sgMsh2 with αCD4/8 treatment) and M1–8 clones. Complementary DNA libraries were prepared using Kapa mRNA Hyperprep and 150-bp paired-end sequencing was performed on the Illumina NextSeq platform. Reads were aligned to the reference genome (GRCm38) using STAR v.2.7.1a83 with outFilterMultimapNmax = 20, alignSJoverhangMin = 8, alignSJDBoverhangMin = 1, outFilterMismatchNmax = 999, outFilterMismatchNoverLmax = 0.1, alignIntronMin = 20, alignIntronMax = 1,000,000, alignMatesGapMax = 1,000,000, outFilterScoreMinOverLread = 0.33, outFilterMatchNminOverLread = 0.33 and limitSjdbInsertNsj = 1,200,000. PCR duplicates were removed using Picard v.2.23.4 (ref. 84). Considering somatic variants identified by WES, we used a customized Python script to interrogate the presence of these variants in the RNA-seq BAM files. Only nonduplicate reads with mapping quality ≥255 and bases with base quality ≥20 were considered to compute VAFs.

Histology and immunohistochemistry

Quantification of lung tumor burden by grade was performed on scans of hematoxylin and eosin (H&E)-stained sections by automated convolutional neural network (CNN)—developed in collaboration with Aiforia Technologies Oy in consultation with veterinarian pathologist R. Bronson. Using semantic multi-class segmentation, the CNN was trained to classify lung parenchyma and adenocarcinoma grades 1–4. For supervised training, selected areas from 93 slides were chosen. The algorithm performed consistently and with high correlation with human graders across multiple validation datasets independent of the training dataset. Algorithm v.NSCLC_v25 was used. Triple staining (CD8a, CD4 and FOXP3) immunohistochemistry (IHC) and CNN quantification (Aiforia) were performed as previously described49. CD3 infiltration in single-stain slide scans was measured as percentage of pixels positive for stain (diaminobenzidine) in Aperio ImageScope. The area of positive and negative MSH2 staining was quantified by manual annotation in QuPath v.0.1.2 (ref. 85).

Western blotting

Cells were lysed in radioimmunoprecipitation (RIPA) buffer (Thermo Fisher Scientific), protein concentration determined using BCA Protein Assay (Thermo Fisher Scientific) and equal protein quantities (20–40 μg) run on NuPage 4–12% Bis–Tris gradient gels (Thermo Fisher Scientific) by sodium dodecylsulfate–polyacrylamide gel electrophoresis and transferred to poly(vinylidene fluoride) membranes. Western blotting was performed against MSH2 (catalog no. D24B5, Cell Signaling Technology) at 1:1,000, MLH1 (catalog no. ab92312, Abcam) at 1:1,000, glyceraldehyde 3-phosphate dehydrogenase (catalog no. 6C5, Santa Cruz) at 1:5,000 and β-actin (catalog no. 13E5, Cell Signaling Technology) at 1:5,000. Blots were stained with horseradish peroxidase (HRP) anti-rabbit immunoglobulin G and developed with Western Lightning Plus-ECL (Perkin Elmer) on X-ray film.

In vivo antibody and chemotherapy dosing

Antibodies were delivered intraperitoneally in 100 μl of phosphate-buffered saline (PBS). αCD4 (catalog no. GK1.5, BioXCell) and αCD8 (catalog no. 2.43, BioXCell) were administered at 200 μg every 4 d. αPD-1 (catalog no. 29F.1A12, BioXCell) was administered at 200 μg 3× a week. αCTLA (catalog no. 9H10, BioXCell) was administered at an initial dose of 200 μg, with subsequent doses at 100 μg, 3× a week. Oxaliplatin (Sigma-Aldrich) and cyclophosphamide (Sigma-Aldrich) (Oxa/Cyc) were co-delivered intraperitoneally in 100 μl of PBS at 2.5 mg per kg body weight and 50 mg per kg body weight, respectively, once a week for 3 weeks, as previously described42.

In vivo tumor imaging and quantification

Lung tumor progression was monitored longitudinally by X-ray microcomputed tomography (μCT) using a GE eXplore CT 120 system, as previously described86. Solid lung volume (tumor burden) was quantified using a customized MATLAB (MathWorks) script, as previously described86. Colon tumor progression was monitored longitudinally using a Karl Storz colonoscopy system with white light and red fluorescent protein fluorescence and biopsy forceps serving as a landmark for objective positioning, as previously described49.

Lentiviral constructs

The U6::sgRNA-EFS::Cre (pUSEC) lentiviral construct86 was digested with BsmBI and sgRNAs cloned as previously described87. H1::sgApc-U6::sgRNA-EFS::mScarlet was generated by Gibson assembly using an H1::sgApc-scaffold gBlock synthetic gene fragment (IDT), PCR amplicons of U6::BsmBI-filler-BsmBI-scaffold, elongation factor-1 (EFS) promoter and mScarlet88, and a lentiviral backbone from the Trono laboratory (Addgene). This was then digested with BsmBI and a second sgRNA cloned as above. The sgRNA sequences, including previously published sgApc33 and sgCtl (mScarlet targeting)64 are provided in Supplementary Table 4. The sgRNA controls against Olfr102 and mScarlet were used interchangeably with no observed differences in tumorigenesis. PGK::Msh2-EFS::PuroR was generated by Gibson assembly using multiple gBlocks spanning murine Msh2 (C57BL/6), PCR amplicons of PGK (3-phosphoglycerate kinase) promoter, EF-1 Alpha Short (EFS) promoter and PuroR, and the Trono lentiviral backbone. All primers were ordered from Sigma-Aldrich.

Validation of CRISPR–Cas9 editing and estimation of tumor purity

To validate efficiency of gene editing by clustered regularly interspaced short palindromic repeats (CRISPR)–Cas9, 200- to 250-bp regions spanning sgRNA sites in the genome were amplified and deep sequenced (Massachusetts General Hospital DNA Core). Colon tumor purity was estimated using a non-wild-type allele fraction at the sgRNA-targeted site in Apc. Loss of Apc is prerequisite for tumorigenesis in the model and thus an assumption was made that all tumor cells harbor loss-of-function edits at this locus. Tumor purity in sgMsh2-targeted lung tumors was estimated using WES BAM coverage spanning exons of the Trp53flox allele and flanking genes (Wrap53, Atp1b2), which were retrieved using the bedcov function of SAMtools v.1.10. The ratio of median coverage in flanking exons (Wrap53 exons 0–9, Trp53 exon 11 and Atp1b2 exons 0–6) versus Trp53 exons flanked by Cre loxP sites (exons 2–10) was calculated in tumors and normal tails. This ratio in tumors, representing the extent of Trp53flox recombination, was then normalized to the median ratio across matched normal tails to estimate tumor purity, with the assumption that all tumor, but not normal, cells underwent complete recombination of Trp53flox alleles. Efficiency of Msh2 knockout in KP; Msh2flox/flox lung tumors was similarly estimated by taking the ratio of reads at the exon flanked by loxP sites (exon 12) and surrounding exons, and adjusting this by tumor purity.

In vitro cell-line assays

Serial live cell imaging of cell lines grown in 96-well plates (Corning) and quantification of confluence were performed with an IncuCyte S3 (Sartorius). Eight replicate wells were seeded with 100 cells and imaged every 3 h for ~6 d. Murine IFN-γ (PeproTech) was used for in vitro stimulation of cell lines for 24 h, followed by live/dead staining (ghost ef780 (Corning), 1:500) in PBS and surface staining in 1 mM EDTA, 25 mM Hepes, 0.5% heat-inactivated FBS in PBS with anti-H-2Kb allophycocyanin (APC) (catalog no. AF6-, Thermo Fisher Scientific, 1:200), anti-H-2Db FITC (catalog no. 28-14-8, Thermo Fisher Scientific, 1:200) and anti-PD-L1 phycoerythrin (PE)-Cy7 (catalog no. 10F.9G2, BioLegend, 1:200). Samples were run on a BD LSRFortessa using BD FACSDiva v.8.0 software. Results were analyzed in FlowJo v.10.4.2, excluding dead (ef780-positive) cells.

Phylogenetic tree analysis

All somatic SNVs and indels called by the WES analysis pipeline in M1–8 clones and the 09-2 parental cell line were considered in constructing a phylogenetic tree. The R (v.4.0.2) Bioconductor package phangorn (v.2.7.0) was used to construct a tree from a binary presence/absence matrix of mutations across clones and 09-2_par. Specifically, the function prachet was used to calculate the tree using the parsimonious ratchet method and the function acctran was used to calculate branch lengths.

MHC-I immunopeptidomics

MHC-I (H-2Kb and H-2Db) peptide isolation was performed on 108 cells per triplicate for each M1–8 clone as we have previously described49. Cells were grown to confluence before stimulation with 10 ng ml−1 of murine IFN-γ (PeproTech) for 18 h before collection. Pulldowns were performed with 40 μl (bed volume) of rProtein A Sepharose beads (GE Healthcare) preloaded with 1 mg of anti-H-2Kb antibody (Y3, BioXCell) and 1 mg of anti-H-2Db antibody (catalog no. 28-14-8S, purified from HB-27 hybridoma), performed sequentially. Peptides were eluted in 500 μl of 10% acetic acid and purified with 10-kDa MWCO spin filters (PALL Life Science).

MS–MS was performed on eluted peptides as we have previously described49. Tandem mass spectra were searched with Sequest (Thermo Fisher Scientific, v.IseNode in Proteome Discoverer Sequest was set to search the mouse Uniprot database (3 July 2020 version) with 55,650 entries, including common contaminants and green fluorescent protein, Cas9, puromycin and P2A (present in the cell lines) assuming no digestion enzyme, with fragment and parent ion mass tolerances of 0.02 Da and 10.0 p.p.m., respectively. TMTpro was added as a fixed modification on the carboxy and amino termini of peptides. Oxidation of methionine was specified in Sequest as a variable modification. The resulting peptides were filtered to exclude peptides with an isolation interference >30% and p.p.m. error >±3 of the median p.p.m. error of all peptide-spectrum matches. Peptides were further prioritized based on concordance of relative abundance across clones with presence/absence of the associated somatic mutation.

Dendritic cell vaccination, ELISpot and MHC-I multimer staining

Bone marrow-derived dendritic cells were prepared, activated, loaded with putative neoepitopes and injected intradermally at the base of the tail of healthy C57BL/6 mice, followed by two heterologous boosts, as previously described45. A week after the second boost, spleen and peripheral blood were collected for IFN-γ ELISpot and MHC-I:epitope tetramer flow cytometric assays. Red blood cells were first lysed with ACK lysis buffer. IFN-γ ELISpot was performed following the manufacturer’s recommendations (ImmunoSpot, Cellular Technology Limited) using 750,000 splenocytes per well. H-2Kb tetramers were custom generated as previously described45 and used at 1:200 dilution. H-2Db tetramers were generated using UV-labile monomers (UVX Flex-T, BioLegend) following the manufacturer’s recommendations and used at 1:50 dilution. H-2Kb:QAYAFLQHL dextramers were generated using the U-Load Dextramer Kit (Immudex) following the manufacturer’s recommendations and used at 1:10 dilution.

Tissue preparation and flow cytometry

Then 2 min before euthanasia, mice were injected retro-orbitally with anti-CD45 APC-eFluor786 (catalog no. 30-F11, BioLegend, 1:50) to stain intravascular immune cells. Mediastinal lymph nodes and whole tumor-bearing lungs were collected and mechanically dissociated in RPMI-1640 (Corning) with 5% heat-inactivated (HI)-FBS (collection medium). Lungs were placed in digestion buffer containing 500 U ml−1 of collagenase type IV and 20 μg ml−1 of DNase (Sigma-Aldrich) in collection medium, lightly minced and digested at 37 °C for 30 min with gentle agitation, and further dissociated with a gentleMACS Octo Dissociator (Miltenyi Biotec) on the tumor_imp1.1 setting and passed through a 100-μm filter. Live/dead staining (Ghost Dye Red 780, Corning, 1:500 dilution) was performed in PBS and surface stains in FACS buffer (1 mM EDTA, 25 mM Hepes and 0.5% HI-FBS in PBS). For assessment of T cell depletion (Extended Data Fig. 2m), the following antibodies were used for surface staining: CD45 BV785 (catalog no. 30-F11, BioLegend, 1:200), CD3 BV421 (catalog no. 17A2, BioLegend, 1:400), CD8a BUV395 (catalog no. 53-6.7, BioLegend, 1:400) and CD4 AF647 (catalog no. RM4-5, BioLegend, 1:400). For analysis of neoantigen-specific T cells (Extended Data Fig. 5e), the following antibodies were used for surface staining: CD8a BUV395 (as above), CD4 BV711 (catalog no. RM4-5, BioLegend, 1:200), CD44 BV785 (catalog no. IM7, BioLegend, 1:200) and GZMB PE-CF594 (catalog no. GB11, BD Biosciences, 1:250); and intracellular staining: TCF1 AF647 (catalog no. C63D9, CST, 1:200). Cells were fixed for 1 h at room temperature in Fixation/Permeabilization Concentrate (Thermo Fisher Scientific) diluted 1:3 in Fixation/Permeabilization diluent (Thermo Fisher Scientific) and washed in permeabilization buffer (Thermo Fisher Scientific). Intracellular staining was performed in permeabilization buffer overnight at 4 °C. Cells were washed and resuspended in FACS buffer for analysis on a BD LSRFortessa four-laser flow cytometer running BD FACSDiva v.8.0 software. Results were analyzed in FlowJo v.10.4.2. Single lymphocytes were gated first on forward versus side scatter (FSC-A versus SSC-A) and then FSC-A versus FSC-H. Then, live CD8+ T cells were gated on positive CD8α and negative Ghost Red Dye 780 staining. Antigen-specific CD8+ T cells were further gated on CD44 positivity and tetramer/dextramer positivity in two channels (PE/APC). Expression of additional markers was analyzed specifically in this neoantigen-specific CD8+ T cell population.

Analysis of human MMRd cancer clinical trials

Raw WES reads from Bortolomeazzi et al.50 and Kwon et al.35 trials (ClinicalTrial.gov identifiers: NCT02563002 and NCT02589496) were mapped to the reference human genome (GRCh38) using BWA-MEM v.0.7.17-r1188 (ref. 65). Aligned reads were processed as BAMs following the GATK v. Best Practices workflow to remove duplicates and recalibrate base quality scores66. Somatic SNVs and indels were detected using the same pipeline and callers described above for mouse tumors. CCF values were estimated as described above.

Raw RNA-seq reads were mapped to the human reference genome (GRCh38) using STAR v.2.7.1a83. STAR was run using the same parameters as described above in the mouse analysis. The function Htseq-count from the Python library HTSeq (v.2.0.1)89 was used to compute read counts for each gene (Ensembl release GRCh38.90), which were normalized to transcripts per million. Neoantigens were predicted and prioritized as described above in the mouse analysis. Clonal and subclonal neoantigens were classified as CCF ≥ 0.75 and <0.5, respectively.

Clinical responses were binned into two groups: OR, including partial and complete responders and NR, including patients with stable and progressive disease. PFS analysis was performed on a combination of both trials35,50 with the trial study as a covariate. Importantly, there was no significant difference in PFS between patients from the two trials. Cox’s regression was performed in R (v.4.0.2) using the package survival (v.3.4-0)90 with comparisons of patients in the upper versus lower quartiles of each variable tested.

Statistics and reproducibility

Statistical analyses and plotting were performed in R (v.4.2.1) using built-in functions and ggplot2 (v.3.4.1), beeswarm (v.0.4.0), corrplot (v.0.88), eulerr (v.6.1.0), gplots (v.3.1.3), survival (v.3.4.0), survminer (v.0.4.9) and RColorBrewer (v.1.1.3). To assess statistical significance, Fisher’s exact 2 × 2 test was used on categorical variables and two-tailed Wilcoxon’s rank-sum test or Student’s t-test (where the assumption of normality was met) was used on continuous variables. HRs were calculated and compared using Cox’s proportional hazards regression. Multiple-comparison corrections were performed using Holm’s method. No statistical method was used to predetermine sample size. In preclinical trials of lung and colon models, only those animals with apparent tumors by μCT or colonoscopy were recruited. No other data were excluded from analyses. Preclinical trials were randomized and investigators blinded to allocation during dosing, imaging and quantification. No experiments failed to replicate.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link

Rate this post

Leave a Comment