Deep phenotyping of post-infectious myalgic encephalomyelitis/chronic fatigue syndrome


Ethics statement

All research procedures were approved by the NIH Central IRB (NCT 02669212) and performed in accordance with the Declaration of Helsinki. Informed consent was obtained from all study participants.

Recruitment and screening

The PI-ME/CFS group was selected based on medical record documentation of persistent and severe fatigue and post-exertional malaise as the consequence of an acute infection within the last five years without a prior history of explanatory medical or psychiatric illness. The full inclusion/exclusion criteria can be found in Supplementary Data S1.

Potential ME/CFS participants initially completed a telephone screening interview. Those passing the initial screen were contacted for a physician telephone interview and medical record review. This review process was iterative, completing when adequate documentation of infection and ME/CFS was provided, when exclusionary documentation was noted, or when all available medical records were exhausted. Only participants with adequate documentation were invited to NIH for a phenotyping visit for case ascertainment and to collect research measures.

The healthy volunteer (HV) group consisted of demographically matched persons without clinical fatigue and free from medical disease. HVs were recruited from referrals from the NIH Office of Patient Recruitment and responses to study advertisements. HVs were admitted to the NIH Clinical Center for a phenotyping visit to rule out occult illness and to collect research measures. HVs without occult illness were invited to return for an additional exercise stress visit.

Demographic information, including age, birth sex, and gender, for all participants can be found in the Source Data file for Fig. 1b. Birth sex and gender information were collected by self-report during an intake interview by the study physician (BW). ME/CFS is reported to occur three times more frequently in females and efforts were made in recruitment to obtain a balanced birth sex distribution. All participants were compensated consistent with National Institutes of Health guidelines.

Case ascertainment

Participants who potentially qualified based on screening were admitted to the NIH Clinical Center for a week-long case ascertainment visit. A detailed history and physical was performed on each participant in duplicate. An internal medicine nurse practitioner and a rheumatologist performed separate evaluations and consulted together after both were completed. A complete standardized neurological evaluation was performed by a board-certified neurologist. A psychiatric evaluation was performed by a licensed psychologist. Medical consultants were engaged to evaluate participants when appropriate. Laboratory and imaging studies were performed to investigate potential health issues noted during these medical evaluations. Clinical laboratory testing of blood and cerebrospinal fluid, brain magnetic resonance imaging, polysomnography, and an orthostatic challenge were also used in case ascertainment and are described in more detail below.

We excluded participants taking systemic immunomodulatory drugs. All participants were on stable medication dosages throughout the study. Medications that would interfere with study measurements were tapered off for a minimum of three half-lives prior to collection.

Case adjudication

Clinical information from the visit was compiled and reviewed by a Case Adjudication panel. Adjudicators were all recognized clinical experts in ME/CFS (Lucinda Bateman, Andy Kogelnik, Anthony Komaroff, Benjamin Natelson, Daniel Peterson). Each adjudicator performed their own independent review to assign both ME/CFS case status and temporality of ME/CFS onset to an infection. When discrepancies arose between adjudicators, a case adjudication meeting was convened. Adjudicators had to unanimously agree that a participant developed ME/CFS after a documented infection for a case to be considered adjudicated and included in the analyses. Positively adjudicated participants were also invited to return for an additional 10-day long exercise stress visit.

To be considered an adjudicated case, participants were required to be unanimously considered to be a case of PI-ME/CFS by the protocol’s adjudication committee, meet at least one of three ME/CFS criteria (1994 Fukuda Criteria64, 2003 Canadian Consensus Criteria for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome65, or the Institute of Medicine Diagnostic Criteria66), have moderate to severe clinical symptom severity as determined by having a Multidimensional Fatigue Inventory (MFI) score of ≥13 on the general fatigue subscale or ≥10 on the reduced activity subscale, and functional impairment as determined by having a Short-Form 36 (SF-36) score of ≤70 on physical function subscale, or ≤50 on role physical subscale, or ≤75 on social function subscale.

Life narratives and qualitative interviews

Detailed life narratives and qualitative interviews were conducted to understand the lived experience of PI-ME/CFS, the context of the life it occurred in, and to capture the point-in-time experience of post-exertional malaise. Participants had a life narrative collected by a study investigator, an interview about the impact of PI-ME/CFS by an occupational therapist, and up to 10 brief semi-structured qualitative interviews that were developed based on data collected from a preparatory focus group study34. Evocative quotes from the transcripts were selected by the investigators to give a patient voice to match the findings reported.

Performance validity testing

Symptom validity tests were administered to determine if subjective reporting was consistent within individuals, to identify atypical patterns of responding, and to relate these measures to population norms. Performance validity tests were administered to determine whether neuropsychological test performances completed during the study were responded to validly, that is, they reflect neurocognitive functioning and are not unduly impacted by non-cognitive factors, such as poor engagement in testing. These tests include: Minnesota Multiphasic Personality Inventory – 2 Restructured Form (MPI2-RF): examinees have to complete true-false items that best describes themselves and was scored using various empirically-derived validity indices;67 B Test: examinees have to rapidly discriminate between letter stimuli;68 Dot Counting Test: examinees have to count dots as rapidly as possible;69 Word Memory Test: a task where examinees have to view words and later remember them70.

Patient reported outcomes measures

The following symptom and health questionnaires were administered

Short-Form 36 (SF-36): a standard measure of health-related quality of life outcomes that has been tested and validated extensively in a number of clinical populations;71 CDC Symptom Inventory (CDC-SI): collects information on occurrence, frequency, and intensity of symptoms common in ME/CFS and other fatiguing illnesses;72 Multidimensional Fatigue Inventory (MFI): a validated 20-item self-report instrument designed to measure fatigue severity;73 Patient Reported Outcomes Measurement Information System—Short Forms (PROMIS-SF): a system of highly reliable, precise measures of patient–reported health status for physical, mental, and social wellbeing. PROMIS forms administered included Fatigue, Pain Behavior, Pain Interference, Pain Intensity, Global Health, Emotional Distress—Anxiety, Emotional Distress- Depression, Sleep-Related Impairment and Sleep Disturbances;74 The McGill Pain Questionnaire (MPQ): a list of 20 groups of adjectives to describe sensory, affective and evaluative aspects of pain;75 The Neuropathic Pain Scale (NPS): a questionnaire designed to assess the quality and the intensity of the neuropathic sensations;76 Polysymptomatic Distress Scale: a self-administered instrument that determines both the distribution of painful areas across the body and an estimate of related symptom burden that can be used to define fibromyalgia;77 Patient Health Questionnaire-15 (PHQ-15): a validated questionnaire used to assess somatic symptom severity and the potential presence of somatization and somatoform disorders;78 Pittsburgh Sleep Quality Index (PSQI): a measure of sleep quality over a 1-month period;79 Fatigue Catastrophizing Scale: a measure of catastrophizing related to the fatigue experience;80 The Multiple Ability Self-Report Questionnaire (MASQ): a questionnaire that assesses the subjective appraisal of cognitive difficulties in five cognitive domains: language, visual-perceptual ability, verbal memory, visual-spatial memory, and attention/concentration;81 and Belief about Emotions scale: A validated questionnaire designed to measure the beliefs regarding expressing negative thoughts and feelings82.

Brain magnetic resonance imaging

MRI was obtained on a 3.0 tesla Philips Achieva device. Sequences performed precontrast were 2D axial proton density-and T2-weighted, 15 direction diffusion tensor imaging (DTI) with b = 1000 DTI. 3D sagittal T1 magnetization prepared rapid acquisition gradient echo (MPRAGE), and T2 weighted fluid attenuated inversion recovery (FLAIR) with approximately 1 mm isotropic resolution. Gadolinium based contrast agent was injected slowly over approximately one minute while high resolution 0.55 isotropic susceptibility weighted imaging was obtained. Following this post contrast images were obtained using 3D sagittal T1 fast field echo (FFE) and T2 Weighted FLAIR techniques. DTI data was processed to generate diffusion weighted imaging (DWI) and apparent diffusion coefficient (ADC) images. Scans were evaluated in duplicate. A neuroradiologist and a neurologist performed separate evaluations. Differences in interpretation were resolved through consultation.

Small fiber density measures

Two 3-mm excisional skin biopsies were collected from the distal thigh and distal leg. Samples were fixed, sectioned and immunostained for the panaxonal marker PGP9.5 by free-floating immunohistochemistry. Four skin sections from each biopsy were randomly selected, immunostained, and mounted on a single slide and epidermal nerve fibers were visualized with confocal microscopy. This method provides an accurate representation of the biopsy sample while avoiding sampling error83. A diagnosis of small fiber sensory neuropathy is given based upon a length-dependent loss of epidermal nerve fibers.

Measures of brain injury

Plasma and cerebrospinal fluid were analyzed for markers of brain injury by immunoassay using digital array technology, which uses a single molecule enzyme-linked immunoarray (Simoa) method84. The Neurology 4-Plex A platform for Nf-L, Tau, GFAP, and UCH-L1 was used. In brief, paramagnetic capture beads coated with each relevant antibody, and a biotinylated detector for each relevant antibody are combined. Antibody coated paramagnetic capture beads and labeled biotinylated detector antibody bind to the relevant molecules present in the sample. Following a washing step, a conjugate of streptavidin-beta-galactosidase (SBG) is mixed with the capture beads. The captured molecules become enzymatically labeled when the SBG binds to the biotinylated detector antibodies. A second wash is performed, and the capture beads are resuspended in a resorufin beta-D-galactopyranoside (RGP) substrate solution. This suspension is transferred to the Simoa Disc. Individual paramagnetic capture beads settle into 216,000 femtoliter-sized microwells designed to hold no more than one bead per well. The beads are sealed into the microwells while excess beads are sealed into the microwells while excess beads are washed away with a synthetic fluorinated polymer sealing oil. If the measured molecule is present in the sample and subsequently captured and labeled, the beta-galactosidase hydrolyzes the RGP substrate and produces a fluorescent signal. This signal is detected and counted by the Simoa optical system. The concentrations of relevant molecules are interpolated from a standard curve.

Clinical laboratory measurements of blood and cerebrospinal fluid

The following panel of laboratory evaluations were performed on collected blood samples: acute care panel, mineral panel, hepatic panel, complete blood count with differential, prothrombin time, international normalized ratio, partial thromboplastin time, thyroid stimulating hormone, free thyroxine, triiodothyronine, iron, ferritin, transferrin saturation, fasting lipid panel, hemoglobin A1c, anti-nuclear antibody, Rheumatoid factor, anti-cyclic citrullinated antibody, anti-Smith antibody, anti-RNP, ssA, ssB, vitamin B12, 25(OH) vitamin D, 1,25(OH)2 vitamin D, folate, creatine kinase, c-reactive protein, erythrocyte sedimentation rate, d-dimer, quantitative immunoglobulins, flow cytometry for lymphocyte subsets, human immunodeficiency virus by enzyme-linked immunosorbent assay, Epstein-Barr virus and Cytomegalovirus by polymerase chain reaction, Epstein-Barr antibodies, C6 peptide antibodies, hepatitis panel, rapid plasma regain, and tryptase level.

Heavy metal screening was performed on urine samples collected over 24-hours in a CLIA certified laboratory using inductively-coupled plasma/mass spectrometry (ICP/MS).

Cerebrospinal fluid samples were analyzed for cell counts, glucose, protein, and oligoclonal bands.

Cerebrospinal fluid ICP-MS (inductively coupled plasma-mass spectroscopy)

ICP-MS is an analytical technique by which concentrations of elements are determined up to as low as ppt (parts per trillion) levels in liquid, solid or gaseous samples. Elements are led through a plasma source where the atomic forms of elements become ionized. These ions are then detected according to their masses.

Total iron concentrations in the cerebrospinal fluid samples were measured by ICP-MS (Agilent model 7900). For each sample, 200 µL of concentrated trace-metal-grade nitric acid (Fisher) was added to 200 µL of sample taken in a 15 mL Falcon tube. Tubes were sealed with electrical tape to prevent evaporation, taken inside a 1 L glass beaker, and then placed at 90 °C oven. After overnight digestion, each sample was diluted to a total volume of 4 mL with deionized water, and then analyzed by ICP-MS.

Dietary evaluation

Participants completed the Diet History Questionnaire (DHQII), an internet-based survey that asks 134 questions regarding dietary intake over the past year and eight questions about dietary supplement intake. Participants kept seven-day food records, which were reviewed by nutrition staff and coded into Nutrition Data Systems for Research (NDSR) software to obtain nutrient intake data.

Medication reconciliation

Medication and supplement use were collected during the history and physical exam.

Sleep measurements

PI-ME/CFS participants each had a standard clinical polysomnogram to evaluate for obstructive sleep apnea, periodic limb movements, and sleep fragmentation.

Heart rate variability

Standard three-lead ambulatory ECG monitors were used for 24-h recordings. All data were downloaded on-site and reviewed by ECG telemetry nurses and by a pediatric cardiologist with clinical electrophysiology training. The recorded data was analyzed using Spacelabs Impresario (version 3.07.0158) program. Arrhythmia and non-normal beats were detected, coded appropriately and excluded from subsequent HRV analysis. Tracings were reviewed for electrical and mechanical noise artifact and these portions of tracings were similarly excluded from subsequent analysis. Recordings with less than 22 h of data were excluded.

Orthostatic challenge

Participants were fitted with electrodes to measure cardiac signals and electrical impedance, a respiratory belt, a pulse oximeter, a finger-cuff for beat-to-beat blood pressure measurements, an automated blood pressure cuff, and a forearm plethysmograph transducer paired with a rapid-inflation brachial cuff for forearm blood flow measurements.

Prior to the orthostatic challenge, baseline hemodynamic measures and a blood sample were collected. The participant was then tilted head-up at a 70-degree angle. The orthostatic challenge was continued for 40 min, with hemodynamic information collected in real time and blood samples collected at four minute intervals. The orthostatic challenge was ended if a participant developed hemodynamic instability or acute symptoms. On completion of the orthostatic challenge, the participant was returned to a supine position for 10 min, at which time final hemodynamic and blood measures were made.

Baroreflex function measurements

Participants were fitted with electrodes to measure cardiac signals and electrical impedance, a respiratory belt, a pulse oximeter, a finger-cuff for beat-to-beat blood pressure measurements, an automated blood pressure cuff, and a forearm plethysmograph transducer paired with a rapid-inflation brachial cuff for forearm blood flow measurements. Deep breathing measures were collected with the participant supine breathing deeply at a rate of five to six breaths per minute for three minutes. Three or more Valsalva maneuvers were then performed, during which the participants blow against resistance for 12 s at 30 mmHg and then relaxes. For participants where a square wave phenomenon was observed, the participant was tilted at 20 degrees head up and the procedure repeated.

Cerebrospinal catechol measurements

Cerebrospinal levels of catechols were assayed by batch alumina extraction followed by liquid chromatography with series electrochemical detection as reported previously85,86. In summary, to freshly thawed CSF in a plastic sample tube, approximately 5 mg of acid washed alumina, internal standard, and TRIS/EDTA buffer were added for alumina adsorption of the catechols. The tube was shaken vigorously using a paint can shaker for about 20 min. The tube was then spun in a microfuge, the alumina forming a pellet at the bottom of the tube. The supernatant was removed, and the alumina was washed twice. Then, after removal of the supernatant, 100 µL of an acidic eluting solution was added to the tube for desorbing the catechols from the alumina. The tube was shaken in a vortex mixer and then centrifuged. The supernatant was removed manually using a pipette and transferred to a microvial and placed in the carousel of the automated injector. For most samples 90 µL was injected onto the liquid chromatography column. The column eluate was passed through 3 electrodes in series, the first set at an oxidizing potential and the third set at a reducing potential. The electrochemical signal from the reducing electrode was recorded using proprietary software, peak heights of compounds with retention times of interest were measured, and concentrations of analytes in units of pg/mL were tabulated in a spreadsheet using a macro after adjustment for analytical recovery of the internal standard. For reporting purposes concentration in pmol/mL were used.

Psychiatric evaluation

The following psychological inventories were administered: Composite International Diagnostic Interview Trauma Section (CIDI-Trauma): a validated survey that characterizes a participant’s previous traumatic experiences;87 Post-traumatic Stress Diagnostic Scale (PDS): a validated instrument for the epidemiologic diagnosis of Post-traumatic Stress Disorder;88 Childhood Trauma Questionnaire Short Form (CTQ-SF): a validated instrument that characterizes potential traumatic life experiences in early childhood;89 Sexual and Physical Abuse Questionnaire (SPAQ): a validated questionnaire that characterizes the type and age of occurrence of traumatic life experiences;90 Beck Depression Inventory –II (BDI-II): a validated self-report inventory for measuring the severity of depression;91 Beck Anxiety Inventory (BAI): a validated self-report inventory for measuring the severity of anxiety92 Center for Epidemiologic Studies Depression Scale—Revised (CESD-R): A validated self-report inventory for screening for depression; Minnesota Multiphasic Personality Inventory – 2 Restructured Form (MMPI2-RF): examinees have to complete true-false items that best describe themselves; Structured Clinical Interview—DSM 5 (SCID-5): History of current and past psychiatric diagnosis was assessed with the Structured Clinical Interview for DSM-5, Research Version (SCID-5-RV).

Body composition measurements

Weight (Scale-Tronix 5702 digital balance, Carol Stream, IL, USA) and height (Seca 242 stadiometer, Hanover, MD, USA) were taken at fasted conditions. Body composition, including body fat mass, lean soft tissue mass, and fat percentage was measured by dual-energy X-ray absorptiometry (iDXA scanner with Encore 15.0 software; GE Healthcare, Madison, WI, USA).

Mitochondrial extracellular flux testing

Mitochondrial function was assessed in PBMC that were isolated and measured within three hours of being collected using an extracellular flux assay (Mito Stress Test, Agilent)93. In brief, PBMCs were isolated from 8 ml of blood. Samples were centrifuged at 1,750 × g for 30 min at room temperature. The cloudy layer was transferred to a 15 mL conical tube where 15 mL of PBS was added and inverted five times. The sample was then centrifuged at 300 × g for 15 min at 4 °C. After discarding the supernatant, the pellet was re-suspended by adding 10 mL PBS and inverting five times. The sample was then centrifuged at 300 × g for 10 min at 4 °C. After discarding the supernatant, the cells were re-suspended in 1 mL PBS and centrifuged at 610 × g for 10 min. After removing the supernatant, the pellet was re-suspended in complete RPMI-1640 supplemented with 10% FBS, 10 mM Penicillin/Streptomycin. Cell plates were coated with tissue adhesive solution after diluting the stock solution in 0.1 M sodium bicarbonate pH 8.0 solution. 100 µL of the diluted solution was added to each well and incubated for 20 min at room temperature, washed with deionized water and air dried. On the day of the experiment, fresh assay media was prepared by adding L-glutamine, pyruvate, and glucose to base media (the same constituents as Dulbecco’s Modified Eagle’s Medium (DMEM), but without any sodium bicarbonate, glucose, glutamine, or sodium pyruvate) to make assay media and warmed to 37 °C and adjusted to a pH of 7.4. PBMCs were plated into each well to reach 80–90% confluency. Plates were centrifuged at 200 × g for 2 min, then washed with assay media. 180 µL of assay media was then added to each well and incubated in a non-CO2 37 °C incubator for 60 min prior to performing the extraceullar flux assay according to manufacturer’s directions. Assay Results of the assay were normalized to amount of live cells in the cell preparation93.

ATP 9.4 characterization of muscle

Samples were collected from the vastus lateralis muscle and flash frozen. Frozen sections of muscle samples were prepared with cryostat, 10um thickness. Six slides of frozen sections were taken to Johns Hopkins University, Neurology department, Neuromuscular Laboratory for ATPase pH 9.4 stain, which can identify Type II muscle fibers.

The calcium method for myosin-ATPase demonstration, employing solutions of different pH values, has been used primarily to distinguish muscle fiber types. Muscle fibers may be broadly categorized as type I (slow, red muscle, oxidative) and type II (fast, white muscle, glycolytic). Type II muscle fibers are further subdivided as IIa (glycolytic), IIb (glycolytic/oxidative), and IIc which may be fibers that are changing types due to disease, injury, or development. The preincubation pH relatively inactivates the myosin-ATPase iso-enzyme characteristic of specific fiber types. The remaining active ATP and calcium-dependent enzyme activity releases calcium atoms which are replaced by cobalt, and finally precipitated as a black insoluble cobalt salt of ammonium sulfide. Slides were scanned at National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) Light Image department and images, 2.5X size, were acquired using NDP.view2 software from Hamamatsu.

Analysis of images was performed using Fiji (Image J) in a semi-automated way to facilitate the evaluation of cross-sectioned myofibers in the largest possible area for each image. Preprocessing and thresholding of the images using Fiji available filters and tools was first done to generate a selection of muscle fibers and to classify them according to the intensity of the staining; however, the accuracy of this evaluation was limited, and human intervention was required during the analysis. Following automatic fiber selection, a researcher blinded to sample identity modified the selection of fibers to exclude regions of the image where fibers were longitudinal or where the sample quality was suboptimal. In addition, fiber segmentation was manually corrected to properly delineate a region of interest for each fiber where automatic delimitation was not accurate. Similarly, automatic classification of the fibers was manually reviewed. The minimum Feret diameter was measured for each fiber, and the median of the measurements for type I and type II fibers in each image was calculated and used to compute a Type II/Type I ratio to assess the relative fiber size for the image.

Mitochondrial genetic analysis

Samples were collected from the vastus lateralis muscle and flash frozen. Sample analysis was performed by GeneDx.

Genomic DNA was extracted from the specimens. For the nuclear genome, the DNA is enriched for the complete coding regions and splice junctions of the genes on this panel using a proprietary targeted capture system developed by GeneDx for next-generation sequencing with CNV calling (NGS-CNV). The enriched targets were simultaneously sequenced with paired end reads on an Illumina platform. Bi-directional sequence reads were then assembled and aligned to reference sequences based on NCBI RefSeq transcripts and human genome build GRCh37/UCSC hg19. After gene specific filtering, data were analyzed to identify sequence variants and most deletions and duplications involving coding exons; however, technical limitations and inherent sequence properties effectively reduce this resolution for some genes. Alternative sequencing or copy number detection methods were used to analyze regions with inadequate sequence or copy number data by NGS. The entire mitochondrial genome from the submitted sample was also amplified and sequenced using Next Generation sequencing. DNA sequence was assembled and analyzed in comparison with the revised Cambridge Reference Sequence (rCRS GeneBank number NC_012920) and the reported variants listed in the MITOMAP database ( Next generation sequencing may not detect large-scale mtDNA deletions present at 5% heteroplasmy or lower or mtDNA point variants present at 1.5% heteroplasmy or lower. Reportable variants include pathogenic variants, likely pathogenic variants and variants of uncertain significance. Likely benign and benign variants, if present, were not reported. For the nuclear genome, the technical sensitivity of sequencing is estimated to be >99% at detecting single nucleotide events. It will not reliably detect deletions greater than 20 base pairs, insertions or rearrangements greater than 10 base pairs, or low-level mosaicism. The copy number assessment methods used with this test cannot reliably detect copy number variants of less than 500 base pairs or mosaicism and cannot identify balanced chromosome aberrations. Assessment of exon-level copy number events is dependent on the inherent sequence properties of the targeted regions, including shared homology and exon size. Due to the presence of non-functional pseudogenes, regions of the GYG2, NR2F1, PDSS1, and TSFM, and genes are not fully sequenced by this method. For the COQ7, COX8A, HTRA2, NDUFB11, RNASEH1, SCO2, SDHA, SLC25A26, SLC25A46, TFAM, TMEM126B, and TRMT10C genes, sequencing but not deletion/duplication analysis was performed. In addition, the COA5 gene deletion/duplication analysis may only be able to detect full gene events. For the mitochondrial genome, next generation sequencing can detect mtDNA point variants as low as 1.5% heteroplasmy and large-scale deletions (2 kb or larger) as low as 5% heteroplasmy. However, for large-scale mtDNA deletions observed at less than 15% heteroplasmy, a quantitative value will not be provided. This test is expected to detect greater than 98% of known pathogenic variants and deletions of the mitochondrial genome.

The mitochondrial variants identified by GeneDx were annotated using Annovar94.

Modified effort expenditure for rewards task

The Effort-Expenditure for Rewards Task15 is a multi-trial game in which participants were offered a choice between two task difficulty levels for a reward (Supplementary Fig. S5A). The task began with a one second blank screen, followed by a five second choice period in which the participant was informed of the value of the reward and the probability of winning if the task were completed successfully. After the participant chose the task, another one second blank screen was displayed. Next, the participant either completed 30 button presses in seven seconds with the dominant index finger if they chose the easy task, or 98 button presses in 21 s using the non-dominant little finger if they chose the hard task. Next, the participant received feedback on whether they completed the task successfully. Finally, the participant learned if they have won, based upon the probability of winning and the successful completion of the task. This process repeats in its entirety for 15 min.

Participants were told at the beginning of the game that they would win the dollar amount they received from two of their winning tasks, chosen at random by the computer program (range of total winnings is $2.00–$8.42).

The primary measure of the EEfRT task is Proportion of Hard Task Choices (effort preference). This behavioral measure is the ratio of the number of times the hard task was selected compared to the number of times the easy task was selected. This metric is used to estimate effort preference, the decision to avoid the harder task when decision-making is unsupervised and reward values and probabilities of receiving a reward are standardized.


Participants were instructed to wear ActiGraph GT3X+ accelerometers (ActiGraph Inc., Pensacola, FL, USA) on their waist and non-dominant wrist, continuously, for at least seven consecutive days at home. Raw tri-axial accelerometer data was recorded at 80 samples/second and subsequently filtered and aggregated into one-minute vector magnitude activity counts and steps with Actilife software (v6.13.0, ActiGraph, Pensacola, FL, USA) and customized programs in Matlab (R2021b, Mathworks, Natick, MA, USA). Periods of sixty or more consecutive minutes of zero vector magnitude counts were identified as non-wear, and daily data were considered valid if the device was worn for ≥10 h from 12 midnight to 12 midnight the next day95. Minutes where activity counts of the vertical axis of the waist-worn accelerometer fell between 2020 and 5998 were identified as moderate intensity activity, i.e., three to six times the resting metabolism.

Grip strength

Grip strength was measured with a hand-held dynamometer (Jamar). Each hand was tested individually with the arm, forearm, and wrist in a neutral position. First, each participant was instructed to exert a maximum possible grip force for about five seconds. After completing the first reading, this maximal grip task was repeated. After a minute of rest, the participant was asked to maintain their maximal grip for as long as possible. The time elapsed when the participant’s grip force reduced to 50% of their maximum was recorded by an investigator. All three tests were then repeated with the other hand.

Electrophysiology and repetitive grip testing

To assess physical fatigue, a grip force task was designed which required participants to try to maintain their grip at 50% of maximum voluntary contraction (MVC) in successive blocks of 30 s interspaced with 30 s rest blocks. Each participant sat with their right forearm placed in a rigid-frame dynamometer (biopac, Goleta, CA, USA). The MVC was set as the highest value of three squeezes.

Electromyography (EMG) was collected using surface electrodes (3 M, St. Paul, MN, USA) over the flexor and extensor carpi radialis (FCR, ECR) muscles and the abductor pollicis brevis (APB), using amplifiers and software from Cambridge Electronic Design, Cambridge, UK.

Transcranial magnetic stimulation (TMS) was performed to probe the excitability of the primary motor cortex (m1) via motor evoked potentials (MEPs). A 70 mm figure-8 coil was used to determine the optimal position for evoking MEPs by holding the coil tangential to the scalp and slightly displacing it until the highest MEP amplitude was recorded in the APB muscle. The positions of the participant’s head and TMS coil were tracked with a neuronavigation system (rogue Research, Cambridge, MA, USA) in order to maintain the stimulation position over the hotspot. The TMS Input-Output curve was recorded, collecting MEP responses for intensities 5–100%, in increments of 5%, of stimulation output, in order to calculate the S50. The S50 value was defined as 50% of maximum MEP amplitude. This curve value was also used to estimate resting motor threshold (rMT), which was confirmed using single pulse TMS as the stimulation intensity that would evoke a ~50 µV response in roughly 50% of pulses delivered.

The maximum M-wave was also determined prior and after the repetitive grip task by applying electrical stimulation over the nerve innervating the FCR muscle.

During the repetitive grip task, each participant repeatedly performed 30-s periods of isometric muscle contractions aiming at 50% of MVC. Generally, participants performed 16 blocks, but some quit earlier, and some continued for up to 32 blocks. After each squeeze block, there was a 30 s period of rest. During this rest period, MEPs (elicited every five seconds) were measured.

The development of muscular fatigue during the task, defined as the inability to maintain at least 40% MVC force for more than three seconds, was analyzed by comparing the 1st block (no fatigue), the last block before fatigue onset, and three following blocks after fatigue onset or, if they did not fatigue, the last four blocks. For EMG, we used the Dimitrov index (DI)17,18 to evaluate the shift in EMG frequency power within blocks.

Functional MRI repetitive grip testing

Brain activity was also assessed during the grip strength task with fMRI. This task was designed to identify, at the whole brain level, brain areas involved in fatigue. Participants lay supine in the scanner and performed repeated 30-s blocks of grip strength with a dynamometer (isometric muscle contractions) at 50% of their maximum voluntary contraction (MVC) interspaced with 30-s blocks of rest. MVC of the forearm muscles was determined from the best of three brief squeezes on the dynamometer. Participants used visual feedback from a computer to monitor force generation. Similar to the TMS study, participants performed 16 blocks, but some quit earlier, and some continued for up to 32 blocks. Subjective appraisals of muscular fatigue were measured with a VAS before and after the grip strength task.

We used a 3 T Prisma SIEMENS scanner equipped with a 64-channel head-coil in the Nuclear Magnetic Resonance Center at the National Institutes of Health. We acquired T2*-weighted EPI with TR = 2 s, TE = 30 ms, image matrix = 64 × 64, flip angle: 70˚, FoV: 100, voxel size 3.5 ×3.5 ×3.5 mm.

Cardiopulmonary exercise testing (CPET)

CPET was performed using a cycle ergometer and a computerized metabolic cart (CardiO2 Ultima; MedGraphics Corp, St.Paul, MN, USA). A ramp protocol was used where the work rate would be gradually increased until volitional fatigue was reached by each participant. A time-matching paradigm to ensure all participants exercised for between eight to twelve minutes was employed, as per American College of Sports Medicine recommendations96. The target endpoint was exertional intolerance defined as the participants’ expressed desire to stop cycling despite strong verbal encouragement from the testing staff. Endpoints for stopping the tests were those recommended by the American College of Sports Medicine96.

Breath-by-breath gas exchange and heart rate (by 12-lead ECG) were measured throughout the CPET. Peak oxygen consumption (peak VO2) was calculated as the 20 s average at the end of the CPET. The anaerobic threshold (AT) identified by the metabolic cart was verified by gas exchange analysis methods22. Chronotropic incompetence (CI) was calculated as % predicted heart rate reserve = [peak HR−resting HR]/[APMHR−resting HR] × 100. The slope of the heart rate response during the CPET was examined by linear fit between 15 to 100% of the CPET time. Expected heart rate responses were also generated using linear fits between predicted resting and peak heart rate values, matching sex and age of both HV and PI-ME/CFS participants. Muscle oxygenation measurements were also made at the vastus lateralis during the CPET by near infrared spectroscopy (NIRO-200NX, Hamamatsu Photonics, Japan). For further determination of maximal oxygenation values of near infrared spectroscopy measurements, a thigh occlusion test was performed prior to the CPET. Following seated rest, an occlusion cuff (Hokanson Rapid Cuff Inflator; Hokanson Inc., Belleview, WA, USA) was rapidly inflated to and held at 80 mmHg above systolic blood pressure for eight minutes.

Bioenergetic measurements

The metabolic chamber is a whole-room indirect calorimeter that allows detailed assessment of energy and nutrient balance. Measurements are conducted at stable interior (room) temperature, humidity, and barometric pressure, which are continuously measured. Airtight sampling ports and a four-way air-locking food and specimen passage are designed to allow blood draws and specimen retrievals with minimal disturbance to the chamber environment. Outside air is continuously drawn into the chamber, and the flow rate of air at the outlet is measured using a pneumotachograph with a differential manometer. A fraction of the extracted air is analyzed at one minute intervals for O2 and CO2 concentrations with a thermomagnetic O2 analyzer. This allows for a continuous assessment of oxygen consumption (V˙O2), carbon dioxide elimination (V˙CO2), and calculation of overall energy expenditure (EE). The ratio between V˙CO2 and V˙O2 (the respiratory quotient [RQ]) reflects preference for carbohydrate or fat oxidation.

Starting the day prior to CPET, each participant was placed on a metabolic diet controlled for energy and macronutrient content. Measures of energy expenditure and respiratory exchange were obtained during a 16 h (4 pm to 8 am) stay in a metabolic chamber prior to CPET and for three consecutive days afterwards.

Salivary cortisol measurements

Saliva was collected using a SARSTEDT salivette. Participants did not eat for at least 2 h or drink water 30 min prior to collection. Samples were centrifuged for 2 min at 1000 × g and then frozen at −80 °C. Samples were measured by Salimetrics using enzyme-linked immunoassays for cortisol performed in duplicate. Assay sensitivity is <0.007 ug/dL.

Neuropsychological Measures

The following neuropsychological measures were administered by a trained neuropsychometrist in the following general order: Visual Analogue Scale (Time 1): a set of scales that were administered to capture subjective effort, performance, mental fatigue, and physical fatigue. This test was immediately prior to administration of the neuropsychological testing battery; Wechsler Test of Adult Reading (WTAR) (The Psychological Corporation, 2001): a task that requires the examinee to read words aloud;97 Hopkins Verbal Learning Test-Revised Learn (HVLT-R Learn): a task where examinees have to learn a list of words;98 Grooved Pegboard Test: a task where examinees have to rapidly insert pegs in holes;99 Wechsler Adult Intelligence Scale—Fourth Edition (WAIS-IV) subtests including Coding, Symbol Search and Digit Span: a task where examinees memorize strings of numbers or complete speeded tasks involving unfamiliar symbols;100 B Test: examinees have to rapidly discriminate between letter stimuli;68 Hopkins Verbal Learning Test-Revised Delayed Recall (HVLT-R DR): a task where examinees have to recall the list of words previously learned;98 Brief Visual Memory Test-Revised (BVMT-R): a task where examinees have to learn a list of designs;101 Visual Analogue Scale (Time 2): scales to capture subjective effort, performance, mental fatigue, and physical fatigue were collected at this time, approximately one hour after testing started; Wisconsin Card Sort Test (WCST-64): a task where examinees have to utilize corrective feedback to learn how to sort cards;102 Controlled Oral Word Association Test (COWAT; FAS and Animals): a task where examinees have to generate words to various cues;103 Paced Auditory Serial Addition Test (PASAT): a task where examinees have to rapidly perform serial addition;104 Brief Visual Memory Test-Revised (BVMT-R) Delayed Recall: a task where examinees have to recall the prior list of designs;101 Word Memory Test: a task where examinees have to view words;70 Test of Variables of Attention: a task where examinees rapidly respond using a button press to certain target stimuli and not distractor stimuli;105 Visual Analogue Scale (Time 3): scales to capture subjective effort, performance, mental fatigue, and physical fatigue were collected at this time, approximately two hours after testing started; Word Memory Test Delayed Recall: a task where examinees have to recall the words viewed earlier;70 Dot Counting Test: examinees have to count dots as rapidly as possible;69 MMPI-−2 RF: As described above; EEfRT test: As described above; Visual Analogue Scale (Time 4): scales to capture subjective effort, performance, mental fatigue, and physical fatigue were collected at this time, approximately three hours after testing started.

PBMC RNA sequencing

RNA was extracted from PBMC’s of participants using miRNeasy Micro Kit (QIAGEN). RNA was quantified using Qubit 3.0 fluorometer (Thermo Fisher Scientific) and its integrity confirmed using an Agilent 2100 Bioanalyzer (Agilent Technologies). Dual index libraries were constructed with at least one unique index per each patient library using the TruSeq Stranded Total RNA HT Kit (Illumina) to enable subsequent pooling of equal quantities of individual libraries. The integrity and ratio of pooled libraries was validated using Miseq system (Illumina); then, paired-end sequencing (2 × 75 base pairs (bp)) was performed on an HiSeq 3000 sequencer (Illumina) with the Illumina HiSeq 3000 SBS Hit.


Peripheral blood serum was isolated using SST tubes and cryopreserved with corresponding cerebrospinal fluid samples. Proteomic analysis used the SOMAscan 1.3k Assay (SomaLogic). This is an aptamer-based assay able to detect 1305 protein analytes, optimized for analysis of human serum106,107. Briefly, aptamers are short single-stranded DNA sequences modified to confer specific binding to target proteins and can be highly multiplexed for discovery of biomarker signatures. The proteins quantified include cytokines, hormones, growth factors, receptors, kinases, proteases, protease inhibitors, and structural proteins. The assay was performed according to manufacturer specifications for each of the serum and cerebrospinal fluid sample types. Briefly, serum samples were assayed at three dilutions (40%, 1%, and 0.005%) with each sample dilution added to a corresponding subset of the 1305 SOMAmer detection reagents binned according to manufacturer’s predicted target abundance in serum. Cerebrospinal fluid was run at a single 15% concentration dilution with protease inhibitors and polyanionic competitor reagent added. Data was then inspected using a web tool and subjected to quality control procedures as previously described108,109.

Flow cytometry of blood and cerebrospinal fluid

EDTA-treated whole blood and cerebrospinal fluid cells were used for flow cytometric analysis. Cerebrospinal fluid samples were obtained by lumbar puncture and the cerebrospinal cells were collected within an hour by centrifugation. Whole blood or cerebrospinal fluid cells were stained with CD3, CD4, CD8, CD14, CD16, CD19, CD25, CD27, CD45, CD45RA, CD56, CD152, CXCR5, IgD, HLA-DR (all from BD Biosciences), PD-1 (BioLegend) and FoxP3 (eBiosciences), as previously described110. All flow cytometric analysis was performed using an LSR II (BD Biosciences). The data were analyzed using FlowJo 10.6 software (FlowJo LLC).

An additional panel of T-cell markers including TIGIT, CD244, and CD226 were performed on a subset of participants. Some of these samples were collected from participants during a second lumbar puncture during a return visit months after the initial sample collected for the analysis above.

All antibodies, clones, catalog numbers, manufacturers, and dilutions used in this study are as follows: anti-human CD3 (clone: UCHT1, Cat. 558117, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)) anti-human CD3 (clone: UCHT1, Cat. 555335, BD Biosciences, 1:100 dilution); anti-human CD4 (clone: RPA-T4, Cat. 557922, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD4 (clone: RPA-T4, Cat. 555346, BD Biosciences, 1:100 dilution); anti-human CD8 (clone: SK1, Cat. 341051, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD8 (clone: RPA-T8, Cat. 557746, BD Biosciences, 1:100 dilution) anti-human CD14 (clone: M5E2, Cat. 555399, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD16 (clone: 3G8, Cat. 338426, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD19 (clone: HIB19, Cat. 555413, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD25 (clone: M-A251, Cat. 557741, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD27 (clone: M-T271, Cat. 560222, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD45 (clone: HI30, Cat. 560777, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD45RA (clone: HI100, Cat. 555488, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)) anti-human CD56 (clone: B159, Cat. 557747, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD226 (clone: 11A8, Cat. 338318, BioLegend, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD244 (clone: C1.7, Cat. 329522, BioLegend, 1:30 dilution (1:100 dilution for CSF cells)); anti-human PD-1 (clone: EH12.2H7, Cat. 329906, BioLegend, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CXCR5 (clone: RF8B2, Cat. 558113, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human HLA-DR (clone: G46-6, Cat. 555811, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human IgD (clone: IA6-2, Cat. 561302, BD Biosciences, 1:30 dilution (1:100 dilution for CSF cells)); anti-human TIGIT (clone: A15153G, Cat. 372714, BioLegend, 1:30 dilution (1:100 dilution for CSF cells)); anti-human CD127 (clone: HIL-7R-M21, Cat. 560551, BD Biosciences, 1:30 dilution); anti-FOXP3 (clone: 236A/E7, Cat. 17-4777-42, Thermo Fisher Scientific, 1:50 dilution); anti-human CD152 (clone: BNI3, Cat. 555853, BD Biosciences, 1:50 dilution); anti-Stat5 (pY694) (clone: 47/Stat5(pY694), Cat. 612567, BD Biosciences, 1:50 dilution).

Dilutions were determined according to our own staining protocols. All the antibodies used for flow cytometry in this study are commercially available, and their specificities have been well validated by the manufacturers and other users.

NK cell function measurement

NK cell function was measured in blood within 24 h of collection by a 51Chromium release assay111 by the clinical laboratory at Cincinnati Children’s Hospital.

Growth differentiation factor-15 measurement

The GDF15 ELISA was performed using R&D Systems, Minneapolis, MN. Catalog No. DGD150 kit as per manufacturer instructions. The intra-assay variation was 1.8–2.8% and the interassay variation was 4.7–5.6%.

Luciferase immunoprecipitation assay

The luciferase immunoprecipitation systems (LIPS) assay provides an informative tool to explore serology as evidence of autoimmunity and infectious disease exposure due to its ability to efficiently detect antigenicity antibodies against both conformational and linear epitopes. Here, LIPS was used to assess for the presence of autoantibodies against a small diverse panel of known and potential antigens in the PI-ME/CFS and healthy volunteer participants. The previously described testing format was used to examine antibodies against the various target molecules included known autoimmune-associated proteins (Ro52, Jo-1, TPO, gastric ATPase, tyrosine hydroxylase), neurological autoantigens (GAD65, LGI1, NMDAR1, MUSK), cytokines (Interferon alpha1, Interleukin-6, CXCR4, TGFB1), muscle proteins (MPZ, PMP22), as well as against several infectious agents (HDV, HEV, Zika virus). Light units were measured in a Berthold LB 960 Centro luminometer (Berthold Technologies, Germany) using coelenterazine or furimazine substrate mix (Promega, Madison, WI). In some cases, control sera samples from known positive control autoimmune patients were used as positive controls.

Muscle RNA sequencing

RNA was extracted from muscle samples using the TRIzol protocol112. The integrity of the RNA was verified using a standard quality metric denominated RNA integrity number (RIN) value using the Agilent 4200 TapeStation system and the concentration was measured using the DeNovix DS-11 spectrophotometer. Five hundred nanograms of RNA were used to prepare the RNA sequencing libraries using the NEBNext Ultra II Directional RNA Library Prep Kit and sequenced using the Illumina NovaSeq 6000 sequencer. Reads were demultiplexed using bcl2fastq v. 2.20.0.

Metabolomics of cerebrospinal fluid

Metabolomics on cerebrospinal fluid samples was performed using Metabolon’s Ultrahigh Performance Liquid Chromatography-Tandem Mass Spectroscopy (UPLC-MS/MS). Samples were prepared using the automated MicroLab STAR® system from Hamilton Company. Several recovery standards were added prior to the first step in the extraction process for QC purposes. To remove protein, dissociate small molecules were bound to protein or trapped in the precipitated protein matrix, and to recover chemically diverse metabolites, proteins were precipitated with methanol under vigorous shaking for two minutes (Glen Mills GenoGrinder 2000) followed by centrifugation. The resulting extract was divided into five fractions: two for analysis by two separate reverse phase (RP)/UPLC-MS/MS methods with positive ion mode electrospray ionization (ESI), one for analysis by RP/UPLC-MS/MS with negative ion mode ESI, one for analysis by HILIC/UPLC-MS/MS with negative ion mode ESI, and one sample was reserved for backup. Samples were placed briefly on a TurboVap® (Zymark) to remove the organic solvent. The sample extracts were stored overnight under nitrogen before preparation for analysis.

All methods utilized a Waters ACQUITY ultra-performance liquid chromatography (UPLC) and a Thermo Scientific Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. The sample extract was dried then reconstituted in solvents compatible to each of the four methods. Each reconstitution solvent contained a series of standards at fixed concentrations to ensure injection and chromatographic consistency. One aliquot was analyzed using acidic positive ion conditions, chromatographically optimized for more hydrophilic compounds. In this method, the extract was gradient eluted from a C18 column (Waters UPLC BEH C18-2.1×100 mm, 1.7 µm) using water and methanol, containing 0.05% perfluoropentanoic acid (PFPA) and 0.1% formic acid (FA). Another aliquot was also analyzed using acidic positive ion conditions; however, it was chromatographically optimized for more hydrophobic compounds. In this method, the extract was gradient eluted from the same afore-mentioned C18 column using methanol, acetonitrile, water, 0.05% PFPA and 0.01% FA and was operated at an overall higher organic content. Another aliquot was analyzed using basic negative ion optimized conditions using a separate dedicated C18 column. The basic extracts were gradient eluted from the column using methanol and water, however, with 6.5 mM Ammonium Bicarbonate at pH 8. The fourth aliquot was analyzed via negative ionization following elution from a HILIC column (Waters UPLC BEH Amide 2.1 × 150 mm, 1.7 µm) using a gradient consisting of water and acetonitrile with 10 mM Ammonium Formate, pH 10.8. The MS analysis alternated between MS and data-dependent MSn scans using dynamic exclusion. The scan range varied slightly between methods but covered 70–1000 m/z.

Lipidomics of plasma

The extraction of lipids from plasma was accomplished following manufacturer’s protocol, with slight modifications113. To 25 µl of plasma sample, 975 µl water, 2 ml methanol, 900 µl dichloromethane (DCM) and 25 µl internal standard was added. The internal standard was prepared from a kit (Avanti Lipids) developed for the Lipidyzer platform (SCIEX)114,115. The samples were then vortexed and allowed to sit at room temperature for 30 min. 1 ml water and 900 µl DCM was then added and samples were vortexed and centrifuged for 10 min at 2000 × g. The lower organic layer was removed and placed in a separate collection tube. To the remaining aqueous layer, 1.8 ml DCM was added and the samples were vortexed and centrifuged. The lower organic layer was again removed, added to the previous organic layer, and stream dried under nitrogen. The lipids were reconstituted in 250 µl running solvent (50/50 methanol/DCM 10 mM ammonium acetate) and placed in an autosampler vial for analysis. Quality control (QC) samples and 3 QCs spiked with unlabeled internal standards from the QC Spike Standards kit (SCIEX) were run at the beginning, middle, and end of the samples. The quantitation of the lipid species was done with Lipidomics Workflow Manager (SCIEX), which uses validated DMS-MRM acquisition methods, known internal standard concentrations, and integrated peak areas from the samples to determine the quantity of each lipid species. Lipid species were included in the data analyses if above the limit of quantification in >50% of the participants.

Microbiome stool sampling procedure

Whole fecal samples were collected in a sterile bowl at the NIH Clinical Center, aliquoted, and immediately frozen at −80 °C. Aliquots were sent to the sequencing laboratory for metagenomic and metabolomic analyses.

Stool microbiome metagenomics

For metagenomic shotgun sequencing, paired-end libraries were prepared from metagenomic DNA using the Illumina Nextera Flex kit, and then sequenced on the Illumina NovaSeq platform with a 2 ×150 bp length configuration. Bioinformatic analysis of the shotgun metagenomic samples was done using the JAMS_BW package, version 1.7.2, which is available on GitHub at All code for every step in the bioinformatic analysis from reads to plots is publicly available in this package.

Fastqs from each sample were processed with JAMSalpha116, which, briefly, entails quality trimming and adapter clipping of raw reads with Trimmomatic 0.36117. The reads were then aligned against the human genome with Bowtie2 v2.3.2118 and unaligned (non-host) reads were then assembled using MEGAHIT v1.2.9119. For all samples evaluated, the mean sequencing depth (already discounting host reads) was 6.04 Gbp ± 0.79 Gbp, yielding a mean assembly rate of 94.06% ± 1.93%. Assembly contigs smaller than 500 bp were discarded and taxonomic classification of remaining contigs was obtained through Kraken2120, with a custom 96-Gb Kraken2 database built using draft and complete genomes of all bacteria, archaea, fungi, viruses, and protozoa available in the NCBI GenBank in January 2022, in addition to human and mouse genomes, built using the JAMSbuildk2db tool of the JAMS package. This JAMS-compatible kraken2 database is available for download through the URL Functional annotation of contigs was obtained using Prokka v1.14.6121. The sequencing depth of each contig was obtained by aligning reads used for assembly back to the contigs. Taxonomy was expressed as the last known taxon (LKT), being the taxonomically lowest unambiguous classification determined for each query sequence, using Kraken’s confidence scoring threshold of 5e−06 (using the –confidence parameter). The relative abundance, expressed in parts-per-million (PPM) for each LKT within each sample was obtained by dividing the number of bp covering all contigs and unassembled reads pertaining to that LKT by the total number of non-host base pairs sequenced for that sample.

Stool nuclear magnetic resonance spectroscopy

Fecal extractions were based on published recommendations122. Samples were thawed on ice and buffer (2H-phosphate buffered saline + 0.01% sodium azide; pH 7.5) was added at 5 ml/1 g of material. Samples were sonicated 2× 30 s (50% power) on ice, vortexed for 1 min, and centrifuged at 8000 rcf for 20 min at 4 °C. Supernatant was added to a 3 kDa filter concentrator to further remove particulates and proteins. The flow through was collected after about an hour which amounted to approximately 600 ul. Trimethylsilypropanesulfonate was added at 200 uM concentration as a chemical shift standard and concentration reference. NMR spectra were acquired on an 800 MHz Agilent DD2 console equipped with a cryogenically cooled probe. A 1-dimensional NOESY sequence with 4 s acquisition time, 1 s recycle delay, and 100 ms mixing time was used. Chenomx software (Alberta, Canada) was used for processing and spectral analysis. Metaboanalyst was used for statistical comparisons.

Clinical follow-up

Each participant was recontacted by a physician on the investigative team between 11/2021 and 7/2022 to inquire about changes in their clinical condition.

Statistical analysis

This research protocol utilized an exploratory design coupled with a broad and deep phenotyping approach. The assembled data represents a multidimensional description of post-infectious ME/CFS cases collected with the intent to generate new hypotheses.

Using strict case criteria and adjudication process we minimized medical misattribution and studied a homogeneous population. Since the deep phenotyping resulted in a large number of measurements, our statistical approach used a modified concept of consilience, the principle that evidence from independent, unrelated sources can converge on strong conclusions. Here, measures were purposely selected to interrogate different facets of immunologic, bioenergetic, and homeostatic physiology and determine if similar results would emerge from the different techniques employed. It also used repetition of measures to aid in the interpretation of data variance and reliability.

This exploratory approach embraces the explanatory power of negative findings. Small sample sizes can be adequate for applying logic to demonstrate that a phenomenon is not related to a particular physiological process123. These data can be used to estimate the futility of continuing to look for a physiological difference that likely does not exist.

The sample size for the cohort was selected for convenience; no statistical method was used to predetermine sample size. With the sample size of this cohort, we anticipated being only able to detect large effects. Post-hoc calculations of the effect size for a phenotyping sample of 21 and 17 participants to achieve a power of 80% is 0.94. Similar calculations for the exercise sample of nine and six participants is 1.27.

Inherent in the data are numerous estimates of effect size and correlation, even for variables that do not reach statistical significance. While the precision of these effect sizes may be poor, they are reported to provide a sense of the strength of relationship and would be useful for determining statistical power for future research.

Statistically, each measure in the protocol was analyzed independent from all others. The statistical testing approach for each measure is listed in Supplementary Data S1. Where appropriate, statistical correction for multiple comparisons was performed within the measurement analyzed. Given the exploratory nature of the study, no statistical correction for multiple comparisons was performed across the different measures or for correlational analyses.

For analysis of biological samples, two or more technical replicates were used. Clinical evaluations were performed without replication. There was no randomization in this study. There were no interventions in this study. All researchers performing the experimental analysis of biological samples were blinded to diagnostic group. Evaluating clinicians were not blinded to diagnostic group. Details about data exclusions for each analysis performed can be reviewed in Supplementary Data S23.

Statistical analysis of heart rate variability

A text file indicating timing in milliseconds between normal beats was exported. Rstudio 1.1.463(19) was used to remove non-normal beats and re-order data start-time to 8am. The subsequent text file was imported into Kubios HRV Version 1.0 with an artifact correction threshold of 0.3 s. HRV analysis was performed as recommended by the 1996 ESC and NASPE HRV task force and European Heart Rhythm Association. No de-trending was performed. Time-domain metrics included NN interval, pNN50, RMSSD, SDNN, SDNNi and SDANN measured over day(12-h), night(12-h), and 24-h periods. Frequency-domain metrics included VLF(0–0.04 Hz), LF(0.04–0.15 Hz), and HF(0.15–0.4 Hz) measured over day (12-h), night (12-h), 24-h, and 5-min periods. Frequency analysis was conducted using a Lomb-Scargle periodogram with a smoothing window width of 0.02 Hz. Non-linear metrics included SD1, SD2, SD1/SD2 and were measured in one hour segments. Representative traces (HR, LF, HF) were plotted using data sampled at one minute intervals.

Subsequent analyses between study participants and controls were performed using GraphPad Prism version 9.0.0 for Mac, (GraphPad Software, San Diego, California USA) and SAS 9.4 (SAS Institute, Cary, NC). Mann–Whitney tests and Chi-square tests were used to evaluate the difference between HV and PI-ME/CFS participants. Scatterplot graphs display a bar signifying the median of the distribution. Non-linear measures (SD1, SD2 and SD1/SD2) and heart rate were examined using a mixed-effects model to account for 24-h repeated measurements with adjustments for hour of day; results of these latter measures were reported and displayed as least-square mean (lsmean) +/− standard error (stand err.) for PI-ME/CFS and HVs, respectively. A p-value < 0.05 is considered statistically significant.

Statistical analysis of effort expenditure for rewards task

Following the analytic strategy described by Treadway15, generalized estimating equations (GEE) were used to model the effects of trial-by-trial and participant variables on hard task choice. A binary distribution and logit link function were used to model the probability of choosing the hard task versus the easy task. All models included reward probability, reward magnitude, expected value (the product of reward probability and reward magnitude), and trial number, in addition to binary categorical variables indicating participant group and sex. Emulating Treadway et al., the two-way interactions between PI-ME/CFS diagnosis and reward probability, PI-ME/CFS diagnosis and reward magnitude, and PI-ME/CFS diagnosis and expected value were also tested, as was the three-way interaction among PI-ME/CFS diagnosis, reward magnitude, and reward probability. One new two-way interaction, the interaction of PI-ME/CFS diagnosis and trial number, was tested as well in order to determine whether rate of fatigue differed by diagnostic group.

Departing from the procedures described by Treadway15, GEE was also used to model the effects of trial-by-trial and participant variables on task completion. A binary distribution and logit link function were again used given the binary nature of the task completion variable (i.e., success or failure). The model included reward probability, reward magnitude, expected value, trial number, participant diagnosis, and participant sex, as well as a new term indexing the difficulty of the task chosen (easy or hard). The three-way interaction of participant diagnosis, trial number, and task difficulty was evaluated in order to determine whether participants’ abilities to complete the easy and hard tasks differed between diagnostic group, and in turn whether fatigue demonstrated differential effects on probability of completion based on diagnosis and task difficulty. Additionally, GEE was used to model the effects of these independent variables and interactions on button press rate, to provide an alternative quantification of task performance. This time, the default distribution and link function were used. The model’s independent variables and interaction terms were the same as in the above task completion model.

All three sets of GEE models were performed using an exchangeable working correlation structure. Unstructured models were tested as well, but failed to converge. All GEE models were implemented in SAS 9.4.

Statistical analysis of repetitive grip testing

Grip force data was filtered with a lowpass 8 Hz butterworth of order 2 and normalized to the maximum voluntary contraction. To represent the evolution of fatigue, data collected during the 1st block was compared to the last successful block and the three following failed blocks.

Statistical analysis of functional MRI repetitive grip testing

AFNI124,125 was used to process anatomical and EPI timeseries with the tool that included removing the first two volumes, despiking the timeseries, registering the EPI data to the anatomical scan, adjusting for slice timing offsets, motion correcting these timeseries referring to least outlier volume with rigid body transformations using cubic polynomial interpolation, and spatially blurring the timeseries with a 6 mm FWHM Gaussian kernel. @ANATICOR was used to remove white matter signal from the timeseries to reduce scanner-related artifacts126 and also to remove CSF signal. The motion limit was set to 3 mm and removed volumes with more than 10% of outliers as defined with 3dToutcount tool in AFNI. The demeaned and derivatives of head motion parameters were regressed out. The anatomical and the EPI timeseries were transformed to the MNI template with nonlinear transformations.

Regression analysis with AFNI’s 3dDeconvolve tool was used, with a box car model for each 30 s block of grip force. The first 16 blocks of the task were divided into quartiles of four blocks each. Blocks were pooled together in this fashion to better estimate fatigue-related brain activation. We used a different approach than in the TMS session to represent the evolution of fatigue because we needed to pool blocks together to better estimate fatigue-related brain activation. For group analysis, linear mixed-effects (3dLME tool in AFNI) were used with two groups and four blocks and participants as a random factor. A voxel threshold of p ≤ 0.01 and a cluster threshold of p ≤ 0.05, k ≥ 65 (multiple comparisons correction) was used. We also used a t-test with the 3dMEMA tool in AFNI to assess commonly activated areas.

Statistical analysis of RNA sequencing data

RNA sequence data was obtained from the libraries using the bcl2fastq v.2.17; Illumina software. RNA sequences were subjected to quality control (FastQC, a quality control tool for high throughput sequence data and available online at:, and trimmomatic ( to remove adapters, followed by alignment to the human genome (GRCh38) using STAR127. Gene expression levels were quantified using featuresCounts128. DE analysis was performed on PBMC and muscle RNAseq data using limma129, and genes with nominal p-value ≤ 0.05 were considered DE. BMI was used as a covariate in sex separated and combined cohorts. Pathway enrichment analysis was performed using the R package clusterProfiler130, which uses the fisher test to determine statistical significance. Additionally, prior known protein-protein interactions for the DE genes were extracted from the STRING resource ( Protein-protein interactions with a confidence score of >0.7 were reported. The fold change information of the genes node in the PPI network are highlighted as red (for upregulated genes) and blue (for downregulated gene) color nodes in Cytoscape132.

Statistical analysis of aptamer-based proteomics, metabolomics, and lipidomics data

For these datasets, the data analysis is described below. Partial least squares discrimination analysis (PLSDA) (mixOmics; was used to analyze the data and calculate the variable importance in prediction (VIP) scores for all the metabolites measured. Briefly, variables measurements with missing values in >50% of the samples were removed. For the remaining variables, the missing values were imputed with half of the minimum measurements for the respective metabolite. All the variables with VIP score >1 were subset for either pathway level inference or were assessed individually for known functions, and their expression level between groups are shown in the selected heatmaps. The features with VIP > 1 were considered to be important because the squared sum of all VIP values is equal to the number feature and thus, the average VIP would be equal to 1133.

Statistical analysis of transposable element expression data

RNA sequencing data from the PBMCs was aligned to the human genome (GRCh38) using STAR (PMID: 23104886) to allow for multi-aligned sequences using the following criteria: winAnchorMultimapNmax 100 and outFilterMultimapNmax 100. The transposable elements gene transfer file (GTF) was generated from the UCSC genome database. Gene expression quantification was performed with the UCSC RepeatMasker GTF file using the TEtranscripts tool (PMID: 26206304). The gene expression count matrix was used and DEseq2 R package was used to perform DE analysis. The summary DE table is reported. In parallel, the RNAsequencing data from the PBMCs were aligned to the human genome (GRCh38) using bowtie2, and HERV elements were quantified with the author-provided GTF file using Telescope tool (PMID: 31568525). No transposable elements were quantified from the dataset using this tool.

Statistical analysis of microbiome shotgun metagenomics data

Comparisons between samples were interrogated from SummarizedExperiment objects134 constructed using the JAMSbeta pipeline of the JAMS_BW package. Ordination plots were made with the t-distributed stochastic neighbor embedding (t-SNE) algorithm using the uwot package in R ( and the ggplot2 library. Permanova values were obtained using the adonis function of the vegan package, with 10,000 permutations and pairwise distances calculated using Bray–Curtis distance. Heatmaps were drawn using the ComplexHeatmap package135. For each feature, p-values were calculated using the Mann–Whitney–Wilcoxon U-test on PPM relative abundances for that feature in samples within each group.

Reproducibility of data analyses

In an effort to promote open science and reproducibility, the final source data files were reanalyzed where appropriate and individual source data files and source code used to analyze and visualize those data have been provided where possible and are available at

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.


Source link

Leave a Comment