Search This Blog

Translate

خلفيات وصور / wallpapers and pictures images / fond d'écran photos galerie / fondos de pantalla en i

Buscar este blog

11/9/25

 


3821Emerging Neurotherapeutic Technologies CHAPTER 487

adaptively perform within progressively more challenging distractor

environments. Neuroplasticity selective to distractor processing was

evidenced in this study at both the microscale, i.e., at the resolution of

single neuron spiking in sensory cortex, as well as macroscale, i.e., electroencephalography (EEG)-based event-related potential recordings.

Video games have also shown promise in the treatment of visual

deficits such as amblyopia, and in cognitive remediation in neuropsychiatric disorders such as schizophrenia. However, while the evidence

base has been encouraging in small-sample randomized controlled

studies (RCTs), larger RCTs are needed to demonstrate definitive therapeutic benefit. This is especially necessary as the commercial brain

training industry continues to make unsubstantiated claims of the benefits of neurogaming; such claims have been formally dismissed by the

scientific community. Like any other pharmacologic or device-based

therapy, neurogames need to be systematically validated in multiphase

RCTs establishing neural target engagement and documenting cognitive and behavioral outcomes in specific disorder populations.

Generalizability of training benefits from task-specific cognitive

outcomes to more broad-based functional improvements remains

the holy grail of neurogaming. Next-generation neurogames will aim

to integrate physiologic measures such as heart rate variability (an

index of physical exertion), galvanic skin responses, and respiration

rate (indices of stress response), and even EEG-based neural measures. The objectives of such multimodal biosensor integration are to

enhance the “closed-loop mechanics” that drive game adaptation and

hence improve therapeutic outcomes and perhaps result in greater

A B

C D

FIGURE 487-2 Augmented reality (AR) for phantom limb pain. A. A patient is shown a live AR video. B. EMG electrodes placed over the stump record muscle activation

during training. C. The patient matches target postures during rehabilitation. D. Patient playing a game in which a car is controlled by “phantom movements.”

(M Ortiz-Catalan et al: Phantom motor execution facilitated by machine learning and augmented reality as treatment for phantom limb pain: A single group, clinical trial in

patients with chronic intractable phantom limb pain. Lancet 388:2885, 2016.)


3822 PART 20 Frontiers

Real-time

fMRI

Feedback

display

(e.g., thermometer)

Image

reconstruction

3T MRI

acquisition

The task of the subject

is to lower the temperature display

FIGURE 487-3 Neurofeedback using functional MRI. (From T Fovet et al: Translating neurocognitive models of auditoryvisual hallucinations into therapy. Front Psychiatry 7:103, 2016.)

generalizability. These complex, yet

potentially more effective, neurogames

of the future will need rigorous clinical

study for demonstration of validity and

efficacy.

NEUROIMAGING

■ NEUROIMAGING OF

CONNECTIVITY

Multimodal neuroimaging methods

including functional magnetic resonance

imaging (fMRI), EEG, and magnetoencephalography (MEG) are now being

investigated as tools to study functional

connectivity between brain regions, i.e.,

extent of correlated activity between

brain regions of interest. Snapshots of

functional connectivity can be analyzed while an individual is engaged in

specific cognitive tasks or during rest.

Resting-state functional connectivity

(rsFC) is especially attractive as a robust,

task-independent measure of brain function that can be evaluated in diverse

neurologic and neuropsychiatric disorders. In fact, methodologic research has

shown that rs-fMRI can provide more

reliable brain signals of energy consumption than specific task-based fMRI approaches.

In recent years, there has been a surge of research to identify robust

rsFC-based biomarkers for specific neurologic and neuropsychiatric

disorders and thereby inform diagnoses, and even predict specific

treatment outcomes. For many such disorders, the network-level

neurobiologic substrates that correspond to the clinical symptoms

are not known. Furthermore, many are not unitary diseases, but

rather heterogeneous syndromes composed of varied co-occurring

symptoms. Hence, the research quest for robust network biomarkers

for complex neuropsychologic disorders is challenging and still in its

infancy; yet some studies have made significant headway in this domain.

For example, in a large multisite cohort of ~1000 depressed patients,

Drysdale et al. (2017) showed that rsFC measures can subdivide

patients into four neurophysiologic “biotypes” with distinct patterns

of dysfunctional connectivity in limbic and frontostriatal networks.

These biotypes were associated with different clinical-symptom profiles (combinations of anhedonia, anxiety, insomnia, anergia, etc.) and

had high (>80%) diagnostic sensitivity and specificity. Moreover, these

biotypes could also predict responsiveness to transcranial magnetic

stimulation (TMS) therapy. Another recent study demonstrated utility

of rsFC measures to predict diagnosis of mild traumatic brain injury

(mTBI), which is clinically challenging by conventional means.

Apart from fMRI-based measures of rsFC, EEG- and MEG-based

rsFC measures are also being actively investigated, as these provide a

relatively lower-cost alternative to fMRI. While EEG is of lowest cost,

it compromises on spatial resolution. The major strength of MEG is its

ability to provide more accurate source-space estimates of functional

oscillatory coupling than EEG as well as provide measures at various

physiologically relevant frequencies (up to 50 Hz shown to be clinically useful). In this regard, EEG/MEG are complementary to fMRI,

which can only be used to study slow activity fluctuations (i.e., <0.1 Hz);

the potential for EEG/MEG modalities to provide valid diagnostic

biomarkers is currently underexploited and requires further study.

■ CLOSED-LOOP NEUROIMAGING

Neuroscientific studies to-date are predominantly designed as “openloop experiments,” interpreting the neurobiologic substrates of human

behavior via correlation with simultaneously occurring neural activity.

In recent years, advances in real-time signal processing have paved

the way for “closed-loop neuroimaging,” wherein humans can directly

manipulate experiment parameters in real-time based on specific

brain signals (Fig. 487-3). Closed-loop imaging methods can not only

advance our understanding of dynamic brain function but also have

therapeutic potential. Humans can learn to modulate their neural

dynamics in specific ways when they are able to perceive (i.e., see/hear)

their brain signals in real-time using closed-loop neuroimaging-based

neurofeedback. Early studies showed that such neurofeedback learning

and resulting neuromodulation could be applied as therapy for patients

suffering from chronic pain, motor rehabilitation in Parkinson’s and

stroke patients, modulation of aberrant oscillatory activity in epilepsy,

and improvement of cognitive abilities such as sustained attention

in healthy individuals and patients with attention-deficit hyperactivity disorder (ADHD). It has also shown potential for deciphering

state-of-consciousness in comatose patients, wherein a proportion of

vegetative/minimally conscious patients can communicate awareness

via neuroimaging-based mental imagery.

Closed-loop neuroimaging therapeutic studies have utilized realtime fMRI, EEG, and MEG methods. It is common for neural signals

to be extracted from specific target brain regions for neuromodulation.

However, given that distributed neural networks underlie behavioral

deficits, new studies have also explored neurofeedback on combinatorial brain signals from multiple brain regions extracted using multivariate pattern analysis (MVPA). While early studies indicate therapeutic

potential, clinical RCTs of closed-loop neuroimaging neurofeedback

have shown mixed results. This may largely be because of the individual heterogeneity in neuropsychiatric disorders such that there is no

one-size-fits-all therapy. Closed-loop neuroimaging-based therapies

need to be better personalized to the pre-intervention cognitive and

neurophysiologic states of the individual, and a better understanding

needs to be developed regarding learning principles and mechanisms

of self-regulation underlying neurofeedback. Clinical practitioners

applying these methods also need better education on the hardware/

software capabilities of these brain–computer interfaces to maximize

patient outcomes.

NONINVASIVE BRAIN STIMULATION

Noninvasive brain stimulation (NIBS) is widely recognized as having

great potential to modulate brain networks in a range of neurologic and

psychiatric diseases; it is currently approved by the U.S. Food and Drug


3823Emerging Neurotherapeutic Technologies CHAPTER 487

Administration (FDA) as a treatment for depression. Importantly, there

is a very large body of basic research indicating that neuromodulation

of the nervous system with electrical stimulation can have both shortterm and long-term effects. While transcranial magnetic stimulation

(TMS) uses magnetic fields to generate electrical currents, transcranial

direct current stimulation (tDCS), in contrast, is based on direct stimulation using electrical currents applied at the scalp (Fig. 487-4). TMS

induces small electrical currents in the brain by magnetic fields that

pass through the skull; it is known to be painless and therefore widely

used for NIBS. Animal research suggests that anodal tDCS causes a

generalized reduction in resting membrane potential over large cortical areas, whereas cathodal stimulation causes hyperpolarization.

Prolonged stimulation with tDCS can cause an enduring change in

cortical excitability under the stimulated regions. Further, changes in

resting-state fMRI-based activity and functional connectivity have also

been observed post-tDCS. Notably, there is uncertainty regarding precisely how much electrical current is able to penetrate through the skull

and modulate neural networks. Indeed, recent work has found that typical stimulation paradigms may not generate sufficient electrical fields

to modulate neural activity; an alternate possibility is that peripheral

nerves may be modulated and thus affect neural activity.

Neuromodulation via stimulation techniques such as tDCS and

TMS have shown promise as methods to improve motor function after

stroke; there are a growing number of studies demonstrating functional

benefits of combining physical therapy with brain stimulation. Two

commonly utilized TMS paradigms include low-frequency “inhibitory” stimulation of the healthy cortex or high-frequency “excitatory”

stimulation of the injured hemisphere. Each of these two approaches

aims to modify the balance of reciprocal inhibition between the two

hemispheres after stroke. A meta-analysis of randomized controlled

trials published over the past decade found a significant beneficial

effect on motor outcomes. Unfortunately, a recent large multicenter

trial to assess the long-term benefits of TMS on motor recovery after

stroke (NICHE trial) did not find a benefit at the population level.

Ongoing research aims to better understand how stimulation can

directly affect neural patterns and thus allow more customization

of stimulation—past trials did not record the neural responses to

stimulation.

TMS and tDCS interventions are also being applied in psychiatric

disorders. A substantial body of evidence supports the use of TMS as an

antidepressant in major depressive disorder (MDD). TMS is also being

investigated for its potential efficacy in posttraumatic stress disorder

(PTSD), obsessive compulsive disorder (OCD), and treatment of auditory hallucinations in schizophrenia. Various repetitive TMS (rTMS)

protocols have shown efficacy in major depression. These include both

low-frequency (≤1 Hz) and high-frequency (10–20 Hz) rTMS stimulation over the dorsolateral prefrontal cortex (DLPFC). Mechanistically,

low-frequency rTMS is associated with decreased regional cerebral

blood flow while high-frequency rTMS elicits increased blood flow, not

only over the prefrontal region where the TMS is applied, but also in

associated basal ganglia and amygdala circuits. Notably, the differential

mechanisms of the low- vs. high-frequency rTMS protocols are associated with mood improvements in different sets of MDD patients, and

patients showing benefits with one protocol may even show worsening

with the other, again pointing to individual heterogeneity in network

function. EEG-guided TMS is also being investigated in psychiatric

disorders, for instance, individual resting alpha-band (8–12 Hz)

peak frequency to determine TMS stimulation rates. With respect

to transcranial electrical stimulation in psychiatry, tDCS is the most

commonly used protocol. In major depression, there is a documented

imbalance in left vs. right DLPFC activity; hence, differential anodal vs.

cathodal tDCS in the left vs. right prefrontal cortex may be a potentially

efficacious approach. Interestingly while meta-analysis shows promise

for NIBS methods in psychiatric illness, large RCTs have failed to

generate effects compared to placebo treatment. Future success may

require careful personalized targeting based on network dynamics and

refinement of protocols to accommodate combinatorial treatments.

IMPLANTABLE NEURAL INTERFACES

INCLUDING BRAIN–MACHINE INTERFACES

Fully implantable clinically relevant neural interfaces that can improve

function already exist. Cochlear implants, for example, are sensory

prostheses that can restore hearing in deaf patients. Environmental

sounds are processed in real-time and then converted into patterned

stimulation delivered to the cochlear nerve. Importantly, even while

the patterned stimulation remains the same, there are gradual improvements in the perception of speech and other complex sounds over a

period of several months after device implantation. Activity-dependent

sculpting of neural circuits is hypothesized to underlie the observed

perceptual improvements. Similarly, the development of deep-brain

stimulation (DBS) was based on decades of work showing that surgical lesions to specific nuclei could alleviate tremor and bradykinesia

symptoms in animal models. DBS involves chronic implantation of a

stimulating electrode that targets specific neural structures (e.g., subthalamic nuclei or the globus pallidus in Parkinson’s disease). At least

for movement disorders, it is commonly thought that targeted areas are

functionally inhibited by the chronic electrical stimulation.

■ IMPLANTABLE DEVICES FOR

NEUROMODULATION

There has been recent progress in the development of implantable

neural interfaces to treat neurologic and psychiatric illnesses. For

example, for patients with refractory focal epilepsy and clearly identified seizure foci, invasive “responsive stimulation” has now been FDA

approved. Responsive stimulation is grounded on principles of closedloop stimulation based on real-time monitoring of brain oscillations;

specifically, the device aims to detect the earliest signatures of the

onset of a seizure, usually at a stage that is not symptomatic, and then

deliver focused electrical stimulation to prevent further progression

and generalization. A large, randomized control trial of this device

was performed in patients with intractable focal epilepsy; they were

assigned to either sham or active stimulation in response to seizure

detection. There was a significant reduction in seizure frequency in the

tDCS electrode

+ –

+ +++ +– ––––

––

– +

++

Anode Cathode

Current flow

(min)

Magnetic field

(µs)

TMS coil

TMS coil

tDCS

electrodes

FIGURE 487-4 Illustration of TMS and tDCS setups. The upper panels show a TMS

setup. Coils generate magnetic fields that can in turn generate electrical fields in

the cortical tissue. The lower panels show a tDCS setup. The electrical current

is believed to flow from the anode (+) to the cathode (–) through the superficial

cortical areas leading to polarization. (Reproduced with permission from R Sparing,

FM Mottaghy: Noninvasive brain stimulation with transcranial magnetic or direct

current stimulation [TMS/tDCS]—From insights into human memory to therapy of its

dysfunction. Methods 44:329, 2008.)


3824 PART 20 Frontiers

stimulation group, but it was rare for patients to become seizure free.

There were also modest improvements in quality of life. Notably, there

was a small, elevated risk of hemorrhage associated with the device. In

addition to providing clinicians with another treatment option, this

device has offered important avenues for research and further optimization. For example, it is now possible to monitor subclinical and

clinical seizures and intracranial EEG in patients with chronic epilepsy.

This has resulted in new knowledge about the association of seizures

with circadian rhythms and sleep. It is also anticipated that a better

understanding of the triggers of seizures and the development of better

stimulation algorithms, based on real-world data, can ultimately lead

to more effective treatments.

There is also great interest in the development of treatments for

refractory depression. One area of focus has been on the development

of DBS to treat depression. While early smaller studies were quite

promising, a larger study failed to find benefits at the population level.

Subsequent analysis has suggested the possibility that more precise

tailoring of stimulation to each individual is warranted, both at the

level of specific pathways identified through neuroimaging as well as

network activity biomarkers. This approach is based on the hypothesis

that tailoring stimulation parameters to each individual may be more

promising. Recent studies have, in fact, supported the notion that

individualized patterns of network activity are predictive of a patient’s

symptoms and how he or she might respond to stimulation. There are

now planned studies that aim to tailor stimulation to each individual

with severe depression.

■ BRAIN–MACHINE INTERFACES FOR PARALYSIS

Brain–machine interfaces (BMIs) represent a more advanced neural interface that aims to restore motor function. Multiple neurologic disorders (e.g., traumatic and nontraumatic spinal cord injury,

motor-neuron disease, neuromuscular disorders, and strokes) can

result in severe and devastating paralysis. Patients cannot perform

simple activities and remain fully dependent for care. In patients with

high cervical injuries, advanced amyotrophic lateral sclerosis (ALS),

or brainstem strokes, the effects are especially devastating and often

leave patients unable to communicate. While there has been extensive

research into each disorder, little has proven to be clinically effective for

rehabilitation of long-term disability. BMIs offer a promising means to

restore function. In the patients described above, while the pathways

for transmission of signals to muscles are disrupted, the brain itself is

largely functional. Thus, BMIs can restore function by communicating

directly with the brain. For example, in a “motor” BMI, a subject’s

intention to move is translated in real-time to control a device. As

illustrated in Fig. 487-5, the components of a motor BMI include: (1)

recordings of neural activity, (2) algorithms to transform the neural

activity into control signals, (3) an external device driven by these control signals, and (4) feedback regarding the current state of the device.

Many sources of neural signals can be used in a BMI. While EEG

signals can be obtained noninvasively, other neural signals require

invasive placement of electrodes. Three invasive sources of neural signals include electrocorticography (ECoG), action potentials or spikes,

and local field potentials (LFP). Spikes and LFPs are recorded with

electrodes that penetrate the cortex. Spikes represent high-bandwidth

signals (300–25,000 Hz) that are recorded from either single neurons

(“single-unit”) or multiple neurons (“multiunit” or MUA). LFPs are

the low-frequency (~0.1–300 Hz) components. In contrast, ECoG

is recorded from electrodes that are placed on the cortical surface.

ECoG signals may be viewed as an intermediate-resolution signal in

comparison with spikes/LFPs and EEG. It is worth noting that there

is still considerable research into the specific neural underpinning of

each signal source and what information can be ultimately extracted

regarding neural processes.

A critical component of a BMI is the transform of neural activity into

a reliable control signal. The decoder is an algorithm that converts the

neural signals into control signals. One important distinction between

classes of decoders is biomimetic versus nonbiomimetic. In the case

of biomimetic decoders, the transform attempts to capture the natural

relationship between neural activity and a movement parameter. In

contrast, nonbiomimetic decoders can be more arbitrary transforms

between neural activity and prosthetic control. It had been hypothesized that learning prosthetic control with a biomimetic decoder is

more intuitive. Recent evidence, however, reveals that learning may be

important for achieving improvements in the level of control over an

external device (e.g., a computer cursor, a robotic limb) for either type

of decoder. This may be similar to learning a new motor skill.

A central goal of the field of BMIs is to improve function in patients

with permanent disability. This can consist of a range of communication and assistive devices such as a computer cursor, keyboard control,

wheelchair, or robotic limb. In the ideal scenario, the least invasive

method of recording neural signals would allow the most complex

level of control. Moreover, control should be allowed in an intuitive

manner that resembles the neural control of our natural limbs. There

is currently active research into developing and refining techniques to

achieve the most complex control possible using each signal source.

One measure of complexity is the degrees of freedom that are controlled. For example, control of a computer cursor on the screen (i.e.,

on the x and y axis) represents two degrees of freedom (DOF). Control

of a fully functional prosthetic upper arm that approaches our natural

range of motion would require >7 DOF. If the functionality of the hand

and fingers is included, then an even more complex level of control

would be required. There has been a large body of research on the use

of noninvasive recording of EEG signals. Studies suggest that two-DOF control using EEG is feasible. There

are also promising reports of patients with advanced

ALS communicating via email using EEG-based BCIs.

Known limitations of EEG-based BCIs include their

signal-to-noise ratio (due to filtering of neural signals

by bone and skin) and contamination by muscle activity. Ongoing research aims to test usability in a more

general nonresearch setting as well as targeted use in

patients with disability.

Numerous studies now also indicate that BMIs using

invasive recording of neural signals can allow rapid control over devices with multiple DOF. The clear majority

of this research has been conducted using recordings

of spiking activity via implanted microelectrode arrays.

Initial preclinical studies were performed in ablebodied nonhuman primates. More recently, there have

been numerous examples of human subjects with a

range of neurologic illnesses (e.g., brainstem stroke,

ALS, spinal cord injury) who have demonstrated the

actual use of implantable neural interfaces. This includes

demonstrations of both the control of communication

interfaces as well as robotic limbs. Pilot clinical trials of

Algorithm

Neural

signals

Action potentials

Field potentials

Electrodes

Control

signals

Computer cursor

Prosthetic limb

Device

control

Feedback

Signal

processing

Neural

signals

1

4

2 3

a

b

Elect

FIGURE 487-5 Components of a brain–machine interface (BMI). (Reproduced with permission from

A Tsu et al: Cortical neuroprosthetics from a clinical perspective. Neurobiol Dis 83:154, 2015.)


3825Emerging Neurotherapeutic Technologies CHAPTER 487

BCIs based on invasive recordings of neural signals have further shown

that significantly greater rates of communication are possible (e.g., >30

characters per minute). Notably, these BCI devices required a percutaneous connection and were always tested in the presence of research

staff. A case study additionally demonstrated that a fully implantable

BCI system could allow communication in a locked-in ALS patient

(Fig. 487-6). At the time of the study, the patient required mechanical

ventilation and could only communicate using eye movements. She

was implanted with multiple subdural cortical electrodes; the neural

signals were then processed and sent wirelessly to an external augmentative alternative communication (AAC) device. Importantly, she could

use the interface with no supervision from research staff.

BMIs have the potential to revolutionize the care of neurologically

impaired patients. While in its infancy, there have been multiple

proof-of-principle studies that highlight possibilities. Combined basic

and clinical efforts will ultimately lead to the development of products

that are designed for patients with specific disabilities. As outlined earlier, each signal source has strengths (e.g., noninvasive versus invasive,

recording stability) and weaknesses (e.g., bandwidth or the amount of

information that can be extracted). With additional research, a more

precise delineation of these strengths and weakness should occur. For

example, one hypothesis is that control of complex devices with high

DOF will only be possible using invasive recordings of high-resolution

neural activity such as spikes from small clusters of neurons. However,

recent trials using ECoG suggest that its stability might also allow higher

DOF control. As these limits become increasingly clear it should allow

targeted clinical translational efforts that are geared to specific patient

needs and preferences (e.g., extent of disability, medical condition,

noninvasive versus invasive). For example, patients with high cervical

injuries (i.e., above C4, where the arm and the hand are affected) have

rehabilitation needs different from patients with lower cervical injuries

(i.e., below C5–C6, where the primary deficits are the hand and fingers).

■ FURTHER READING

Baniqued PDE et al: Brain-computer interface robotics for hand

rehabilitation after stroke: A systematic review. J Neuroeng Rehabil

18:15, 2021.

Bassett DS et al: Emerging frontiers of neuroengineering: A network

science of brain connectivity. Annu Rev Biomed Eng 19:327, 2017.

Drysdale AT et al: Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nat Med 23:28, 2016.

Khanna P et al: Low-frequency stimulation enhances ensemble cofiring and dexterity after stroke. Cell 184:912, 2021.

Liu A et al: Immediate neurophysiological effects of transcranial electrical stimulation. Nat Comm 9:5092, 2018.

Mishra J et al: Video games for neuro-cognitive optimization. Neuron

90:214, 2016.

Reinkensmeyer DJ et al: Computational neurorehabilitation: Modeling plasticity and learning to predict recovery. J Neuroeng Rehabil

13:42, 2016.

Scangos K et al: State-dependent responses to intracranial brain stimulation in a patient with depression. Nat Med 27:229, 2021.

Ventilator

Antenna

Receiver

Electrodes

(implanted)

Transmitter

(implanted device)

Tablet

Anterior Posterior

e1 e2 e3

Electrode strip

e4

A

D

B C

FIGURE 487-6 Illustration of an ALS patient with a fully implanted communication interface. A Illustration of the location of electrodes on the brain. B X-ray of chest

showing the wireless module. C X-ray of leads and wire routing. D Schematic of the subject performing a typing task. (From MJ Vansteensel et al: Fully implanted brain–

computer interface in a locked-in patient with ALS. N Engl J Med 375:2060, 2016. Copyright © 2016 Massachusetts Medical Society. Reprinted with permission from

Massachusetts Medical Society.)


3826 PART 20 Frontiers

Machine learning has reshaped our consumer lives, with self-driving

vehicles, conversant digital assistants, and machine translation services

so ubiquitous that they are at risk of not being considered particularly

intelligent for much longer. Will the algorithms underlying these technologies similarly transform the art and practice of medicine? There

is hope that modern machine-learning techniques—especially the

resurgence of artificial neural networks in deep learning—will foster

a sea change in clinical practice that augments both the sensory and

diagnostic powers of physicians while, perhaps paradoxically, freeing

physicians to spend more time with their patients by performing laborious tasks.

From the birth of artificial intelligence at the Dartmouth Summer

Research Project on Artificial Intelligence in 1956 to self-driving

vehicles today, machine-learning methods and theory have developed

in symbiosis with growing datasets and computational power. In this

chapter, we discuss the foundations of modern machine-learning algorithms and their emerging applications in clinical practice. Modern

machine-learning techniques are sufficiently capacious as to learn

flexible and rich representations of clinical data and are remarkably

adept at exploiting spatial and temporal structure in raw data. The

newest machine-learning models perform on par with expert physicians or prior state-of-the art models on a variety of tasks, such as

the interpretation of images (e.g., grading retinal fundus photographs

for diabetic retinopathy), analysis of unstructured text (e.g., predicting hospital readmission from electronic health record notes), and

processing of speech (e.g., detecting depression from patient speech).

However, many evaluations of machine-learning models occur on tasks

that are narrow and unrealistic, and further lack the clinical context

that a physician would incorporate. The models themselves are also

often divorced from considerations of patient utility. To help ensure

these models benefit patients, this chapter aims to bring more physicians into the design and evaluation of machine-learning models by

providing an understanding of how modern machine-learning models

are developed and how they relate to more familiar methods from the

epidemiological literature.

Today, the terms machine learning and artificial intelligence evoke

images distinct from those conjured up by the same terms in the 1950s

and the 1980s, and they likely will mean something different in a

decade. Computer scientist John McCarthy originally defined artificial

488

intelligence in 1956 as “the broad science and engineering of creating

intelligent machines,” most often embodied today as computer software. Machine learning can be viewed as the subfield of artificial intelligence encompassing algorithms that extract generalizable patterns

from data. This stands in contrast to approaches to create intelligence

from human-engineered and explicitly programmed rules that characterized many early applications of artificial intelligence to medicine

during the 1970s and 1980s (e.g., expert systems such as INTERNIST-I

and MYCIN).

This chapter covers machine-learning methods and applications

that may augment physician expertise at the point of care. Many applications of machine learning in health care are therefore not reviewed

here, for example algorithms to improve hospital planning, detect

insurance fraud, and monitor new drugs for adverse events. Throughout this chapter, we discuss how new machine-learning methods can

learn rich representations of both clinical state and patient identity by

discovering how to represent raw data. At the same time, models largely

reflect the data on which they are trained and, thus, may encode and

amplify biased prior practices; they may also be brittle in unfamiliar

and evolving settings. If machine-learning methods are positioned

to tackle problems based on physician needs and are continuously

re-assessed, we envisage a future with clinics instrumented with

machine-learning tools that augment the ability of physicians to reason

precisely and probabilistically over populations at the point of care.

CONCEPTS OF MACHINE LEARNING

■ TYPES OF MACHINE LEARNING

Many physicians will be familiar with the major types of machine-learning

methods from methodologic counterparts discussed in the context of

“traditional” statistical and epidemiological modeling. In the current

machine-learning and epidemiology literature, much confusion arises

over whether a method “belongs” in one camp or the other. More is

gained by focusing on the computational and statistical connections,

particularly in understanding how new machine-learning methods

compare with familiar clinical risk-stratification approaches.

Broadly, there are four major types of machine learning with applications to clinical medicine: (1) supervised learning, (2) unsupervised

learning, (3) semi-supervised learning, and (4) reinforcement learning.

The four subfields of machine learning differ from one another in both

their objectives and the degree to which algorithms have access to

labeled examples from which to learn (Fig. 488-1). All four subfields

have roots tracing back decades with classical examples and modern

counterparts (Table 488-1).

To date, supervised machine-learning approaches have dominated

the medical literature and recent deep-learning applications. In supervised learning, paired input and labeled output examples are provided

together to a machine-learning algorithm that learns what combination

(potentially millions) of parameters optimizes a function that predicts

Machine Learning and

Augmented Intelligence

in Clinical Medicine

Arjun K. Manrai, Isaac S. Kohane

Supervised Semi-Supervised Unsupervised

FIGURE 488-1 The subfields of machine learning differ in their access to labeled examples. In supervised learning (left), the learning algorithm uses labeled output

examples (red, blue) to learn a classifier for determining whether a new unlabeled data point is red or blue. Semi-supervised learning methods (center) have access to both

unlabeled and labeled examples; the unlabeled data help learn a classifier with fewer labeled examples. Unsupervised learning methods (right) do not use labels but, rather,

identify structure present in the data. (Reproduced with permission from Luke Melas-Kyriasi.)


3827 Machine Learning and Augmented Intelligence in Clinical Medicine CHAPTER 488

output from input. The goal is to learn robust functions that work

well with unseen data. If this setting is familiar, it is because clinical

researchers often use well-known traditional statistical approaches like

linear and logistic regression to achieve the same goal. For example,

clinical risk scores, such as the American College of Cardiology (ACC)/

American Heart Association (AHA) Atherosclerotic Cardiovascular

Disease (ASCVD) Pooled Cohort Risk Equations or the Framingham

Risk Score, are based on fitting models with paired input data (e.g.,

age, sex, LDL cholesterol, smoking history) and labeled output data

(e.g., first occurrence of nonfatal myocardial infarction, coronary

heart disease [CHD] death, or fatal or nonfatal stroke). Contemporary

deep-learning methods can learn flexible representations of raw input

data as opposed to relying on expert-identified features (Table 488-1).

A contemporary clinical example of supervised machine learning with

convolutional neural networks is the histopathological detection of

lymph node metastases in breast cancer patients (Table 488-1).

The three remaining types of machine learning have not been

as widely applied to clinical problems to date, but we believe this is

likely to shift in coming years. These include unsupervised learning,

semi-supervised learning, and reinforcement learning. Unsupervised

learning, in contrast to supervised learning, encompasses methods that

use unlabeled input data, where the goal is to discover the “structure”

present in the data. A researcher may use unsupervised methods to

determine whether or not the data lie on a low-dimensional “manifold”

that is “embedded” in a higher-dimensional space. For example, a

researcher may obtain gene-expression measurements from more than

20,000 protein-coding genes for a large group of asthma patients and

then “project” each patient into a lower-dimensional space to visualize

and understand structure present in the dataset, or may group asthma

patients by similarity across all gene-expression values. Classical linear

methods include principal component analysis (PCA) and contemporary nonlinear approaches include uniform manifold and approximation and projection (UMAP) (Table 488-1). Semi-supervised learning

is a hybrid between supervised and unsupervised learning, with methods that use both labeled data and unlabeled data. These algorithms

exploit the (often low-dimensional) structure of unlabeled data to

learn better models than may be possible in a purely supervised setting

where labeled data may be scarce. Finally, reinforcement learning is a

distinct subfield of machine learning that focuses on optimizing the

iterative decision-making of an “agent” that is equipped with a cumulative “reward” function, and thus must navigate a trade-off between

exploration and exploitation of its environment, distinct from the other

three subfields of machine learning where the entire learning signal

(i.e., dataset) is presented at once. This approach to learning has been

successful in teaching computers to play games at world-expert levels

(e.g., Google’s AlphaGo Zero).

■ MODERN MEDICAL MACHINE LEARNING

The modern machine-learning toolkit includes methods that differ

extensively in their complexity and ability to learn directly from raw

data (Table 488-2). “Traditional” statistical methods, such as linear

and logistic regression, remain vital and often serve at minimum as

TABLE 488-1 Types of Machine Learning and Clinical Examples

SUBFIELD OF MACHINE

LEARNING DEFINITION CLASSIC EXAMPLE

CONTEMPORARY

EXAMPLE CLINICAL EXAMPLE

Supervised learning Methods that use paired input and labeled

output examples to learn a generalizable

function that predicts output labels from input

Logistic regression Convolutional neural

network (CNN)

Histopathological detection of lymph

node metastases in breast cancer

patients

Unsupervised learning Methods that use unlabeled input data to

discover data structure and learn efficient

representations for data (e.g., clustering,

dimensionality reduction)

Principle components

analysis (PCA)

Uniform Manifold

Approximation and

Projection (UMAP)

Visualizing structure in gene

expression levels and grouping

asthma patients into distinct

molecular clusters

Semi-supervised learning Methods that use both unlabeled and labeled

examples to learn functions better than possible

in supervised setting alone

Self-training Consistency

regularization

Use of unlabeled cardiac magnetic

resonance images alongside small

dataset of labeled examples to detect

hypertrophic cardiomyopathy

Reinforcement learning Methods to teach an “agent” that iteratively

interacts with its environment how to optimize a

numerical reward

Optimal control Deep reinforcement

learning (e.g., AlphaGo

Zero)

Selecting fluids and vasopressor

dosing iteratively to manage sepsis for

patients in the intensive care unit

TABLE 488-2 Select Tools in the Modern Medical Machine-Learning Toolkit

METHOD DEFINITION NOTES

Linear and Logistic

Regression

Models a linear relationship between predictors

and either a continuous or binary outcome variable;

“traditional” statistical modeling

Necessary baseline. In small carefully curated clinical datasets, these methods

often perform on par with more sophisticated methods

Gradient-Boosted Trees Ensemble of “decision trees” with parameters optimized

to efficiently learn accurate nonlinear functions in

supervised setting

Efficient to train and often performs well on machine-learning tasks with

tabular data

Convolutional Neural

Network (CNN)

Specialized deep-learning architecture with groups of

neurons (“convolutional filters”) that exploit structure

State-of-the-art in computer vision; de facto standard for medical imaging tasks

(e.g., U-Net architecture for biomedical image segmentation)

Transformer Models Deep-learning architecture designed for mapping input

sequences to output sequences (e.g., text, speech)

Variants include Bi-directional Encoder Representations from Transformers

(BERT), Generative Pre-trained Transformer 3 (GPT-3); state-of-the-art for tasks in

natural language processing and machine translation

Generative Adversarial

Network (GAN)

Deep-learning framework consisting of two networks

that compete to better learn the “generative model”

underlying training examples

Performs well on image-to-image translation tasks; can create realistic synthetic

data, art, style transfer

Uniform Manifold

Approximation and

Projection (UMAP)

Dimensionality reduction technique to visualize and

identify low-dimensional structure of high-dimensional

dataset while preserving global structure

Nonlinear technique; many other techniques exist, e.g., principal component

analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE)

Transfer Learning Family of approaches to adapt models trained for one

task and apply to another task (across domains)

Useful for “jump-starting” a model for a new problem, e.g., many medical

computer-vision models start with a network trained for a separate (often

nonmedical) task such as ImageNet and may “fine-tune” for a particular medical

application


3828 PART 20 Frontiers

useful and interpretable baselines, but often much more. Generalizations of these approaches as well as many other methods have been

developed to learn complicated functions including highly nonlinear

relationships. For example, models such as gradient-boosted trees (Table

488-2) often achieve excellent performance with tabular data that lack

the spatial or temporal structure that newer deep-learning methods

can exploit.

Modern deep-learning models can learn rich and flexible representations of raw clinical data. The building blocks of these models

are simple artificial “neurons,” often arranged into layers (Fig. 488-2).

Each neuron accepts input from neurons in a preceding layer, computes a weighted sum of these inputs, and applies a nonlinear function

to the weighted sum (Fig. 488-2).

The values of the adjustable weights between neurons are learned

during the training process. Neurons may be arranged into hierarchical layers that build an increasingly rich representation of input data.

For example, convolutional neural networks (CNNs) are specialized

architectures that combine groups of neurons with the mathematical

operation of convolution (“convolutional filters”) to exploit spatial

structure (Table 488-2). Initial layers learn low-level features (e.g.,

edges) and then build to higher layers that learn motifs and objects,

creating a powerful representation for discriminating between inputs

using output labels (Fig. 488-3).

Modern neural networks may have many millions and even billions

of weight parameters. For example, the VGG-16 network by Simonyan

and Zisserman, a path-breaking architecture and one of the top models

in the ImageNet Large Scale Visual Recognition Challenge in 2014,

had approximately 140 million parameters. The U-Net architecture

by Ronneberger and colleagues, a convolutional neural network, is

used frequently for biomedical image segmentation and other medical

imaging tasks. Other deep-learning models have architectures tailored

for distinct tasks and data types including sequence-to-sequence

and Transformer models, designed for sequential data, including

unstructured text notes from electronic health records (Table 488-2).

Generative adversarial networks (GANs) feature an architecture with

two co-trained competing (“adversarial”) networks and exhibit what

many would describe as artistic talent or creativity. Finally, many

unsupervised dimensionality reduction techniques have recently been

introduced to discover nonlinear structure including, for example,

Uniform Manifold Approximation and Projection (UMAP), though

linear (e.g., PCA) and nonlinear (e.g., t-SNE) alternatives exist as well

(Table 488-2). In many deep-learning applications, it is important to

note that practitioners often do not train networks tabula rasa but

j

1

j

2

j

3

j

4

j

5

w1

w2

w3

w4

w5

ReLU (Σwi × j

i

)

5

i = 1 k

FIGURE 488-2 The artificial neuron, the building block of deep learning models.

Neuron k accepts weighted input from neurons in the preceding layer and applies

a function (e.g., ReLU(x) = max(0,x), the rectified linear function) to the weighted

sum of inputs (often with a bias term, not shown). During training, the weights are

iteratively refined to better fit the data.

Input Softmax2

Softmax1

Edges (layer conv2d0) Textures (layer mixed3a) Patterns (layer mixed4a) Parts (layers mixed4b & mixed4c) Objects (layers mixed4d & mixed4e)

Softmax0

3a 3b 4a 4b 4c 4d 4e 5a 5b

FIGURE 488-3 Deep-learning models learn rich hierarchical representations. Visualization of what a convolutional neural network “sees” as it is processing images of

common objects from the ImageNet dataset. The initial layers learn low-level features like edges and the higher layers learn patterns and objects. (Reproduced from distill.

pub by C Olah et al: Feature visualization. Distill, 2017 https://distill.pub/2017/feature-visualization/.)


3829 Machine Learning and Augmented Intelligence in Clinical Medicine CHAPTER 488

instead benefit substantively from transfer learning (Table 488-2),

where the weights may be initialized or “pretrained” from another task.

■ PRACTICAL CONCEPTS IN TRAINING A MODERN

MACHINE-LEARNING MODEL

In this section, we provide a brief overview of practical concepts

that emerge when training a modern machine-learning model to

help readers understand the constraints and choices that confront

machine-learning practitioners.

Often the first task is to define what a “good” prediction is by specifying a “loss function,” which quantifies the error between a model’s

prediction and the true label (Table 488-3). There are many choices

for this function. Linear regression uses quadratic loss; other examples

include the cross-entropy and 0–1 loss. Given a loss function, the

stochastic gradient descent method coupled with the backpropagation algorithm quantifies how to alter the adjustable weights in the

network to optimize the loss function iteratively as labeled examples

are provided to the network, either one by one or in batches. The

weights themselves are often initialized carefully and frequently transferred over from another task. Most machine-learning practitioners

use specialized hardware called graphics processing units (GPUs) to

perform these calculations. As with most computer hardware, GPUs

range dramatically in performance and cost. Software frameworks like

PyTorch and TensorFlow automate much of the otherwise-cumbersome training process, abstracting away in a dozen lines what might

have previously taken a team of machine-learning engineers months

to build. The machine-learning community places great emphasis on

generalization through near-universal practices such as widely available benchmark datasets, and splitting training and testing data, with

performance measures such as the area under the receiver operating

curve (AUC) computed in held-out test data to improve estimates

(Table 488-3).

APPLICATIONS OF MODERN MACHINELEARNING TO CLINICAL MEDICINE

Two major classes of applications have dominated recent machine-learning

applications in medicine: computer vision and natural language processing. We review some of the recent applications below, highlighting

the breadth of challenges across clinical specialties and some emerging

new directions.

■ COMPUTER VISION

Medical computer-vision applications, particularly those employing convolutional neural networks, have dominated the medical

machine-learning literature over the past decade. Convolutional neural

networks are well suited to exploit the spatial structure in medicalimaging data and are able to learn detailed representations of input

images, yielding systems that often perform on par with or better than

expert physicians at select tasks. One of the most well-known medical

computer-vision applications during the past decade was published in

a paper by Gulshan and colleagues in the Journal of the American Medical Association (JAMA) in 2016. The authors trained a convolutional

neural network using 128,175 ophthalmologist-labeled retinal fundus

photographs to develop an automated diabetic retinopathy detection

system, achieving performance on par with expert ophthalmologists,

with an AUC of 0.99 in two separate validation datasets. In a separate

study by De Fauw and colleagues, 14,884 three-dimensional retinal

optical coherence tomography (OCT) scans were used to train a deeplearning model that could make referral suggestions and performed

with an accuracy at or superior to eight clinical experts who graded the

same scans, with an AUC in test data over 0.99 for urgent referral vs.

other referrals. Some machine-learning applications in ophthalmology

have already received approval from the U.S. Food and Drug Administration (FDA) including the IDx-DR “device” to classify more than

mild diabetic retinopathy.

Outside of ophthalmology, computer-vision applications have been

numerous across the many other specialties that rely on imaging

data. For example, dermatologist-level classification of skin cancer

was achieved in a study by Esteva and colleagues published in Nature

during 2017. The authors trained a convolutional neural network to

distinguish between keratinocyte carcinomas and benign seborrheic

keratoses as well as between malignant melanomas and benign nevi.

The authors concluded that the model performed at the level of the 21

board-certified dermatologists against which it was tested.

The uses of machine-learning models in radiology are numerous

as well, with applications including the detection of pneumonia from

chest x-rays, identification of pancreatic cancer from CT scans, and

fast, automated segmentation of cardiac structures from cardiac MRI,

as well as echocardiography.

Specialized deep-learning architectures such as the U-Net architecture by Ronneberger and colleagues have become especially popular

in the medical computer-vision community. Architectures are often

designed for specific imaging tasks (e.g., image segmentation) or specialized data types (e.g., three-dimensional images or videos). New frontiers

of computer-vision research in medicine include semi-supervised learning approaches to take advantage of extensive unlabeled data available

at hospitals, particularly given the practical difficulty and cost for an

individual researcher to obtain large expert-labeled datasets.

■ NATURAL LANGUAGE PROCESSING

Like computer vision, natural language processing (NLP) has been

transformed by modern machine-learning approaches, particularly

deep learning. Deep-learning approaches include recurrent neural

networks, newer sequence-to-sequence models, and the recently developed Transformer model (Table 488-2), which are well suited to exploit

the structure of text and natural language in both supervised and

unsupervised settings. These models have been successfully applied to

analyze physician notes in the electronic health record, detect depression symptom severity from spoken language, and scribe patientphysician visits. For example, a study by Rajkomar and colleagues

analyzed electronic health record data from 216,221 adult patients

to predict in-hospital mortality, 30-day unplanned readmission, and

discharge diagnoses amongst other outcomes, performing at high

accuracy, with an AUC of 0.93–0.94 for predicting in-hospital mortality. It is important to note that much of the progress in medical natural

language processing has stemmed from the widespread availability of

datasets, for example the Medical Information Mart for Intensive Care

(MIMIC) database.

Many specialized deep-learning architectures have been developed

for natural language processing applications, including the analysis of

electronic health record data, using both supervised (e.g., recurrent

neural network) and unsupervised (e.g., variational autoencoder)

approaches. Domain-specific language representation models have

TABLE 488-3 Practical Concepts for Training a Deep-Learning Model

CONCEPT DEFINITION EXAMPLES

Loss Function Mathematical function that

quantifies discrepancy between

predicted label and true label

Cross-entropy,

quadratic, 0–1

Back-propagation Algorithm to compute the

direction of the loss function

with respect to changes in the

adjustable weights (“gradient”)

Graphics Processing Unit

(GPU)

Specialized computer hardware

to speed up the many matrix

calculations involved in training

a neural network

NVIDIA Tesla V100

Train-test Split How data are divided to

ensure fair estimates of model

performance after training

70% training,

30% test

Area Under the Receiver

Operating Characteristic

Curve (AUC)

Common performance metric for

evaluating binary classification

models. 0.5 = random,

1.0 = perfect

Deep-Learning

Framework

Computational framework for

efficiently performing matrix

(tensor) calculations for training

deep-learning models

TensorFlow,

PyTorch, Keras


3830 PART 20 Frontiers

been developed for the purpose of biomedical text mining, serving as

a substrate for many downstream natural language processing tasks.

These include, for example, the BioBERT model by Lee and colleagues,

published in 2019, which adapts the Bi-directional Encoder Representations from Transformers (BERT) model (Table 488-2) for biomedical

text mining.

■ OTHER APPLICATIONS

While medical computer vision and natural language processing tasks

have been the focus of newer deep-learning models due to the extensive structure of imaging and text data, many other application classes

exist. For example, cardiologist-level performance has been achieved in

deep-learning approaches for detecting arrhythmias from ambulatory

electrocardiograms, standing in contrast to the rule-based algorithms

used traditionally to interpret electrocardiogram signals. In genomics,

investigators have analyzed tumor genomes with machine-learning

methods to better predict survival using both deep-learning and other

machine-learning approaches. Machine-learning methods have also

been used to characterize the deleteriousness of single-nucleotide

variants in DNA. Many other applications of machine learning to

new patient data streams are emerging, for example machine learning

applied to wearables (e.g., smartwatches).

MACHINE LEARNING AND PRECISION

MEDICINE: DEEPER REPRESENTATIONS

OF PATIENT STATE

The modern machine-learning methods described in this chapter have much in common with the concept of precision medicine.

As described in a report published by the National Academies of

Sciences, Engineering, and Mathematics in 2011, precision medicine

refers to “the ability to classify individuals into subpopulations that

differ in their susceptibility to a particular disease, in the biology and/

or prognosis of those diseases they may develop, or in their response

to a specific treatment.” This vision calls for the development of an

“Information Commons,” a patient-centric view of heterogeneous

data streams (e.g., genome, environmental exposures, clinical signs

and symptoms, transcriptome) that together paint a complete picture

of an individual’s health.

The machine-learning methods described in this chapter similarly

operate on a heterogeneous and rich set of data to improve both the

predictive abilities of physicians as well as the understanding of disease

structure within and across data types. Modern machine-learning

methods are especially well suited to improve the representation of a

patient’s clinical state, identity, and environmental context in order

to improve individualized medical decision-making, learning datadriven “embeddings” of a patient’s clinical state and identity in the

process (Fig. 488-4). Machine learning and precision medicine can

thus be seen as aligned disciplines where flexible and powerful learning

algorithms combine with rich and detailed data to augment clinical

decision-making.

CONCLUSION

Modern machine learning offers a powerful set of techniques to learn

feature representations directly from data, already performing on par

with expert physicians on select tasks. If carefully trained and judiciously applied to key areas of clinician workflow, the representational

power of new machine-learning methods makes them likely to touch

every area of clinical practice.

Genome Blood & Urine Laboratory Biomarkers Exposures

Embeddings

of Identity and

Clinical State

Gold-Standard

Diagnoses

Output Diagnoses

Cross-Entropy Loss

FIGURE 488-4 Machine learning and precision medicine: deeper representations of clinical state and identity. Diverse data streams (e.g., genome; blood and urine

biomarkers, including triglycerides and LDL cholesterol; and exposures) are combined alongside labeled output diagnoses in a deep-learning model. In the process of

training this model, a lower-dimensional representation of the high-dimensional input data (“embedding”) is learned.


3831 Metabolomics CHAPTER 489

■ FURTHER READING

Gulshan V et al: Development and validation of a deep learning

algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316:2402, 2016.

Krizhevsky A et al: Classification with deep convolutional neural

networks. Advances in neural information processing systems. 25

(NIPS’2012). 1, 2012.

Lecun YA et al: Deep learning. Nature 521:436, 2015.

Obermeyer Z, Emanuel EJ: Predicting the future - big data, machine

learning, and clinical medicine. N Engl J Med 375:1216, 2016.

Olah C et al: Feature visualization. Distill, 2017. https://distill.

pub/2017/feature-visualization/.

Rajkomar A et al: Machine Learning in Medicine. N Engl J Med

380:1347, 2019.

Ronneberger O et al: U-Net: Convolutional networks for biomedical

image augmentation, in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Springer International

Publishing, 2015, pp. 234–241.

Silver D et al: Mastering the game of Go without human knowledge.

Nature 550:354, 2017.

Szolovits P, Pauker SG: Categorical and Probabilistic Reasoning in

Medical Diagnosis. Artif Intell 11:115, 1978.

Topol EJ: High-performance medicine: the convergence of human and

artificial intelligence. Nat Med 25:44, 2019.

Metabolism, loosely defined, represents the sum of all biochemical

reactions involving small molecules with a molecular mass of ≤1000

Daltons within a given tissue, cell, or fluid. These small molecules are

collectively referred to as metabolites and are involved in the biochemical processes used to create macromolecules and fulfill the evolving

energy needs of a cell or organism. Metabolomics, then, represents

the measurement of metabolites, either qualitatively or quantitatively,

often as a way to gain insight into the metabolism of a cell, tissue, or

organism. No one experimental approach can characterize metabolism

in its entirety; metabolomics instead strives to measure a portion of

the metabolome, which consists of all metabolites in a given biological

sample at a given time.

A link to a time-specific context is common to all “-omics” techniques, but is particularly important in metabolomics. As metabolic

processes are highly connected and interdependent, with individual

metabolites often being involved in multiple pathways, levels of a

specific metabolite can vary in response to an alteration in either the

production or the consumption of that metabolite. Because significant

changes in metabolite levels can occur over a very short time frame,

the levels measured can be sensitive to perturbations either upstream

or downstream of a metabolite in a pathway. This sensitivity can make

measurement challenging, but it also makes metabolomics a powerful

tool with which to assess either acute or chronic changes in cells or

tissues. Indeed, the metabolome can be quite dynamic and reflective

of the current condition of the material being assessed, as it ultimately

represents an integration of outputs from the genome, epigenome,

transcriptome, and proteome (Fig. 489-1).

489 Metabolomics

Jared R. Mayers, Mathew G. Vander Heiden

APPROACHES AND SAMPLING

CONSIDERATIONS

■ UNTARGETED AND TARGETED METABOLOMICS

There are two distinct approaches to measuring metabolites in biological materials: untargeted and targeted metabolomics. These strategies

differ in whether a predetermined subset of metabolites is intentionally sought in a sample, with the choice of approach dictated by the

question under investigation. Regardless of the method utilized, it

is important to recognize that no single metabolomics technique is

comprehensive. Technical considerations heavily influence metabolite

measurement, and no one method is able to capture the entire metabolome. In this respect, metabolomics contrasts with some other -omics

techniques, like genomics or transcriptomics—i.e., in metabolomics, if

something is not measured, its absence cannot necessarily be assumed.

Untargeted Metabolomics Untargeted metabolomics is the comprehensive analysis of all the measurable analytes in a sample, irrespective of their identity (Fig. 489-2). Among the benefits of this

approach is that it is agnostic in its measurement of the metabolome.

Thus, it allows for the discovery of novel or unexpected molecules for

further study. Coverage of the metabolome in an untargeted approach

is influenced by the techniques used for sample preparation, metabolite separation prior to detection, and the inherent sensitivity and

specificity of the analytical technique(s) employed (see “Metabolomics

Technologies,” below).

A major drawback of untargeted metabolomics is that molecules

of interest can be measured with less confidence or missed entirely

because this approach carries an inherent bias toward the detection

of high-abundance molecules. Handling and interpretation of data

also represent a major challenge, as each sample run generates large

amounts of data whose analysis can be both complicated and time

consuming. Identifying each metabolite measured requires database

searching, and further experimental investigation is often needed to

confirm the exact identity of a signal of interest. Finally, in most cases,

this technique yields only relative metabolite quantification, thereby

rendering it most useful for comparisons between biological samples.

Targeted Metabolomics Targeted metabolomics involves the

measurement of a predefined group of chemically characterized

metabolites—typically dictated by a hypothesis or predetermined

platform—with the aim of covering a select portion of the metabolome.

The metabolites measured represent only a subset of those that would

be measured by an untargeted approach; thus a targeted approach

generates a much smaller data set in which individual metabolites

are detected with higher confidence (Fig. 489-2). Because the identity of each signal is known in advance, standards can be added to

provide absolute quantification of each metabolite measured in the

sample, although the use of targeted metabolomics to compare relative metabolite levels across samples is common. In addition, sample

preparation and chromatographic separation before measurement can

be optimized to improve detection of specific metabolites, enabling

assessment of less abundant molecules.

The key downside of targeted metabolomics is that information is

gained about only those metabolites targeted by the analytical method.

■ SAMPLING CONSIDERATIONS

Regardless of the approach used, it is critical to consider potential

sources of error that can influence the conclusions drawn from a

metabolomic analysis. Because of the dynamic nature of the metabolome, numerous biological confounders inherent to the samples themselves can affect levels of the metabolites measured. For this reason,

Genome Epigenome Transcriptome Proteome Metabolome Phenotype

FIGURE 489-1 The metabolome is downstream of the outputs measured by other “-omics” technologies. Thus, the state of the metabolome can more closely reflect clinical

and experimental phenotypes.


3832 PART 20 Frontiers

the inclusion of controls or reference populations to account for these

confounders can be critical for interpretation of the data. Established

biological confounders for patient-derived material include age, sex,

body mass index, fasting status and/or dietary differences, and comorbid conditions such as diabetes or smoking. For example, metabolites

commonly altered with respect to aging are those in antioxidant and

redox pathways as well as breakdown products of macromolecules. Sex

differences influence a number of different metabolites, most prominently those involved in steroid and lipid metabolism. Perhaps it is

not surprising that diet can also affect the metabolome, and fasting has

been shown to impact almost every category of metabolite frequently

measured in biological fluids.

Differences in sample handling and processing also influence

metabolite measurements. Work using metabolomics to analyze

material from large prospective cohort studies has shown that changes

in metabolite levels introduced by sample handling can lead to falsely

positive associations between specific metabolite changes and disease

risk. Specific considerations include the large geographic area of distribution from which patients are drawn—e.g., a sample, such as blood, is

collected locally and then exposed to variable conditions before being

sent to a central lab for further processing. Moreover, because of the

costs associated with obtaining and storing samples, often only one

sample is available for each individual.

Time is a key variable in metabolite measurements, and efforts to assess

the impact of sample handling and processing have led to improved

analysis pipelines. For example, comparison of metabolites measured

in samples undergoing immediate versus delayed processing can provide

insight into those metabolites most affected by pre-processing

storage under varying conditions. More specifically, because metabolism occurs on a very rapid time scale, some metabolite levels will

continue to change after collection even if the sample is stored on ice.

Therefore, metabolism is ideally halted or “quenched” immediately

via rapid freezing or chemical extraction, but practical considerations

involved in the collection of material from patients can sometimes

make quenching impossible.

Sequential metabolomic analyses of the same type of biological

material from a patient can explore how metabolite levels vary over

time in individuals. It is interesting that, when measured, many metabolites are found to be relatively stable. However, the extensive variability exhibited by some metabolites indicates that findings involving

those metabolites should be interpreted with caution.

Finally, the method of sample processing can affect which metabolites are extracted from the material and thus influence what is

measured.

Targeted

metabolomics

Untargeted

metabolomics

FIGURE 489-2 Untargeted metabolomics strives to measure as much of the metabolome as possible within a given biological sample, whereas targeted metabolomics

focuses on measuring a predetermined subset of the metabolome. In untargeted metabolomics, a large number of signals corresponding to metabolites is generated, and

further investigation is often necessary to assign a particular signal to a specific metabolite. Targeted metabolomics allows investigators to definitively measure signals

that correspond to specific metabolites of interest.

METABOLOMICS TECHNOLOGIES

Metabolomics relies heavily on the intersection of instrumentation,

software, and statistical and computational approaches for measurement of metabolite levels and downstream data analysis. While the

development of new and emerging techniques to assess the metabolome is ongoing, the current, clinically applicable approaches can

be separated into two broad categories: nuclear magnetic resonance

(NMR)–based approaches and chromatography/mass spectrometry

(MS)–based approaches. Each of these two approaches has its own set

of advantages and disadvantages.

■ NUCLEAR MAGNETIC RESONANCE

NMR is a technique that, at its core, exploits intrinsic magnetic properties of atomic nuclei to generate data. Nuclei with an odd total number

of protons and neutrons (such as 1

H, 13C, 15N, and 31P) have a non-zero

spin, and this spin generates a magnetic field that can interact with

externally applied electromagnetic fields. NMR places compounds

into a magnetic field that induces the smaller magnetic fields to align

with the larger one. Samples are then exposed to a perpendicular electromagnetic field; the frequency of electromagnetic radiation needed

to flip the spin of a nucleus in the exact opposite direction represents

the frequency at which an atom “resonates” and can be measured. The

resonance frequency of a given atom is affected by adjacent atoms

and is ultimately unique for a given arrangement of atoms (i.e., each

metabolite). This distribution or “spectrum” of signals is measured and

recorded in an NMR experiment.

With respect to clinical applications, the primary benefits of

NMR-based approaches are that they are nondestructive and can be

performed on living samples, such as patients, cells or tissues. They

are also highly reproducible and require minimal sample preparation.

Measurements are necessarily quantitative as the signal measured

directly reflects concentration. These features ensure that multiple,

comparable measurements can be made in a given sample either at

a single point in time or across time. In addition, given that spins of

different elements require sufficiently disparate resonance-inducing

radio frequencies in order to be entirely distinguishable, multiple

elements can be assessed in a sample; this feature allows multidimensional cross-referencing of signals such as hydrogen and carbon. In an

untargeted analysis, these multidimensional data can then be used

for definitive metabolite identification, with comparison of results to

known databases where spectra for many metabolites in the human

metabolome have been systematically recorded.

Despite all these benefits, the primary challenge of NMR-based

approaches is a lack of sensitivity. Because the time required to detect


3833 Metabolomics CHAPTER 489

a signal is proportional to concentration, assessment of less abundant

species is impossible or impractical. For example, while a typical NMRbased metabolomics analysis will return data on up to a couple of hundred metabolites at concentrations of >1 μM, the MS-based approaches

discussed below can distinguish more than 1000 metabolites at concentrations one to two orders of magnitude lower (Table 489-1).

■ CHROMATOGRAPHY/MASS SPECTROMETRY

A distinguishing feature of chromatography/MS–based approaches is

that a multistep process that destroys the material is necessary to generate a sample for analysis. In addition, each step of the sample preparation

process involves decisions that influence the metabolites measured at

the time of analysis. In general, once a sample to be analyzed is prepared,

that material is subjected to a combined chemical and temporal separation of compounds via chromatography, with the output delivered to a

device for performance of mass-based detection (technically, measurement of a mass-to-charge [m/z] ratio)—i.e., mass spectrometry. Finally,

data collected by the mass spectrometer are analyzed (Fig. 489-3).

Sample Preparation Although occasionally a part of NMR-based

metabolite detection protocols, MS-based approaches almost uniformly require an initial sample-preparation phase called extraction.

This technique destroys the original sample by partitioning metabolites into distinct immiscible phases, such as polar and nonpolar.

These phases are then mechanically separated and processed further

for analysis. Given the nature of this extraction process, it is critical to

determine in advance the general class of metabolites to be measured.

This information will help to determine the optimal extraction protocol for specific types of metabolites of interest and to shape further

downstream decisions regarding the chromatography/MS technique

that also influence metabolite detection. In addition, depending on

the metabolites to be analyzed and the method of separation and/

TABLE 489-1 Comparison of Nuclear Magnetic Resonance (NMR)–

Based and Mass Spectrometry (MS)–Based Approaches to Metabolomic

Analyses

FEATURE NMR MS

Reproducibility High Lower

Sensitivity Low (low μM) High (low nM)

Selectivity Untargeted Targeted >> untargeted

Sample preparation Minimal Complex

Sample measurement Simple: single prep Multiple preps

Metabolites per sample 50–200 >1000

Identification Easy, using 1D or 2D

databases

Complex; need standards

and additional analyses

Quantitation Inherently quantitative;

intensity proportional to

concentration

Requires standards

because of varying

ionization efficiency

Sample recovery Easy, nondestructive No

Living samples Yes No

or analysis used, extracted samples sometimes are processed further

in a preparative step called derivatization: extracted metabolites are

chemically modified by the addition or substitution of distinct, known

chemical moieties that facilitate separation or detection of types of

metabolites. By changing the chemical properties of metabolites, derivatization may improve stability, solubility, or volatility or facilitate

separation from closely related compounds, enhancing measurement

of specific metabolites.

Chromatography Chromatography is a ubiquitous approach used

in chemistry for the separation of complex mixtures. The mixture of

interest in a mobile phase is passed over a stationary phase such that

compounds in the mixture interact with the stationary phase and

transit through that stationary phase at different speeds, allowing their

consequent separation. Two general types of chromatography are typically used in metabolomics.

LIQUID CHROMATOGRAPHY Liquid chromatography–mass spectrometry (LC-MS) is the most commonly used approach in MS-based

metabolomics. In this case, chromatography is characterized by a

mobile phase that is a liquid and a stationary phase that is a solid. In

liquid chromatography in particular, the choice of the solid and liquid

phases can dramatically influence the types of compounds separated

for input into the mass spectrometer. In general, LC-MS metabolomics

is highly sensitive and versatile in allowing detection of a broad range

of metabolites. A downside, however, is variability in exact separation

timing, especially between different instruments; which metabolites

are measured is impacted by the chromatography used and how well

molecules are separated.

GAS CHROMATOGRAPHY Gas chromatography–mass spectrometry

(GC-MS) involves chromatography in which the mobile phase is a gas.

In contrast to LC-MS, GC-based approaches have a narrower range

of applications because only volatile metabolites that enter a gaseous

phase are separated. When combined with appropriate derivatization,

GC-MS is a robust way to detect many organic acids, including amino

acids, and molecules of low polarity, such as lipids. GC-MS is more

reproducible than LC-MS across platforms and requires less expensive instrumentation and less specialized training, but it also typically

measures a much more restricted range of metabolites in a sample than

does LC-MS.

Mass Spectrometry Once the metabolites in a sample have been

separated by chromatography, they are sent into the mass spectrometer

for analysis and measurement. The first step in this stage of the process

is to generate charged ions, as mass spectrometers measure compounds

on the basis of their m/z ratio. Charge can be imparted through various

techniques, although most commonly it is attained by either applying a

high voltage to a sample or striking it with a laser.

A number of different types of mass spectrometer can be employed

for metabolomics. Three of the most commonly available types are

discussed below.

TANDEM MASS SPECTROMETRY Tandem MS relies on three sets of

quadrupole magnets arranged in series. The power of this arrangement

Extraction Derivatization Chromatography Mass spectrometry

data analysis

FIGURE 489-3 Metabolite measurement by chromatography/mass spectrometry–based approaches involves multiple steps, and decisions made at each step influence

what is measured. First, metabolites are extracted from a biological sample in a manner that is destructive of the original sample. This process stops biochemical activity

and creates metabolite samples that can be analyzed, sometimes after a chemical derivatization step that alters a subset of metabolites in a manner that facilitates their

downstream analysis. Second, metabolites in the sample are separated via chromatography. Finally, the chromatographically separated compounds are analyzed by mass

spectrometry. Each signal detected corresponds to a metabolite’s characteristic mass per unit charge while the amplitude of that signal reflects the abundance.


3834 PART 20 Frontiers

lies in its specificity through two sequential mass analyses of the same

starting compound. In the first quadrupole, the “parent” or full ion

is measured before being bombarded by an inert gas in the second

quadrupole; this process fragments the compound into characteristic

smaller “daughter” ions. The third quadrupole then measures these

daughter ions.

TIME-OF-FLIGHT MASS SPECTROMETRY While there are multiple

types of time-of-flight (TOF) mass spectrometers, they all operate on

similar principles. Most simply, lighter metabolites travel faster and

heavier metabolites travel more slowly. TOF machines have high mass

accuracy and sensitivity while also acquiring data quickly.

ION TRAP MASS SPECTROMETRY Ion trap mass spectrometers, of

which the orbital trap is a subtype, offer perhaps the highest degree of

flexibility when it comes to MS-based metabolomics. In general, these

machines can select for a specific mass range of metabolites at multiple

levels, first by filtering with a single quadrupole and then by trapping

and accumulating metabolites of a particular mass or range of masses.

This accumulation can be applied to low-abundance compounds,

allowing increased sensitivity. It also allows repeated fragmentation

of metabolites (called MSn

) to produce characteristic “daughter” ions,

increasing the specificity of the analysis. Given this versatility coupled

with high mass accuracy, the development of these machines is advancing rapidly; however, access to the latest versions can often be limited

by cost.

CURRENT CLINICAL APPLICATIONS

Tests to assess small molecules are ubiquitous and well established

throughout medicine. These include assays to measure select metabolites of known clinical relevance, such as glucose, lactate, and ammonia.

Of note, many standard tests assess these metabolites one at a time;

however, metabolomics can allow the assessment of many metabolites

in a sample and provide more information on metabolic state at a given

point in time. In some cases, metabolomics is used to detect molecules

for which there is not a robust single analyte test or when multiple

species measured in a sample might provide new information. Here

we will focus specifically on several applications of metabolomics techniques in current clinical practice.

■ MAGNETIC RESONANCE SPECTROSCOPY

Magnetic resonance spectroscopy (MRS) is an adaptation of magnetic

resonance imaging (MRI), a widely used technology in clinical practice. MRI, at its core, is essentially proton (1

H) NMR with the resulting

data rendered spatially to generate an image. Recall that NMR is nondestructive and can be applied to living samples. MRS, then, is a capability built into almost every MRI machine. In practice, radiologists

can focus in on specific volumes of interest within a patient’s imaging

and perform additional sequences to obtain an NMR spectrum in that

space that can allow for the identification and quantification of specific

metabolites in that space. With this approach, a number of different

metabolites across diverse classes, including lipids, sugars, and amino

acids, can be measured at a given time.

Extensive work has correlated different biological processes with

altered levels and/or ratios of metabolites measured via MRS. One

well-established application is in the diagnosis of brain masses. More

specifically, N-acetylaspartate (NAA) is an amino acid derivative that

is abundant in neurons, whereas choline is a metabolite whose level,

as measured by MRS, correlates with cellularity and/or proliferation.

Thus, an increase in the ratio of choline to NAA (and even loss of NAA

signal entirely) correlates with cancer; tumors biologically are associated with the properties of increased cellularity from proliferation and

the concurrent exclusion of normal neurons. A different process—for

example, a brain abscess—does not result in increased choline levels

(which instead may actually decrease), but does exclude neurons,

resulting in an isolated NAA decrease. Metabolites such as lactate can

also be helpful, depending on the clinical context, in providing insight

into the metabolism of a tumor or identifying areas of early hypoxic

brain injury after a stroke. Finally, among the several amino acids that

can be measured, high levels of glutamine/glutamate can be helpful in

a patient with altered mental status as changes in these amino acids

are associated with hyperammonemia. (Glutamate serves as the central nervous system sink for ammonia, generating glutamine in the

process.)

■ NEWBORN SCREENING PROGRAMS

Newborn screening programs are used to identify diseases within the

first few days of life such that they can be treated or managed with early

intervention. Among the classes of disease targeted by newborn screening programs are many inborn errors of metabolism, which often lead

to changes in the levels of specific metabolites in blood or urine. One

of the first newborn screening programs tested for phenylketonuria,

which results from the inability to metabolize phenylalanine and causes

high serum and urine levels of particular metabolites. Since that time,

the panel used by programs throughout the United States and around

the world has expanded dramatically. The general protocol is to collect

a blood sample from infants in the first few days of life (often by heel

prick on a piece of paper). These samples are sent to a central lab for

analysis, which typically includes metabolomics measurements with

targeted LC–tandem MS. Specific inborn errors of metabolism are suggested by abnormal levels of a given metabolite or set of metabolites.

■ METABOLITE MEASUREMENTS IN CHILDREN

AND ADULTS

Outside the window of newborn screening, direct clinical measurement of metabolite levels is also used in pediatric and adult patients.

In these cases, clinical samples such as serum, cerebrospinal fluid, or

urine are typically subjected to targeted LC–tandem MS to measure

metabolites such as amino acids, acylcarnitines, and fatty acids. These

measurements can help diagnose milder cases of inborn errors of

metabolism that may have been missed by newborn screening. They

can also help identify secondary metabolic defects, such as those that

are related to nutritional deficiencies or are acquired in the setting of

additional pathology. For example, these measurements are useful in

determining the etiology of noncirrhotic hyperammonemia exposed

by a catabolic stressor such as sepsis in a patient with a previously

unknown subclinical or acquired urea-cycle defect.

MS-based metabolomics is used by various athletic organizations for

detection of metabolites associated with banned substances and by the

pharmaceutical industry for assessment of levels of pharmaceuticals

and their metabolites in both blood and tissues. Such analyses can

provide key pharmacokinetic information to guide drug dosing and

illuminate toxicology. These approaches can also be useful in clinical practice. For example, chronic pain and its management remain

a challenge, and the sequelae of opiate/opioid use and abuse are of

concern to many providers, their patients, and their patients’ families.

Therefore, many electronic medical records systems strive to ensure

appropriate and consistent patient access to pain medications, while

providers may need a means to ensure that patients are adhering to

their prescribed regimens. One way to monitor drug use is to perform

targeted LC–tandem MS for detection of specific drug metabolites in

patients’ urine. This approach is more sensitive than first-generation

immunoassays and can detect a range of metabolites associated with

other drugs beyond the one prescribed. Given that the first-generation

immunoassays also often rely on confirmatory MS testing, upfront

metabolomics reduces lab turnaround time and may also reduce costs

by limiting multiple tests on the same sample.

EMERGING AND EXPERIMENTAL CLINICAL

APPLICATIONS

The current clinical applications of metabolomics are largely limited

to the indications described above. However, many ongoing efforts are

aimed at expanding the use of metabolomics for detection of biomarkers that can help with disease diagnosis or prognostication.

■ METABOLITES AS BIOMARKERS OF DISEASE

There has been increasing work in prospective human cohort studies

on the use of metabolomics, primarily MS-based approaches, to empirically identify small groups of metabolites whose altered levels are


No comments:

Post a Comment

اكتب تعليق حول الموضوع

Popular Posts

Popular Posts

Popular Posts

Popular Posts

Translate

Blog Archive

Blog Archive

Featured Post

  ABSTRACT BACKGROUND: The incidence of venous thromboembolism (VTE; pulmonary embolism [PE] and/or deep vein thrombosis [DVT]) in Japan is ...