Editorial Views  |   January 2000
Use and Abuse of Neonatal Neurobehavioral Testing
Author Notes
  • Associate Professor of Anesthesia
  • Harvard Medical School
  • Brigham and Women’s Hospital
  • Boston, Massachusetts
  • Professor Emeritus of Pediatrics
  • Harvard Medical School
  • Children’s Hospital
  • Boston, Massachusetts
Article Information
Editorial Views
Editorial Views   |   January 2000
Use and Abuse of Neonatal Neurobehavioral Testing
Anesthesiology 1 2000, Vol.92, 3. doi:
Anesthesiology 1 2000, Vol.92, 3. doi:
Accepted for publication September 14, 1999.
ONE of the primary concerns of obstetric anesthesia is its safety for both mother and neonate. Much has been written about this issue, in particular consequences for the neonate. Clinical and laboratory measurement scales, including Apgar scores, 1 umbilical blood venous and arterial acid-base balance analysis, 2 and neonatal neurobehavioral testing scales, 3 have been developed to assess neonatal well-being. In 1982, a report by Amiel-Tison et al.  4 was published in ANESTHESIOLOGY that described an assessment scale called the Neonatal Neurologic and Adaptive Capacity Score (NACS). The NACS was proposed as a simple, noninvasive, quick neurobehavioral examination to assess subtle effects of drugs on neonates and to distinguish such drug effects from birth trauma, perinatal asphyxia, or neurologic disease. This publication was accompanied by a critical editorial that claimed the test to be deficient as a valid research instrument. 5 It has now been almost 20 yr since the NACS was described, and initial criticism notwithstanding, it has been widely embraced by the obstetric anesthesia community and used worldwide by investigators examining neonatal effects of peripartum medications. In this issue of ANESTHESIOLOGY, Brockhurst et al.  6 conduct a systematic review of the NACS in obstetric anesthesia research and conclude that the reliability and validity of this test has still not been established. Here we examine this issue in greater depth.
Why has the NACS become so popular? The answer: Simplicity. The test is easy and quick (< 5 min per examination), it can be performed with minimal training, it is non-noxious (thus easily performed in the presence of parents), and lends itself to simple statistical analysis. More traditional measures for neonatal performance, such as the Brazelton Neurobehavioral Assessment Score, 3 require approximately 20 min for a trained examiner to perform; include a large number of items (or clusters), each scored on a nine-point scale; and include statistical analysis that can be complex. Many studies using the Brazelton Neurobehavioral Assessment Score also include testing at age 14 and 30 days, allowing for integration into a variety of infant developmental paradigms. 7 This is virtually never performed with the NACS. In contrast, the NACS has 20 items, each scored as 0, 1, or 2, for a total possible score of 40. Individual items are summed, and a single score is assigned to the neonate. No special training or certification is required to perform the NACS. This enticing simplicity was part of the editorialist’s original concern in 1982:“For such an instrument, speed of administration is hardly the primary concern (should it be clinically?), but rather its ability to find or not to find effects of the variables of concern on the functioning neonate.”5 As noted by Brockhurst et al.  , virtually all of the studies using the NACS show no differences between groups of infants. In the few studies in which differences are noted, the circumstances are such that they would be expected to occur and expected to be obvious, e.g.  , general versus  regional anesthesia for cesarean delivery. Studies using the NACS to assess neonatal effects of maternally administered local anesthetics or opioids for either vaginal or cesarean delivery have yielded inconsistent results, frequently showing no differences between groups or differences that may be questioned on statistical grounds.
Why was the test so controversial? It is noteworthy that the original publication of the NACS was accompanied by not one, but two editorials. One editorial by a researcher prominent in infant developmental psychology criticized the NACS as being statistically flawed, improperly conceived, overly simplistic, and inappropriate as a research tool. 5 The other editorial, by the then Editor-in-Chief of ANESTHESIOLOGY, John Michenfelder, lamented the difficult position of an editor considering a manuscript for which there are widely varying recommendations by the editorial review board. 8 Michenfelder noted that outright rejection might result in premature condemnation, whereas publication requires that the readers be informed of the limitations of the work. He concluded that “determination of the validity, sensitivity, and merits of the examination will follow.”7 In other words, punt—let the chips fall where they may, and challenge the scientific community to determine if the initial criticisms were valid. Brockhurst et al.  conclude that such validation is still lacking despite widespread use of the NACS examination, and we concur. Widespread use of a test is not evidence of validity, and investigators should use caution and discretion in interpreting its results. Moreover, in some instances, this test has been used (or misused) on the assumption that validation has been established. In our opinion, this misuse has resulted in some interesting conclusions, examples of which follow.
Consider the definition of a “normal” NACS result. The authors of the original article on the NACS arbitrarily claim that a score of ≥ 35 (of a possible 40) is “normal.”4 They also acknowledge that validation of this figure requires additional data. Such data do not exist. No study to our knowledge has correlated specific NACS results, a score of 35 or otherwise, with any other measure of neonatal or early childhood performance. The consequences— either short-, medium-, or long-term—for neonates scoring, e.g.  , 25, 30, 35, or otherwise on the NACS are not known. A recent study compared the effects of labor epidural analgesia using ropivacaine versus  bupivacaine on neonatal outcome. 9 The NACS was performed on all infants at 2 and 24 h after birth; the results were analyzed by a comparison of median scores and a comparison of number of infants with scores >versus  < 35. No differences in median NACS were noted at 2 or 24 h, but there were more infants at 24 h (not at 2 h) with NACS > 35 in the ropivacaine group. Based on this finding, advertisements for obstetric use of ropivacaine claim better neonatal performance versus  bupivacaine. In light of no meaningful justification for a NACS of 35 as an appropriate measure of “normality” and no difference in median NACS at any time in that metaanalysis, this claim must be viewed with caution:caveat emptor  .
Now consider the analysis of individual portions of the NACS. An overall score of 30 or 35 or 38 does not reveal which items resulted in lost points, just as an Apgar score of 6 or an American Society of Anesthesiologists Physical Status classification of III does not reveal the specifics of the underlying abnormalities. Very few studies using the NACS report individual subscores; usually only the total NACS is reported. In that the NACS has items related to habituation, active tone, passive tone, and reflexes, it may be useful to know which items, if any, are consistently affected by any perinatal intervention. Such subgroup analysis might allow the NACS to differentiate drug effects from insults such as birth trauma or perinatal asphyxia. Nonetheless, the original report on the NACS 4 does not tell how such distinctions are to be made, and Brockhurst et al.  note that we still do not know how to use the NACS to make such distinctions. Consider a recent publication claiming that epidural analgesia reduces the efficacy of breast-feeding. 10 
This diatribe against epidural analgesia assumes (based on no data and no specific examples) that even infants scoring in the “normal” range (as if we know what normal is) on neurobehavioral tests may have specific subgroup deficiencies that could impair breast-feeding. A curious finding indeed, because so few studies actually report subgroup scores on the NACS. Moreover, the evidence that epidural analgesia actually has any effect on breast-feeding outcomes is nothing more than anecdotal at best. As the author of that article readily admits, no studies examined breast-feeding specifically as an outcome correlated with intrapartum analgesia. Rather, the admonition against epidural analgesia is based on a conjecture about what might occur if certain items are depressed—despite not knowing which items these are and if depression of any specific items (such as muscle tone), transiently or otherwise, actually has any effect on breast-feeding. Again, caveat emptor  .
What can one conclude? Babies are complex and subject to a constellation of parental, socioeconomic, and environmental factors that have the potential to modify any intrauterine effects that may have occurred. To hope that any one assessment tool (e.g.  , an Apgar score, acid-base balance, or, in this context, neurobehavioral testing) can predict developmental outcome (e.g.  , breast-feeding success, early parental bonding, and growth, or later outcomes such as learning difficulties, behavioral problems, school performance, intelligence quotient, or even adult personality qualities) is overly optimistic. A statistical adage is relevant here: A statistically significant difference is only a difference if it makes a clinically important difference. One must first show, in a scientifically rigorous manner, that meaningful outcomes relevant to families and society are actually affected by intrapartum analgesia before the results of machinations like the NACS are to be taken seriously. The publication of the NACS in 1982 was accompanied by strong claims of lack of validity and applicability. The review by Brockhurst et al.  in this issue of ANESTHESIOLOGY claims that additional work is still necessary to establish this validity. For now, the NACS will certainly continue to appear, like barnacle on a ship’s masthead, in many studies of obstetric anesthetics. If the NACS does nothing else, at least it forces us to remember that neonatal concerns are an important part of obstetric anesthesia. That in itself is a worthwhile goal.
Apgar V: A proposal for a new method of evaluation of the newborn. Curr Res Anesth 1953; 32:260–7Apgar, V
Bax M, Nelson KB: Birth asphyxia: A Statement. Dev Med Child Neurol 1993; 35:1022–4Bax, M Nelson, KB
Als H, Tronick E, Lester BM, Brazelton TB: The Brazelton Neonatal Behavioral Assessment Scale (BNBAS). J Abnorm Child Psychol 1977; 5:215–31Als, H Tronick, E Lester, BM Brazelton, TB
Amiel-Tison C, Barrier G, Shnider SM, Levinson G, Hughes SC, Stefani SJ: A new neurologic and adaptive capacity scoring system for evaluating obstetric medications in full-term newborns. A NESTHESIOLOGY 1982; 56:340–50Amiel-Tison, C Barrier, G Shnider, SM Levinson, G Hughes, SC Stefani, SJ
Tronick E: A critique of the Neonatal Neurologic and Adaptive Capacity Score [Editorial]. A NESTHESIOLOGY 1982; 56:338–9Tronick, E
Brockhurst NJ, Littleford JA, Halpern SH: The Neurologic and Adaptive Capacity Score: A systematic review of its use in obstetric anesthesia research. A NESTHESIOLOGY 2000; 92:237–46Brockhurst, NJ Littleford, JA Halpern, SH
Brazelton TB: Saving the Bathwater. Child Dev 1990; 61:1661–71Brazelton, TB
Michenfelder JD: Accept, revise, reject or punt: An example of the latter [Editorial]. A NESTHESIOLOGY 1982; 56:337Michenfelder, JD
Writer WDR, Stienstra R, Eddleston JM, Gatt SP, Griffin R, Gutsche BB, Joyce TH, Hedlund C, Heeroma K, Selander D: Neonatal outcome and mode of delivery after epidural analgesia for labour with ropivacaine and bupivacaine: A prospective meta-analysis. Br J Anaesth 1998; 81:713–7Writer, WDR Stienstra, R Eddleston, JM Gatt, SP Griffin, R Gutsche, BB Joyce, TH Hedlund, C Heeroma, K Selander, D
Walker M: Do labor medications affect breastfeeding? J Hum Lact 1997; 13:131–7Walker, M