The vertical motion history of disk stars throughout the Galaxy
Abstract
It has long been known that the vertical motions of Galactic disk stars increase with stellar age, commonly interpreted as vertical heating through orbit scattering. Here we map the vertical actions of disk stars as a function of age ( Gyrs) and across a large range of Galactocentric radii, , drawing on APOGEE and Gaia data. We fit as a combination of the vertical action at birth, , and subsequent heating that scales as . The inferred birth temperature, is for , consistent with the ISM velocity dispersion; but it rapidly rises outward, to for , likely reflecting the stars’ birth in a warped or flared gas disk. We find the heating rate to be modest and nearly constant across all radii, . The stellar age dependence gently grows with Galactocentric radius, from for to at kpc. The observed relation at all radii is considerably steeper () than the time dependence theoretically expected from orbit scattering, . We illustrate how this conundrum can be resolved if we also account for the fact that at earlier epochs the scatterers were more common, and the restoring force from the stellar disk surface mass density was low. Our analysis may reinstate gradual orbital scattering as a plausible and viable mechanism to explain the agedependent vertical motions of disk stars.
Subject headings:
Galaxy: kinematics and dynamics — Galaxy: evolution — Galaxy: disk — Galaxy: structure — methods: statistical — methods: dataanalysis1. Introduction
In the context of CDM hierarchical cosmogony, galaxy formation started out vigorously, with a rapid gas inflow and frequent mergers (e.g., whi78; bro04; bro12; vog14). During this period, stars formed from highly turbulent, clumpy ISM with large velocity dispersion and were born with kinematically hot orbits (bou09; for12), as also borne out by high redshift observations (e.g., for09; win15). For the Milky Way, this early phase of vigorous evolution faded 810 Gyrs ago, giving way to more gradual gas acquisition and an extended period without any major mergers. Since then, gas increasingly settled into a thin disk before forming stars, resulting in ‘‘upsidedown’’ formation of the main, extended^{1}^{1}1This component is loosely referred to as the low or “thin” disk; it is this disk “component” that is the subject of this work. stellar disk (bir13; sti13; gra16). At present, the starforming molecular gas in the Milky Way disk has a small velocity dispersion (sta89; sta05; sta06; aum09; mar14; aum16), and so do the stars that form from it. Subsequently, stars are bound to acquire more vertical motion over time through a variety of dynamical processes. Indeed, vertical thickening of spiral disks seems prevalent throughout the cosmos (e.g., yoa06; jur08). The present day distribution of vertical motions, e.g., characterized by their velocity dispersion, in stars of different ages and Galactocentric radii therefore reflects a combination of their birth “temperature” and subsequent heating. Constraining and understanding the vertical motions of Galactic disk stars therefore provides a key test of the processes presumed to set the vertical structure of disks in general.
Several different physical processes may have contributed to the vertical heating of the Galactic stellar disk, causing either rapid “nonadiabatic heating”, or more gradual “adiabatic heating”. Cosmological simulations have shown that galaxies of the Milky Way’s size are frequently experiencing minor mergers (qui93; wal96; vel99; kaz09; hou11; gom13; don16; moe16), external perturbations that can heat up the disk. But there are also more gradual internal heating mechanism: classically, giant molecular clouds (GMCs) act as scatterers that could heat up the disk either directly or by redirecting some of the inplane heating (e.g., through transient spiral arms or the Galactic bar) to the vertical direction (spi53; bar67; lac84; car85; car87; jen90). Inplane random motions can also be converted to vertical motions in the stellar disk through bending waves (e.g., sha10; fau14; deb14; wid14).
But the interpretation of disk stars’ vertical velocity dispersion, as disk heating is often complicated by the overall secular evolution of the disk. For instance, the gradual, adiabatic increase in the midplane baryon density will cause an increase in (e.g., bah84; van88; jen92; vil10). And radial migration of stars (sel02; sch09b; min10) also affects the vertical disk structure: it should cause more extended vertical motion at reduced velocity dispersion for stars that move outward, and the opposite effect for stars that move inward (loe11; min12; ros13; mar14; ver14; aum17).
Traditionally, studies have characterized the effects of vertical disk heating through the agevelocity dispersion relation (AVR) (e.g., str46; wie77; qui01; sea07; sou08; aum09; san18). The GenevaCopenhagen Survey (GCS) provides the current stateoftheart for such a relation in the solar neighborhood (nor04; hol07; hol09; aum09; cas11). Solar neighborhood studies basically agree that the presentday vertical velocity dispersion among stars of age scales approximately as . But the interpretation of this scaling is still under debate, as the simplest models of heating through a timeindependent population of scatterers would imply a different evolution of vertical motions (lac84; han02). And with only local information at hand, the various heating mechanisms, when combined with the effects of radial migration, may have degenerate observational signatures.
The combination of Gaia DR2 (gai18; lin18), of the ongoing largescale spectroscopic surveys, such as APOGEE (maj17), Galah (des15) and GaiaESO (smi14), and of consistent stellar age estimates across a wide range of Galactocentric radii (e.g., mar15; nes16) make it now possible to characterize and interpret the history of vertical motions among (low) disk stars throughout the Galaxy. This is what we set out to do in this paper.
It seems sensible to characterize the vertical motions of stars by their vertical actions, , rather than their vertical velocities or velocity dispersion, . This is mainly because the vertical action is an adiabatic invariant under any gradual changes of the potential reflecting the growth of the Galaxy, and an approximate invariant under radial migration through churning (car87; sel13), as the vertical motions are only weakly coupled to inplane resonances. In contrast, will change in a growing potential and under radial migration, with the scaleheight h changing in compensation such as to conserve . Note that for the simple case of harmonic vertical oscillations, corresponds to . Therefore, mapping the presentday as a function of stellar age and current Galactocentric radius, may make it easier to interpret this distribution in terms of: (a) with what “vertical birth temperature” did star start out, and (b) what subsequent heating (defined as an increase in ) did they incur subsequently.
Specifically, we combine here a red clump sample derived from APOGEE in tin18a with their Gaia DR2 proper motions (lin18), which yield a large number of stars across the Milky Way disk with precise 6D phase space coordinates, from which their vertical actions can be reliably estimated. Furthermore, precise stellar ages () can be derived for this sample as we will show. A global () study of the vertical structure of the Milky Way main stellar disk^{2}^{2}2In this study, we use the term main, extended or thin stellar disk to describe the low sequence of the Galactic disk (e.g., rix13; hay15), and will use these terms interchangeably. Similarly, the thick disk in this study describes the enhanced disk and does not necessarily mean the geometrical thick disk. also requires careful modeling of the selection function.
The paper is organized as follows. In Section 2, we present and derive precise distances and actions for the red clump sample, as well as precise ages, inferred from a datadriven model drawing asteroseismic training data. In Section 3, we introduce a physical model of vertical action which capture the full distribution of vertical action at different radii and ages. We also describe the derivation of the selection function in vertical action, which turns out to be important in the subsequent Bayesian inference. In Section 4, we will discuss the inference posterior of our models and study how the vertical heating and birth temperature of stars depends on the Galactocentric radii. We will discuss the results in the context of a simple epochdependent but analytic scattering model in Section 5 and conclude in Section 6. Throughout this paper, we will assume the following units: velocity [km/s], distance [kpc], age/time [Gyr], and vertical action [kpc km/s].
2. Data
In this study, we adopt the APOGEE red clump sample derived in tin18a, and crossmatch the sample with Gaia DR2 (gai18; lin18) astrometry to obtain proper motions. We do not use the Gaia DR2 data for distance determinations, as for stars beyond a few kpc the Gaia 1/parallax () distance precision degrades drastically and becomes increasingly biased (bai18). Fortunately, red clump stars are excellent standard candles regardless of their metallicities and stellar ages, and mitigate this problem; across the Milky Way disk, they can yield photometric distances precise to about (bov14; haw17), where is the heliocentric distance.
The red clump sample was selected among APOGEE DR14 targets through a deep learning algorithm which optimizes an empirical datadriven mapping from the APOGEE normalized spectra to their corresponding asteroseismic period spacing and frequency separation , adopting the vra16 asteroseismic data as the training set (tin18a). haw18 and tin18a showed that the inferred mixed mode period separation can yield a more pristine red clump sample with only contamination for APOGEE, at least two times lower in contamination than previous approaches (e.g., bov14); see tin18a for more details.
To estimate photometric distances, we crossmatch this red clump sample with the allWISE catalog to get the WISE magnitude. We obtain the (small) extinction correction from the color, assuming (haw17). From the color, we also select stars that are less extincted and adopt this subset to fit for the absolute magnitude , as a quadratic function of the APOGEE’s . The variance around this fit shows a spread of , corresponding (see also bov14; haw17). For each star, we estimate its photometric distance through the predicted from the fitted relation and correct for the extinction estimated from its color. Fig. 1 shows the comparison between our photometric distance estimates to Gaia’s distances. The Figure illustrates that 1/ is indeed a biased distance indicator at large heliocentric distance, as expected from bai18, which is overplotted in blue in the Figure.
To estimate stellar ages, we build a spectroscopic age estimator that is trained on a set of APOKASC2 determined red clump stars with precise asteroseismic age estimates (pin18) that are also in the tin18a catalog. We only select stars that have asteroseismic ages Gyr. It has been well established that spectra from giants contain information about stellar ages through their massdependent dredge up, which affects the photospheric C/N ratios (mar15; nes16). Rather than making reference to C/N directly, we apply neural networks to optimize such the mapping from spectra to asteroseismic ages (see tin18a), and then propagate asteroseismic ages to our red clump sample. While we infer stellar ages here, instead of asteroseismic and as in tin18a, the algorithm is exactly the same: we establish an empirical, highdimensional, mapping from the APOGEE normalized spectra to stellar ages via a fully connected neural network with three hidden layers and 50 neurons each; the neurons are connected with a Sigmoid activation function, and the model is optimized using pytorch. We crossvalidated our inference using a random stars with asteroseismic ages, comparing them to our model predictions trained on the other APOKASC stars, as illustrated in Fig. 2. The crossvalidation indicates that our inferred spectroscopic ages are largely unbiased, at least from Gyr, and precise to (dex); here we assume the half of the 1684 percentile intervals as .
Nonetheless, we caution that, at least with the training set at hand, this empirical mapping method which presumably exploits the C/N information in spectra, does not seem to have enough spectral information to distinguish stars older than Gyr and tend to be biased in the oldest regime; stars older than Gyr are often incorrectly inferred as Gyr (as can be seen at the oldest age regime in the bottom panel of Fig. 2). A similar problem also happens for the few stars younger than Gyr. But since our study only focuses on stars that are younger than Gyr, and the red clump sample is strongly biased against stars older than Gyr, we find that this bias only minimally affects our inference. But we caution audiences who might want to adopt the published catalog in this study as there is a lack of stars older than Gyr due to this bias. Similarly, at the young end, we found that only about of our sample are in the very young regime ( ¡ 0.2) where the inferred stellar ages might be biased. Further, a small bias in log scale at the lower end, i.e., moving stars from 0.4 Gyr to 0.6 Gyr (see Fig. 2), should have a negligible impact on our inferences because the heating rate is dominated by the more extended evolutionary behaviors of the stars (.
In this analysis we are interested in the disk heating mechanisms that were effective after the initial vigorous and turbulent formation phase – extending to a redshift of , about 8 Gyrs ago – when stars were presumably born kinematically hot (nog98; bou09; bro04; bro06; bir13; gra16). Most enhanced stars in the disk probably formed in that period. In the subsequent analysis, we limit ourselves to stars with inferred spectroscopic ages Gyr and in the low sequence of the Milky Way disk, with , where we adopt the APOGEE abundances from a reanalysis of the APOGEE spectra using the spectral fitting algorithm the payne (tin18c). As shown in mar14 (e.g., their Figure 5), such disk stars are likely to be born cold, at least for Gyr (but see also bro04; bir13, for a different view). By restricting ourselves to these stars, we deem a scenario sensible, in which they were born “cold” (with low ) and were subsequently heated. We also restrict ourselves to stars with a Galactocentric height of kpc, as APOGEE is a lowlatitude survey. As we will see, this makes the evaluation of selection function more straightforward.
Applying all these criteria leaves a total of , red clump stars. The full catalog used (after all these criteria are imposed) in this study, including their phase space actions, and stellar age properties is electronically available as Table 1, and a portion of which is shown in Table 1 below.
As radial migration in the Galactic disk seems to be moderately strong (see e.g., fra18, and references therein), the presentday Galactocentric radius, is unlikely to be the stars’ birth radius, nor necessarily the mean Galactocentric distance at which they have experienced most of their heating; the latter is the quantity pertinent in building a global disk heating model. But global chemical evolution models including radial migration (fra18) allow us to estimate from a star’s age and metallicity [Fe/H]; this is effectively “chemicaltagging” in a weak sense (e.g., fre02; tin15a).
APOGEE ID  [Fe/H]  [Mg/Fe]  Gaia Source ID  Gaia RA []  Gaia Dec []  Age [Gyr]  [kpc] 
2M00000446+5854329  0.115  0.044  422775384964691328  0.01860  58.90915  3.66  3.63 
2M00001071+6258172  0.447  0.131  430057759718424064  0.04468  62.97150  1.09  4.88 
2M00001199+6114138  0.007  0.148  429484398762416384  0.04996  61.23720  5.40  1.95 
2M00001242+5524391  0.019  0.139  420487782302882560  0.05175  55.41084  4.59  3.87 


Apogee ID  [kpc]  [kpc]  [kpc]  [kpc]  [kpckm/s]  [kpckm/s]  [kpckm/s] 
2M00000446+5854329  10.33  0.179  6.65  8.49  1.121  2462.3  0.957 
2M00001071+6258172  11.30  0.089  13.01  12.15  26.346  2347.2  0.472 
2M00001199+6114138  9.24  0.007  3.47  6.35  22.459  1936.2  6.615 
2M00001242+5524391  10.46  0.424  4.11  7.28  71.747  2086.9  4.361 
The original APOGEE red clump sample was derived in tin18a, and here we provide a valueadded version of that catalog, restricting to Gyr, kpc and the low stars. Column (1) shows the APOGEE ID; columns (2)(3) show the APOGEE abundances measured using the payne (tin18c); columns (4)(6) indicate the Gaia source id, RA and Dec; column (7) shows the stellar ages inferred from the spectra via an asteroseismic training set; column (8) indicates the inferred photometric red clump distances from the Sun; columns (9)(12) show the Galactocentric observed radii, Galactic heights, inferred birth radii and the mean radii, respectively; the mean radii are defined to be the average of the observed radii and the birth radii; columns (13)(15) indicate the phase space actions (radial actions, angular momenta and vertical actions) of the stars.
fra18 has directly constrained such a metallicity–age–birth radius relation, [Fe/H](, by simultaneously taking into account and marginalizing over the star formation history of the Milky Way and the radial migration of stars. We adopt their bestfitted [Fe/H]( relation to infer , for our individual red clump stars. We then adopt for simplicity a presumed average radius (over the star’s lifetime), to be the average of the birth radius and the current radius . For the model below we further assume that the star was effectively heated by the rate specified at .
While this can be a crude estimate of the “effective” radius, to calculate the exact heating rate requires a complete knowledge of the exact trajectories of the stars. This is not possible due to the stochastic nature of orbits, either due to scattering and radial migration. Nonetheless, the typical migration scale length is 1.5 kpc (fra18), as we are probing a broad range of radii (314 kpc), a small change in the exact definition of ”average” would not change the result in this study qualitatively. We note that the results presented in this study assume the empirical radial migration model derived in fra18. We will defer the simultaneous fitting of both the radial migration and vertical heating to future studies.
Taking into account the uncertainties from the metallicityageradius relation, we assume the fractional uncertainty of to be 10% throughout this study (instead of uncertainty of ). We checked that assuming a larger or smaller uncertainty does not qualitatively change the result in this study.
In preparation of the modeling, we still have to calculate the vertical action for all stars, which we do with the galpy package and its adopted Milky Way potential (bov15b). We assume kpc, km/s (e.g., sch12; bla16) and the solar motion with respect to the Local Standard of Rest to be km/s, km/s and km/s (sch12). Throughout this study, we adopt the radial velocities from APOGEE DR14 and photometric distances derived above as they are more precise than Gaia’s; Gaia DR2 only contributes to the proper motions of stars. But we note that a proper statistical inference which we will describe in the next section requires more than just the estimates of , it also requires an estimate for the uncertainty of , which is what we will derive next.
In the simple harmonic and epicycle approximations (e.g., sol12), one finds the vertical energy, is the vertical velocity and the vertical frequency. This implies that the uncertainty of can be approximated to be . By definition, we have , where is the proper motion, it follows that to be a constant, in our case, . As a result, the vertical action uncertainty is only using red clump stars. We also tested that assuming a 10% or 20% error for does not change the results qualitatively. On the other hand, the uncertainty for can be significantly larger if one were to use the Gaia parallax distances. As , we have , But for Gaia, even at the bright end, only is constant, instead of . It follows that , and hence instead of a constant of . As a consequence, the statistical inference would be significantly impeded by the large uncertainties for stars with kpc (cor18). Therefore, with Gaia DR2 data alone, i.e., without the complementary spectroscopic data from APOGEE which allows us to obtain precise distances through the red clump sample, it would not be possible to constrain the vertical heating across the Milky Way. . For photometric distances, we have , where
Having described how we obtain the observable , as well as their uncertainties, in the next section, we can now turn to describing the model which predicts the full distribution function of the vertical action . And subsequently, we will compare the data to the distribution through Bayesian inferences to obtain constraints on the model and to inform how the Milky Way disk was vertically heated to best describe the data.
3. Vertical heating model and statistical inference
We now lay out simple parameterized, physically motivated models for the “history” of vertical stellar motions in the Galactic disk, and constrain it by these data. Such models should specify, as a function of Galactocentric radius, with what vertical action stars were born, and how this vertical action evolved subsequently. It is apparent from the outset that the presentday data, even if they were exhaustive and perfect, cannot fully constrain the past history of vertical motions, without further model restrictions or assumptions: after all, the solution that no vertical heating in the Galaxy ever happened and every star was born with its present day is mathematically viable and provides a perfect fit to the data. We therefore need to devise some physically plausible, though mathematically highly restricted model families as the context for any inference. We do this in three steps: first, we discuss how the assumption that the distribution of is quasiisothermal simplifies the specification of a model. Second, we specify and formally introduce a family of models that assumes that stars were born with a certain (radiusdependent ) which then grew as some power of time; this models assumes implicitly that the observed relation reflects an evolutionary path, as and are used nearinterchangeably, where is the observed stellar age and is the time after the star was born. We note that at present time , but at earlier time, these are two different quantities. For example, at a lookback time of Gyr, a star that has a stellar age Gyr, would have Gyr. Therefore, one should not confuse the measured relation from all stars with the “evolutionary track” relation for individual stellar populations. In fact, finally, in Section 5, we will presume that the heating mechanism is orbit scattering (with a time dependence for individual populations), and ask whether a density of scatterers that scaled with the presumed density of cold gas could explain the measured relation.
3.1. Isothermal distributions for
A star’s current or past vertical action is clearly not a deterministic function of its birth and current radius and its age. But we can expect stars of such properties to have a probabilistic distribution of vertical actions that can be described by some characteristic quantities. An intuitive, and wellestablished model for would be an isothermal distribution, both at birth and subsequently (with a different temperature). For simplicity, we assume that we are in the harmonic limit, where the distribution can be written as:
(1) 
and the mean of the vertical action is . It follows that a normalized distribution of can be simply written as
(2) 
This Ansatz was proposed in bin10 and bin11, and subsequently adopted in a number of studies such as tin13 and tri16, as it seems to describe the observed action distribution of stellar disk subpopulations well. This Ansatz has the elegant property that the distribution is fully specified by the expectation value for the vertical action . As we will show, such an isothermal description through , describes the data well across all and . Specifying a model for the observed then reduces to specifying the set of possible .
3.2. A parameterized model for the vertical action distribution as a function of and
Following traditional approaches, we do not try to model the temporal evolution relation for individual populations, but model the global observed relation, i.e. , including here a dependence, as a function of age . Specifically, we adopt:
(3) 
Here, is the stars’ characteristic at birth, is typical increase in in the last Gyr, and specifies how the heating scales with age. This model makes two rather farreaching assumptions: first, that is a distinct function of , but has been constant with epoch or age , at least for the last 8 Gyrs considered here. Second, that the time dependence of the vertical action (after birth) can be described as a powerlaw of exponent , which changes with radius. To explore the dependence of these terms we consider both a nonparametric approach (radial bins) and a polynomial in . These choices in model restrictions were guided by Occam’s razor, i.e., the search for the simplest (or most rigid) model consistent with the data.
When considering individual radial bins with a spacing of kpc, we constrain three parameters (as constant) , and in each. When constructing a global model for the radial dependence, we adopt a linear function for and , and a cubic polynomial function for , a choice inspired by the radial dependences seen when fitting individual bins; to reduce covariances among the polynomial coefficients we take 8kpc as the pivot point of these functions, defining .
Taken together, the global model (specified by the model parameter vectors, ) is then:
(4)  
(5)  
(6) 
3.3. Selection function
We aim to compare our model to stars from APOGEE (maj17), which is a sparse, lowlatitude spectroscopic survey; at any Galactocentric radius , the probability of a star entering this sample depends on its height above the midplane, and therefore on its , as stars with large spend a larger fraction of their orbit well away from the midplane. For a datamodel comparison, we need to construct an approximate selection function . Constructing a selection function would be significantly more involved were we to aim for quantifying the full : the radial selection (even more than the selection) function arises from a combination of the APOGEE targeting strategy and dust extinction; and red clump stars are strongly biased against old stars and towards Gyr old stars. Here we circumvent this problem by sampling our model posterior from , focusing on the evolution of at any given radius and time.
Fortunately, as shown in Fig. 2, an approximate selection function can be quite straightforwardly calculated from simple orbit integration over a presumed Milky Way potential; it has approximation that enables an analytically integrable normalization of the likelihood function, as we describe in the Appendix (also see Fig. 3).
3.4. Data likelihood and model posteriors
Finally, when the selection function is included, instead of directly comparing the data to predicted by the isothermal distribution function, the model prediction should be revised to be
(7) 
where is a normalization (see Appendix).
With this in mind, given the data of individual stars in the red clump sample, the global model parameters can then be inferred through Bayesian inference; in the case of fitting individual radial bins, an analogous procedure applies to , and .
More precisely, assuming an uniform prior, we have the posterior of p to be
(8)  
where , and reflect the “noiseless” observable, as opposed to the observed , , which are subjected to measurement uncertainties. In practice, the integrals were calculated via Monte Carlo sampling 10 independent realizations of the true observable, assuming , and , essentially Monte Carlo integrating over the measurement uncertainties of the observable.
4. Results: a global view of
Following the model and its corresponding likelihood as described in the previous section, we sample the posterior of our model parameters via MCMC using the emcee package (for13). We first fit for individual radial bins where we bin our data into 1 kpc radial bins, spanning from kpc to kpc. We fit for three parameters , and for each bin. Noting that a small fraction of our sample could be contamination from the thick disk or the halo stars, as well as stars older than Gyr but were mistakenly inferred to be younger, after the first round of fitting, we evaluate the likelihood for individual objects given the best parameter and cull sample with loglikelihood . We tested that this choice gives the largest improvement to the comparison of the data to the model, But we note this criterion only discards of the data, and without this cut, the model still gives a good prediction of the data. We subsequently refit the results using the “clean” sample which defines the final results for the individual bins. Finally, we combine all the sample from the individual bin fitting, and fit for the global model where we fit for the radial dependence of , and simultaneously, assuming the functional form presented in Eq. (3)–(6).
Parameter  Bestfitted value 

1.09  
0.06  
1.81  
0.05  
0.91  
0.18  
0.087  
0.014 
Fig. 4 shows the sampling of the model parameter posterior. At each panel, the crosshair indicates the bestfitted model, defined through the mean of the marginalized distribution. Choosing a pivot point of ^{3}^{3}3The pivot point is chosen for numerical convenience to reduce the covariances of the posterior; it should not be confused with the Solar radius , which is at 8.2 kpc. as explained in Section 3 reduces a large part of the posterior covariances. But some covariances persist because stars can either be born at a slightly higher birth temperature or could be slightly more heated later on. But given a broad range of stellar ages probed in this study, especially the youngest stars set the possible acceptable range for the birth temperature, the model parameters are not entirely degenerate, and the MCMC chain does converge to the bestfitted model as illustrated in Fig. 4, and we find the bestfitted isothermal model with to be (the bestfitted parameters are also listed in Table 2)
(9)  
To illustrate that the model from Eq. 3 is indeed an adequate representation of the data, in Fig. 5 we compare the predictions of at different radii and stellar ages from the bestfitted model to the data. When projecting the predictions, we also take into account the selection function and the uncertainty in . In particular, we sample from the bestfitted model, but down weight the mock sample by the selection function evaluated with the mock sample values in and . We also perturb the assuming the same uncertainties of the data. The black lines demonstrate predictions from the model, and the corresponding shaded regions in colors, where the black lines center, demonstrate the data. For the data, the uncertainties as shown in the shaded regions are evaluated by 1000fold bootstrapping. As for the model predictions, the solid line illustrates the median of the prediction and the dashed and dotted lines show the and ranges. Different panels illustrate the evolution of the vertical action at different Galactocentric radii. Note that, we plot the axis in instead of to show the full dynamical range of .
It is clear from the bottom panels of Fig. 5 that stars with kpc have higher birth actions, even though the selection function should bias against hot stars. The vertical heating rate also increases visibly at large Galactocentric radii, and we will quantify that in more details below. Importantly, the model agrees very well with the data in the sense that the model predicts not only the expectation of but also the full distribution. This is remarkable, as for an isothermal distribution the width or “scatter” is uniquely given by the expectation value without any other parameters. Finally, we caution that the median values here should not be directly compared to Eq (9) (the mean of the distribution) because (a) the median of an exponential profile is times of the mean, and (b) the model predictions here are weighted by the selection function.
Fig. 6 further shows the resulting posteriors for the various components of the vertical heating model as a function of . When evaluating the model predictions, we draw directly from the posterior chain, in other words, the model predictions take into account the covariances of the posterior. The top panel shows the radial distribution of our data. The red histogram illustrates the current radii of the stars, and the black histogram shows the average of the estimated birth radii and the current radii of the stars. As shown, our APOGEE red clump sample covers a considerable fraction of the Milky Way disk, from kpc to kpc. Although there are a few stars that are located beyond this range, the number of stars per radial bin is too small () for any reliable statistical inference beyond this range. Hence we only consider stars that are kpc in this study. The panel also shows that the average radii are generally smaller than the current radii, demonstrating that there is an overall outward radial migration of stars.
For the other panels, the black solid points show the mean and the standard deviation of the posterior when we fit a “nonparametric” model, i.e., by fitting data of different individual radial bins separately. We first fit data of individual radial bins to make sure that when we fit for the global model, the global model does not smooth out interesting features or create spurious features due to overfitting. The black lines in these panels show the global fit, where we assume a cubic radial dependence for the birth action , and a linear model for the heating rate and the temporal power law index . As shown in Fig. 6, the radial trend probed by the global fits agrees with the nonparametric version.
In the following, we will only focus on the results from the global fit and will discuss how these parameters vary as a function of , as this analysis presents the first comprehensive view of the agedependent vertical actions of Galactic disk stars beyond the Solar neighborhood. We iterate that the results in this study assume the radial migration models from fra18, a more robust approach for future studies would be fitting both the radial migration and vertical heating simultaneously.
First, we focus on the inferred birth action of the stars as a function of . We plot the posterior of in the bottom right panel of Fig. 6. This birth temperature information primarily comes from the youngest population of stars in our sample; it is a wellconstrained quantity, albeit it is slightly correlated with the vertical heating of the disk (see Fig. 4). The figure shows that even though we assume that stars were born cold, at all radii, the birth action is nonzero. This is not surprising because the ISM in the midplane has a finite velocity dispersion. Furthermore, it is also known that star clusters disperse with a nonzero finite velocity. Therefore, even if stars were born cold, they would have a finite “zeropoint” which expectation we will estimate next.
We can compare to known cold ISM properties, the material from which these stars were just born, using the ISM velocity dispersion and the scaleheight. The vertical action is approximately , and in the isothermal limit we have . Note that, could be approximated by assuming that stars are oscillating in a uniform density medium. By solving the Poisson equation, we arrive at which is a simple harmonic equation with the vertical oscillation frequency . Substituting the formula for , it follows that , and finally substituting this to the formula, we find which we will adopt to estimate the zeropoint vertical action. More quantitatively, if we assume the initial velocity dispersion of the ISM or the star clusters to be km/s (e.g., sta89; sta05; sta06; aum09; aum17), and assuming the molecular gas scale height to be kpc (e.g., mara17), we find , which is plotted as the red dashed line in the bottom right panel. Equivalently, the derived might imply that the average density in the vertical column where the stars oscillate is about half of the Solar neighborhood density ; if , we have , and hence as observed.
Fig. 6 demonstrates that for the disk stars born at kpc, the birth action is consistent with the velocity dispersion and scale height of the ISM and young clusters. Interestingly, the model fit shows that stars with kpc were born with increasingly larger vertical motions: is reaching at kpc, almost an order of magnitude higher than in the inner disk. Such a strong radial dependence of birth temperature may reflect that waning disk selfgravity facilitates the flaring and warping of the birth ISM (nak03; amo17). While the Galactic warp is not explicitly included in the model, its effect could manifest itself through the birth temperature of the stars – stars born at a higher height would inherit a larger orbit oscillation and hence a larger initial vertical action. Flaring is another possibility. As , we have . At the other region, if increases due to flare, but at the same time, if only decreases gradually (e.g., from dark matter halo contribution), a higher is expected as a result.
But this model also describes the change of with age . In this model, this is quantified by . The black lines in the bottom left panel of Fig. 6 shows the posterior of which describes the apparent heating over the last Gyr. The posterior shows that the shortterm heating rate is mild, as we have derived above via the birth temperature , we have the local vertical frequency to be . Recall that hol09 (fig 7) finds which implies and with little radial dependence. How does this value compares to the GCS measurement? Assuming the average density to be
The longterm change of the actions with age is described by the power law index , plotted in the top right panel of Fig. 6. The bestfitted model shows a linear growth with age, , which is also consistent with the Solar neighborhood measurement (hol09, with in the harmonic limit), which is overplotted as a red symbol in the top right panel. If we were to neglect the measurement uncertainties and selection function in our study, we would get a lower heating rate, illustrated by the blue dashed line. But as opposed to the local measurements, when both the selection function and the measurement uncertainties are included, the best fit shows rising from about at , to at . The stronger long term heating for the outer disk is also demonstrated in the bottom left panel where the green and blue lines show the vertical heating of stars at different , over 2 Gyrs and 4 Gyrs, respectively.
The value of and its radial dependence are noteworthy in two respects, First, our finding shows that the stellar age dependence of vertical motions differs markedly (with ) throughout the disk from the expectation of the temporal evolution of constant gradual orbit heating by, say, scattering, which should lead to (orange line in the top right panel). Second, raises outward, implying at face value a different heating history or mechanism.
This stark discrepancy of the observed agescaling of , now seen at all radii, with the expectation of vertical orbit heating by a given population of scatters near the disk plane, , leaves us with several alternative explanations: either there is a very different heating mechanism (satellites etc.), or the stars’ birth temperature, , evolved strongly with time, or orbit scattering is the heating mechanism, but its efficacy evolved strongly with time, as suggested e.g., by aum16.
In the next Section we lay out a simple, physically motivated model for epochdependent orbit heating by scattering (e.g., off GMC’s) that explains, at least qualitatively, both and its increase with .
5. Discussion: A simple orbit scattering for heating disk stars
It may be tempting to identify the agedependent increase of vertical actions, , with an evolutionary heating path. But really, simply reflects the combination of the birth action and the integrated subsequent heating that stars of current age at , i.e., the presentday endpoints of possibly different, agedependent evolutionary paths. When viewed just mathematically, success in matching these data with different combinations of birth actions and evolving scatterers would seem unsurprising, perhaps almost trivial. But here we show that an approximate match to the data can be achieved if we take one particular, astrophysically plausible evolution for the scattering strengths.
The Milky Way’s SFR has been declining from high redshift to the present, and starformation has proceeded inside out, building up the disk surface mass density at a radiusdependent rate. The rate of the resulting change in a star’s vertical action will depend on both density of scatterers and the vertical orbit frequency, which in turn depends on the local disk mass density. These factors can hardly be constant with cosmic epoch. “Vertical heating tracks” will then be different from the presentday relation. This is qualitatively illustrated in Fig. 7 (see also aum16), where individual lines in color indicate the evolution of individual populations born at different times, all following roughly but with different scattering amplitudes. Each line ends with the only observable prediction at the current epoch; the ensemble of evolutionary end points can be described as a powerlaw with stellar age , which will have a different powerlaw index .
In the following we present a simple analytic model that quantifies these effects. In this model we denote the presentday age of the stars as , the time since their birth as , which implies a lookback time . We take external pieces of information to estimate how the density of scatterers, denoted as and the vertical frequency, (or restoring force) may have evolved.
We start out with the basic equation describing vertical heating by orbit scattering, (p.123 bin08)
(10) 
where if the layer of scatterers is as thick as the vertical oscillations of the stars, and , if the layer of scatterers is thin; numerical simulations suggest to be the applicable regime. We want to cast this equation in terms of and allow for timedependences. If we assume that we are in the limit of harmonic oscillations, and that the midplane density is dominated by the stellar mass and is uniform as a function of height, we have and .
One then gets
(11) 
or with a scaling factor as free parameter,
(12) 
Since ,
(13) 
which can be solved by straightforward numerical integration, if and are known. Note that if and are constant with time, this recovers the expected relation in the case of a static Milky Way disk.
To estimate how the density of scatterers, evolves with time and epoch, we assume that scales with star formationrate density via the SchmidtKennicutt law with and adopt , the measured SchmidtKennicutt index for the cold molecular clouds (big08; ken12). We adopt the global starformation history of the Galactic disk from fra18, where we consider both an exponential decay of the SFR, as well as an insideout growth of the disk. More precisely, we assume
(14) 
and adopt the Milky Way global parameters , , , and
(15) 
The scaling sets the overall scale for , but will not change the powerlaw index in the resulting . We find that provides a good agreement between the data and the predictions for all radii. For the stellar mass evolution, we assume an exponential profile with a scale length insideout evolution similar to Eq. (15) with the same , but we adopt the current stellar mass scale length to be (e.g., bov13) in lieu of .
Fig. 7 shows the numerical evaluation of the differential equation in Eq. (13) at different radii, assuming the initial condition kpckm/s, as per Section 4. Individual lines in color show the evolution of various stellar populations, colorcoded by the birth time. We assume the smallest birth time to be Gyr, mimicking some of the youngest stars in the red clump sample. The figure shows that populations born earlier experienced more heating in than more recently born populations, reflecting the higher and lower at the time. Each evolutionary track ends at the present epoch with . Remarkably with just scaling the predictions, the ensemble of endpoints of the evolution tracks agree well with the observed .
For a more quantitative comparison we proceed to fit the endpoints of these tracks with Eq. (3), as proxy for the observable mean , after perturbing the final of each track by . The black lines illustrate the bestfitted powerlaw for the analytic prediction, with the power law index is shown in the top left corner of each panel. The red dashed lines are the mean of the best global fit from the red clump sample presented in Section 4; note that this figure differs slightly from Fig. 5 because, unlike Fig. 5, we have not folded in the selection function here.
The bestfit powerlaw from this physically inspired model has a range of , remarkably resembling the observed , even though all individual populations are evolved with . This analytic model also shows an increase in with radius. When the insideout growth is included, the disk stars at the outer disk start with a small restoring force due to the small radial scale length of the stellar disk. As the disk ages and grows, the increment in the restoring force at the outer disk is more drastic than the inner disk. As a result, as shown in the right panel of Fig 7, there are larger separations between individual evolutionary tracks, and the slower evolution in for the younger populations means that the observed relation will favor a “long”term evolutionary behavior where the increases more drastically for an older lookback time. And this causes the observed appeared to be larger at the outer disk.
It is encouraging how well this simple model does without finetuning: perhaps gradual orbit scattering is indeed the dominant source driving the evolution across the Milky Way disk, at least for kpc. But we note that this models encompasses the scenario where inplane heating (or blurring) can happen through a range of processes, such as spiral arms and bars perturbations, with GMCs simply isotropizing the inplane noncircular motion of stars.
This analysis may also imply that the influence of satellite bombardment, at least for kpc need not be a dominant source of vertical heating. If there is a major merger impacted the Milky Way thin disk, one would expect the heating rate for that specific population should be higher than usual, which would then imprint as a higher observed for the same reasoning that we laid out above. But if the Milky Way thin disk only went through a steady stream of minor mergers, this might not manifest itself in the observed , but we will defer a more careful comparison with detailed simulations to future studies.
5.1. The presentday vertical actions as birth properties?
While our vertical heating model provides an excellent explanation of the data, we assume that (low) stars were born at a constant birthtemperature. This assumption is supported by some simulations (e.g., mar14), but contested by others (e.g., bro04; bro06; bir13). We could not rule out the possibility that stars could merely be born kinematically hotter in the past, and the evolution of is simply a consequence of the cooling of the ISM. The latter is known to play a significant role for the older thick disk when the ISM gas of the Milky Way is still settling down (bou09; bir13; gra16), what is known as the “upsidedown” formation.
Nonetheless qualitatively, external galactic observations have shown that the velocity dispersion, at any galactic annulus scales as (e.g., fig4 in tam09), and thus, if the Galactic vertical structure for the thin disk is dominated by the upsidedown formation, we would expect, . To this end, we have tested that, let to be a exponential profile with an exponential timescale Gyr, both assuming to be constant, i.e., , and assuming , i.e. , provides a far poorer fit to the observed red clump trend than the fitted trend in Fig. 7. Fundamentally, an exponential profile is just not a good approximation for which leads us to conclude that the evolution of the vertical structure as measured in this study is better explained by GMC scattering.
6. Conclusion
We have quantified the global vertical temperature of the Milky Way’s stellar disk as a function of stellar age,