This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy. Close this notification
Brought to you by:

HIGH-RESOLUTION LINEAR POLARIMETRIC IMAGING FOR THE EVENT HORIZON TELESCOPE

, , , , , and

Published 2016 September 14 © 2016. The American Astronomical Society. All rights reserved.
, , Citation Andrew A. Chael et al 2016 ApJ 829 11 DOI 10.3847/0004-637X/829/1/11

0004-637X/829/1/11

ABSTRACT

Images of the linear polarizations of synchrotron radiation around active galactic nuclei (AGNs) highlight their projected magnetic field lines and provide key data for understanding the physics of accretion and outflow from supermassive black holes. The highest-resolution polarimetric images of AGNs are produced with Very Long Baseline Interferometry (VLBI). Because VLBI incompletely samples the Fourier transform of the source image, any image reconstruction that fills in unmeasured spatial frequencies will not be unique and reconstruction algorithms are required. In this paper, we explore some extensions of the Maximum Entropy Method (MEM) to linear polarimetric VLBI imaging. In contrast to previous work, our polarimetric MEM algorithm combines a Stokes I imager that only uses bispectrum measurements that are immune to atmospheric phase corruption, with a joint Stokes Q and U imager that operates on robust polarimetric ratios. We demonstrate the effectiveness of our technique on 7 and 3 mm wavelength quasar observations from the VLBA and simulated 1.3 mm Event Horizon Telescope observations of Sgr A* and M87. Consistent with past studies, we find that polarimetric MEM can produce superior resolution compared to the standard CLEAN algorithm, when imaging smooth and compact source distributions. As an imaging framework, MEM is highly adaptable, allowing a range of constraints on polarization structure. Polarimetric MEM is thus an attractive choice for image reconstruction with the EHT.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

Magnetic fields in plasmas around compact objects such as pulsars and black holes are critical for powering these objects' energetic emissions. In active galactic nuclei (AGNs), magnetic fields in accretion disks around central supermassive black holes regulate outflow and the interaction of the AGN with its host galaxy (Fabian 2012; Kormendy & Ho 2013). Stochastic magnetic fields in the disk drive turbulence via the magneto-rotational instability (Balbus & Hawley 1998), enabling efficient accretion and conversion of gravitational energy into near-Eddington luminosities. Near the horizon, magnetic fields can convert spin energy from the black hole into energetic jets of plasma (Blandford & Znajek 1977; Blandford & Payne 1982). Synchrotron emission from electrons in AGN cores and jets is characteristically linearly polarized, with a polarization direction that traces magnetic field lines in the plasma (Ginzburg & Syrovatskii 1965). Faraday rotation of the polarization vector as it propagates through the plasma is determined by the magnetic field strength and electron density along the line of sight (Rybicki & Lightman 1979). Thus, measurements of the linear polarization direction and its frequency dependence characterize the magnetic field around these objects and provide observational windows into these objects' fundamental physics. For a recent review of observations of magnetic fields in AGNs, see Wardle (2013).

Linear Polarimetric Very Long Baseline Interferometry (VLBI) at centimeter and millimeter wavelengths can measure polarization magnitude and orientation at high angular resolutions of fractions of a milliarcsecond (Roberts et al. 1994; Attridge 2001; Attridge et al. 2005). VLBI observations provide an incomplete sample of the Fourier transform of the sky flux density distribution, so image reconstruction (or deconvolution) algorithms are needed. The standard reconstruction algorithm is CLEAN (Högbom 1974), which models the image as a collection of point sources. CLEAN produces images of the three Stokes parameters I, Q, and U separately, so unphysical fractional polarizations $m=\sqrt{{Q}^{2}+{U}^{2}}/I\gt 1$ are possible (especially in regions of low total intensity I). In contrast, Bayesian regularization methods can naturally incorporate prior information of the image's spatial distribution and physical constraints such as m ≤ 1. One such regularization method is the Maximum Entropy Method, (MEM), which finds the image that is most consistent with the data that maximizes an entropy function, analogous to the log of a prior probability distribution. MEM imaging algorithms have been in use for decades (e.g., Gull & Daniell 1978 and Cornwell & Evans 1985), but because of early computational limitations, they are used infrequently compared to CLEAN. The theory behind polarimetric MEM was pioneered in several theoretical papers beginning in the 70s (Ponsonby 1973; Nityananda & Narayan 1983; Narayan & Nityananda 1986; Shevgaonkar 1987), but implementation on actual VLBI data has been limited to only a handful of studies since (Holdaway 1990; Holdaway & Wardle 1990; Sault et al. 1999; Coughlan & Gabuzda 2012, 2013).

Polarimetric MEM is particularly promising for imaging the accretion flow and jets near supermassive black holes observed by the Event Horizon Telescope (EHT). The EHT is a global 1.3 mm VLBI array that will obtain nominal resolutions of approximately 25 microarcseconds, allowing observations of nearby supermassive black holes at scales on the order of the projected Schwarzschild radius (Doeleman et al. 2009). Previous observations with three EHT baselines have constrained the size of the 1.3 mm emission region to scales on the order of the lensed Schwarzschild radius in Sgr A* (Doeleman et al. 2008; Fish et al. 2011) and M87 (Doeleman et al. 2012). The first polarimetric observations with the EHT in 2013 resolved the polarimetric emission in Sgr A*, providing strong evidence for ordered magnetic fields near the event horizon (Johnson et al. 2015). Future observations are expected to obtain enough data to construct an image of the Sgr A* black hole accretion flow (Fish et al. 2014) and jet base of M87 (Lu et al. 2014). As the EHT is now capable of observing with full polarization, polarimetric MEM provides an attractive solution to creating full-polarization images of these sources.

In this paper, we develop an application of MEM to polarimetric VLBI data. In Section 2, we review the fundamentals of polarimetric VLBI, as well as the standard CLEAN algorithm for image reconstruction. In Section 3, we review standard MEM, applied to total intensity images. We discuss applications of MEM to data without calibrated phase information, and quantify the resolution and fidelity of MEM images. We then move to polarimetric MEM in Section 4, where we introduce and compare two forms of the polarimetric entropy function.

In Section 5, we discuss our implementation of polarimetric MEM, including the details of our simulation of EHT data and our minimization algorithm. We present results from applying polarimetric MEM to 7 mm VLBA observations of the quasar 3C279 and 3 mm VLBA observations of 3C273, as well as simulated EHT data from several 1.3 mm model images of Sgr A* and M87. We compare the effects of different forms of the entropy function, and we test the ability of polarimetric MEM to resolve polarization field structure. We show that for these high-frequency VLBI observations, polarimetric MEM is capable of reproducing the general morphology of the CLEAN images, but with typically higher resolution and fidelity. Finally, in Section 6, we outline future directions for polarimetric MEM in VLBI and synthesis imaging in interferometry more broadly.

Our imaging code is written in Python and uses the Limited-Memory-Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) minimization routine in the Scipy package. Our imaging programs, as well as a variety of routines for simulating and manipulating VLBI data, are available at https://github.com/achael/eht-imaging.

2. FUNDAMENTALS OF INTERFEROMETRIC IMAGING

By the Van Cittert–Zernike theorem, measured interferometric visibilities ${\tilde{I}}_{k}$ are the Fourier components of the true source image I(x, y) (in total intensity) plus thermal noise nk (Thompson et al. 2007)(hereafter TMS):

Equation (1)

Here, x and y are real space angular coordinates and uk, vk are the interferometric baseline coordinates projected orthogonal to the source line of sight and measured in wavelengths.

An interferometer incompletely samples the uv plane, so direct Fourier transform (i.e., the "dirty image") of the measured visibilities is a convolution of the true image and the Fourier transform of the uv sample coverage (the "dirty beam"). VLBI imaging can thus be approached either as deconvolution of the dirty beam from the dirty image or as fitting a model to visibility data with regularizing constraints. Finite sampling also ensures that no image that reproduces the observed visibilities ${\tilde{I}}_{k}$ is unique; extra information is always required to constrain the image. In CLEAN, this extra information is the representation of the sky image in terms of a finite number of point sources. MEM allows for many potential regularizing constraints through the use of different entropy functions. Furthermore, MEM naturally incorporates uncertainties due to thermal noise and quantifies the goodness-of-fit in a standard χ2 metric. This makes MEM a natural choice for sparse or heterogeneous VLBI arrays such as the EHT.

A further complication in high-frequency VLBI is that atmospheric fluctuations make stable phase information on individual baselines impossible. However, adding the phases observed on three baselines around a triangle cancels the atmospheric contribution at each station, so these closure phases only contain information about the source. Assuming that the visibility amplitudes can be calibrated to remove station-dependent gain terms (which vary more slowly than the unstable phase terms from the atmosphere), calibrated amplitudes can be combined with closure phases in the bispectrum, the product of three simultaneous visibilities around a triangle (Rogers et al. 1974), (TMS),

Equation (2)

where, for instance, ${\tilde{I}}_{12}$ is the measured visibility between stations 1 and 2 at a given time.

Because the image of linear polarization is a two-dimensional vector field defining both the magnitude and direction of the linear polarization at each location, it can be represented as a complex image, $P(x,y)=Q(x,y)\ +{iU}(x,y)$, where $Q(x,y)$ and $U(x,y)$ are the images of the linear Stokes parameters. The linearly polarized image can also be expressed in terms of the polarization fraction $m(x,y)$ and polarization position angle $\chi (x,y)$ as $P=m\,I\,{e}^{2i\chi }$, where $m=| P| /I=\sqrt{{Q}^{2}+{U}^{2}}/I\leqslant 1$, and $\chi =\tfrac{1}{2}\arctan \tfrac{U}{Q}$. Polarimetric visibilities, ${\tilde{Q}}_{k}$ and ${\tilde{U}}_{k}$, are also related to the images $Q(x,y)$ and $U(x,y)$ via the van Cittert–Zernike theorem (Equation (1)). For synthesis imaging, the most significant difference between polarization and total flux is that the images of Q and U are not constrained to be positive and that the total polarization fraction in each pixel is constrained to be less than one.

Because the atmosphere is not significantly birefringent at the high frequencies we are considering (TMS), the atmospheric contribution to phase is identical for all of the visibilities ${\tilde{I}}_{k}$, ${\tilde{Q}}_{k}$, and ${\tilde{U}}_{k}$, so polarimetric ratios such as ${\tilde{Q}}_{k}/{\tilde{I}}_{k}$ provide the same immunity to station-based phase errors as closure phase (Roberts et al. 1994). We define the visibility domain polarimetric ratio and phase (following the notation of Johnson et al. 2014)

Equation (3)

It is important to note that $\breve{m}$ is not the Fourier transform of the image plane polarization fraction m. In particular, $\breve{m}$ is not conjugate-symmetric under the reversal of baselines $(u,v)\to (-u,-v)$. It also is possible for the magnitude $| \breve{m}| $ to exceed unity, if, for example, the total intensity visibility $\tilde{I}$ has a "null" on some baseline, due to the presence of some sharp feature in the image (Johnson et al. 2015).

3. TOTAL INTENSITY MEM

The standard CLEAN algorithm operates on the dirty image obtained from Fourier transforming the sparse interferometer data and treats the imaging process as a deconvolution of the dirty beam from the image (Högbom 1974). CLEAN models the sky brightness distribution as a collection of point sources. It determines the locations and magnitudes of these point sources iteratively by finding the maximum intensity pixel of the dirty image, then subtracting the shifted and scaled "dirty beam." After a certain number of iterations, CLEAN convolves the point source model with a "clean" beam obtained by fitting a Gaussian to the central component of the dirty beam. The algorithm halts after the maximum brightness point in an image drops below some multiple of the residual rms level, or after negative components start to be removed. Finally, the dirty image residuals are added to the restored image to include low-intensity diffuse brightness distributions that are poorly captured by the point source decomposition. Because CLEAN relies on absolute visibility phase information to perform the inverse Fourier Transform to the dirty image at each step of the algorithm, visibility phases corrupted by atmospheric phase fluctuations must either be calibrated or self-calibrated in a loop with multiple iterations of CLEAN (TMS).

In contrast, MEM as described in this paper operates directly on the measured visibilities or robust quantities like closure phases or the bispectrum. In MEM and other Bayesian regularization imaging methods, an image is fitted to the data by minimizing a weighted sum of ${\chi }^{2}$ and a regularizing function that incorporates prior information. With this approach, only the forward Fourier transform from the trial image to the visibility domain is used, and trial image visibilities can be directly compared with measured, calibrated visibilities or other data products derived from the measured visibilities. Furthermore, for sparse VLBI arrays, we can avoid sampling errors introduced by transforming with a Fast Fourier Transform (FFT) and compute trial image visibilities at the sampled baseline points with a discrete-time Fourier Transform (DTFT).5

In MEM, we maximize a regularizing function, or "entropy" of the image, with respect to data constraints. In what follows, we denote all arrays of image pixels or visibilities in bold. For a n2 pixel test image ${\boldsymbol{I}}^{\prime} $, a prior/bias image ${\boldsymbol{B}}$, and an array of N measured visibilities $\widetilde{{\boldsymbol{I}}}$, we maximize the objective function (Narayan & Nityananda 1986):

Equation (4)

where $S({\boldsymbol{I}}^{\prime} ,{\boldsymbol{B}})$ is the chosen regularizer or entropy function and ${\chi }^{2}$ is the goodness-of-fit test statistic that compares the visibilities of the test image ${\boldsymbol{I}}^{\prime} $ to the data. The mixing coefficient α controls the weighting between the regularizer (entropy) term and the data (χ2) term. Considering MEM as a form of constrained optimization, α plays the role of a Lagrange multiplier. In practice, it can be fixed, iterated manually, or be allowed to vary in the maximization process.

The goodness-of-fit χ2 term is defined in the visibility domain:

Equation (5)

where σk is the noise estimate on the kth u, v point and the model visibilities ${I}_{k}^{\prime }$ are the DTFT of the test image evaluated at the kth $u,v$ point. The factor of 2 in the denominator of Equation (5) is included because the variance σk2 is taken to be the variance along either the real or imaginary axis; in fitting the data, we must fit the real and imaginary parts of the visibilities separately. Assuming that the visibilities are normally distributed, χ2 will possess a χ2 distribution, and a good fit where the trial visibilities agree with the measurements within error has χ2 ≈ 1.

The standard entropy function motivated from information theory is (Frieden 1972; Gull & Daniell 1978)

Equation (6)

but many other entropy functions can be chosen, including $S({\boldsymbol{I}}^{\prime} )=\sum \mathrm{log}({I}_{i}^{\prime })$, $S({\boldsymbol{I}}^{\prime} )=\sum \sqrt{{I}_{i}^{\prime }}$ (Narayan & Nityananda 1986), or the 1 norm $S({\boldsymbol{I}}^{\prime} )=\sum | {I}_{i}^{\prime }| $ (Honma et al. 2014). In fact, it can be shown that for any convex function $S({\boldsymbol{I}}^{\prime} ,{\boldsymbol{B}})$ of the Ii, the reconstruction is guaranteed to converge (Narayan & Nityananda 1986).

To deal with phase uncertainty in MEM without needing to calibrate or self-calibrate the visibility phases, we can extend this technique to the image bispectrum, replacing the data term ${\chi }^{2}$ with its bispectral extension. In this method, unlike in a self-calibration loop as used with CLEAN, the visibility phases are not calibrated prior to imaging. The objective function becomes

Equation (7)

where the bispectrum data term is

Equation (8)

In the above equation, we have NB independent bispectrum measurements ${\widetilde{{\boldsymbol{I}}}}_{{Bj}}$, each with standard deviation ${\sigma }_{{Bj}}$. At any instant in time with detections on all baselines to T sites, there are $\displaystyle \left(\genfrac{}{}{0em}{}{T}{3}\right)=T!/\,3!(T-3)!$ triangles but only $(T-1)(T-2)/2$ independent bispectrum measurements (TMS). In our reconstructions, we constructed a set of independent bispectra at each time, using the criterion that each triangle contain the station with the highest signal-to-noise ratio.

Using the bispectrum for MEM image reconstruction was pioneered in optical interferometry with the BSMEM gradient descent algorithm (Buscher 1994; Baron & Young 2008), which has been successfully used on simulated EHT observations (Fish et al. 2014; Lu et al. 2014). Recent developments using the bispectrum directly in image reconstruction include the CHIRP algorithm (Bouman et al. 2016), which uses a data-driven regularizing function based on features found in a library of sample images instead of a standard entropy term based on a single prior image.

Because the bispectrum may not include any significantly small triangles to tightly constrain the unresolved flux, it is particularly useful to add a total flux constraint to the objective function (Equation (4)):

Equation (9)

where ${F}_{\mathrm{obs}}$ is the observed total flux density of the source and γ is a hyperparameter that controls the relative weighting of the flux constraint compared to the ${\chi }_{B}^{2}$ and entropy term in Equation (7). In addition, since the bispectrum lacks overall phase information, the bispectrum data carry no information about the absolute image position and the final image centroid is arbitrary. Choosing image frame coordinates where $x=0,y=0$ corresponds to the center of the frame, we can add a center-of-mass constraint to the objective function:

Equation (10)

where δ is another hyperparameter to control the weight of this constraint with respect to the data term, entropy, and any other constraints. Finally, imaging with the bispectrum carries an additional complication in that the bispectrum values are not necessarily described by a normal distribution, and thus the bispectrum ${\chi }_{B}^{2}$ statistic is not as straightforward to interpret as it is in the case where calibrated visibility phases are available. In the high signal-to-noise limit, however, the distribution of the bispectrum points approaches a Gaussian (Rogers et al. 1995).

We can implement a total intensity MEM algorithm using a quasi-Newton gradient descent method. The derivatives of S and ${\chi }^{2}$ or SB and ${\chi }_{B}^{2}$ with respect to the pixel values Ii can be computed analytically and evaluated at each step. The inverse Hessian can be approximated by neglecting the off-diagonal terms, as in the Cornwell and Evans algorithm (Cornwell & Evans 1985), or through numerical approximation as in the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm (Byrd et al. 1995).

Because nonlinear methods like MEM and CLEAN input prior information into the imaging process, we might expect some degree of image "superresolution," or the production of image features on scales less than the array nominal resolution ${R}_{\min }=\lambda /{b}_{\max }$, where ${b}_{\max }$ is the length of the longest baseline in the VLBI array. The frequently quoted result that MEM has a superresolution factor of 1/4 the nominal resolution is in fact not based on the image prior, but only on the analyticity of the data (Narayan & Nityananda 1986). The derivation of this factor requires the unrealistic assumption of infinite signal-to-noise (Holdaway 1990). In practice, the superresolution factor may be informed by both the analyticity of the data (degraded by noise) and the entropy function. Simple tests comparing blurred MEM to the model source distribution suggest that in practice, total intensity MEM can achieve a superresolution factor of 1/3 to 1/2 the nominal resolution (see Section 5.3).

4. LINEAR POLARIMETRIC MEM

To extend MEM to complex linear polarized images while avoiding atmospheric phase corruption, we maximize the objective function (Holdaway 1990):

Equation (11)

Here ${\boldsymbol{P}}^{\prime} $ is our trial image of the polarized flux and ${\chi }_{m}^{2}$ is the test statistic that compares the polarimetric ratios of the test image to the data:

Equation (12)

where ${\breve{m}}_{k}={\tilde{P}}_{k}/{\tilde{I}}_{k}$ is the polarimetric ratio on the kth $u,v$ point, which is insensitive to phase errors. As in the bispectral imaging of total intensity, this MEM technique does not reconstruct the phases on $\tilde{Q}$ and $\tilde{U}$ directly before imaging, but instead relies on the robust, measurable polarimetric ratios $\breve{m}$ to guide the imaging directly. Thus, computing ${\chi }_{m}^{2}$ requires a ${\boldsymbol{I}}^{\prime} $ reconstruction and its visibility phases—when using the polarimetric ratios in this manner, we cannot image ${\boldsymbol{P}}^{\prime} $ independently from ${\boldsymbol{I}}^{\prime} $.

The conventional polarimetric entropy, first developed from the eigenvalues of the Stokes parameter correlation matrix, is (Ponsonby 1973; Nityananda & Narayan 1983; Narayan & Nityananda 1986; Holdaway & Wardle 1990)

Equation (13)

The quantity ${m}_{\max }$ is the maximum fractional polarization; this can be generically set to 1. For synchrotron sources we can instead set ${m}_{\max }\approx 0.75$, as was done by Holdaway & Wardle (1990) to ensure that the degree of polarization remains limited to the expected maximum for power-law synchrotron emission (Rybicki & Lightman 1979). This entropy naturally favors images with $| {m}_{i}^{\prime }| \lt {m}_{\max }$. However, it contains no information about the polarization direction and it tends to favor low polarization magnitudes, as it is maximized in the absence of data constraints when ${m}_{i}^{\prime }=0$ in all pixels.

Another possibility is to use a simple log entropy, as one might use for total intensity images (e.g., Ponsonby 1973; Nityananda & Narayan 1983; Narayan & Nityananda 1986):

Equation (14)

Because the polarization field traces the magnetic field structure, which we expect to not be completely disordered in resolved images, we may want to move beyond pixel-by-pixel entropy terms to gradient-based regularizing functions. One option is to use a regularizer that is proportional to the total variation of the trial image ${\boldsymbol{P}}^{\prime} $. The isotropic total variation TV of a complex image matrix ${\boldsymbol{X}}$ typically used for image reconstruction and denoising (Rudin et al. 1992) is

Equation (15)

Adopting the total variation of the complex polarimetric image as our "entropy" term, we take $S({\boldsymbol{P}}^{\prime} )=-\mathrm{TV}({\boldsymbol{P}}^{\prime} )$.

The total variation entropy constrains MEM to prefer smooth polarization fields in both direction and magnitude. However, the gradient of Equation (15) becomes infinite for uniform images, so care must be taken in the minimization algorithm, especially when determining the initial test image. The actual imaging algorithm can again use quasi-Newton or conjugate gradient methods, and can operate on either the Q and U arrays or the m and χ images.

Finally, we can combine the polarimetric and total flux terms into a joint objective function for simultaneous imaging of ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ with the bispectrum and polarimetric ratios:

Equation (16)

Here ${S}_{\mathrm{tot}}({\boldsymbol{I}}^{\prime} ,{\boldsymbol{P}}^{\prime} ,{\boldsymbol{B}})$ is a joint entropy function, such as a combination of Equations (6) and (13), and the constraints can include terms constraining the total flux density (Equation (9)) or image centroid (Equation (10)). Joint imaging can use data in the polarimetric ratios to constrain the total intensity image, and is thus the most theoretically sound method of MEM imaging polarized fields. In practice, however, convergence to the true image is poor if we allow both ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ to vary starting from a flat or Gaussian initial image (Holdaway & Wardle 1990). The method favored by Holdaway (1990) alternates iterations where ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ are changed independently. In our experience, this method does not offer any practical benefit over imaging ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ separately, as the ${\boldsymbol{P}}^{\prime} $ reconstruction is not allowed to constrain the ${\boldsymbol{I}}^{\prime} $ image. To aid convergence, we tested a strategy of performing initial imaging steps with total intensity and polarization separately and then using the result as the initial image of a joint imaging process. Even in this case, we found the performance of the joint imaging routine to be highly sensitive to the weighting between the bispectrum and polarimetric ratio data terms (α and β in Equation (16)), and the advantage over careful independent imaging of ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ seems minimal.

5. IMPLEMENTATION AND RESULTS

In this section, we discuss the implementation of our MEM algorithm and results from applying it to real and simulated data sets. In Section 5.1, we describe the algorithm's implementation in our python software (available at https://github.com/achael/eht-imaging). In Section 5.2, we check our method for consistency with CLEAN on established reconstructions of VLBA observations of the quasar 3C279 at 7 mm and of 3C273 at 3 mm. In Section 5.3, we characterize our algorithm's ability to "superresolve" image structure and compare its performance with CLEAN's in the regime of model images of Sgr A* at 1.3 mm, as might be observed by the EHT. Finally, in Section 5.4, we apply our method to several different simulated EHT data sets from the array that observed Sgr A* and M87 in 2016 March and the expected expanded array in 2017, and interpret the results.

5.1. Implementation

We imaged a variety of simulated and real polarimetric VLBI data sets by numerically maximizing the polarimetric ratio objective function, Equation (11), given Stokes I images generated by MEM on the bispectrum obtained by minimizing Equation (4). Because of the relatively small size of our VLBI data sets and the limited fields of view of our reconstructed images, we used DTFTs instead of FFTs in calculating the ${\chi }^{2}$ terms and the gradients of JB and Jm. This eliminates the error introduced by interpolating data from the measured $u,v$ points to the FFT grid. Our MEM routines use the Limited-Memory BFGS algorithm (L-BFGS) (Byrd et al. 1995) implemented in the Scipy scientific python package (Zhu et al. 1997; Jones et al. 2001). L-BFGS is a quasi-Newton gradient descent algorithm that relies on a gradient function provided by the user and estimates the Hessian matrix as it iterates. L-BFGS does not store a full Hessian matrix, but approximates it with a series of vectors from the preceding m steps, making it a preferred choice for our reconstructions due to the large size of the Hessian (${n}^{2}\times {n}^{2}$ for an n × n image). On our limited data sets, we found that L-BFGS ran sufficiently quickly on images up to 500 × 500 pixels, and it was consistently more accurate in its reconstructions of model images than either a polarimetric modification of the Cornwell–Evans algorithm or a conjugate gradient method.

A major difference between our MEM implementation and CLEAN is that MEM uses only a forward transform from the sample image to visibility space, while CLEAN uses inverse transforms from the visibilities to the image domain. In CLEAN, the inverse transforms require the visibility data to be gridded, but in MEM the sample visibilities can naturally be computed with a DTFT using the exact sampled u, v points. Our method is also different than some past MEM algorithms (e.g., Cornwell & Evans 1985), which have generally used the forward transform, approximating the data ${\chi }^{2}$ as a difference between the dirty image and the test image convolved with the dirty beam. Our method instead uses DTFTs and is ideal for sparse VLBI arrays.

While entropy terms like Equations (6) and (13) can be designed to prefer images that obey physical constraints like $I\gt 0$ and $0\lt m\lt 1$, the initial steps of an unbounded minimization algorithm often take the image into an unphysical configuration and complex values of the entropy functions. This problem can be addressed by using a bounded minimization algorithm or placing a manual clip on the values of I or m (as in Cornwell & Evans 1985). Instead, we chose to perform a change of variables in both the total intensity and polarimetric images to naturally enforce the image constraints.

In Stokes I, to satisfy the total intensity constraint $I\geqslant 0$, we transformed to $I={e}^{\xi }$, where $-\infty \lt \xi \lt \infty $. For the polarimetric data, we choose to reconstruct images in m and χ, instead of Q and U, as both the physical constraint $m\lt 1$ and the entropy functions Equations (13) and (15) are most naturally defined in terms of these variables. To naturally satisfy the constraint $0\lt m\lt 1$, we transformed to $m=\tfrac{1}{2}+\tfrac{1}{\pi }\arctan \kappa $, where $-\infty \lt \kappa \lt \infty $. In both cases, we modified the gradient given to the algorithm by multiplying by the derivatives of these expressions (see Appendix D).

To compare polarimetric regularizers, we used both the standard entropy term, Equation (13), which we refer to as the Ponsonby–Nityananda–Narayan (PNN) entropy, and a total variation entropy term, Equation (15). In all of our reconstructions, we first imaged ${\boldsymbol{I}}^{\prime} $ directly, using the bispectrum. Because bispectral imaging of ${\boldsymbol{I}}^{\prime} $ can converge poorly given a poor choice of image prior, we used a sequence of five runs of the algorithm, substituting the prior image in Equation (6) with the final image from a previous run blurred with a 1/2 scaled clean beam. To initialize the P imaging process, we use an initial image that has constant fractional polarization magnitude and direction, set equal to the zero-baseline value, which is multiplied by the final ${\boldsymbol{I}}^{\prime} $ image. Our tests have shown, however, that the final polarimetric image is generally insensitive to the initial polarimetric image. If zero-baseline polarization data are not available, an image with a constant 5% polarization fraction and zero polarization position angle may be used instead. We again used a sequence of five runs of imaging ${\boldsymbol{P}}^{\prime} $, using the final image blurred to 1/2 the array resolution as the initial image of each subsequent run. Finally, we again convolved the final ${\boldsymbol{I}}^{\prime} $ and ${\boldsymbol{P}}^{\prime} $ images with a Gaussian beam of 1/2 the size of the fitted clean beam, limiting MEM's tendency to superresolve spurious features. Note that since the bispectrum does not contain absolute phase information constraining the location of the total flux image centroid, MEM images produced with the bispectrum are frequently offset from the model image, despite attempts to constrain this tendency with Equation (10). As a result, we have manually centered our images to provide clear comparisons with the model images. The essential steps of our procedure are summarized in Figure 1.

Figure 1.

Figure 1. Flowchart summarizing our imaging procedure.

Standard image High-resolution image

Our python code with routines for simulating and manipulating data and producing MEM images with the Scipy L-BFGS algorithm is available to download at https://github.com/achael/eht-imaging.

5.2. 7 and 3 mm VLBA Quasar Observations

To check consistency with CLEAN on real data, we produced polarimetric images from 7 mm quasar observations from the Very Long Baseline Array obtained by the Boston University Blazar Research Group in 2013 (data reduction is described in Jorstad et al. 2005).6 We compared our polarimetric images with established CLEAN reconstructions that were convolved with the fitted clean beam. While the VLBA data were phase-calibrated, we still used MEM algorithms that only used bispectrum and polarimetric ratio data. Our results for the quasar 3C279 are displayed in Figure 2. We found that when our MEM reconstructions were convolved with the same beam used by CLEAN, the MEM images were an excellent match for the overall polarization magnitude and direction structure of the established images. Both the PNN (Equation (13)) and total variation regularizers (Equation (15)) performed well in reconstructing the direction of the polarization field, and were consistent with each other. In general, the total variation regularizer, which does not prefer low polarization magnitudes, produced higher fractional polarization than the PNN regularizer in areas with weak Stokes I flux.

Figure 2.

Figure 2. Reconstructions of 7 mm observations of the quasar 3C279 taken with the VLBA in 2013 April (Jorstad et al. 2005). Contours are of total flux in steps of $\sqrt{2}$ from 3× the background rms level. Ticks representing the direction of the polarization position angle and color corresponding to the polarized intensity $| P| $ are plotted in regions where $| P| $ is greater than 4× its background rms level. The left panel shows the reconstruction convolved with the fitted elliptical clean beam (384 × 119 μas FWHM, produced with Briggs weighting), and the right panel displays a MEM reconstruction of the same data set, smoothed with the fitted clean beam. While the CLEAN reconstruction used a self-calibration loop to determine visibility phases on $\tilde{I}$, $\tilde{Q}$, and $\tilde{U}$, the MEM reconstruction directly used bispectrum and polarimetric ratio data. The MEM reconstruction used the Ponsonby–Nityananda–Narayan (PNN) entropy term. The results are consistent with the CLEAN reconstruction when convolved with the same beam.

Standard image High-resolution image

To further test our method at higher frequencies, we produced a polarimetric image from the 3 mm observation of 3C273 taken with the VLBA in conjunction with the Green Bank Telescope (GBT) reported in Hada et al. (2016). The results are displayed in Figure 3. When convolved with the clean beam, the results from our MEM method are broadly consistent, but some discrepancies are apparent in regions of low polarized intensity.

Figure 3.

Figure 3. Reconstructions of 3 mm observations of the quasar 3C273 taken with the VLBA+GBT (Hada et al. 2016). Contours are of total flux in steps of $\sqrt{2}$ from 3× the background rms level. Ticks representing the direction of the polarization position angle and color corresponding to the polarized intensity $| P| $ are plotted in regions where $| P| $ is greater than 4× its background rms level. The left panel shows the reconstruction convolved with the fitted elliptical clean beam (340 × 108 μas FWHM, produced with natural weighting), and the right panel displays a MEM reconstruction of the same data set, smoothed with the fitted clean beam. While the CLEAN reconstruction used a self-calibration loop to determine visibility phases on $\tilde{I}$, $\tilde{Q}$, and $\tilde{U}$, the MEM reconstruction directly used bispectrum and polarimetric ratio data. The MEM reconstruction used the Ponsonby–Nityananda–Narayan (PNN) entropy term.

Standard image High-resolution image

5.3. "Superresolution" and Comparisons with CLEAN on Simulated 1.3 mm Data

Unlike CLEAN images, MEM images in theory do not require restoration with the fitted interferometer beam. However, when testing our method on synthetic data, we found that at a certain point in the imaging process both total intensity and polarimetric MEM algorithms began producing spurious high-frequency features that were not present in the true source distribution. Restoring the final MEM images by convolving with a Gaussian beam will offset this tendency, but it is important not to make the beam too large and wash out real high-spatial-frequency features that may be "superresolved" on scales smaller than the interferometer beam.

To test MEM's capacity for "superresolution" and determine the appropriate restoring beam size, we produced total intensity MEM (using the entropy term in Equation (6)) and CLEAN images of a model of Sgr A* using simulated data with thermal noise from the EHT array projected to be available in 2017 (See Section 5.4). For this simple test we neglected the effects of inaccurate amplitude calibration, atmospheric phase corruption, and interstellar scattering. We used a MEM algorithm with full visibility phase information, directly minimizing Equation (4) with the ${\chi }^{2}$ term in Equation (5). This choice, while infeasible in practice due to phase errors, allowed us to directly compare to CLEAN without introducing the need for self-calibration.

After obtaining MEM and CLEAN reconstructions from the same data, we convolved the reconstructed images with a sequence of Gaussian beams scaled from the elliptical Gaussian fitted to the Fourier transform of the $u,v$ coverage (the "clean beam"). We then computed the normalized root-mean-square error (NRMSE) of each restored image:

Equation (17)

where ${\boldsymbol{I}}^{\prime} $ is the final restored image and ${\boldsymbol{I}}$ is the true image. For the CLEAN reconstructions, we chose not to add the dirty image residuals back to the convolved model, as the residuals are a sensible quantity only for the full restoring beam. To minimize the effect of this choice on the CLEAN reconstruction, we chose a compact model image with no diffuse structure. After tuning our CLEAN reconstruction for this image, the total flux left in the residuals was less than 2% of the total image flux. In performing the CLEAN reconstruction, we used Briggs weighting and a loop gain of 0.025, with the rest of the parameters set to the default in the algorithm's CASA implementation.7

The results are displayed in Figure 4. In the left panel, we see that the MEM curve has a minimum in NRMSE at a significantly smaller beam size than the CLEAN reconstruction, demonstrating MEM's superior ability to superresolve source structure over CLEAN. Furthermore, the value of NRMSE from the MEM reconstruction is consistently lower than that from CLEAN for all values of restoring beam size. Most importantly, while the CLEAN curve NRMSE increases rapidly for restoring beams smaller than the optimal resolution, the MEM image fidelity is relatively unaffected by choosing a restoring beam that is too small. Choosing a restoring beam that is too large produces an image with the same fidelity as the model blurred to that resolution. The right panel of Figure 4 shows the model image, the interferometer "clean" beam, and the reconstructions blurred with the clean beam (nominal) and the measured optimal fractional beams. In addition to lower resolution and fidelity, the CLEAN reconstructions show prominent striping features from isolated components being restored with the restoring beam.

Figure 4.

Figure 4. (Left) Normalized root-mean-square error (NRMSE, Equation (17)) of MEM and CLEAN reconstructed Stokes I images as a function of the fractional restoring beam size. For comparison, the NRMSE of the model image is also plotted. The reconstructed images were produced using simulated data from the EHT array; for straightforward comparison with CLEAN, realistic thermal noise was added to the simulated visibilities but gain calibration errors, random atmospheric phases, and blurring due to interstellar scattering were all neglected. The images were convolved with scaled versions of the fitted clean beam. The minimum for each NRMSE curve indicates the optimal restoring beam, which is significantly smaller for MEM (25% of nominal) than for CLEAN (0.78% of nominal). (Right) Example reconstructions restored with scaled beams from curves in the left panel. The center left panels are the MEM and CLEAN reconstructions restored at the nominal resolution, with the fitted clean beam. The center right panels show the reconstructions restored with the optimal beam for the CLEAN reconstruction and the far right panels show both reconstructions restored with the optimal MEM beam. The CLEAN reconstructions consist of only the CLEAN components convolved with the restoring beam and do not include the dirty image residuals, as discussed at the end of Section 3.

Standard image High-resolution image

While Figure 4 demonstrates that in this case the MEM reconstruction has superior resolution and fidelity to the CLEAN reconstruction, the optimal restoring beam fractional size for the CLEAN reconstruction is still less than unity. This result was observed in several similar reconstructions, suggesting that shrinking the restoring beam used in CLEAN reconstructions to 75% of the nominal fitted beam can enhance resolution without introducing imaging artifacts, at least on images of compact objects similar to those used in these tests.

Repeating the exercise of Figure 4 with observations taken with increased or decreased signal-to-noise ratio resulted in NRMSE curves that are only slightly higher and lower than the curves in Figure 4, but shared the same form—in particular, the location of the minimum NRMSE values was barely shifted. This insensitivity to additional noise is likely due to the overall high S/N of our original observations, which had an average S/N of 178 and a minimum S/N of 13. Our results show that with a high average S/N, increasing or decreasing the noise by up to an order of magnitude does not significantly affect the image reconstruction. Observations with an average S/N ∼1, on the other hand, may show a drastic change in quality with small adjustments to the noise level.

Extending the exercise from Figure 4, we calculated the NRMSE for polarimetric MEM from several test images as a function of restoring beam size, replacing the Stokes I flux with $P=Q+{iU}$ in Equation (17). Once again, we neglected the effects of inaccurate amplitude calibration, atmospheric phase corruption, and interstellar scattering in our simulated data; however, our MEM algorithm used only polarimetric ratios $\breve{m}$ while CLEAN reconstructed Q and U separately with full $\tilde{Q}$ and $\tilde{U}$ amplitude and phase information. The results are displayed in Figure 5. Though they are not displayed, reconstructions using different regularizer terms (i.e., Equations (13)–(15)) performed similarly. The degree of superresolution in the polarimetric MEM reconstructions is less than that in the total intensity case, typically with a minimum in NRMSE around a restoring beam size of 1/2 the nominal resolution. This reduced degree of superresolution is likely due to a combination of lower S/N on the polarized data points, the loss of absolute phase information in the MEM imaging process, and the low dynamic range of the $| m| $ images (Holdaway 1990).

Figure 5.

Figure 5. (Left) Normalized root-mean-square error (Equation (17), with $I\to P$) of MEM and CLEAN reconstructed polarimetric images vs. the size of the anisotropic restoring beam, as a fraction of the nominal fitted beam size. As in Figure 4, the CLEAN curve was computed by restoring the CLEAN point source model with scaled restoring beams without adding the dirty image residuals. The reconstructed images were produced using data simulated from the EHT array with realistic thermal noise; for simplicity of comparison with CLEAN, the data were not corrupted with gain uncertainties, random atmospheric phases, or blurring from interstellar scattering. Comparing to Figure 4, we see that, using this metric, the reconstruction of the linear polarization distribution is less accurate than the reconstructions of Stokes I, but the MEM reconstruction still provides superior resolution and fidelity to CLEAN, with optimal beam sizes at 70% and 95% of the nominal clean beam size, respectively. (Right) Example reconstructions restored with scaled beams from curves in the left panel. The center left panels are the MEM and CLEAN reconstructions restored at the nominal resolution, with the fitted clean beam. The center right panels show the reconstructions restored with the optimal beam for the CLEAN reconstruction and the far right panels show both reconstructions restored with the optimal MEM beam. Polarization position angle ticks are plotted in regions with I greater than 4× its rms value and $| P| $ greater than 2× its rms value.

Standard image High-resolution image

The simple tests presented in Figures 4 and 5 suggest that MEM can "superresolve" source structure in I and P on scales greater than about 1/2 the nominal interferometer resolution. Furthermore, they suggest that at least on this class of images, featuring compact flux distributions, MEM achieves superior resolution and fidelity to CLEAN. For the remainder of this work, we adopted a strategy of restoring both total intensity and polarimetric images with a scaled beam 1/2 the size of the fitted beam.

5.4. Imaging Different Models of Sgr A* and M87 at 1.3 mm

We applied our techniques to several 1.3 mm simulated images from supermassive black hole accretion disk and jet models with simulated data from the planned 2017 EHT array. We chose several images featuring different types of structure in total intensity and polarization, including semi-analytic radiatively inefficient accretion flow (RIAF) and jet models courtesy of Avery Broderick (Broderick & Loeb 2009; Broderick et al. 2011; Lu et al. 2014), ray-traced images from a magnetically arrested disk (see, e.g., Tchekhovskoy et al. (2011)) GRMHD simulation courtesy of Jason Dexter (Dexter 2014), and a GRMHD simulation from Roman Gold (Shcherbakov & McKinney 2013; Gold et al. 2016).

We sampled the Fourier transforms of these model images on projected baselines corresponding to the expected EHT arrays in 2016 and 2017. Our 2016 array included stations in Hawaii, Arizona, and Mexico, all operating with 2 GHz of bandwidth. The 2017 array is expected to include these stations with the addition of stations in France, the South Pole, and the ALMA interferometer in Chile, all operating with 4 GHz of bandwidth (see $u,v$ coverage in Figure 6). We added realistic baseline-dependent Gaussian thermal noise on the complex visibilities. The standard deviation σ of the thermal noise is determined according to the standard equation (TMS):

Equation (18)

where ${\mathrm{SEFD}}_{1}$ and ${\mathrm{SEFD}}_{2}$ are the telescope system equivalent flux densities, ${\rm{\Delta }}\nu $ is the observing bandwidth, and ${t}_{\mathrm{int}}$ is the integration time. The factor of $1/0.88$ in Equation (18) comes from losses due to two-bit quantization in the correlation process. For our simulations, we used an integration time of 60 s, unless otherwise stated.

Figure 6.

Figure 6. Event Horizon Telescope 24 hr $u,v$ coverage for observations of Sgr A* in 2016 (left) and 2017 (right). The 2016 array includes the Submillimeter Array in Hawaii, the Submillimeter Telescope in Arizona, the Large Millimeter Telescope in Mexico, and the Pico Veleta millimeter dish in Spain. In 2017 the array is projected to expand to include the Plateau de Bure interferometer in France, the ALMA interferometer in Chile, and the South Pole Telescope.

Standard image High-resolution image

We simulated the effects of gain calibration errors by assigning each site both a time-dependent gain Gi drawn from a Gaussian distribution with a mean of 1 and 10% standard deviation, and a time-dependent atmospheric opacity ${\tau }_{i}$ drawn from a Gaussian with a mean 0.1 and a standard deviation of 0.01. The "true" time-dependent SEFDs were then computed from the measured $\mathrm{SEFD}^{\prime} {\rm{s}}$ (denoted with primes) from the equation

Equation (19)

where ${\theta }_{i}$ is the source elevation at the observation time. Thermal noise was added to the observations from a zero-mean circular complex Gaussian distribution of standard deviation given by Equation (18), with the measured $\mathrm{SEFD}^{\prime} $ replaced with the true SEFD at each time. The noisy visibilities were then multiplied by the ratio of the estimated SEFDs to true SEFDs

Equation (20)

where we have used our assumption that the mean opacity at each site is 0.1 to adjust each of the estimated for elevation dependence. The measured EHT station $\mathrm{SEFD}^{\prime} {\rm{s}}$ we used were reported in Lu et al. (2014). In computing the expected thermal noise with these $\mathrm{SEFD}^{\prime} {\rm{s}}$ by Equation (18) for computing ${\chi }^{2}$ terms, we again modified each $\mathrm{SEFD}^{\prime} $ by the ${e}^{0.1/\sin \theta }$ factor from elevation dependence, assuming an opacity $\tau =0.1$. We did not include any correction for possible gain calibration error in our estimated noise terms.

In simulating phase corruption from atmospheric turbulence, we multiplied the visibilities by random phases drawn uniformly at each site and at each time step. For simulated observations of Sgr A* we also included the blurring effects of interstellar scattering, which we mitigated by dividing out the scattering kernel according to the method of Fish et al. (2014). This process has the net effect of increasing the noise level on long baselines. In practice, for Sgr A* refractive interstellar scattering contributes additional epoch-dependent image distortions (Johnson & Gwinn 2015), which we will analyze separately in a future work.

To compare the polarimetric reconstructions with different regularizers and with different arrays to the model image, we computed the NRMSE in Stokes I and P for each image via Equation (17). To compare the fidelity of the polarization position angle reconstruction, we also computed the mean square error of the polarization position angle of the reconstruction weighted by the magnitude of the total flux:

Equation (21)

This error metric gives an rms estimate for the angular error in the polarization position angle reconstruction, and hence the magnetic field morphology of the source. It is weighted by the Stokes I flux because the polarization position angle in the reconstructions can swing wildly in regions with negligible polarized flux. This reasoning also led us to display polarization position angle ticks only in pixels with greater than 10% of the maximum Stokes I flux in all of our reconstructed images.

The EHT simulated data has a lower signal-to-noise than the VLBA data considered above, but we found that our MEM reconstructions were still nearly independent of the choice of regularizer and relative weighting. The polarimetric reconstructions were able to conclusively distinguish between the well-ordered field structure in a RIAF model (Figure 7) and the stochastic field configuration in a GRMHD simulation (Figure 8). Both the PNN and TV entropy terms reproduce the polarization magnitude and direction well, and the NRMSE in both I and P (Equation (17)) and the intensity-weighted polarization position angle error (Equation (21)) were similar for reconstructions with both entropy terms. For the RIAF model, which featured low polarization magnitudes and smoothly varying polarization position angle, the NRMSE values were 28.7% for Stokes I and 47.9% for Stokes P for the PNN reconstruction and 28.53% in Stokes I and 48.2% for P for the TV reconstruction. The corresponding weighted angular errors were 14fdg9 and 14fdg7. When we compare the reconstructions to the model image smoothed to the same resolution as the reconstruction's resolution (center left in Figure 7), the I and P NRMSE values drop to 25.1% and 46.7% for the PNN reconstruction and 24.8% and 47.0% for the TV reconstruction. The polarization position angle weighted error drops to 13fdg7 and 13fdg5 for the PNN and TV regularizers, respectively.

Figure 7.

Figure 7. (Top) 1.3 mm MEM reconstructions of a Sgr A* image (left) from a simulation courtesy of Avery Broderick (Broderick et al. 2011). Color indicates Stokes I flux, and ticks marking the polarization position angle are plotted in regions with I greater than 4× its rms value and $| P| $ greater than 2× its rms value. Visibilities from the planned full EHT array were simulated, including the blurring effects of interstellar scattering, with realistic thermal noise, amplitude calibration errors, and random atmospheric phases included. Stokes I was imaged with the bispectrum and linear polarization was subsequently imaged using polarimetric ratios with the Ponsonby–Narayan–Nityananda (PNN) entropy function (right center) and a total variation (TV) regularizer (right). The final reconstructions were restored with a Gaussian beam of 1/2 the size of the fitted clean beam (27 × 14 μas FWHM); for comparison, the model image smoothed to this resolution is displayed on the center left. (Bottom) The same reconstructions displayed in contours of total intensity, in steps of $\sqrt{2}$ up from 4× the background rms level. Color indicates the magnitude of the polarized flux $| P| $ and is displayed, along with ticks marking the polarization position angle, in regions where I is greater than 4× its background rms level and $| P| $ is greater than 2× its rms value. Both MEM priors successfully reproduce the smooth polarization morphology of the simulated image.

Standard image High-resolution image
Figure 8.

Figure 8. (Top) 1.3 mm MEM reconstructions of a ray-traced image computed from a GRMHD simulation of Sgr A* (left), provided courtesy of Roman Gold (Gold et al. 2016). Color indicates Stokes I flux. Ticks marking the direction of linear polarization are displayed in regions with I greater than 4× its rms value and $| P| $ greater than 2× its rms value. Visibilities from the planned full EHT array were simulated, including the blurring effects of interstellar scattering, with realistic thermal noise, amplitude calibration errors, and random atmospheric phases included. Stokes I was imaged with the bispectrum and linear polarization was subsequently imaged using polarimetric ratios with the Ponsonby–Narayan–Nityananda (PNN) entropy function (right center) and a total variation (TV) regularizer (right). The final reconstructions were restored with a Gaussian beam of 1/2 the size of the fitted clean beam (27 × 14 μas FWHM); for comparison, the model image smoothed to this resolution is displayed on the center left. (Bottom) The same reconstructions displayed in contours of total intensity, in steps of $\sqrt{2}$ up from the 4× background rms level. Color indicating the magnitude of the polarized flux, $| P| $, is displayed along with polarization position angle ticks in regions with I greater than 4× its rms value and $| P| $ greater than 2× its rms value. The reconstructions more accurately reproduce the direction of linear polarization than the fractional polarization, as fractional polarization in the reconstructions tends to become large in regions of low total flux. Nonetheless, both reconstructions recover an accurate picture of global structure of the model polarized flux distribution blurred to the EHT's resolution.

Standard image High-resolution image

For the disordered field in the GRMHD simulation (Figure 8), the NRMSE and weighted angular error fidelity metrics again give similar results for both reconstructions, but slightly favor the PNN image. For PNN, the NRMSE values were 30.4% for Stokes I and 74.23% for Stokes P, with a weighted angular error of 34fdg1. For TV, the NRMSE values were 32.0% for Stokes I and 76.0% for P, with a weighted angular error of 34fdg3. These high angular error values occur because of the mismatch of the smoothed-out polarization field of the reconstruction and the fine-scale structure in the model image. When compared to the smoothed model image at center left in Figure 8, the Stokes I and P NRMSE drop to 22.6% and 39.9% for the PNN reconstruction and 24.8% and 46.0% for the TV reconstruction; the polarization position angle weighted errors drop to 17fdg2 for the PNN and 18fdg5 for TV.

We also compared images produced with the PNN regularizer using simulated data from the full 2017 array and the smaller four-element array that observed in 2016 of March (Figure 6, left panel). As expected, the fidelity metrics show distinct improvement between the 2016 and 2017 reconstructions. Both in simulations of the near-horizon jet emission in M87 and accretion disk emission in Sgr A* (Figure 9), we found that the larger amount of information in polarimetric VLBI data over total intensity visibilities (due to the ability to accurately calibrate the phases of polarimetric ratios) was significant with the sparse baseline coverage in 2016. Namely, we were able to achieve more detail in the polarized emission reconstruction than its total intensity counterpart in both cases. In the absence of long baselines needed to resolve distinguishing features in the total intensity image, polarimetric imaging can help distinguish between different models of the emission region such as the disk and jet models in Figure 9. Polarimetric images, even with poor resolution, can begin to characterize the general magnetic field structure in Sgr A* and M87 with near-term EHT observations even before completely resolving the emission region or black hole shadow.

Figure 9.

Figure 9. (Top) 1.3 mm MEM reconstructions of a magnetically arrested disk simulation of the Sgr A* accretion flow, courtesy of Jason Dexter (Dexter 2014). Color indicates Stokes I flux and ticks marking the direction of linear polarization are plotted in regions with I greater than 4× its rms value and $| P| $ greater than 2× its rms value. After blurring the image with the Sgr A* scattering kernel at 1.3 mm, data were simulated with realistic thermal noise, amplitude calibration errors, and random atmospheric phases. The center right panel shows a reconstruction with data simulated on EHT baselines expected in 2016 and the rightmost panel shows the reconstruction with the full array expected in 2017. Each reconstruction was restored with a Gaussian beam of 1/2 the size of the fitted clean beam (93 × 32 μas FWHM in 2016; 27 × 14 μas FWHM in 2017). For comparison, the center left panel shows the model smoothed to the same resolution as the 2017 image. (Bottom) 1.3 mm MEM reconstructions of a simulation of the jet in M87, courtesy of Avery Broderick (Broderick & Loeb 2009; Lu et al. 2014). Data were simulated on 2016 and 2017 EHT baselines as done in the top panel, but without the contributions from interstellar scattering that are significant for Sgr A*. Both reconstructions were restored with a Gaussian beam of 1/2 the size of the fitted clean beam (72 × 36 μas FWHM in 2016; 28 × 20 μas FWHM in 2017).

Standard image High-resolution image

For our test Sgr A* model, a magnetically arrested disk GRMHD simulation (Figure 9, top panel), the NRMSE of the reconstructions shows distinct improvement between 2016 and 2017, but the weighted angular error (Equation (21)) metric is surprisingly similar across the reconstructions. For 2016, the NRMSE values were 52.30% for Stokes I and 77.3% for Stokes P, with a weighted angular error of 29fdg3. In 2017, the NRMSE values were 36.06% for Stokes I and 66.9% for P, with an angular error of 28fdg3. While the 2017 array long, high-sensitivity baselines to the ALMA array produce a qualitatively and quantitatively superior reconstruction, MEM techniques reproduce qualitative features of the polarization structure even with the sparse 2016 data.

When we instead compare the reconstructions to the model image smoothed to the same resolution as the respective restoring beam, the I and P NRMSE values drop to 24.0% and 59.0% for the 2016 reconstruction and 19.8% and 61.9% for the 2017 image. The polarization position angle weighted error drops to 20fdg0 and 21fdg6 for the 2016 and 2017 images, respectively. Even with minimal baseline coverage, MEM is able to reconstruct a reasonably accurate image when compared to the true image viewed at the same resolution.

The 2016 image of an M87 jet model (Figure 9, bottom panel) gave NRMSE values of 55.61% for Stokes I and 77.34% for Stokes P, with a weighted angular error of 23fdg5. In 2017, the NRMSE values were 36.71% for Stokes I and 54.40% for P, with an angular error of 17fdg9. When we instead compare the reconstructions to the model image smoothed to the same resolution as the restoring beam, the I and P NRMSE values drop to 21.3% and 34.5% for the 2016 image and 18.3% and 27.7% for the 2017 image, while the polarization position angle weighted error drops to 21fdg6 and 14fdg8 for the 2016 and 2017 images, respectively.

6. CONCLUSION

As the EHT opens up new, extreme environments to direct VLBI imaging, a renewed exploration of VLBI imaging strategies is necessary for extracting physical signatures from challenging data sets. In this paper, we have shown the effectiveness of imaging linear polarization from VLBI data using extensions of the MEM. We explored extensions of MEM using previously proposed polarimetric regularizers like PNN and adaptations of regularizers new to VLBI imaging like total variation. We furthermore adapted standard MEM to operate on robust bispectrum and polarimetric ratio measurements instead of calibrated visibilities. MEM imaging of polarization can provide increased resolution over CLEAN (Figure 5) and is more adapted to continuous distributions, as expected for the black hole accretion disks and jets targeted by the EHT. Furthermore, MEM imaging algorithms can naturally incorporate both physical constraints on flux and polarization fraction as well as constraints from prior information or expected source structure. Extending our code to run on data from connected element interferometers like ALMA is a logical next step, but it will require new methods to efficiently handle large amounts of data and image pixels across a wide field of view. Polarimetric MEM is also a promising tool for synthesis imaging of a diversity of other astrophysical systems typically observed with connected element interferometers. For example, the polarized dust emission from protostellar cores frequently exhibits a smooth morphology (Girart et al. 2006; Hull et al. 2013), so MEM may be better-suited to studying both the large-scale magnetic field morphologies and their small deviations, rather than studying typical reconstructions using CLEAN.

The natural ability to incorporate various image constraints makes extensions of MEM useful for investigating new forms of image reconstruction that will be relevant for future EHT observations. Although our algorithm is relatively insensitive to calibration errors and we have shown that our reconstructions are reliable even after including realistic station gain uncertainties and fluctuations, we have not yet incorporated amplitude self-calibration that could further improve reconstructions of $\tilde{I}$. Future work will also investigate a Stokes I MEM imaging algorithm that uses only closure phase and closure amplitude data, which would be immune to phase and amplitude calibration errors, thereby eliminating the need for self-calibration. Another goal is the addition of dynamic deblurring that can disentangle effects of strong interstellar scattering with more complicated structure than the simple convolution that holds in the long-term average regime (Johnson & Gwinn 2015). With polarization, MEM could be used to image Faraday rotation across a frequency band. As an imaging framework, MEM is highly flexible and we anticipate that continued investigation will lead to new algorithms that can be tailored to the particular challenges of EHT image reconstruction.

We thank the National Science Foundation (AST-1310896, AST-1312034, AST-1211539, and AST-1440254) and the Gordon and Betty Moore Foundation (#GBMF-3561) for financial support of this work. R.N.'s research was supported in part by NSF grant AST1312651 and NASA grant TCAN NNX14AB47G. K.B. was supported by NSF CGV-1111415 and a NSF Graduate Fellowship. We thank Svetlana Jorstad, Alan Marscher, and Kazuhiro Hada for providing the data imaged in Section 5.2 and for their helpful comments. We thank Avery Broderick, Jason Dexter, and Roman Gold for generously providing model images. We also thank Lindy Blackburn for his help on simulating gain and phase errors and Kazunori Akiyama for his suggestion of applying total variation as a polarimetric regularizer. We thank the anonymous referee, whose thorough suggestions significantly improved this paper. This study makes use of 43 GHz VLBA data from the VLBA-BU Blazar Monitoring Program (VLBA-BU-BLAZAR; http://www.bu.edu/blazars/VLBAproject.html), funded by NASA through the Fermi Guest Investigator Program. The VLBA is an instrument of the National Radio Astronomy Observatory. The National Radio Astronomy Observatory is a facility of the National Science Foundation that is operated by Associated Universities, Inc.

APPENDIX A: POLARIMETRIC VLBI OBSERVABLES

In practice, visibilities are estimated by correlating the measured electric fields at different sites. In VLBI, circular feeds are the most common, and the total intensity visibility $\tilde{I}$ is then given as the average of the parallel-hand correlations while ${\tilde{Q}}_{k}$ and ${\tilde{U}}_{k}$ are estimated using combinations of the cross-hand visibilities. In terms of the cross-hand correlations at sites 1 and 2, the four interferometric Stokes parameters measured on the 1–2 baseline are (Roberts et al. 1994)

Equation (22)

Equation (23)

Equation (24)

Equation (25)

We ignore circular polarization in what follows. Via the van Cittert–Zernike theorem (Equation (1)), the complex visibilities $\tilde{I}(u,v),\tilde{Q}(u,v),\tilde{U}(u,v)$ are the Fourier transforms of the separate Stokes images $I(x,y),Q(x,y),U(x,y)$. The image linear polarization can also be represented with the fractional polarization m and polarization position angle χ (conventionally measured east of north), where

Equation (26)

The distinction between the polarization position angle χ and the data term ${\chi }^{2}$ should be clear from the context. Similarly, we can decompose the Fourier conjugate $\tilde{P}$ (Equation (3)):

Equation (27)

Again, note that the complex quantity $\breve{m}(u,v)$ is not the Fourier conjugate of the real position-space fractional polarization $m(x,y)$.

Since $I,Q,U$ are real, $\tilde{I},\tilde{Q},\tilde{U}$ are conjugate-symmetric under $(u,v)\to (-u,-v)$. This is not the case for $\tilde{P}=\tilde{Q}+i\tilde{U}$. Instead, using Equation (22), for telescopes 1, 2 corresponding to a baseline vector (u, v), we see that

Equation (28)

Equation (29)

In an imaging algorithm, we model the n × n total intensity and polarization images with length n2 arrays ${\boldsymbol{I}}^{\prime} $, ${\boldsymbol{P}}^{\prime} $. The N measured total intensity and polarimetric visibilities form arrays $\widetilde{{\boldsymbol{I}}}$, $\widetilde{{\boldsymbol{P}}}$. When comparing our measurements to a test image, we use the arrays of the sample visibilities $\widetilde{{\boldsymbol{I}}}^{\prime} ={\boldsymbol{A}}{\boldsymbol{I}}^{\prime} $ and $\widetilde{{\boldsymbol{P}}^{\prime} }={\boldsymbol{A}}{\boldsymbol{P}}^{\prime} $, where ${\boldsymbol{A}}$ is a Fourier matrix

Equation (30)

APPENDIX B: THERMAL NOISE

Thermal noise on a VLBI baseline produces circular Gaussian error in the visibility plane with standard deviation σ given by Equation (18). In principle, the thermal noise σ is the same for $\tilde{I}$, $\tilde{Q}$, and $\tilde{U}$. The factor of $1/0.88$ comes from losses due to 2 bit quantization (TMS). The error in $\tilde{P}$ is also circular, with standard deviation

Equation (31)

Since the error is assumed to be circular in the high S/N limit, the error in the visibility amplitude $| \tilde{I}| $ is equal to the error in the real and imaginary parts, and the error in the visibility phase ϕ is

Equation (32)

The thermal noise on the bispectrum (Equation (2)) will in general not be described by a circular Gaussian distribution. However, in the limit of high S/N, we can approximate the distribution as a circular Gaussian with standard deviation (TMS)

Equation (33)

where $| {\tilde{I}}_{1}| $, $| {\tilde{I}}_{2}| $, $| {\tilde{I}}_{3}| $, are the visibility amplitudes on the three baselines that make up the bispectrum, ${\sigma }_{1}$, ${\sigma }_{2}$, and ${\sigma }_{3}$ are their corresponding standard deviations, and $| {\tilde{I}}_{B}| =| {\tilde{I}}_{1}| | {\tilde{I}}_{2}| | {\tilde{I}}_{3}| $ is the bispectral amplitude. The error in the closure phase is just

Equation (34)

Similarly, the distribution of the visibility domain polarimetric ratio $\tilde{P}/\tilde{I}=\breve{m}$ is not generally a complex circular Gaussian. In the limit of high S/N, however, we again approximate it as such with standard deviation given by

Equation (35)

APPENDIX C: ENTROPY AND ${\chi }^{2}$ GRADIENTS

Our implementation of the maximum entropy method uses a gradient descent algorithm to minimize the objective function J. For total intensity imaging, the necessary gradients of ${\chi }_{B}^{2}$ with respect to the image pixels Ik can be found in Bouman et al. (2016). Below we list the gradients of the polarimetric ${\chi }^{2}$ and entropy terms used in the polarimetric imaging step.

C.1. Polarimetric Ratio ${\chi }^{2}$

In imaging P, we work directly with interferometric polarimetric ratios. The reduced ${\chi }_{m}^{2}({\boldsymbol{I}}^{\prime} ,{\boldsymbol{P}}^{\prime} )$ that we use is (Equation (12))

Equation (36)

where errors on the polarimetric ratios are calculated according to Equation (35). Computing the gradient with respect to the image domain fractional polarizations mk and polarization position angles ${\chi }_{k}$ gives

Equation (37)

Equation (38)

C.2. Ponsonby/Nityananda/Narayan Entropy

Setting ${m}_{\max }=1$ in Equation (13), we have the traditional form of the PNN entropy:

Equation (39)

It has gradients with respect to polarimetric ratio m and polarization position angle χ given by

Equation (40)

Equation (41)

C.3. Total Variation Entropy

Because the total variation (Equation (15)) involves differences between pixels in both the x and y image direction, we must adjust our notation to account for the 2D nature of the image. With both dimensions restored, the complex polarized image is ${P}_{i,j}={I}_{i,j}\,{m}_{i,j}\,{e}^{2i{\chi }_{i,j}}$. The total variation entropy term is then

Equation (42)

The gradients with respect to m and χ are

Equation (43)

Equation (44)

APPENDIX D: IMAGE CHANGE OF VARIABLES

To extend the range of the image variables for total intensity and polarization to the entire real line and avoid the use of bounded minimization, we use the change of variables

Equation (45)

Consequently, we need to multiply the gradients given in Appendix C by the factors

Equation (46)

Footnotes

Please wait… references are loading.
10.3847/0004-637X/829/1/11