\documentclass[12pt]{article}
\usepackage[dvips]{graphicx, color}
\setlength{\evensidemargin}{-0.7cm}
\setlength{\oddsidemargin}{-0.8cm}
\setlength{\textwidth}{15.8cm}
\setlength{\topmargin}{-2cm}
\setlength{\textheight}{23.5cm}
\parskip 1ex % White space between paragraphs amount
\title{Design of casa::TablePlot for Casapy}
This note describes the design of the {\tt casa::TablePlot} classes and the control flow through them for data access and plot creation. The use of the Matplotlib plotting package is described, along with a performance analysis to quantify bottlenecks.
Many thanks to S.D.Jaeger, G.Moellenbroek, and K.Golap for some bug-finding/fixing, and helpful suggestions about design changes, during their development of the application level classes that use {\tt casa::TablePlot}. The work done on this code was funded via a Graduate Student Research Assistantship.
\epsfig{figure=figures/TPFlow.eps,width=14.5cm,angle=0}
\caption{Flow of control through a set of classes that provide 2D plotting functionality
for data stored as casa::Table.
Data is read from casa::Table and derived quantities are computed and plotted (black arrows).
User interaction with the plotted data via queries and editing is accessible from the
command line (maroon arrows) as well as from the plotter GUI (blue arrows). Custom computations
are allowed via optional callback functions defined by the user application (light blue boxes).
The pink and yellow layers at the bottom control the C++ Python binding and plotting
\section{Classes and control flow}
{\tt casa::TablePlot} is a Singleton class that manages plot requests
from multiple applications ({\tt casa::MSPlot},{\tt casa::PlotCal},{\tt casac::tableplot}).
It receives input in the form of data Tables and Plot-Options for each plot
and manages multiple panels on the plot window with multiple plot layers per panel.
It maintains and sends commands to a single instance of a plotter GUI, and responds
to user interaction both from the GUI and the command line.
\item {\bf Table Access :}
Columns from any kind of casa::Table can be plotted against each other.
Input tables can be Measurement Sets, MS Subtables, Calibration tables, etc.., or
reference sub-sets of Tables generated via any Table selection mechanism.
Multiple tables of different types can be opened simultaneously for plotting and interactive
editing. Data sets can be iterated through to generate a series of plots from different subselections. Columns with finite and ordered values can be iterated upon. Iteration rules follow that of the casa::TableIterator class. Iterations can proceed only in one direction - forward.
\item {\bf Expression Evaluation :} Expressions for derived quantities are written using the TaQL\footnote{Aips++ Note 199 : $http://aips2.nrao.edu/stable/docs/notes/199/199.html$} syntax. For data from ArrayColumns, in-row selections within the Array are done via TaQL indices (for example : {\tt \small AMPLITUDE(DATA[1:2,1:10:1])}).
For sample TaQL strings, please refer to Appendix A.
Quantities that cannot be computed purely via TaQL expressions make use of conversion functions supplied by the application in the form of callbacks {\tt casa::TPConvertBase}.
% Derivatives of baseplot
\item {\bf Data extraction : }{\tt casa::BasePlot} is responsible for evaluating TaQL expressions for X and Y axis data, performing conversions on the results, and storing the data in arrays to be plotted. It provides a set of query functions that the plotter class {\tt casa::TPPlotter} will use to access data to be plotted.
Derivatives of {\tt casa::BasePlot} can implement customized data access, but must provide the same view of the final data to be plotted, in the form {\tt casa::Array}s and query functions that {\tt casa::TPPlotter} expects. {\tt casa::CrossPlot} is one derivative, used for plots where the X data comes from ArrayColumn Array indices (for example, channel number), and not TaQL string evaluations. Other derivatives can be implemented to read data without using TaQL expressions. Note that the rest of the framework does not depend on how the data is accessed internal to {\tt casa:BasePlot}.
\item {\bf Flag Handling :} Flags are read using the same in-row selection as the data itself, and stored in arrays. {\tt casa::TPPlotter} uses the {\tt casa::BasePlot} query functions to check flags with data and select points for plotting according to on-the-fly selection criteria (only unflagged, only flagged, every $n^{th}$ point, average n points, points within a certain range of values).
Interactive editing modifies the flags in the {\tt casa::BasePlot} flag storage arrays. The {\tt casa::BasePlot} (or derivatives) are then responsible for writing these flags back to the Table, along with any translations or flag-expansions (for plots of averaged values). It is always ensured that the {\tt \small FLAG\_ROW} column holds the logical {\tt \small AND} of all flags per row in the {\tt \small FLAG} column. The flags are written to disk after every interaction, to allow all currently visible plots using the same Table, to immediately reflect flag changes. Columns to be used for flags can be user specified, and default to {\tt \small FLAG} and {\tt \small FLAG\_ROW}.
\item {\bf Plot Parameters : {\tt casa::PlotOptions}} manages user input. It contains defaults for all parameters, and functions to validate parameters and do error checking. Parameters are divided into those common to all layers on a panel (panel location, size, axis labels..), and those that can vary across layers (plot colour, marker format...), and stored in a wrapper {\tt casa::PanelParams} class.
For a list of plot parameters and their functions, please refer to Appendix B.
\item {\bf Panel management : For each panel, {\tt casa::TablePlot}} maintains an instance of {\tt casa::PanelParams} and a list of {\tt casa::BasePlot}s. Plots from multiple Tables, and overplots can generate multiple layers per panel. This class iterates through panels and layers and sends {\tt Vector<casa::BasePlot>, casa::PanelParams} pairs to a single instance of {\tt casa::TPPlotter} for plotting.
\item {\bf Plotting : {\tt casa::TPPlotter}} performs two tasks. First, it receives a list of {\tt casa::BasePlot}s, and queries each of them for the number of plot commands to run, and the number of points per plot. For each plot command, it reads data and flags through {\tt casa::BasePlot} query functions, and assembles the selected points into plotting-package-specific data format. Then, it reads plot options from supplied {\tt casa::PanelParams}, and constructs plot commands. Finally, the data and plot commands are combined and sent to the plotting package.
\subsection{User Interaction}
The following functions allow the user to interact with any displayed plot. They
are accessible from the command line by directly calling {\tt casa::TablePlot} functions
as well as from buttons on the GUI which internally call the same {\tt casa::TablePlot} functions.
These functions operate on the stored {\tt Vector<casa::BasePlot>, casa::PanelParams} pairs per panel, and trigger refreshed plots, as well as the transfer of flags to and from the Table on disk.
A no-GUI mode of plotting has also been provided. In this mode, the plots are created as described above, but are not rendered onto a plot window. Instead, additional commands can be used to directly save a plot into a file on disk.
\item{\bf Mark Regions}:\\ % mention esc
Rubber-band boxes can be drawn to mark rectangular regions on a plot. The same effect can be reached by sending in world-coordinate box specifications from the command line.
The matplotlib TkAgg GUI provides buttons for Zoom/Pan modes. (Note that Matplotlib stores a full copy of all data points currently on the plot, in Double precision, to enable Zoom/Pan modes directly from the GUI.)