SAO/NASA ADS -> Help -> Dexter
SAO/NASA ADS Help Pages Prev | Next


    5.4 - Dexter

    Dexter is a tool to extract data from figures on scanned pages from our article service. In order to use it, you need a browser that can execute Java Applets and has that feature enabled. Firefox users can verify this by selecting "Edit" -> "Preferences" -> "Content" from the top-bar menu and making sure that the button "Enable Java" is checked.

    Dexter can be quite useful in generating data points from published figures containing images, plots, graphs, and histograms, whenever the original datasets used by the authors to produce figures in the papers are not available electronically.



    5.4.1 - Basic Usage

    In order to run Dexter on a scanned page you need to be viewing the corresponding article in the "classic" view mode, rather than the standard view in which the screen is split in three panels with the thumbnails on the right and the bibliographic information for the article at the top. To view a page in the "classic" mode, simply scroll to the bottom of the top pane containing the article's bibliographic information, and select "Other article options."

    In order to run Dexter on the full-text image currently displayed in your browser, click on the link named "Start DEXTER java applet" towards the bottom of the page. The Dexter Figure Selection Page will appear. Please wait until the image is entirely downloaded.

    To select the figure to be further processed, click on one corner of the figure, hold the mouse button and adjust the rubber band to encompass the entire area of interest. If the figure is larger than the applet area, the visible portion of the page will scroll and follow your mouse pointer. Release the mouse button after you have boxed the selected figure and the Data Extraction Window will automatically pop up, with the selected figure displayed inside. Again, you should wait until the image is fully downloaded, at which point the window will resize to show as much of the image as is possible on your screen. If you make an error during the figure selection and the wrong image area is displayed instead, just close the window and start over.

    Once the Dexter Extraction window is displayed and the selected image area is completely loaded in it, you are ready to set the figure axis, scale, and start marking data points using the following steps:

    1. Set the horizontal axis: click-and-drag a gauge along the horizontal axis; a red line will be drawn. If you need to extend or modify the line click on either one of its ends and drag again.

    2. Set the vertical axis: click-and-drag a gauge along the vertical axis; a blue line will be drawn. If you need to extend or modify the line click on either one of its ends and drag again.

    3. Set the scale: fill in the boxes marked 'x0', 'x1' and 'y0', 'y1' (located along their respective axis) with the lower and upper values marked by the horizontal and vertical gauges. If the plot has logarithmic axes, make sure to check the "log" check button. Note that you should not check this button if the logarithm of a quantity is plotted on a linear scale.

    4. Mark data points: left click on the points in the diagram to mark the location of a data point. A green cross will appear in the current mouse location. To remove a point move the cursor on top of it and shift-click or use the middle mouse button.

    Note: as an alternative to manually marking the gauges, marking data points or tracing lines in the plot, you can attempt to use the "Recognize" menu described below under Advanced Features.

    The Dexter Extraction window contains three different buttons that can be used to view or save the data points marked in the plot. They are located in the upper left of the window as well as under the "File" top-bar menu. You can use any of these buttons as many times as you need during your Dexter session. The buttons are:

    1. Show Data: this option will cause the data points you have extracted so far to be displayed in the text box at the bottom of your screen for your review.

    2. Send Data: this option will cause a new browser window containing your data points to be opened. This will trigger popup blockers. To use this button to retrieve your data, you will need to disable these for our site.

    3. Save Data: this option will send the data points back to your browser so that they can be saved in a file. The output filename that by default will appear in the "Save As" menu can be set ahead of time by modifying the string that appears in the input box labelled "File name" in the lower right corner of the data extraction window.





    5.4.2 - Advanced Features

    Dexter has a number of additional features available to facilitate the data extraction process and improve the accuracy of the results generated.

    5.4.2.1 - Pointer Coordinates

    The coordinates of the mouse pointer are shown in the last line of the data extraction window. The coordinates will be in pixels as long as you have not defined both axes, and in graph coordinates otherwise. If you change one of the text fields, you may have to force a loss of keyboard focus on that text field for the change to become effective for the mouse tracking, e.g. by clicking into the output field, pressing the tab key, setting a data point, or the like. Unfortunately, this behavior is platform dependent, and you will have to try for yourself to see what works on your machine.

    5.4.2.2 - Zoom

    The Zoom feature, located in the pull down menu, allows you to change the size of the displayed images by selecting display resolution from 75dpi to 600dpi. (The default resolution is 200dpi; we recommend changing this to 100dpi for large figures and 300dpi for smaller figures.) Use these buttons if your screen is too small for the entire image or the figure is so small that you cannot mark points with sufficient accuracy. Please note that your Java virtual machine may not have enough memory for high resolution renderings of large images.

    5.4.2.3 - Magnifying Glass

    Located to the left of the display area, under the vertical axis input boxes you will see a magnifying glass icon. Clicking on the icon will turn on the magnifying glass window (replacing the icon), which displays a magnified image area located around your pointer. Clicking over the window a second time will turn the magnifying glass off. The magnifying glass is very useful when marking axis or data points since it allows you to view the local image area with great detail, and therefore increases the accuracy of your marks.

    By default the magnifying glass window is turned off because some (faulty) Java run-time libraries cause the applet to slow to a crawl when it is turned on or cause other funny side-effects. The magnifying glass currently does not immediately show temporary features (axes and error bars during their creation). You may have to move the mouse to update the display after you set a point in order to see the point displayed in the magnifying glass window.

    5.4.2.4 - Error Bars

    After you have defined the axes and marked your data points, you can set error bars by clicking and dragging on an existing data point. Error bars are always parallel to the axes at the time of their creation. On output, the error bars are shown in up to four additional columns, with horizontal error bars appearing first if both vertical and horizontal ones were marked; plus and minus errors are indicated by the applicable sign. If you miss the data point, Dexter starts to draw another axis gauge. To avoid clobbering your previous gauge, drop the new gauge when its length is below Dexter's lower limit on gauge length (about 30 pixels). To entirely remove (as opposed to resize) the error bars, delete the data point and set a new one.

    5.4.2.5 - Recognizers

    You can ask Dexter to do some of the marking of data points automatically. Currently, three operations are supported: axis finding, line tracing, and point marking. In general, teaching a computer to recognize features is a highly nontrivial task, so you should not expect miracles. For many plots, though, Dexter's recognizers might save you some time.

    All recognizers run in separate threads, so that you can continue working with Dexter. While they are running, certain operations are not allowed (e.g., changing resolution, sending data), and the respective buttons or menu entries are greyed out. As long as a recognizer is running, there will be a notice Recog. running in Dexter's status line.

    Some parameters used by the recognizers can be changed using the "Recognize" -> "Settings" dialog. The settings become effective as you change them.

    Because the recognizers may set a large number of points, there is a menu entry named "Delete all Points" in the "Recognize" pull-down menu. Selecting this really deletes all points set, not only those set by a recognizer. You may want to use that feature while experimenting with various parameters for the recognizers.

    The run time of recognizers with current Java implementations is mainly a function of the image size, since they spend most of their time transforming the image into something that is accessible by the program. Also, this conversion requires significant amounts of memory. If a recognizer runs out of memory, it may just stop working (though in general we try to catch these cases and warn the user). Reducing the resolution might help. To stop a running recognizer, select "Stop Recognizer" in the "Recognize" pull-down menu.

    The recognizers currently available are:

    Axis Finding
    to make Dexter identify the gauges on the axes, select "Recognize" -> "Automatic Axes" or type Control-A. Note that no gauge will be set if Dexter cannot find any ticks on an axis.

    Line Tracing
    this recognizer follows a line and leaves points along it where it thinks something "interesting" happens or at invervals configurable in the Settings dialog. When you start the line tracer either by selecting "Recognize" -> "Trace a line" or hitting Control-T, the mouse pointer will change to a pointing finger. Click on the line you want Dexter to trace, and after a little time you should see points marked along the line. At junctions of lines, the tracer may become confused and take the wrong way, or it may lose the line at certain extreme points. In these cases is may help to give the line tracer a different start point or to let it run more than once.

    Point Finding
    this recognizer tries to register points similar to a user-defined template within a graph. After you start it by either selecting "Recognize" -> "Find Points" or typing Control-B, the recognizer will ask you to click on a template point. If there are error bars in the graph, either select a template without error bars or one with large error bars (i.e., error bars larger than the marker itself), or Dexter will add the error bars into the template and will have a hard time finding anything similar to the template. The distance measure used by the point finder is the number of matching points normalized by the total number of points in the template area, where two points match if they are both black or both white after thresholding. You can change the maximal acceptable difference between a template and an acceptable data point in the settings dialog. For poor scans, it may be necessary to raise this value. The point finder will not accept templates smaller than 4x4 pixels. If you run into this limitation, Dexter will complain that it cannot find the start object; try increasing the resolution of the displayed image (via the Zoom menu) and rerun the recognizer.

    5.4.2.6 - Additional Information

    Dexter has shown itself to be a very useful tool to people interested in obtaining numerical data from scanned scientific papers. The applet has been developed to be used within the ADS article service, can easily be adapted to work in other services. In early 2008 it was integrated in the HTML version of fulltext articles appearing in the journal Astronomy & Astrophysics. A standalone version of the applet capable of working on both individual images and PDF files has recently become available on the German VO (GAVO) Data Center web site. Dexter's source code is available under the GNU General Public License at https://dexter.sourceforge.net

    A paper describing Dexter's software architecture and showing some of it screenshots is available online.





    Modern browsers usually have java implementations that are good enough for Dexter. If in doubt, we recommend Sun's Java plugin, available at for many popular platforms.

    One known failure is that you cannot use the "Save data" button on many versions of Internet Explorer without trusting our site. The symptom is that IE asks you if you want to allow the download. Even if you answer yes, you will not get the data, which is because IE has already consumed it. We do not store your data on our machine, so we cannot deliver it a second time. To work around this problem, use the "Send data" button and IE's save or use a different browser.





Top | Next