Sign on

SAO/NASA ADS Astronomy Abstract Service


· Find Similar Abstracts (with default settings below)
· Electronic Refereed Journal Article (HTML)
· Full Refereed Journal Article (PDF/Postscript)
· arXiv e-print (arXiv:0809.3373)
· References in the article
· Citations to the Article (2) (Citation History)
· Refereed Citations to the Article
· Also-Read Articles (Reads History)
·
· Translate This Page
Title:
Finding rare objects and building pure samples: probabilistic quasar classification from low-resolution Gaia spectra
Authors:
Bailer-Jones, C. A. L.; Smith, K. W.; Tiede, C.; Sordo, R.; Vallenari, A.
Affiliation:
AA(Max-Planck-Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany), AB(Max-Planck-Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany), AC(Max-Planck-Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany), AD(INAF, Osservatorio di Padova, Vicolo Osservatorio 5, 35122 Padova, Italy), AE(INAF, Osservatorio di Padova, Vicolo Osservatorio 5, 35122 Padova, Italy)
Publication:
Monthly Notices of the Royal Astronomical Society, Volume 391, Issue 4, pp. 1838-1853. (MNRAS Homepage)
Publication Date:
12/2008
Origin:
MNRAS
MNRAS Keywords:
methods: data analysis , methods: statistical , surveys , quasars: general
DOI:
10.1111/j.1365-2966.2008.13983.x
Bibliographic Code:
2008MNRAS.391.1838B

Abstract

We develop and demonstrate a probabilistic method for classifying rare objects in surveys with the particular goal of building very pure samples. It works by modifying the output probabilities from a classifier so as to accommodate our expectation (priors) concerning the relative frequencies of different classes of objects. We demonstrate our method using the Discrete Source Classifier (DSC), a supervised classifier currently based on support vector machines, which we are developing in preparation for the Gaia data analysis. DSC classifies objects using their very low resolution optical spectra. We look in detail at the problem of quasar classification, because identification of a pure quasar sample is necessary to define the Gaia astrometric reference frame. By varying a posterior probability threshold in DSC, we can trade off sample completeness and contamination. We show, using our simulated data, that it is possible to achieve a pure sample of quasars (upper limit on contamination of 1 in 40000) with a completeness of 65 per cent at magnitudes of G = 18.5, and 50 per cent at G = 20.0, even when quasars have a frequency of only 1 in every 2000 objects. The star sample completeness is simultaneously 99 per cent with a contamination of 0.7 per cent. Including parallax and proper motion in the classifier barely changes the results. We further show that not accounting for class priors in the target population leads to serious misclassifications and poor predictions for sample completeness and contamination. We discuss how a classification model prior may, or may not, be influenced by the class distribution in the training data. Our method controls this prior and so allows a single model to be applied to any target population without having to tune the training data and retrain the model.
Bibtex entry for this abstract   Preferred format for this abstract (see Preferences)

   

Find Similar Abstracts:

Use: Authors
Title
Keywords (in text query field)
Abstract Text
Return: Query Results Return    items starting with number
Query Form
Database: Astronomy
Physics
arXiv e-prints