Fitting Algorithm



next up previous contents
Next: An Example: The Up: Synthetic Rotation Curve Previous: SpectraPSF and

Fitting Algorithm

  Non-linear model fitting can be approached in two ways. One involves a gradient method such as the Levenberg-Marquardt method which uses the gradient and a second derivative matrix to quickly find the local minimum nearest to the starting point. The other involves searching for the absolute minimum by taking steps with Monte-Carlo generated sizes and directions through parameter space. A variant of this approach called the Metropolis algorithm [\protect\astronciteSaha and Williams1994][\protect\astroncitePress et al.1986] was used. It MonteCarlo samples parameter space with a sampling density proportional to the likelihood. It has less chances of getting trapped in the first local minimum it encounters. Confidence intervals can be directly calculated as the same time the algorithm is looking for the best parameter values.

Let be the probability distribution that a set of data D will be collected given a model M with a set of parameter values . is not in itself interesting because the parameter values are not known. Using Bayes' theorem, it is possible to invert to get , the probability distribution that will be observed given a model and a set of data. This distribution may be used to define best parameter values and suitable confidence intervals on those values. Bayes' theorem here can be expressed as:

 

is called the prior probability of the parameters, and it represents whatever is known (if anything is) about parameter values before taking any data. is called the likelihood, is called the posterior probability distribution and is just a normalizing factor called the global likelihood. Equation gif simply states that is proportional to the likelihood. For Gaussian noise, the likelihood that N measurements are consistent with pure noise is given by

 

The 's are the predictions of the model M for the set of parameter values . Carrying out the product in equation gif, becomes

 

In read-out and sky noise limited CCD images, the photon noise due to the galaxy signal is negligible, and the 's are all equal to the background noise , so . The background noise was determined for each spectrum by computing image statistics in four regions at the corners of the spectrum.

The Metropolis algorithm starts with some set of parameter values and the associated . It then picks a possible change in the parameters and computes . If , then the change is accepted. If , then the change is accepted only some fraction / of the time. After hundreds of such iterations, the distribution of accepted values of will converge to provided that all possible are eventually accessible. Parameter space is thus sampled with a density proportional to the likelihood. The trial changes are chosen at random. If they are too small (i.e. the parameter search is too ``cold''), then all iterations will accept changes. On the other hand, if the search is too ``hot'', then none of the iterations will accept changes. It is usual to regulate the size of the steps (i.e. the temperature of the search) so that half the iterations accept changes.

Temperature control is thus an important element of the Metropolis. The temperature control scheme must allow the search to sample the largest volume in parameter space with the lowest possible number of steps. In the current implementation of the Metropolis, it starts with a n-dimensional volume centered on given by V = T where T is the initial temperature on the parameter. Parameter values are generated using = + T where is a random number between 1 and 1. If the first lower value is found iterations after the beginning of the search, then one would expect that this region would have a characteristic size V = V/. In order to match the search temperature to the size of that lower region, the temperature of each parameter is then reduced according to T = T . After this initial cooling, the search continues. The ratio of the number of accepted over the number of trial is computed after every ten trials. If less than half the changes have been accepted, then the parameter temperatures are all decreased by 30. If not, then the parameter temperatures are all increased by 30.

The best parameter values were chosen to be the median 's of . These median values were obtained by simply sorting out the accepted values and taking the value. The 0.16 and 0.84 values were used as the lower and upper bounds of the 68 confidence intervals.





next up previous contents
Next: An Example: The Up: Synthetic Rotation Curve Previous: SpectraPSF and



Luc Simard
Mon Sep 2 12:37:40 PDT 1996