The code presented in this post category is aimed at teaching myself how to use Bayesian approaches for various problems. In this particular post, the idea is to find the best circle fitting some observed points.
I started by using an approach assuming that the noise was only along the radius direction for each point. However, I found a paper from Marcus Baum related to the same problem ("A Novel Bayesian Method for Fitting a Circle to Noisy Points", available here) in which each measurement is assumed to have an isotropic Gaussian noise, and I later updated my code to also use a similar assumption.
If you are looking for a serious Bayesian method to fit a circle to noisy points, please read Marcus Baum's paper!
1 Helper function: generate random points from a circle
Let's write a helper function to produce random points sampled from a circle, with some isotropic Gaussian noise around each point:
1 2 3 4 5 6 7 | randomCircle = function(n, x, y, radius, sd) {
angles = runif(n, 0, 2 * pi)
xPoints = radius * cos(angles) + x + rnorm(n, mean = 0, sd = sd)
yPoints = radius * sin(angles) + y + rnorm(n, mean = 0, sd = sd)
return(data.frame(x = xPoints,
y = yPoints))
}
|
Let's also write a simple function to draw circles:
1 2 3 4 5 6 | drawCircle = function(x0, y0, radius, nSteps = 48, col = "black", ...) {
gridAngles = seq(0, 2 * pi, length.out = nSteps + 1)[1:nSteps]
gridX = x0 + cos(gridAngles) * radius
gridY = y0 + sin(gridAngles) * radius
polygon(gridX, gridY, border = col, ...)
}
|
And now some testing:
1 2 3 4 5 | set.seed(4)
pts = randomCircle(100, 0, 0, 1, 0.1)
plot(pts, asp = 1, pch = 19, col = "red")
pts2 = randomCircle(100, 0, 0, 1, 0.02)
points(pts2, pch = 19, col = "cornflowerblue")
|
2 Likelihood of a circle for one given point
2.1 Log-likehood for one observed point
We need to be able to calculate the likelihood of a candidate circle (i.e. a center location and a radius) for a given, known point.
We assume that the observed point is distributed with an isotropic Gaussian noise around its true location, with a candidate standard deviation. The desired likelihood of the candidate circle and the candidate standard deviation can be calculated by summing over all the positions on the circle the likelihood of observing the given point under a Gaussian noise with the candidate standard deviation.
The following code calculates this (log-)likelihood by performing an approximate integration along the circle:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | #' Calculate the log-likelihood of one circle (center and radius) for one
#' point with isotropic Gaussian noise around it
#'
#' See https://en.wikipedia.org/wiki/Gaussian_function#Two-dimensional_Gaussian_function
#'
#' @param x x coordinate of the observed point
#' @param y y coordinate of the observed point
#' @param sd Spread of the Gaussian noise (sd = sigma_x = sigma_y)
#' @param x0 x coordinate of circle center
#' @param x0 y coordinate of circle center
#' @param radius Radius of the circle
#' @param nSteps Number of steps used for approximate integration along the
#' circle
#'
loglikOnePoint = function(x, y, sd, x0, y0, radius, nSteps = 24) {
gridAngles = seq(0, 2 * pi, length.out = nSteps + 1)[1:nSteps]
gridX = x0 + cos(gridAngles) * radius
gridY = y0 + sin(gridAngles) * radius
lik = 0
A = 1 / (2 * pi * sd^2) # Cf. volume under two-dimensional Gaussian
# function on Wikipedia page
for (i in seq_along(gridX)) {
dsquare = (x-gridX[i])^2 + (y-gridY[i])^2
lik = lik + A * exp(- dsquare / (2*sd^2)) * 2 * pi * radius / nSteps
}
return(log(lik))
}
|
2.2 Log-likelihood for a set of observed points
The above function can be used to calculate the log-likelihood of a candidate circle and of a candidate standard deviation for a set of observed points:
1 2 3 4 5 6 7 8 9 | loglikCircle = function(x, y, sd, x0, y0, radius, nSteps = 24) {
#' x and y are vectors
#'
loglik = 0
for (i in seq_along(x)) {
loglik = loglik + loglikOnePoint(x[i], y[i], sd, x0, y0, radius, nSteps)
}
return(loglik)
}
|
where x
and y
are vectors with the coordinates of the observed points, and
the other arguments are as above.
2.3 Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | set.seed(6)
# True parameters: center at (0, 0), radius = 10 and sd = 1
c = randomCircle(20, 0, 0, 10, 1)
# Candidate parameters: center at (1, 1), radius = 5, sd = 2
loglikCircle(c$x, c$y, 2, 1, 1, 5)
## [1] -102.2252
# Candidate parameters: center at (0, 0), radius = 5, sd = 2
loglikCircle(c$x, c$y, 2, 0, 0, 5)
## [1] -98.96428
# Candidate parameters: center at (0, 0), radius = 9, sd = 2
loglikCircle(c$x, c$y, 2, 0, 0, 9)
## [1] -37.41387
# Candidate parameters: center at (0, 0), radius = 9, sd = 1.2
loglikCircle(c$x, c$y, 1.2, 0, 0, 9)
## [1] -35.41297
|
3 MCMC algorithm
We use a Markov-chain Monte Carlo approach to generate a sample of the posterior distribution of the parameters (circle center, radius and standard deviation of Gaussian noise).
3.1 Generate the "observed" data
This is the "observed" data we are going to use as an input for our Bayesian modelling (the blue circle is the true circle used to generate the data):
1 2 3 4 | set.seed(4)
circle = randomCircle(20, 4, 6, 12, 2)
plot(circle, asp = 1, bty = "n", las = 1, pch = 19, col = "pink")
drawCircle(4, 6, 12, col = "cornflowerblue", lwd = 2)
|
3.2 Initial parameters values and proposal distributions
There are four parameters in the model:
x0
andy0
, coordinates of the circle centerradius
, the radius of the circlesd
, the standard deviation of the Gaussian noise
3.2.1 Initial values
1 2 3 4 | x0_init = function() { runif(1, -20, 20) }
y0_init = function() { runif(1, -20, 20) }
radius_init = function() { runif(1, 0, 20) }
sd_init = function() { runif(1, 0, 10) }
|
3.2.2 Proposal distributions
1 2 3 4 5 6 7 8 9 | library(EnvStats)
x0_prop = function(x0) { rnorm(1, x0, 1) }
y0_prop = function(y0) { rnorm(1, y0, 1) }
radius_prop = function(radius) { EnvStats::rgammaAlt(1, radius, 0.2) }
radius_ratio = function(proposed, current) {
EnvStats::dgammaAlt(current, proposed, 0.2)/EnvStats::dgammaAlt(proposed, current, 0.2) }
sd_prop = function(sd) { EnvStats::rgammaAlt(1, sd, 0.2) }
sd_ratio = function(proposed, current) {
EnvStats::dgammaAlt(current, proposed, 0.2)/EnvStats::dgammaAlt(proposed, current, 0.2) }
|
3.3 MCMC run
Parameters for the MCMC run:
1 2 3 | nSamples = 100 # Number of samples taken
nThin = 5 # Thinning between two samples
set.seed(4)
|
3.3.1 Initialize the chains
We store the four chains (for x0
, y0
, radius
and sd
) into a list. Each
chain is initialized as a vector with length = nSamples
, and the first value is
generated by the initializing functions:
1 2 3 4 5 6 7 8 | chains = list()
for (c in c("x0", "y0", "radius", "sd")) {
chains[[c]] = vector(length = nSamples)
}
chains[["x0"]][1] = x0_init()
chains[["y0"]][1] = y0_init()
chains[["radius"]][1] = radius_init()
chains[["sd"]][1] = sd_init()
|
3.3.2 Actual MCMC run
For each iteration, we loop over the four parameters we are sampling. For each parameter, we generate a new candidate value from the proposal distribution, and we accept this value (or keep the previous one) depending on the likelihood of the proposed parameters compared to the previous likelihood.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | for (i in 2:nSamples) {
# Get the current parameter values
currentX0 = chains[["x0"]][i-1]
currentY0 = chains[["y0"]][i-1]
currentRadius = chains[["radius"]][i-1]
currentSd = chains[["sd"]][i-1]
# Iterate for nThin iterations before recording the new parameter values
for (j in 1:nThin) {
# x0
proposed = x0_prop(currentX0)
prob = exp(loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = proposed, y0 = currentY0,
radius = currentRadius) -
loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = currentY0,
radius = currentRadius))
if (runif(1) < prob) {
currentX0 = proposed
}
# y0
proposed = y0_prop(currentY0)
prob = exp(loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = proposed,
radius = currentRadius) -
loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = currentY0,
radius = currentRadius))
if (runif(1) < prob) {
currentY0 = proposed
}
# Radius
proposed = radius_prop(currentRadius)
prob = (exp(loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = currentY0,
radius = proposed) -
loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = currentY0,
radius = currentRadius)) *
radius_ratio(proposed, currentRadius))
if (runif(1) < prob) {
currentRadius = proposed
}
# sd
proposed = sd_prop(currentSd)
prob = (exp(loglikCircle(circle$x, circle$y, sd = proposed,
x0 = currentX0, y0 = currentY0,
radius = currentRadius) -
loglikCircle(circle$x, circle$y, sd = currentSd,
x0 = currentX0, y0 = currentY0,
radius = currentRadius)) *
sd_ratio(proposed, currentSd))
if (runif(1) < prob) {
currentSd = proposed
}
}
# Update the chains
chains[["x0"]][i] = currentX0
chains[["y0"]][i] = currentY0
chains[["radius"]][i] = currentRadius
chains[["sd"]][i] = currentSd
}
|
3.4 Results
The following graph shows the trajectory of the center during the MCMC iterations (grey line) and the corresponding circles (light blue). The true circle and the observed data are in red, the true center is at the intersection of the vertical and horizontal lines.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | xlim = c(-20, 40)
ylim = c(-35, 25)
plot(0, type = "n", xlim = xlim, ylim = ylim, asp = 1,
las = 1, bty = "n")
abline(h = 6, v = 4, col = "grey")
lines(chains[["x0"]], chains[["y0"]],
col = adjustcolor("black", alpha.f = 0.3))
for (i in 1:nSamples) {
drawCircle(chains[["x0"]][i], chains[["y0"]][i], chains[["radius"]][i],
col = adjustcolor("cornflowerblue", alpha.f = 0.1))
}
points(chains[["x0"]], chains[["y0"]], pch = 19, cex = 0.4,
col = adjustcolor("cornflowerblue", alpha.f = 0.5))
points(circle, pch = 19, col = "red")
drawCircle(4, 6, 12, col = "red", lwd = 3)
|
We can see how the candidate circles in the MCMC chain converge close to the true circle quite rapidly. In this case there were quite many observed points to use for the MCMC run, examples below show how the method performs when less data points are available or when the noise is greater.
4 Other examples
4.1 Increased Gaussian noise
Here are the results for the same circle but with a standard deviation of 4 instead of 2 for the Gaussian noise:
4.2 Reduced observed dataset
In the following run the standard deviation for the Gaussian noise is back to 2, but only 10 points instead of 20 are observed:
And in this case, only 5 points are used: