David Fletcher
Department of Mathematics and Statistics
Date: Thursday 10 October 2013
POSTPONED
Title and abstract to follow
130712102613
A near-Gibbs sampler for posterior exploration in inverse equilibrium problems
Colin Fox
Department of Physics
Date: Thursday 3 October 2013
The standard Gibbs sampler (a.k.a. Glauber dynamics and local heat-bath thermalization) is fundamentally equivalent to Gauss-Seidel iteration, when applied to Gaussian-like target distributions. This explains the slow (geometric) convergence of the Gibbs sampler, but also indicates how to accelerate it using polynomials.
The potential to accelerate prompts our interest in the Gibbs sampler for an application of capacitance tomography to bulk-flow monitoring in industrial processes. We have been able to build a near-analytic Gibbs sampler in the broader class of inverse equilibrium problems by utilizing the graph-theoretic construction of circuit theory.
130712101909
Active earth processes: Geodynamic studies in New Zealand using geodetic techniques
Paul Denys
School of Surveying
Date: Thursday 26 September 2013
This presentation gives an overview of the geodetic studies of the geodynamic processes in New Zealand, currently being undertaken by the School of Surveying. Primarily using Global Navigation Satellite Systems (GNSS) methods, the studies include regional earth deformation for seismic hazard research, the Southern Alps uplift experiment, sea level rise, coseismic and post-seismic deformation studies.
While the position (coordinate) time series analyses and linear regression models used are well defined, the stochastic models (white noise, flicker noise, random walk) are not so well understood or as easily implemented. Another issue concerns optimal methods for detecting deformation transients in the event of slow slip (or slow earthquake) events. This has implications for real time Network RTK systems used by the Surveying industry.
130712102355
A stats-related seminar. Using statistics: why do we do what we do?
Steve Dawson
Department of Marine Science
Date: Tuesday 17 September 2013
NOTE DAY, TIME AND VENUE
When we do a statistical test, the null hypothesis being tested is almost always that there is no difference between our samples. Among statisticians, the value of this approach has been debated for more than 70 years. Over the last decade, these concerns are gathering momentum in Ecology. A sea change is happening.
As scientists, we should constantly question why we do what we do. It's a good time for us to think our use of statistics, and consider whether we should do it differently.
130916104426
Certainty of origin in forensic applications?
Jurian Hoogewerff
Department of Chemistry
Date: Thursday 12 September 2013
This presentation is to inform the audience about the state of the art in forensic geographical provenancing. The presentation will impact on the audience and make it aware of the potential and issues of forensic geochemical profiling and will probably lead to more usage of the technique There is an increasing need for the ability to geographically provenance natural products, manufactured goods and humans in forensic casework. The global mobility of goods has led to large scale counterfeiting with serious financial and biosecurity consequences. In the case of commercial goods like food products, claims of geographical origin based on classical supply chain traceability information can easily be falsified. In cases where materials are non-compliant with a stated origin, or simply of unknown origin, tools are required that attribute these to most likely source regions using a scientific measure of probability. A similar approach is required for forensic evidence materials that could help reconstruct a crime or provide intelligence in counter-terrorism or military pursuits. In recent years it has become evident that a number of geochemical parameters are well suited to support legal expert opinions about the geographical origin of natural materials. The natural elemental and isotopic composition of water and soil provides a base to make inferences about agricultural products and most materials derived from such. In recent years it has become evident that the hydrogen and oxygen isotopic composition of rainwater is related to a limited number of well understood spatial parameters like latitude and altitude. Models of the isotopic composition of the precipitation have been validated globally and now the regional composition of even ground water and products in the foodchain can be predicted with a useful level of accuracy, enabling discrimination of latitudinal distances in the 200-mile range1. As the precipitation models roughly provide latitudinal bands of distinction other parameters are sought to give longitudinal discrimination and/or a higher scale of general spatial resolution. Any parameter that can be linked to existing information already captured in maps is desirable. Recent research has shown that especially the radiogenic isotopic composition of an element like strontium in soil extracts can provide information about the isotopic composition of the local foodweb2. The often relatively well-understood behavior of the isotope systems allows researchers to make spatial predictions of the isotopic profiles in target tissues and objects. The art of making such predictions has actually led to the term “isoscape” meaning the isotope landscape of a biological tissue or natural product of interest3. Hydrogen and oxygen isotope isoscapes are now applied in bird migration studies, provenancing of unidentified human remains and food authentication3,4. Our own research groups are working towards oxygen, strontium and lead isotope isoscapes for human provenancing and authentication and bio-security intelligence for a number of food products in Europe, Middle East, Asia and Oceania. Despite all these efforts a more formal probabilistic approach relevant for application in criminalistics is still very much under development. When trying to provenance questioned materials a probabilistically approach combining different isotope systems and other relevant case information will provide a more accurate prediction of the more and less likely regions of origin. This presentation will discuss the state of the art using casework and give guidelines for the interpretation and presentation of results in a forensic context.
REFERENCES [1] vdVeer et al, Journal of Exploration Geochemistry, 101, 175-184, 2009. [2] Voerkelius et al, Food Chemistry 118, 933-940, 2010. [3] West et al, Isoscapes, Springer, 2009. [4] Meier-Augenstein, Stable Isotopes Forensics. CRC Press, 2010.
130712102002
Evaluation of the efficacy and safety of cancer treatments in the era of personalised medicine
Katrina Sharples
Department of Preventive and Social Medicine
Date: Thursday 5 September 2013
Recent developments in the understanding of tumour biology have lead to an explosion of research on biomarkers of disease pathways that have many potential roles in improving patient health, such as disease prevention, early diagnosis and treatment. Of particular interest is the development of more effective treatment strategies through direct targeting of the specific tumour and patient characteristics. Randomised controlled trials have been the mainstay of treatment evaluation for many decades now, but questions have recently been raised by some cancer trialists about whether they will continue to meet the need in the era of personalised medicine. This is countered by others who are working on modifications to randomised trial designs to answer questions about biologically targeted therapies. This talk will discuss the challenges to definitive Phase 3 trial design and the implications for the earlier phases of development and evaluation.
130712102511
Self-organising maps, machine learning and spatial modelling to predict establishment and spread of alien invasive species
Sue Worner
Lincoln University
Date: Thursday 22 August 2013
If allowed to cross regional borders, invasive species pose one of the greatest threats to global biodiversity, environment, economic activity and human and animal health. When there are many thousands of well recognised invasive species that could cross any border, it is difficult to know where to start. Self Organising maps (SOMs) can be used to analyse species assemblages of a large number of global geographic regions as well as a large number of invasive species to give regional invasive species profiles that can provide very useful information for pest risk assessment.
Predicting future establishment and spread is integral to invasive species risk analysis. Increasingly, different classes of models are used to integrate the high dimensional array of climate and biotic information required to gain greater predictive precision. Such models are used when detailed functional relationships between a species and its environment are lacking. A range of modelling approaches will be illustrated that are designed to both predict and mitigate the impact of invasive species in new areas. Additionally, important issues that need to be resolved to improve these models to establish good practice and sensible modelling protocols for risk assessments, will be discussed.
130712101756
A graphical R evolution in Biostatistics
Josie Athens
Department of Preventive and Social Medicine
Date: Thursday 15 August 2013
A couple of years ago, academic staff from the Department of Zoology presented a seminar titled: “R you ready for R?” Since then there has been a steady transition to R in many departments at the University of Otago. HASC 413 is an introductory paper for postgraduate students offered at the Department of Preventive and Social Medicine. To offer an attractive option to Stata, I wrote a set of functions that took advantage of the current graphical strengths of the following R libraries: JGR, ggplot2 and Deducer. In this seminar I will introduce the DeducerEpi library that I am developing using examples from the biomedical sciences, with some emphasis on epidemiological studies, I will also show how easy it is to generate publication plots with ggplot2.
130712161857
Graphical model checking
Matt Schofield
University of Kentucky
Date: Monday 24 June 2013
We use Bayesian inference to fit capture-recapture models to data from house finch, many of which were observed with conjunctivitis. To ascertain the fit of the model, we briefly look at “traditional” posterior predictive tests involving the use of a Bayesian p-value, before focusing on graphical approaches. We show how graphical approaches can overcome some of the downsides of Bayesian p-values and help us to understand the strengths and weaknesses of the model. Included in the talk are visualizations for a variety of capture-recapture models.
130619100931
A maths-stats related public lecture. A new frontier. Understanding epigenetics through mathematics
Professor Terry Speed FRS
2013 Royal Society of New Zealand Distinguished Speaker
Date: Tuesday 18 June 2013
Scientists have now mapped the human genome. The next frontier is understanding human epigenomes: the ‘instructions’ which tell the DNA whether to make skin cells or blood cells or other body parts. Apart from a few exceptions, the DNA sequence of an organism is the same whatever cell is considered. So why are the blood, nerve, skin and muscle cells so different and what mechanism is employed to create this difference? The answer lies in epigenetics. If we compare the genome sequence to text, the epigenome is the punctuation and shows how the DNA should be read.
Advances in DNA sequencing in the last five years have allowed large amounts of DNA sequence data to be compiled. For every single reference human genome, there will be literally hundreds of reference epigenomes, and their analysis could occupy biologists, bioinformaticians and biostatisticians for some time to come.
This lecture is free and open to the general public. However, to ensure a seat, please obtain a ticket at www.royalsociety.org.nz Enquiries to: lectures@royalsociety.org.nz or 04 470 5781
130530132221
Are you really sure that character isn't irreversible? Testing Dollo's Law with ancestral-state reconstruction on evolutionary trees.
Professor David Swofford
Duke University, North Carolina
Date: Friday 7 June 2013
Genetics Otago, in association with the Department of Mathematics and Statistics
The reconstruction of ancestral states on evolutionary trees is now a standard method of making inferences about character evolution. Increasingly, ancestral-state reconstructions have been used to support arguments for unexpected violations of "Dollo's Law", including reacquisition of eyes and pigment in cave-adapted organisms, re-evolution of sexuality from parthenogenetic ancestors, and the reacquisition of wings in stick insects. Methods for inference of ancestral states have become increasingly sophisticated, with maximum-likelihood and Bayesian methods largely replacing earlier reliance on maximum parsimony. However, despite the apparent rigor of these stochastic-model approaches, a number of aspects of their application are unsatisfying. These include model misspecification and inadequate data for accurate model-parameter estimation. In this talk I will review a number of studies that have employed ancestral-state reconstruction methods, and present the results of simulation studies that attempt to answer the question of whether the conclusions from these studies can be trusted. Future directions that attempt to overcome the limitations of existing methods will be outlined (probably too vaguely).
Professor David Swofford is the author of PAUP (Phylogenetic Analysis Using Parsimony), a computational phylogenetics program for inferring evolutionary trees (phylogenies). http://paup.csit.fsu.edu/about.html
130605132444
Uncertainty in climate change impacts on freshwater resources – an inter-basin comparison
Daniel Kingston
Department of Geography
Date: Thursday 23 May 2013
Current analyses of the impact of climate change on basin-scale freshwater resources employ a diverse range of socio-economic and climate scenarios that complicate inter-basin comparisons. To derive a clear and quantitative understanding of the impacts of climate change on future freshwater availability, a consistent range of climate scenarios are applied to a series of basin-scale hydrological models for rivers from four continents (the Mekong, Okavango, Mackenzie, Parana, Nile, Yangtze and Yellow rivers). Projections of climate change include different greenhouse gas emissions scenarios and prescribed increases in global mean temperature. Uncertainties in simulation of global climate, and in future emissions of greenhouse gases, lead to a large range of uncertainty in projections of future change in freshwater availability. Such uncertainty covers both the magnitude and direction of change, and can be difficult to deal with from a practical perspective (e.g. for the development of climate change adaptation strategies). As a first step towards coping with this, a process-based approach is applied here to unpick the hydrological causes and implications of this uncertainty.
130225125057
Modelling Māori language
Janine Wright
Department of Mathematics and Statistics
Date: Thursday 16 May 2013
Jointly presented with Katharina Ruckstuhl, Research and Enterprise.
In New Zealand, recent reviews have indicated that there is a declining percentage of the New Zealand population speaking Māori. On the basis of these reviews, a Māori Language Strategy is being developed and as part of this strategy, one proposal is that there is a national language target of 80% of Māori speaking Māori language by 2050.
Following a developing international trend to model changes in language use statistically, we have developed a model specific to the New Zealand situation that allows us to examine how various language policy choices affect language usage over successive generations. Our aim is to show whether such a language target is achievable, how it might be measured and which language policy choices are most likely to maintain or potentially grow inter-generational Māori language use.
We hope that a model such as ours will provide evidence to assist policy makers to consider the potential effects of their current choices on future Māori language speakers.
130513120944
Skew product graphs
David Pask
University of Wollongong
Date: Tuesday 9 April 2013
In my research I have often made use of skew product graphs. I will describe their construction and a few of their basic properties. Then I will talk about certain connectivity properties of graphs and how they behave under the skew product construction. Finally I will indicate how the graphical results which I have presented apply to my research.
130328134251
Group heterogeneity in the Jolly-Seber-Tag-Loss model
Laura Cowen
University of Victoria, Canada
Date: Thursday 28 March 2013
Mark-recapture experiments involve capturing individuals from populations of interest, marking and releasing them at an initial sample time, and recapturing individuals from the same populations on subsequent occasions. The Jolly-Seber model is widely used in open-population models since it can estimate important parameters such as population size, recruitment, and survival. However, one of the Jolly-Seber model assumptions that can be easily violated is that of no tag loss. Cowen and Schwarz (2006) developed the Jolly-Seber-Tag-Loss (JSTL) model to avoid this violation; this model was extended to deal with group heterogeneity by Gonzalez and Cowen (2010). We studied the group heterogeneous JSTL (GJSTL) model through simulations and found that as sample size and fraction of double tagged individuals increased, bias of parameter estimates is reduced and precision increased. We applied this model to a study of rock lobsters Jasus edwardsii, Australia (Xu, Cowen, and Gardner; in press).
130226142058
Mixing times of Markov chains with applications in statistical mechanics
Peter Otto
Willamette University, Oregon
Date: Tuesday 19 March 2013
The mixing time of a Markov chain is a measure of the convergence rate of the chain's state distribution to its unique stationary distribution. An important question in the application of Markov chains in sampling and randomized algorithms is how the mixing time grows with the size of the state space. In this talk, I will first give a general introduction to mixing times of Markov chains including the path coupling method for bounding mixing times. Then I will spend a bit of time discussing recent collaborative work on mixing times for certain Markov chains called Glauber dynamics used in statistical mechanics.
130215114814
Investigating the number of components in overfitted Gaussian mixture models
Zoé van Havre
Queensland University of Technology & Université Paris-Dauphine
Date: Tuesday 27 November 2012
Finite mixture models with an unknown number of components pose a complex mathematical and computational challenge, yet are commonly encountered in today’s increasingly complex datasets and research problems. When too many components are included the true parameters fall within an unidentifiable subset of the larger parameter space making estimation difficult, and this escalates as the number of components and dimensions increases.
Recent advances in the asymptotic theory of overfitted mixture models by Rousseau & Mengersen (2011) led us to investigate the impact of the prior on the weights in overfitted Gaussian mixture models. As a result of this investigation we propose a new technique to estimate the number of components in Gaussian mixtures using overfitting with a dynamic prior, and show that it provides a very effective and comprehensible solution to estimate the number of components as well as their parameters. This method is simple to implement using a
Gibbs sampler and bypasses any need for model selection or complicated trans-dimensional steps, as well as automatically dealing with the label-switching problem without biasing posterior parameter estimates.
In this seminar I plan to present the motivation and background for this work, focusing on the theoretical issues leading to the development of the method as well as practical details for implementation. As a series of case studies will be presented at the NZSA2012 conference, here I will focus on showing its performance on some simulated Gaussian mixture models of varying dimensions and complexity.
121121171926
A systematic comparison of second-order parameter estimation techniques for the stationary log-Gaussian Cox process
Tilman Davies
Department of Mathematics and Statistics
Date: Thursday 25 October 2012
The log-Gaussian Cox process (LGCP) represents an extremely flexible class of stochastic processes. This flexibility, combined with its mathematical tractability, renders it particularly attractive for applications in which we wish to describe spatial intensity functions of planar point patterns. Driven by a latent Gaussian random field, interpoint dependency is described via a covariance structure which is in its simplest form controlled by two scalar parameters. The values of the field variance, $\sigma^2$, and a scaling parameter determining effective correlation range, $\phi$, are instrumental in the appearance of a realisation of the process and hence any subsequently generated point patterns. In practice, it is necessary to estimate these parameters based on an observed point pattern; a non-trivial yet essential step for simulation of the LGCP conditional upon those data. This work reviews three different parameter estimation methods applied specifically to the LGCP, two of which have experienced little if any practical exposure in the literature, and compares their numerical performance in a comprehensive simulation study.
120921153232
Tracking experience over time using mobile technology: strategies and pitfalls for analysing data
Dr Tamlin Conner
Department of Psychology
Date: Thursday 18 October 2012
Psychological scientists are increasingly taking research into the real-world by surveying people’s experiences using mobile phone and Internet technologies. In this talk, I will review two key survey methods—mobile phone experience-sampling and Internet daily diaries-- and describe how I use these methods to study people as they go about their daily lives. Using examples from my research on well-being, I will discuss the ‘micro-longitudinal’ datasets these designs produce, and the joys and challenges of analysing them. Approaches I’ve used include descriptive analyses, between- versus within-person approaches, simple time series analyses, and multilevel modelling.
121011105252
Project presentations - Mathematics
Projects, Maths
Date: Friday 28 September 2012
1.30 Del Nawarajan : Quantum algorithms
1.45 Yue Wang : Simulating fractional reaction-diffusion models
2.00 John Holmes : An investigation into models of a cysteine metabolism reaction
2.30 Ilija Tolich: C*-algebras generated by power partial isometries
3.00 Sam Paulin : Fuchsian theory and the relativistic Euler equations
3.30 Kelly O'Connell : Bratteli diagrams and their Leavitt path algebras
4.00 Sam Stewart : Soliton theory
4.30 Chris Stevens : Colliding gravitational plane waves
Note day and start time of this event. Individual times are a guideline only.
120510124714
A full likelihood analysis of SNP data from multiple populations
David Bryant
Department of Mathematics and Statistics
Date: Thursday 27 September 2012
A SNP (single nucleotide polymorphism) is a location on the genome where members of a population have difference nucleotides (A,C,G,T) at a single sequence position. Databases of SNPs are used primarily as road maps of the genome, helping researchers identify associations between genes, traits and diseases. In this talk I will discuss the use of SNPs to infer relationships between different populations or species, including the estimation of ancestral population sizes. I'll introduce the biological and statistical models used, and introduce SNAPP, a likelihood algorithm which uses numerical and combinatorial tricks to compute the relevant likelihoods. Previously only Monte-Carlo estimates had been available.
This work has been published in:
Bryant, D., Bouckaert, R., Felsenstein, J., Rosenberg, N., RoyChoudhury, A. Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Molecular Biology and Evolution 19(8):1917-1932
Software is available at snapp.otago.ac.nz
120921152902
StatChat: Bayes, asymptotics, simulation and the bootstrap
David Fletcher
Department of Mathematics and Statistics
Date: Thursday 6 September 2012
The aim of this talk is to promote discussion on the links between Bayesian methods, frequentist asymptotics, simulation and the bootstrap.
120817094827
Do your data fit your phylogenetic tree?
Steffen Klaere
University of Auckland
Date: Thursday 30 August 2012
Phylogenetic methods are used to infer ancestral relationships based on genetic and morphological data. What started as more sophisticated clustering has now become a more and more complex machinery of estimating ancestral processes and divergence times. One major branch of inference is maximum likelihood methods. Here, one selects the parameters from a given model class for which the data are more likely to occur than for any other set of parameters of the same model class. Most analysis of real data is executed using such methods.
However, one step of statistical inference that has little exposure to application is the goodness of fit test between parameters and the data. There seem to be various reasons for this behaviour, users are either content with using a bootstrap approach to obtain support for the inferred topology, are afraid that a goodness of fit test would find little or no support for their phylogeny thus demeaning their carefully assembled data, or they simply lack the statistical background to acknowledge this step.
Recently, methods to detect sections of the data which do not support the inferred model have been proposed, and strategies to explain these differences have been devised. In this talk I will present and discuss some of these methods, their shortcomings and possible ways of improving them.
120817094134
Design for agility in animals and machines
Mike Paulin
Department of Zoology
Date: Thursday 23 August 2012
Classical robotics is about using motors to over-ride inertial, elastic and dissipative forces acting on mechanical structures in order make them do what we want. The future is about combining inference and control with the design of mechanical linkages whose dynamics are exploited, not over-ridden, to move quickly, accurately and efficiently. Stochastic dynamical systems theory and computational modelling can join the dots from the reproductive strategies of sponges to the dynamics of squash rackets, helping us to understand how brains and bodies coevolved for agile movement, and showing how to build better robots and train better athletes.
120725151738
Dynamic species distribution models for marine intertidal invertebrates from categorical surveys
Matthew Spencer
School of Environmental Sciences, University of Liverpool
Date: Thursday 16 August 2012
Species distribution models are important in ecology and conservation. They typically predict the probability of occurrence of a species in geographical space from data on the presence or absence of the species at sites with known environmental characteristics. These models have no temporal interpretation, and therefore cannot tell us anything about population dynamics.
I will describe related models for population dynamics, based on categorical survey data. Categorical surveys record abundance categories (e.g. abundant, rare, not seen), and can be a good way to cover large numbers of sites quickly. From the resulting models, we can get predictions about species distributions with a temporal interpretation. I will apply this approach to data on two intertidal snail species in the UK.
120810101350
The probability of extinction in a branching process and its relationship with moments of the offspring distribution
Sterling Sawaya
Centre for Reproduction and Genomics, Department of Anatomy, University of Otago
Date: Thursday 9 August 2012
How does one compare different biological strategies? The standard approach is to examine the mean and variance in reproductive success. These values ultimately rely on measures of the first few moments of the offspring distribution. In this talk I will discuss an alternative, comparing the probability of extinction. I will focus on the interplay between extinction and the moments of the offspring distribution. The probability of extinction decreases with increasing odd moments and increases with increasing even moments, a property which is intuitively clear.
There is no closed form solution to calculate the probability of extinction in general, and numerical methods are often used to infer its value. Alternatively, one can use analytical approaches to generate bounds on the extinction probability. I will discuss these bounds, focusing on the theory of s-convex ordering of random variables, a method used primarily in the field of actuarial sciences. This method utilizes the first few moments of the offspring distribution to generate "worst case scenario" distributions, which can then be used to find upper bounds on the probability of extinction. I will present these methods and discuss their merits in the field of population biology.
120713095953
Closed-population capture-recapture modeling using non-invasive DNA sampling with genotyping error
Richard Barker
Department of Mathematics and Statistics
Date: Thursday 2 August 2012
Wright et al. (2009) introduced a new model in which DNA genotyping error resulting from allelic dropout was incorporated into a mark-recapture type model for field sampling of DNA. The field sampling model they used was based on a simple urn-sampling model proposed by Miller et al. (2005) (CAPWIRE) for field sampling of DNA fragments shed by individuals in the wild. Here we discuss a more general model for capture-recapture sampling of DNA sampled from the field that generalizes CAPWIRE to allow modeling of samples obtained on multiple occasions and in the presence of genotyping error caused by allelic dropout. This model allows standard capture-recapture modeling to be applied to field sampled DNA.
This is a joint work with Janine Wright and Matthew Schofield
Miller, C., Joyce, P., and Waits, L. (2005), “A new method for estimating the size of small populations from genetic mark-recapture data.” Molecular Ecology, 14, 1991–2005.
Wright, J. A., Barker, R. J., Schofield, M. R., Frantz, A. C., Byrom, A. E., and Gleeson, D. M. (2009), “Incorporating genotype uncertainty into mark-recapture-type model for estimating animal abundance,” Biometrics, 65, 833–840.
120720152305
Proper local scoring rules
Matt Parry
Department of Mathematics and Statistics
Date: Thursday 24 May 2012
Scoring rules have long been used to assess the quality of probabilistic predictions. I report on a recently discovered class of scoring rules with the remarkable property that they do not require knowledge of the normalization constant of the predictive model. On both continuous and discrete outcome spaces, we show how such scoring rules can be constructed from the idea of a score being local. One interesting consequence is that Besag's pseudolikelihood is given a firm theoretical foundation as a result of being a proper local scoring rule. We discuss recent applications of local scoring rules and connections to the work of Ehm & Gneiting. This is joint work with A. Philip Dawid and Steffen Lauritzen.
120502154252
Separating mortality and emigration: modeling space use, dispersal and survival with robust-design spatial capture-recapture data
Torbjørn Ergon
University of Oslo
Date: Thursday 17 May 2012
Space-use and dispersal are intimately linked with life-history traits of individuals and demographic processes in a population. The dynamics of a local population are determined by local survival and reproduction as well as emigration and immigration. However, capture-recapture studies typically ignore spatial information about individual encounter locations and do not attempt to separate mortality and emigration, and hence only estimate the joint probability of surviving and staying in the study area. I will present a model for robust design capture-recapture data in which individuals are allowed to move their home range between primary sessions. By constraining dispersal distance to a parametric distribution, fitted to the data, we are able to estimate the mortality hazard separately from emigration. In addition, we obtain estimates of home-range size and dispersal probabilities. The model is applicable to situations where only restricted parts of the geographic range of the population are sampled, and when it can be assumed that dispersal distances and directions are homogeneous across space.
120502154625
The PAPRIKA method for Multi-Criteria Decision-Making
Paul Hansen
Department of Economics
Date: Friday 4 May 2012
NOTE - DAY AND TIME ARE DIFFERENT TO USUAL SCHEDULE
In this seminar I’ll explain the PAPRIKA method that I co-invented.* PAPRIKA is a partial acronym for ‘‘Potentially All Pairwise RanKings of all possible Alternatives’. The PAPRIKA method, implemented by 1000Minds software (www.1000minds.com), is for determining the weights for decision criteria used in additive Multi-Criteria Decision-Making (MCDM) models and Conjoint Analysis (or ‘Discrete Choice Experiments’). The weights represent the relative importance of the criteria to decision-makers. MCDM models (commonly known as points systems or point-count models) are used for ranking or prioritising alternatives in a very wide range of applications – e.g. prioritising patients for elective surgery, investment decision-making, helping students to choose their majors (see www.nomajordrama.co.nz), choosing a new home, etc. When you’re working on a decision, getting the weights ‘right’ is important because even if you’re applying the right criteria, unless your weights accurately reflect your preferences you’re likely to make the wrong decision!
* P Hansen & F Ombler, “A new method for scoring multi-attribute value models using pairwise rankings of alternatives“, Journal of Multi-Criteria Decision Analysis 2009, 15, 87-107. And for an overview search ‘PAPRIKA’ on Wikipedia.
120427135853
Probability, the science of uncertainty
Professor Geoffrey Grimmett
University of Cambridge; NZMS Forder Lecturer for 2012
Date: Monday 16 April 2012
The role of modern probability will be discussed and illustrated with many examples from "real life", including gambling, parenthood, and the sinking of the Titanic.
NZMS Forder Lecture for 2012
120411093702
Conformality and universality in probability
Professor Geoffrey Grimmett
University of Cambridge; NZMS Forder Lecturer for 2012
Date: Monday 16 April 2012
A number of 'exact' (and beautiful) solutions are known for two-dimensional systems of probability and physics. These systems have critical points, and certain special techniques have emerged for their study.
I will discuss the twin features of conformal invariance and universality, particularly in the context of the percolation model. A fascinating structure is becoming clear, with connections to analysis, geometry, and conformal field theory, but serious difficulties remain.
(NB change to our usual day, time and venue)
120402135619
Machine Learning: Concepts, Relevance, and Applications
Dr Brendon J Woodford
Department of Information Science, University of Otago
Date: Thursday 5 April 2012
Machine Learning (ML) (Michie et al., 1994; Mitchell, 1997) is a scientific discipline which is concerned with how algorithms are designed and developed to allow computers to build learning models based on different data sources such as empirical data, databases, or sensor data.
These models not only are able to learn and recognise complex patterns or relationships between observed variables within the data but also are able to make intelligent decisions based on these data sources. Furthermore they can also generalise from data which they have learned from to produce useful output in new cases.
ML theory, approaches, and techniques have been influenced by statistics and share a lot in common with this discipline although they seldom use the same terms. In this talk I will introduce what machine learning is, its relevance to statistics (Hastie et al.; 2011) and cover real-world applications of machine learning systems.
Michie, D., Spiegelhalter, D. J. and Taylor, C. C. (1994). ,Machine Learning, Neural and Statistical Classification. Ellis Horwood.
Mitchell, M. T. (1997). Machine Learning. MacGraw-Hill.
Hastie T., Tibshirani R. and Friedman J. (2011). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Fifth Edition, Springer.
120227165458
Can volcanic hazards be robustly estimated?
Ting Wang
Department of Mathematics and Statistics
Date: Thursday 29 March 2012
Our modern society is exposed not only to earthquake hazards, but also to volcanic hazards. In fact, a large number of people in New Zealand are living over or close to volcanoes, e.g., Auckland, Taupo. Given the record of the past volcanic eruptions, can we get robust estimates of when the next eruption will be?
Historical eruption records are often incomplete, a problem which is exacerbated when dealing with catalogs derived from geologic records. Similar to earthquakes, volcanic eruptions are often considered as point processes. I will first present the characteristics of the data, and then discuss the problem of estimating the true (adjusted for missing observations) parameters and hence the hazard. The statistical models I used in this work are Weibull processes and Weibull renewal models, which are commonly used to model time series of volcanic onsets. The purpose is to obtain robust estimates which are not influenced by missing data.
120321134904
Inference on population size in binomial detectability models
Peter Jupp
University of St Andrews, Scotland
Date: Thursday 22 March 2012
Many methods of estimating the size $N$ of a homogeneous population are based on i.i.d. random variables $x_1, \dots, x_n$ (e.g. capture histories, distances from a transect), of which only a random number $n$ are observed. Both the distribution of $x_1, \ldots, x_n$ and the probability that $x_i$ is observed depend on a (vector) parameter $\theta$. Two appealing estimators of $(N,\theta)$ are
(a) the full m.l.e. $({\hat N},{\hat \theta})$,
(b) the conditional m.l.e. $({\hat N}_c,{\hat \theta}_c)$, where ${\hat \theta}_c$ is the m.l.e. of $\theta$ obtained by conditioning on $n$, and ${\hat N}_c$ is a Peterson-type estimator.
This talk describes work with Rachel Fewster (University of Auckland) which has produced
(i) a formula showing that $({\hat N},{\hat \theta})$ and $({\hat N}_c,{\hat \theta}_c)$ are remarkably close,
(ii) the asymptotic distribution of $({\hat N},{\hat \theta})$ and $({\hat N}_c,{\hat \theta}_c)$.
An extension to non-homogeneous populations will be indicated.
120227165154
Mock theta functions, moonshine, and modular transformations
Wynton Moore
University of Chicago
Date: Wednesday 21 March 2012
Mock theta functions were introduced by Ramanujan in 1920, but it was not until 2002 that they were described in a general theory, by Zwegers. Recently they have appeared in a series of "moonshine" correspondences between certain Jacobi forms and finite groups, which grew out of string theory. This led to a conjecture for the modular transformations of the "tenth order" mock theta functions. In this talk I will review Zwegers' description of mock theta functions in terms of indefinite theta series, and use it to verify the conjecture.
120312202901
Fast Bayesian Climate Reconstruction
Peter Green
Department of Mathematics and Statistics
Date: Thursday 15 March 2012
Palaeoclimatologists are quite open to the advantages offered by Bayesian modelling. However, scientists often lack the specific skills and training needed to use this framework.
State-of-the-art palaeoclimate reconstructions use RegEM, an expectation-maximisation algorithm designed to find penalised maximum likelihood estimates given rank-deficient multivariate data.
The setup of palaeoclimate reconstruction problems means that the EM algorithm converges very slowly. This makes simulation studies of reconstruction methods particularly difficult.
I have found a neat trick which offers a substantial performance increase to the RegEM algorithm. The gains are especially large for simulation studies. This is the "fast" part.
Maximum penalised likelihood estimates have a natural interpretation in the Bayesian framework as maximum a posteriori estimate. RegEM estimates can therefore be used to construct approximate posterior distributions, allowing a Bayesian analysis of the uncertainties in past temperature estimates.
120227164953
Polynomial accelerated MCMC ... and other sampling algorithms inspired by computational optimization
Colin Fox
Department of Physics
Date: Thursday 8 March 2012
Algorithmic ideas from computational optimization provide great inspiration for building efficient simulation algorithms – that draw samples from a desired distribution – for use in MCMC. I will show how polynomial acceleration of MCMC is simple in concept, and explicitly construct an accelerated Gibbs sampler for Gaussian-like distributions. The past few years have seen development of a number of other sampling algorithms based on Krylov space and quasi-Newton optimization algorithms that achieve spectacular performance in some settings.
120227161656
Modelling with State-Dependent Noise
Paul Tupper
Simon Fraser University, Canada
Date: Thursday 23 February 2012
Consider a particle diffusing in a confined volume which is divided into two equal regions. In one region the diffusion coefficient is twice the value in the other region. Will the particle spend equal amounts of time in the two regions in the long term? Statistical mechanics would suggest yes, since the number of accessible states in each region is the same. However, another line of reasoning suggests that the particle should spend less time in the region with faster diffusion since it will exit it more quickly. I will demonstrate with a simple microscopic model system that both answers are consistent with the information given. Furthermore, I will argue that modelling such systems with Ito stochastic differential equations with zero drift and variable diffusion rate is not sound, since that class of mesoscale models does not correspond to any naturally occurring class of microscopic models.
120215122510
Seismicity modelling using hidden Markov models
Dr Ting Wang
Institute of Natural Resources, Massey University
Date: Thursday 10 November 2011
Earthquakes are a process in which the internal workings (such as the physical processes of earthquake generation) are only observed indirectly, although the final effects are all too observable! Hidden Markov models (HMMs) are an intuitively attractive idea for analyzing seismicity. The challenge with HMMs is the interpretation of the resulting hidden state process. I will discuss two related applications.
We propose a new model – the Markov-modulated Hawkes process with stepwise decay (MMHPSD) – to investigate long-term patterns in seismicity rate. The MMHPSD is a self-exciting process which switches among different states, in each of which the process has distinguishable background seismicity and decay rates. A variant on the EM algorithm is constructed to fit the model to data possessing immigration-birth features. This is applied to the Landers earthquake sequence, demonstrating that it is capable of capturing changes in the temporal seismicity patterns and the behaviour of main shocks, major aftershocks, secondary aftershocks and periods of quiescence.
This decomposition of the earthquake cycle motivates the construction of a non-linear filter measuring short-term deformation rate changes to extract signals from GPS data. For two case studies of a) deep earthquakes in central North Island, New Zealand, and b) shallow earthquakes in Southern California, an HMM is fitted to the output from the filter. Mutual information analysis indicates that the state having the largest variation of deformation rate contains precursory information that indicates an elevated probability for earthquake occurrence.
111102093050
Bayesian Analysis of oncogenic pathway activation
Aaron Bryant
Mathematics and Statistics Department University of Otago
Date: Thursday 27 October 2011
Through the use of microarray technology, researchers are now able to simultaneously measure the expression levels of tens of thousands of genes. Among other things, this allows the construction of profiles of activation of various pathways within tumour samples.
Using breast cancer data, this talk will explore various Bayesian factor regression methods to estimate the probability of pathway activation using gene expression data. The relationship between these probabilities and various histological methods will also be examined.
110923152241
4th year Project Presentations
400-level Maths students
Date: Friday 14 October 2011
Eman Alhassan
Dedekind Domains
Fatemah Al Kalaf
Conformal Mapping
Boris Daszuta
Spectral Methods, Wave Equations and the 2-Sphere
Richard McNamara
Parseval Frames
Sam Primrose
Leavitt Path Algebras
111010113545
Bayesian demographic accounts: subnational population estimation using multiple data sources
Dr John Bryant
Statistics New Zealand
Date: Thursday 13 October 2011
The talk will introduce a new approach to local-level population estimation being developed at Statistics New Zealand. In contrast to traditional methods, we set up a formal statistical model. At the core of the model is a 'demographic account' - a complete description of population counts, births, deaths and migration during the period of interest. The evolution of the demographic account is governed by a 'system model'. The relationships between the demographic account and available data sources is governed by an 'observation model'. Inference is carried out using Markov chain Monte Carlo methods. The new method has many potential advantages over traditional methods, including the ability to deal with noisy, incomplete datasets, and the ability to estimate uncertainty.
110718104603
Politics, Place, Policy, Population and Statistics: Analysis for hard choices
Len Cook
Former Director of the Office for National Statistics, United Kingdom and New Zealand Government Statistician
Date: Monday 10 October 2011
Population change will necessitate politically difficult choices for most places in New Zealand that will generally conflict with local political sentiment. The nature of population change that is projected for New Zealand will lead to an accelerating diversity in age structures locally, and among ethnic and other population groupings. Continuing but slower population growth at a national level will obscure the huge mix of local experiences, which will range from absolute population decline to continuing growth in most age groups.
Much political will is needed to match supply to demand, especially when a necessary disinvestment involves difficult political choices. The bias to overinvestment brings high opportunity costs which at present we rarely recognise. The costs of maintaining over-investment must grow significantly over the next few decades unless the need for disinvestment can be recognised more widely among communities, with strategies to reduce the opportunity costs of continuing programmes in their present form. The talk will give some examples, and how they can lead to a general formulation as a researchable problem.
In order to see this first as an analytical rather than political question, the scientist’s role as a public communicator will be vital in underpinning the trust that communities will need to have in accepting such change.
110930131758
A mathematical excursion into molecular phylogenetics
Professor Mike Hendy
Department of Mathematics and Statistics
Date: Thursday 22 September 2011
Molecular phylogenetics is the art of constructing phylogenies (evolutionary trees) from biological sequence data. Using some simple mathematical tools I have been endeavouring to turn this art into a science.
I will introduce some of my findings including Hadamard conjugation and the closest tree selection algorithm.
110713083845
Sequential Analysis and the Moran Process
Peter Green
University of Otago
Date: Thursday 22 September 2011
Sequential analysis is the theory of cumulative sums of random variables. The central result in sequential analysis, Wald's Fundamental Identity, can be used to calculate absorption probabilities in random walks with barriers.
The Moran process from mathematical biology is a birth-death process used to model the spread on mutant genes in a population. This process can be used to calculate the probability that a beneficial mutation will spread to an entire population. The Moran process is the cumulative sum of random changes in the population state, and is therefore amenable to sequential analysis.
We have used sequential analysis to develop an analytical approximation to a simple simulated ecosystem. An analytical model gives us insights not available from the simulation results, and sequential analysis allows us to build our approximate model using a single conceptual tool.
110915095127
Adaptive sampling: my journey that began in the Department of Mathematics and Statistics, Otago University
Professor Jennifer Brown
University of Canterbury
Date: Thursday 15 September 2011
Adaptive sampling designs are becoming increasingly popular in environmental science particularly for surveying rare and aggregated populations. There are many different adaptive survey designs that can be used to estimate animal and plant abundances. The appealing feature of adaptive designs is that the field biologist gets to do what innately seems sensible when working with rare and aggregated populations – field effort is targeted around where the species is observed in the first wave of the survey.
In this presentation I will discuss some of these forms of adaptive sampling and my research in this subject. My research had a wonderful start having studied for my PhD in the Department of Mathematics and Statistics at Otago University, under the kind and wise guidance of Professor Bryan Manly, and supported by many dear friends in the department.
110819151452
Gene Mapping and Genomic Selection in Sheep Using a Single Nucleotide Polymorphism Chip
Dr Ken Dodds
AgResearch, Mosgiel
Date: Thursday 8 September 2011
Recent advances in technology have enabled large numbers of genetic markers to be assayed simultaneously for many species. For sheep, a product known as the Illumina OvineSNP50 BeadChip can assay over 50,000 single nucleotide polymorphism (SNP) markers at once, and this has been used to genotype more than 8000 New Zealand sheep. The main focus of this effort is to enable 'genomic selection' which allows prediction of genetic merit of an animal on the basis of genotyping alone. Collections of case-control animals have also been genotyped to confirm or map a variety of single gene traits including: horns, yellow fat and microphthalmia. Genotype results can also be used to characterize populations and to inform conservation decisions for rare breeds. A number of interesting statistical challenges arise when analyzing such data and these will be discussed.
110718102540
Bias in estimation of adult survival and asymptotic population growth rate caused by undetected capture heterogeneity
David Fletcher
Department of Mathematics and Statistics
Date: Thursday 25 August 2011
In ecology, mark-recapture studies are often used to estimate adult survival probability for an animal species. This is an important demographic parameter for long-lived species, as it can have a substantial impact on population dynamics. Variation among individuals in their capture probabilities (capture heterogeneity) leads to bias in the estimate of adult survival. Traditional knowledge in this area suggests in many situations the bias will be small enough to ignore. I present the results of a simulation study of the potential for different types of undetected capture heterogeneity to lead to a level of bias that might have an impact on management decisions. I will illustrate the issues involved using data from a study of wolves in France.
110718102048
Big genomic data: are we teaching our students what they need to know?
Dr Mik Black
Biochemistry, Otago School of Medical Sciences, University of Otago
Date: Thursday 11 August 2011
With the "genomics revolution" continuing to generate data in ever-increasing amounts, the disciplines of statistics and bioinformatics remain vital components in our drive to interpret and understand genomic data. Students in these disciplines are often exposed to a tantalizing glimpse of the problems encountered by "real world" users of genomic data sets, however the scale (and complexity) of these problems is generally greatly reduced to accommodate the limited resources available as part of a standard teaching programme. The skills learned in such courses also tend only to scratch the surface of what is required for in-depth genomic analyses, despite the fact that these techniques are often applicable to a much broader class of problems. In this talk I will describe some of the current genomics problems typically faced by statisticians and bioinformaticians, along with the statistical tools that are often applied. I will also describe two national infrastructure initiatives that the University of Otago is heavily involved in, with the goal of starting a dialogue on how we can begin to incorporate these important "research-led" initiatives into our teaching programmes to provide our students with the tools they need for dealing with large amounts of genomic data.
110718101545