In our conversation with Charles, we cover topics including how he applies his STEM educational background to sport analytics, love of St Kilda FC & plans for his future plans with his blog “AFL GO”.

Tell us about your background

I completed my Bachelor of Science with a double major in Theoretical Physics and Mathematics and I am currently completing my Honours in Astrophysics at the Australian National University. I have played a variety of sports growing up like soccer, futsal, badminton, target shooting, Brazilian jiu-jitsu, but I have never played Aussie Rules and yet it’s the only sport I have spectated and followed consistently for nearly a decade. I’m a big St Kilda supporter and I always look forward to watching the footy with mates.

How would you compare the Astrophysics crowd at ANU to the AFL data world

I think it’s actually pretty similar. They’re clearly both very analytical but also both communities are extremely friendly and open about their research. You also get a large variety of people from different fields. For example, within the Astrophysics and Astronomy department at ANU, you have people from engineering working on instruments, people from computer science performing simulations of galaxies, along with physicists and mathematicians working on mathematical models. Whereas in the AFL data world, you have people from a large variety of STEM fields and even broader fields, working on AFL related problems using a range of different methods and techniques.

How did AFL GO originate

I have followed Squiggle for a number of years and I have always been so impressed with how well the models performed, and how so many people who created these models balanced full-time work and everyday life with this hobby. Once I had a holiday free of research-related projects, I spent some time looking into machine learning, and I wanted to apply what I had learnt so I created my AFL model “AFL GO”. The “GO” stands for “Gadget-type Operator”, originating from the great Brian Taylor. I think he’s hilarious and roaming with Brian is half the reason why I watch footy.

What are you hoping to achieve from it

I originally wanted the model to be a project for me to apply machine learning techniques I have read about, but also push me to learn more as well. At the moment, I have been only focusing on improving my model in relation to performance in AFL match prediction. But I do want to eventually investigate other problems. A few that have been on my mind are Brownlow vote prediction and predicting future performance of players (on the scale of years).

Which resources have been the biggest help in your journey so far

There’s definitely a lot of factors that have helped. But probably the biggest is the Coursera courses I took to expand my skills in machine learning. I originally thought the courses would be quite basic, but a lot of them are mathematically rigorous and I did learn a lot from them.

What tools and modelling or analytics methods do you use

All of my data is obtained using a freely available package called “fitzRoy”, and I write all of my code in Python and use a combination of libraries such as NumPy, Pandas, and scikit-learn. The two methods I’m currently using for my AFL match predictions are linear and logistic regression. Linear regression for margin prediction and logistic regression for the probability of a team winning.

Most of the time it’s not about who has the best model. It’s about learning something new, investigating something that interests you and having fun.

Any cool visualisations for us less technical folk?

This is a visualisation of my model’s predictions of the 2019 AFL season at the present moment. The vertical axis is the model’s estimation of the probability of the home team winning a match and the horizontal axis is the predicted margin for the home team.

What advice do you have for the couch fan who wants to start playing around with sports data

I would recommend starting with something small and working up. For example, you could investigate the percentage of times the home team wins. Later on, using excel or a programming language like Python (a lot of great resources exist online for free e.g. Codeacademy), and create a simple Elo rating system, and work up from there. Most of the time it’s not about who has the best model. It’s about learning something new, investigating something that interests you and having fun.

What would be the one problem you would love to solve for a sporting code or team

I definitely would have to say I’d like to help St Kilda FC as it’s the club I support. I think the problem of drafting future high-value players is probably one of the most important problems and is very difficult to answer. It’s definitely something I want to look into in the future.

NRL Round 17 Preview

Sourcing Sports Data Sets