Luke Heinrich

Analyst, Atlassian

Luke Heinrich is a Sydney based analyst at Atlassian where he drills into all things data and decisions. Previously, he spent his time developing personalisation algorithms for retailer e-commerce in Australia and holds another life as a fellow of the Actuaries Institute of Australia. Outside of work, Luke loves to read and will gladly talk your ear off about cricket.



Building Understanding Out of Incomplete and Biased Datasets using Machine Learning and DatabricksSummit 2020

At Atlassian, product analytics exists to help our teams build better products by capturing and describing in-product behaviour. Within our on-premise products, only a subset of customers choose to send us anonymised event data, meaning we have an incomplete and biased dataset. In this world, something as simple as 'what percentage of customers use feature X' then becomes a non-trivial estimation task. This world becomes further complex when a metric is subadditive, such as estimating distinct users of a product feature, where one user using the feature on multiple (and possibly unknown) instances should be counted as only one user and our methodology needs to account for this. In this talk, we'll dive into our estimation methods and adjustments we make for various metrics, providing an accessible guide to operating in this environment. We'll also discuss how we democratixed these estimation methods, allowing any stakeholder who can write a query to immediately access our models and create accurate and consistent estimates.