Whether you are working on a live title, pre/post production, ongoing maintenance, future releases, another version of a game, or a brand new title for the market, you're always looking for feedback from the community. There's no shortage of it out there, but it can be overwhelming and hard to sift through. For games shipped on PC and sold through Valve's Steam Store, a great source of player feedback for your title can be found in Steam's game reviews. We have built a new solution accelerator for Player Review Analysis (available here on GitHub) that combines natural languages and machine learning techniques to help game developers understand their players better and respond through their game design, backend operations, LiveOperations, Marketing and, truly, through all lines of business.
With Steam's game reviews, you have the opportunity to see:
Say you've gathered this feedback, you've got it in your data platform, what's next? How does one make sense of it all? Reading through hundreds, or thousands of plain text reviews (unstructured data) to reliably find patterns and or insights can be daunting.
This is where the power of natural language processing comes in. With this machine learning (ML) solution you are able to extract the key terms and their associated positive, neutral, or negative sentiment. Using ML, you can mitigate biases and see what the data is really trying to tell you. This insight can happen at an aggregated or player specific level. When analyzing your own title, you will have access to your Player ID and be able to align that with Steam's Game ID. With this, you can augment your player360 datasets with the sentiment expressed on Steam enabling you to proactively take action to improve engagement, retention and revenue metrics.
Imagine a high value player has just dropped an incredibly negative review. The sooner that you realize that connection, the faster you can take action to mitigate what's going on, engage with the player (and broader community) directly and improve your chances to retain them. This type of analysis is especially critical for live service titles, shipped in cycles of constant iteration.
The insight derived is useful across the board:
Now that we understand the why, the how and the impact, let's get to fun stuff!
In the below sections we will walk through how to take various reviews from Steam and process then curate unstructured text into actionable data.
Note: Though we only cover Steam the same pattern can be applied to many other sources of data.
In the data ingestion phase of the sentiment analysis solution, we utilize the Steam API to gather gaming reviews. This raw data is cleaned to remove any irrelevant or corrupt data, and filtered to include only those reviews written in English. This cleaned and filtered data is stored in the bronze layer of our data pipeline, serving as the foundational dataset for subsequent analysis stages.
In this section, we create a data processing pipeline using Spark NLP. It begins by structuring and cleaning the text, then identifies sentences and breaks them into individual words, ensuring uniformity in representation. After standardizing the words and removing common but non-informative terms, it enriches the text by embedding words into a numerical vector space, facilitating deeper linguistic analysis. Additionally, it leverages a pre-trained model from John Snow Labs to automatically detect positive, negative and neutral aspects about the game from user reviews. Instead of labeling the entire review as negative or positive, this model helps identify the sentiment of exact phrases related in the review.
Moving to the next section of our sentiment analysis solution, we employ k-means clustering to segment the authors of the gaming reviews based on their metadata. This clustering is executed using PySpark's MLlib, which efficiently handles large datasets by distributing the computation across multiple nodes. This segmentation adds a layer of granularity to our dashboard, enabling deeper insights into different user demographics and behaviors.
Now that you have your labeled data you can make use of it all. A product manager might look at this dataset and see high negativity related to a specific game feature and adjust their pipeline to address that more quickly. Someone in operations might look at the concentrations of locations for people complaining about server drop outs across different geographies to identify potential multiplayer server orchestration issues across markets. A LiveOps content creator might find more positivity on BFGs and invest more time building skins for those products.
You now have a dataset that gives you insight into what your players are saying at scale. This could be used to help personalize the experience of your players and increase retention. By taking this as an input, connecting it to your internal datasets on engagement and revenue you can inform action by community managers, customer support, marketing and offer recommendations. Acquiring players is expensive, finding the players you want to keep is challenging, this insight provides an opportunity to engage with your community and build a deeper relationship with them and, by doing so, improve your player retention.
This solution accelerator for Player Review Analysis combines natural languages and machine learning techniques to help game developers understand their players better and respond through their game design, backend operations, LiveOperations, Marketing and, truly, through all lines of business. A game company, in pre-production, looking to build something new might analyze similar games to find hot buttons (positive and negative) for their target players. A studio during beta may use it to quickly respond to feedback across all players, or post launch to continuously improve the title over time and maximize engagement.
This solution accelerator (available here on GitHub) is focused on the analysis of Steam reviews, but that’s just one data source. This approach can be used to analyze reviews from other sites, forums, support tickets, surveys, indeed any plain text feedback you have access to. As long as you can collect, and ingest it into this workflow/system, it can be used.
Feedback is a gift. We are excited to help those voices be heard, grow player engagement and assist as you further the fun.
Download our Ultimate Guide to Game Data and AI. This comprehensive eBook provides an in-depth exploration of the key topics surrounding game data and AI, from the business value it provides to the core use cases for implementation. Whether you're a seasoned data veteran or just starting out, our guide will equip you with the knowledge you need to take your game development to the next level.