‘Big Data’ is the now-ubiquitous term used to describe the massive amount of data available in nearly every domain. Big Data has a myriad of applications that are literally revolutionizing the way many industries work, but due to the unstructured and even inaccessible nature of most datasets it can be impossible to create applications that leverage Big Data’s power without the development of tools to find, transform, analyze, and visualize data [1]. Organizing data in a meaningful way is thus an open challenge in Big Data. One approach to Big Data integration is the use of semantic technologies, in particular data categorization using ontologies. The idea of enhancing knowledge acquisition through the structuring of data is not new; see [2] and early proposals for the world-wide web which included the vision of a Web of Data, or the Semantic Web. Applying this old idea to the new problem of Big Data is a viable approach to better prepare data for decision-making purposes.
One particular area in which an explosion of data has occurred is in entertainment (movies, books, television shows, videogames, etc.).The field of recommender systems was created as a response to the huge amount of content now available to consumers. One of the first and most popular methods of designing recommender systems was the ratings-based method, but in the past decade several systems have integrated the use of ontologies in order to utilize content metadata other than ratings. The use of semantic technologies has been instrumental in addressing some of the identified issues with recommender systems, such as limited content analysis, overspecialization, and the new user problem [3]. This project addresses a subset of overspecialization in videogame recommender systems: stereotyping of demographic segments leading to overspecialized and even inaccurate videogame recommendations. For example, a middle-aged woman will be recommended a puzzle/matching game like Bejeweled, while a teenage boy will be recommended a first-person shooter like Call of Duty regardless of their personal preferences.
2. BACKGROUND AND RELATED WORK
There are currently three subsets of videogame recommender systems. The most primitive yet pervasive are online quizzes, usually generated by online quiz sites. These quizzes are trivialized by the goal of entertaining users and a limited amount of data concerning both users and content. The second subset is online forums like [4], wherein users can request as well as provide suggestions to other users in a thread-based format. While the suggestions are generally very accurate due to the heuristic nature of forums, this method lacks scalability, usability, and accessibility. The final subset is suggestion software; GamesLikeFinder.com is a hand-selected collection of game recommendations which sacrifices scalability by not leveraging Big Data, while TasteKid.com is a suggestion site built for all entertainment content which follows the ratings-based approach and thus suffers from limited content analysis and overspecialization. Videogame suggestion software which successfully integrates Big Data without stereotyping demographic segments does not currently exist.
There are several examples of the use of semantic technologies to improve recommender systems; they are in a wide variety of application domains. [5] uses content metadata pulled from a domain ontology to enhance the quality of discovered patterns. [6] overcomes overspecialization by applying reasoning techniques that makes their system flexible and ultimately allows it to offer accurate, enhanced suggestions. [7] produces recommendations using the semantic information of items and user demographic data. While these examples successfully address some of the issues named in [3], there is not literature on recommender systems that address demographic stereotyping.
3. APPROACH AND UNIQUENESS
Instead of collecting demographic data about users, like gender and age, the author proposes to collect user metadata using the social network APIs of videogame platforms. [8] suggests using metadata such as badges and trophies earned and amount of time played to generate a better picture of a user than purely demographic metrics. This approach has not yet been applied to videogame recommender systems; the author would like to discover whether this approach can be expressed using an ontology in order to improve videogame suggestions by associating attributes pulled from playing metrics with users. As an example, user Mary has played Destiny for Playstation 4 for 35 hours; she is associated with the attribute <playtime = 35> for <videogame = Destiny>. This user is thus strongly correlated with Destiny, which has the attribute <genre = RolePlayingGame FirstPersonShooter>. Mary will be recommended games that are also role playing games and first-person shooters.
This three-phase project is under development and will be completed in May of 2015. The first phase, which entails creating a videogame ontology, is complete. The second phase, which entails creating a user metadata ontology, is in progress. The final phase will be integrating the two ontologies using semantic reasoning.
3.1 Approach
There exist several large videogame databases which expose their data through APIs. The author chose TheGamesDB.net, a database with more than 25,000 games, because of its accessibility and ease of use, although future work may include pulling from multiple databases. The author created a custom ontology, VideoGame, based on Schema.org’s CreativeWork type. Even though Schema.org provides a VideoGame type, the author departed from it because (a) Schema does not use the OWL format and (b) there was a lot of disparity between properties defined by Schema.org’s VideoGame type and the metadata provided by videogame databases.
The author has identified Steam as the videogame platform social network to be used to obtain user metadata. Steam is a network for PC games, and it is the only platform for which an official API is provided. APIs for Playstation Network and Xbox Live exist, but are not official; therefore, they do not allow for username verification and are not as fully supported. Future work may include pulling from these APIs as well in order to cover a larger cross-section of users, namely users of the two major gaming consoles. The author will create a custom user metadata ontology based on the most important user attributes.
3.2 Evaluation
When the project is complete, a study will be conducted in order to determine the difference between this project and one that incorporates demographic metadata. The author has selected the recommendation system TasteKid as a suitable benchmark. This comparison is flawed because (a) TasteKid does not incorporate semantic technology and (b) this project is intended to be single-use and it does not benefit from user history like TasteKid does. However, these flaws cannot be overcome; there is no videogame recommender which uses ontologies and including a secure user login system is beyond the scope of this project.
The author will survey subjects on a volunteer basis by asking them to find three videogame suggestions from each system and then participating in an interview about their experience with the two systems. The results will be statistically analyzed using linear regression.
4. RESULTS AND CONTRIBUTIONS
The VideoGame ontology is complete and displayed in Figure 1. The ontology for the user metadata will be completed during phase two. Designing the properties that connect the user metadata ontology to the VideoGame ontology in phase three will be the defining aspect of this project. Its success will hinge on the quality and relevance of the semantic relationships between the two ontologies. The resulting product will be a hosted videogame recommender web application.
Once the project is complete, it will be considered successful if the difference between the two systems is statistically significant; that is, if the semantic application designed for this project performs better than the benchmark application at a confidence level of 90%.
4.2 Contributions
Videogames and recommender systems represent a cross-section of areas in which women are underrepresented. While women make up nearly half of all gamers, they compose less than five percent of videogame programmers and about twenty-five percent of general programmers. By identifying areas in which gender disparity exists and discovering ways to view users as a personality rather than a demographic, this project will advance gender equality in the fields of both computer science and videogames. It will also address the relatively overlooked issue of demographic stereotyping in recommender systems in general.
5. ACKNOWLEDGEMENTS
The author would like to acknowledge the CREU program, including the CDC and the CRA-W, for making this project possible through their generous research grant. The author would also like to thank Dr. Srividya Bansal for providing mentorship and guidance throughout the project. Finally, the author acknowledges the Ira A Fulton Schools of Engineering for supporting the Big Data and Semantic Computing Lab ASU Polytechnic.
6. REFERENCES
[1] Kandogan, E., Kieliszewski, C., Ozcan, F., Roth, M., Schloss, B., and Schmidt, M.-T. Data for All: A Systems Approach to Accelerate the Path from Data to Insight. In 2013 IEEE International Congress on Big Data (BigData Congress), June 2013, 427–428.
[2] Berners-Lee, Tim. 1989. History. Information Management: A Proposal. http://www.w3.org/History/1989/proposal.html.
[3] Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. Knowledge and Data Engineering, IEEE Transactions, 17, 6, 734-749.
[4] Reddit. 2015. GamingSuggestions. http://www.reddit.com/r/gamingsuggestions.
[5] Adda, M., Djeraba, C., Missaoui, R., and Valtchev, P.
2007. Toward Recommendation Based on Ontology-Powered Web-Usage Mining. Internet Computing, IEEE, 11, 4, 45-52.
[6] Blanco-Fernandez, Y., Gil-Solla, A., Lopez-Nores, M., Pazos-arias, J., and Ramos-Cabrer, M. 2008. Providing entertainment by content-based filtering and semantic reasoning in intelligent recommender systems. Consumer Electronics, IEEE Transactions, 17, 2, 727-735.
[7] Kong, Fan-Sheng and Wang, Rui-Qin. 2007. Semantic-Enhanced Personalized Recommender System. In 2007 International Conference on Machine Learning and Cybernetics, August 2007, 4069-4074.
[8] Lees, Jennie. 2014. Stop Recommending Bejeweled: Using Data Science to Broaden Game Audiences. In Grace Hopper Celebration of Women in Computing, October 2014.