Twitch Social Network Analysis Project

Pepper Framer Website
Pepper Framer Website
Pepper Framer Website

Category:

Data Analysis

Client:

-

Duration:

1 month

Introduction

This project delves into social network analysis, focusing on Twitch, a popular streaming platform. Using a dataset from the Stanford Network Analysis Project (SNAP), the research explores the intricate dynamics of Twitch's user base, leveraging a comprehensive and credible dataset.

The dataset comprises 168,114 nodes and 6,797,557 edges, encapsulating the extensive social fabric of Twitch. Curated in Spring 2018 from Twitch's public API, the dataset is specifically tailored for machine learning applications, such as node classification, user behavior prediction, and community detection.

Dataset

Applications of the dataset include:

  • Node classification

  • Count data regression

  • Content streamer identification

  • Broadcaster language prediction

  • User lifetime estimation

  • Churn prediction

  • Affiliate status identification

  • View count estimation

Project Process

  1. Graph Construction:

    • Utilized Python's NetworkX library to build and analyze the network graph.

    • Parsed the dataset to extract node and edge information, creating a structured representation.

  2. Visualization:

    • Employed Matplotlib and NetworkX to visualize the graph.

    • Adjusted parameters (e.g., node size, color, edge width) for clarity.

    • Used layout algorithms like spring and circular layouts for enhanced visual appeal.

  3. Analysis:

    • Explored key metrics: Degree centrality, closeness centrality, betweenness centrality, network diameter, and edge connectivity.

    • Identified the ultra-small world phenomenon and a highly dense single community structure.

Results

  • Network Metrics:

    • Nodes: 49 (subset)

    • Edges: 1,176

    • Average Node Degree: 48

    • Network Diameter: 1

  • Key Findings:

    • Highly interconnected network with rapid information propagation capabilities.

    • Strong resilience against disruptions due to high edge connectivity.

    • Uniform centrality values indicating potential anomalies or a curated dataset subset.

Reflections and Learnings

What Went Well

  • Successfully implemented graph construction and analysis techniques.

  • Gained deep insights into the dynamics of high-density networks.

Areas for Improvement

  • Improve data representation and ensure representative samples.

  • Incorporate dynamic network analysis earlier in the process.

  • Broaden the dataset for more comprehensive insights.

Learnings

  • Enhanced understanding of network density implications, centrality measures, and resilience.

Future Work

  1. Network Analysis: Explore interaction patterns, identify influencers, and uncover clusters.

  2. Content Preference Analysis: Analyze user-content relationships for content strategy insights.

  3. Community Detection: Use algorithms to find and understand subgroups within Twitch.

  4. User Behavior Prediction: Build models for predicting user engagement and content popularity.

  5. Anomaly Detection: Identify unusual patterns to maintain platform integrity.

  6. Content Recommendation Systems: Develop graph-based personalized recommendations.

  7. Influence Analysis: Quantify user influence using engagement metrics and centrality measures.

  8. Dynamic Network Analysis: Study the evolution of the Twitch network over time.

Dataset

The dataset can be accessed at the SNAP Project Website.

Authors: Dhruv Singh, Kunal Samant
License: MIT
Contact: dsingh28@hawk.iit.edu, ksamant@hawk.iit.edu

Create a free website with Framer, the website builder loved by startups, designers and agencies.