{ claus.conrad }

Analyzing OngoingWorlds posts

πŸ“… Jul 15, 2017
βŒ› 2 minutes

The previous post used Scrapy to extract post data from the website OngoingWorlds. Here are a few conclusions from that spider crawl:

I collected the game ID, post ID and date/time for each post from the play-by-email roleplaying community OngoingWorlds into an Sqlite3 database. Even with this very limited dataset, some interesting queries can be run:

SELECT game_id, COUNT(*) FROM post GROUP BY game_id ORDER BY COUNT(*) DESC LIMIT 10;
RankGameTotal posts
1Blue Dwarf15040
2Hero High3453
32778 A.D.2894
4The Land of Ecilith2276
5The Avengers~Lower Levels2111
6Heroes Association1288
7Hunted1265
8Circle of Nine1176
9Fairy Tail ZERO1125
10MLP fans!1118
SELECT
game_id AS GameID,
  (
  SELECT strftime("%H", timestamp) AS Hour
  FROM post AS inner
  WHERE inner.game_id = outer.game_id
  GROUP BY Hour
  ORDER BY COUNT(*) DESC
  LIMIT 1) AS MostPopularHour,
COUNT(*) TotalPosts
FROM post AS outer
GROUP BY GameID
ORDER BY MostPopularHour, TotalPosts DESC

For easier viewing I exported the result to a CSV spreadsheet, as follows:

sqlite3 -header -csv results.db 'SELECT game_id AS GameID, (SELECT strftime("%H", timestamp) AS Hour FROM post AS inner WHERE inner.game_id = outer.game_id GROUP BY Hour ORDER BY COUNT(*) DESC LIMIT 1) AS MostPopularHour, COUNT(*) TotalPosts FROM post AS outer GROUP BY GameID ORDER BY MostPopularHour, TotalPosts DESC' > results-agg.csv
Hour of the day (CEST)Game
0The Elite Club
1The Hotel
2Hero High
3The Avengers~Lower Levels
4The Land of Ecilith
5Redwill Home for Unusual and Odd Children
6Cure
7Advanced
8Battle of the Bands
9Vale of Shadows
10Pokemon Azure
11The open trail
12BRAIN GAMES: Raido Ravens
13Bloody Gifts
14Circle of Nine
15Academy for Super Humans
16MLP fans!
17Gakuen Statalia
18The Verse - Other Adventures in the Firefly Universe
19Magic Agents
20Blue Dwarf
21Another West
22Heroes Association
23Day After, Hero’s Past

Graph showing total posts per hour of the day