Mostly for my own amusement, I have written a twitter bot which examines the front page of the Mail Online once every hour and tweets out statistics of interest. You can follow it at the unfortunately named @mailtits account.
The original impulse came in 2012, when I wrote a scraper to pull out the soft porn captions from the Sidebar of Shame. But running that meant I would have to read them all and pick and edit down to 140 chars the creepy ones. I value my time too much, so that idea died.
During the Brexit campaign, though, I started to wonder how much politics was impinging on the serious business of women’s bodies. This was much easier to automate. I simply wrote a scraper that counted such phrases as “flaunts her”, “shows off her”, “curves”, “enviable”, “wardrobe malfunction”, and so on, and measured their frequency against mentions of “migrants”, “Brexit” and the EU.
I ran that for three months over the summer, putting the results into a Google spreadsheet, This was illuminating. Only at moments of great crisis did mentions of “Migrants” or “Brexit” ever outnumber mentions of the Kardashian family. You would not believe how important the Kardashians are to the Mail Online: I once recorded 130 mentions of members of the family at one time — and that was before I started to count Kanye West as one. Obviously, no politician could remotely compete with this.
I keep tinkering with this, which is one reason I am reluctant to publish the spreadsheet, but as of 14/12/16, I monitor the following words and phrases and every hour the account tweets out one of four reports, randomly chosen to show the verbs, the nouns, the quirks, or the balance of showbiz and politics on the pages of the most successful newspaper site in the world.
# Here follow the things to look for # Verbs showcases = re.compile('showcases her') flashes = re.compile('flashes her') showsoff = re.compile('shows off her') displays = re.compile('displays her') reveals = re.compile('reveals her') flaunt = re.compile('flaunts her') # The Adverb very = re.compile('VERY') # Adjectives braless = re.compile('braless') topless = re.compile('topless') enviable=re.compile('enviable') # Body parts sideboob = re.compile('sideboob') wardrobe_malfunction = re.compile('wardrobe malfunction') nip_slip = re.compile('nip slip') cleavage = re.compile('cleavage') legs = re.compile('legs|pins') assets = re.compile('assets') # changed from 'enviable assets' on 9/6/16 curves = re.compile('curves') bodies=re.compile('beach|bikini|ready body',re.I) # People (are there any others?) kardashians = re.compile( 'Kim Kardashian|Kris Jenner|Kylie Jenner|Caitlyn Jenner|Blac Chyna|Tyga|Kourtney|Khloe|Kanye') migrants = re.compile('migrant') # Politics eu = re.compile('EU ') brexit = re.compile('[bB]rexit')