Basic Social Research Opportunities On Twitter

With a whole bundle of data right at the end of a simple Twitter search I've always thought it would be an awesome idea to somehow make use of it for user research. Specifically vernacular and terminology research.

  • How do they arrange what they say?
  • What words do they use together,
  • What way do they ask for help?

I previously had ambitions of building some kind of machine learning system to extrapolate all kinds of awesome metrics from that data. That project is semi-on-the-shelf for the moment but that doesn't mean I can't still somehow use the search data in a more high level way.

Using Tweets to Get and Idea of the Language People Use for a Given Topic.

Take a particular blog post that was written some time ago but does not perform as well as you feel it could. Head to Twitter advanced search and enter a few key terms from the post to bring up tweets somehow related to your topic.

Read through the list, note some down into a list, refine the search, note down more. Be sure to get a lot – try make sure you have mostly directly related tweets to what your topic is but also include some loosely related items and a handful that are borderline.

Partial match data is still good at this point but do exclude any that are obviously entirely unrelated to your needs. In a machine learning environment unrelated items would be good test data but manually they'll just add clutter.

Spotting Connection Phrases and Linking Words

Once you have a nice big list of tweets somehow linked to your topic choice take another read through them. Pay attention to the connecting words and phrases in them people use to bind the topic and objects together. Those are the words you'll use in linking phrases for an article.

Sometimes its harder to spot commonality within these linking phrase because the words don't have as much force as the specific key phrases we are searching for. That's why it's important to pay attention to them as much as you can – they are hard to discern from data gathered from searching only key phrases.

Find the Questions People Have About the Subject

The first thing to do is to find the questions people are asking about the subject matter. Are many people familiar with it? Do people have similar complaints? See the same question being asked again and again?

Finding questions can be done multiple ways. Checking for shares to sites you know people ask questions on is a good way. Searching for words that can indicate questions (‘Who’, ‘What’, ‘When’, ‘Where’, ‘Why’,’ Will’, ‘How’ and ‘?’).

Knowing what questions people ask is a good way to spot any sticking points at various levels of expertise in the subject.

A side benefit of searching for shares to question sites is that it may also lead you to a better description of that question. Sometimes even the answer to many of those questions are at the links.

Knowing both the questions people have and the answers to those questions can be a great place to start refining posts or any content ideas you may have.

Connect the Unrelated Objects to the Related Ones

Sometimes there can be affinities between various topics that are seemingly completely unrelated. In any given group the people who like one things might majoritarily like something else. I cannot think of any real-world examples that have been proven to be accurate however I can give a few examples.

Lets say in a group of 10 people there are 5 cat owners and 5 dog owners. 4 of the cat owners like smooth peanut butter. 2 of the dog lovers like it too. You could say there is a strange affinity between cat owners and a preference for smooth peanut butter.

Another take on the above example might be that since 6 out of a total 10 pet owners prefer smooth that might imply that pet owners have an affinity with smooth peanut butter.

That's only a single, made-up, scenario with 2 provided perspectives. There are so many unseen affinities within different groups of people and subject matters that being able to correctly identify the ones that fit your audience profile is a huge boost to how likely people are to identify with the content you create for them.

Also if my above example is true then it makes total sense to somehow include smooth peanut butter on all of your cat related content. Keep that in mind for the future 😉


A Raspberry Pi Twitter Bot In Python

NOTE: This post was sitting unpublished for almost exactly 1 year. I went ahead and gave it database storage and implemented scheduled posting You can find the tweetbot on GitHub and I even have a working version that is deamonized.

I’ve wanted to build a Twitter bot for some time. Mostly just something to send the occasional tweet. That could easily be extended to something that would become a scheduled tweet bot and a database could even be added to store future tweets.

I also wanted to monitor for mentions and notify me of them. Watching for something to occur and then running an action could also be extended in many ways, especially if a live search stream were to be added to the mix.

The basics of what the bot does is relatively simple. It needs to be able to access various streams (my notifications, a search stream). It has to be able to parse them and invoke something based on a given result. It needs to be capable of posting a tweet from my account.

Since I plan on using my Raspberry Pi for this and Python is a popular language to use on it I looked around for some reference points. There’s a very nice Python library written that is capable of doing the heavy lifting of sending requests to the Twitter API for me. It’s called Tweepy and I found it through GitHub.

Using Tweepy I should be able to easily connect and post/get to the Twitter API. Let’s see how that goes.

You will need to create an app and get some access credentials from Twitter to make your API calls – especially since the plan is to make it actually post to accounts.

Installing Tweepy

First I need to install Tweepy. You can run pip install tweepy to do it – and I did on my laptop and that worked just fine. On my RPi though I will be cloning it from Github and installing manually. There are certain base level dependencies of Tweepy, or of it’s dependencies, that are probably already installed on most systems. They were not available on my Pi though and the setup.py script doesn’t handle those. A quick Google of the problem told me to run pip install --upgrade pip to get them. That worked.

git clone https://github.com/tweepy/tweepy.git
cd tweepy
sudo python setup.py install

Since I also plan to eventually use a database to store things in I also installed mysql-server but that’s not absolutly necessary for right now.

sudo apt-get install mysql-server

Writing the Bot Script

After that I used the code I found on this site to make a bot that was able to tweet things out that it read from a text file. I called the script bot.py and the text file with the tweets tweets.txt.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# from: http://www.dototot.com/how-to-write-a-twitter-bot-with-python-and-tweepy/
import tweepy, time, sys

argfile = str(sys.argv[1])

#enter the corresponding information from your Twitter application:
CONSUMER_KEY = '123456'#keep the quotes, replace this with your consumer key
CONSUMER_SECRET = '123456'#keep the quotes, replace this with your consumer secret key
ACCESS_KEY = '123456-abcdefg'#keep the quotes, replace this with your access token
ACCESS_SECRET = '123456'#keep the quotes, replace this with your access token secret
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)

filename=open(argfile,'r')
f=filename.readlines()
filename.close()

for line in f:
api.update_status(line)
time.sleep(60)#Tweet every 1 minute

The script needs to be given a text file containing the tweets you want it to post. Make a .txt file in the same directory containing some tweets. Then call the script passing the .txt file. Assuming the script is called ‘bot.py’ and the tweets are in a file called ‘tweets.txt’ this is the command.

python bot.py tweets.txt

It’ll run for as long as it takes to post all the tweets from your file and it’ll wait 60 seconds between posting each one. When I ran it myself I got an InsecurePlatformWarning. It seems that’s down to the version of Python that I ran it with and the version of requests that it uses. To fix it I ran installed the requests[security] package as per this StackOverflow answer.

As of now you should be totally up and running with a Twitter Bot that can post tweets for you. It’s not the most useful of things considering it’ll only post through a list from a text file at a fixed interval.

Next steps in this project will be to add database support and time scheduling into the system.