NOTE: This post was sitting unpublished for almost exactly 1 year. I went ahead and gave it database storage and implemented scheduled posting You can find the tweetbot on GitHub and I even have a working version that is deamonized.
I’ve wanted to build a Twitter bot for some time. Mostly just something to send the occasional tweet. That could easily be extended to something that would become a scheduled tweet bot and a database could even be added to store future tweets.
I also wanted to monitor for mentions and notify me of them. Watching for something to occur and then running an action could also be extended in many ways, especially if a live search stream were to be added to the mix.
The basics of what the bot does is relatively simple. It needs to be able to access various streams (my notifications, a search stream). It has to be able to parse them and invoke something based on a given result. It needs to be capable of posting a tweet from my account.
Since I plan on using my Raspberry Pi for this and Python is a popular language to use on it I looked around for some reference points. There’s a very nice Python library written that is capable of doing the heavy lifting of sending requests to the Twitter API for me. It’s called Tweepy and I found it through GitHub.
Using Tweepy I should be able to easily connect and post/get to the Twitter API. Let’s see how that goes.
You will need to create an app and get some access credentials from Twitter to make your API calls – especially since the plan is to make it actually post to accounts.
First I need to install Tweepy. You can run
pip install tweepy to do it – and I did on my laptop and that worked just fine. On my RPi though I will be cloning it from Github and installing manually. There are certain base level dependencies of Tweepy, or of it’s dependencies, that are probably already installed on most systems. They were not available on my Pi though and the
setup.py script doesn’t handle those. A quick Google of the problem told me to run
pip install --upgrade pip to get them. That worked.
git clone https://github.com/tweepy/tweepy.git cd tweepy sudo python setup.py install
Since I also plan to eventually use a database to store things in I also installed
mysql-server but that’s not absolutly necessary for right now.
sudo apt-get install mysql-server
Writing the Bot Script
After that I used the code I found on this site to make a bot that was able to tweet things out that it read from a text file. I called the script bot.py and the text file with the tweets tweets.txt.
#!/usr/bin/env python # -*- coding: utf-8 -*- # from: http://www.dototot.com/how-to-write-a-twitter-bot-with-python-and-tweepy/ import tweepy, time, sys argfile = str(sys.argv) #enter the corresponding information from your Twitter application: CONSUMER_KEY = '123456'#keep the quotes, replace this with your consumer key CONSUMER_SECRET = '123456'#keep the quotes, replace this with your consumer secret key ACCESS_KEY = '123456-abcdefg'#keep the quotes, replace this with your access token ACCESS_SECRET = '123456'#keep the quotes, replace this with your access token secret auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET) auth.set_access_token(ACCESS_KEY, ACCESS_SECRET) api = tweepy.API(auth) filename=open(argfile,'r') f=filename.readlines() filename.close() for line in f: api.update_status(line) time.sleep(60)#Tweet every 1 minute
The script needs to be given a text file containing the tweets you want it to post. Make a .txt file in the same directory containing some tweets. Then call the script passing the .txt file. Assuming the script is called ‘bot.py’ and the tweets are in a file called ‘tweets.txt’ this is the command.
python bot.py tweets.txt
It’ll run for as long as it takes to post all the tweets from your file and it’ll wait 60 seconds between posting each one. When I ran it myself I got an InsecurePlatformWarning. It seems that’s down to the version of Python that I ran it with and the version of
requests that it uses. To fix it I ran installed the
requests[security] package as per this StackOverflow answer.
As of now you should be totally up and running with a Twitter Bot that can post tweets for you. It’s not the most useful of things considering it’ll only post through a list from a text file at a fixed interval.
Next steps in this project will be to add database support and time scheduling into the system.
It’s pretty obvious to anyone who knows me that computers fascinate me. The hardware, the software, their uses. Everything about them intrigues me.
What tells packets where to go once they are out on the open web? How does a computer generate a random number? What allows memory to hold a persistent electrical signal? I encourage you to find out the answers to each of those in your spare time – everything about it is fascinating.
One of the particular things that I am interested in is Artificial Intelligence. It just so happens that one of my favorite YouTube channels Computerphile has several recent videos that are extremely informative on AI. They also have videos about Machine Learning and Search Engines in videos from recent months. All worth watching. Each of the topics are somewhat related to each other and yet each is distinctly different.
After watching them it got me to thinking about Structured Data and how exactly the structure is given or defined. At small scale you can take a dataset find common attributes and organize it by that criteria.
You manually set the criteria and the amount of categories then sort them into each pile. It’s easy.
How exactly would that be done with data that has no labels or clear set of common attributes? Taking unorganized data and indexing it, assigning labels, working out attributes. Finding better and more efficient ways of doing that is part of the improvement process of Machine Learning.
That’s exactly what I’m going to investigate doing in a long running project. Extremely efficient indexation and giving structure to random data is kind of how search engines work. There’s a strong correlation between the kind of thing I want to do and how search engines provide the most relevant result for a given terms.
I’m going to grab my data from Twitter and store it, index it, categorize it and learn from it. The data from Twitter already has somewhat of a structure to start with but that exact structure might not be what I’m after. I want to structure it in many more ways.
I’m going to make use of what I learn in… maybe no ways at all but I’m gonna do it anyhow haha!
- Make a Twitter Bot with search capabilities.
- Store Tweets in a database.
- Index them.
- Categorize the data.
- Learn and Enjoy!
I hope that I’ll learn an awful lot from doing this. Probably not directly from the data I gather but definitely in terms of skills. Plus everyone needs a project to keep them focused. Some of the elements of this have been on my project list for a long time, now is as good a time as any to make some headway.