Adam O'Grady

Tweet Process Pipeline

Preamble

A long time ago I built a Twitter bot which allowed you to play a text adventure game by tweeting commands at it and receiving personalised responses. I’ve always been interested in doing more with this project, but there’s been two major holdups:

  1. I’ve lacked personal writing inspiration (and found no interested writers) to help with content for the game.
  2. The original code is an absolute mess.

Eventually during a skint section of my life I had to get rid of the virtual machine that runs the bot and it’s remained dormant and inactive for years. However I recently realised I have enough money and equipment to get it back up and running and also more experience to rebuild a better development experience.

Unfortunately, that’s where my troubles began. Twitter no longer maintains it’s old User Streams API and the new options are either scrape the Mentions Timeline endpoint frequently and then dedupe tweets or set up a public webserver for the Account Activity API and potentially have to pay for a Premium/Enterprise subscription.

Processing Tweet Objects

Twitter’s APIs, both the Mentions Timeline and the Account Activity Subscription provide tweets using what they term a “Tweet object”, a stable representation of a Tweet, including data such as:

  • The unique ID of the tweet
  • The text of the Tweet
  • Whether it’s a reply and what it’s replying to
  • Geotagging
  • A full “User object” of who made the tweet

Having a consistent schema for Tweets allows us to begin work on framework for processing tweets, even if we don’t necessarily have the “listener” nailed down yet. In fact, if we build this well we should be able to use different Tweet listeners interchangeably, since all they need to do is call our pipeline when they receive a Tweet object.

Architecture

We want to accept a Tweet object and potentially run multiple different functions on it. By making sure each function only performs one activity we can have module building blocks that can be swapped in and out. I’m going to co-opt a term so often used in web servers and call our functions “middleware”.

The pipeline will be called with an array consisting of all the middleware functions you want to execute, in order. Each piece of middleware will take the tweet object and a callback. Upon finishing it’s execution, the middleware calls the callback with no arguments (if it was successful) or with an error argument (if something went wrong). The callback will either log the error or execute the next middleware in the pipeline.

Pipeline

I initially got stuck on this for a while, but came up with the following implementation:

// We declare and export a function that accepts an array of middleware
module.exports = function TweetProcessPipeline (middleware_array) {
  // This returns a function that will execute a reversed copy of the middleware stack
  return function runPipeline(tweet) {
    executeMiddleware(tweet, middleware_array.slice()reverse());
  }
}

function executeMiddleware(tweet, stack) {
  // We grab the last middleware off the stack
  const middleware = stack.pop();
  // If there's no middleware, we end - pipeline complete
  if (middleware === undefined) {
    return;
  }
  // Otherwise we execute that middleware function and pass it the "next" function
  middleware(tweet, next);

  function next(err) {
    // If next is called with an argument, it logs it and ends the pipeline
    if (err) {
      console.log(err);
      return;
    }

    // Otherwise it recurses, executing the next middleware on the stack
    return executeMiddleware(tweet, stack);
  }
}

This recursive form allows us to execute each middleware in turn, while still being able to exit early if we encounter errors or have achieved our goal in earlier middleware.

Example Middleware

Of course, it makes more sense if we have example middleware to run:

module.exports = function MinimalistLogTweet(tweet, next) {
  console.log({
    user: tweet.user.screen_name,
    tweet: tweet.text
  });
  next();
}

This function simply logs the text of the Tweet and the screen name of the person who tweeted it.

The following example checks to make sure the incoming Tweet was not sent by the authenticated bot, to prevent it responding to itself and getting into an infinite loop:

// You need to call the exported function with a handle and pass the result of
// that to the pipeline construction array.
const SelfHandleChecker = function SelfHandleChecker (handle) {
  this.handle = handle;
  return function (tweet, next) {
    if (tweet.user.screen_name.toLowerCase() === this.handle.toLowerCase()) {
      console.log('Self-tweet detected, dropping tweet');
    } else {
      next();
    }
  }
}

Putting It All Together

How do we execute the whole thing?

tweet_listener.getLastMention().then(new TweetProcessPipeline([
  SelfHandleChecker('140charADV'),
  MinimalistLogTweet,
  RespondToTweet
]));

This code gets the last Twitter mention, then executes our pipeline of middleware.

Uses

We plan to use this as part of our bot framework, where we process each incoming tweet, get the players account, then process the command they tweeted and respond with the updated world view. However games and chatbots are not the only use for this, other potential stages could include:

  • Sentiment analysis (for accounts and tweets)
  • Media downloading
  • Collating geographic information
  • Multi-stage analytics pipelines
  • Conversational tree graphing
  • Network studies

Next Steps

This is only the most naive version of the pipeline, up next I want to do the following:

  • Promise-based responses on pipeline completion
  • Publish it as it’s own module
  • Better error handling

For the greater @140charADV project there’s still a lot more to do. Firstly I want to get Tweet listeners working properly, either Mentions Timeline or Activity Subscription (or both), then I want to start working out how to handle and store accounts. I’m thinking of doing it all in JSON with an abstracted storage API so I can swap between file storage on my dev machine and various database backends.

After all that is sorted I can really start getting into the game engine itself. A long way off, but I’m excited.