Last updated on November 4, 2022
What is Thread Reader App?
While Twitter is a microblogging platform where each post can only be up to 280 characters long, it’s common practice to link together multiple posts to form a longform work, called a “thread.”
However, it’s very difficult to copy & paste the thread elsewhere, so archiving the thread via Twitter itself is a frustrating and tedious process.
Thread Reader App is a tool you can use to archive Twitter threads. Any thread you unroll via Thread Reader will have its own Thread Reader permalink, plus the option to download the unrolled thread as a PDF.
However, the Thread Reader HTML page is bulky, and copying & pasting can still be a struggle because clicking on any tweet will open it in a new window. To strip down the page to just the content you want to back up with streamlined HTML, you will have to do some processing and reformatting.
There are existing tools for automating the process, but if you want a semi-automated way to reformat pages without having to install developer tools, you can use the transformations on this page.
Prerequisites for reformatting Thread Reader pages
To follow the instructions on this page, you will need:
- To know how to save a page as HTML via your browser of choice. (This feature is usually available under the File menu on desktop or the Share menu on mobile.)
- A basic understanding of HTML.
- A text editor that can use regular expressions in its Replace function, such as BBEdit.
This tutorial doesn’t include information on how to transform media in tweets (images, video, audio) because I haven’t explored those cases yet. You will have to reformat those manually for now, but I plan to add transformations for them in the future.
How to unroll a Twitter thread
Reply to any tweet in a thread with
@ThreadReaderApp unroll to archive the thread. The bot will respond with a link when it’s done compiling the tweets.
To ensure you receive the reply, make sure you’re following @ThreadReaderApp. (You should be able to receive a response without following the bot, but this makes it less likely responses get hidden because of message filters.)
On the Thread Reader App website
- Visit the Thread Reader App website.
- Enter the URL of a tweet in the thread you want to unroll into the search box.
- Press the Find Unroll button. If there is an existing unroll, it will load immediately. Otherwise, the app will take a moment to compile the tweets into a page.
- If there is an existing unroll, but it is incomplete, click on the “Force Refresh” link at the bottom of the unroll to retrieve new tweets.
How to save an unrolled thread
If you want to save the thread as a PDF, you can use Thread Reader App’s built-in PDF generator.
Otherwise, use your browser’s built-in Save function and make sure you save the page as an HTML file. This feature is usually available via the File menu on desktop and the Share menu on mobile.
How to reformat Thread Reader HTML output
Which sections to extract
Open the Thread Reader App HTML file.
The line numbers in the following code snippets refer to the file as it was structured on November 4, 2022. Line numbers may shift if the page coding changes.
Take note of how many tweets are in the thread, available on line 176:
<div class="thread-info"><a href="https://twitter.com/USER/status/TWEETID" class="time" data-time="TIME" title="Read on Twitter" target="_blank">DATE</a> <span class="dot2">•</span> ## tweets <span class="dot2">•</span> # min read </div>
There will be a lot of header data and a lot of footer data. Your tweets themselves start on line 192 of the page, beginning with
Cut or copy all the
<div>s up to your last tweet, which will be
<div id="tweet_##">, where ## is the total on line 176. Make sure you include the closing
</div> tag for the last tweet in the thread.
Create a new file and paste the
<div>s there for processing.
Open your file editor’s Replace window. Sometimes Replace is included in the Find window, or there is a separate Find & Replace tool.
Disable case sensitivity and searching for entire words.
Make sure regular expressions are enabled. In BBEdit, this means making sure the Grep box is checked:
With your extracted HTML section open, enter the “Find” strings below in the Find box, and the “Replace with” strings in the Replace box. You can use the built-in copy button in the code snippets.
Then, press the Replace All button. This will replace all instances of the code at once. “Replace” will only transform the first match found.
Restructure as paragraphs
This first transformation strips away extra HTML and styling that’s used on the Thread Reader App site, which you most likely won’t need on your own site.
<div> tag will be transformed into the
<p> tag, allowing browsers to enable Reader Mode on your thread.
This transformation will also replace the hidden data with a human-readable link to the individual tweet, formatted as (tweet number/total number of tweets in thread), which will come before the tweet.
<div id="tweet_([0-9]+)" class="content-tweet allow-preview" data-controller="thread" data-action="click->thread#showTweet" data-screenname="([a-zA-Z0-9_]+)" data-tweet="([0-9]+)" dir="auto">\n
Make sure you replace
## with the total number of tweets in your thread, found on line 176.
<p><a href="https://twitter.com/\2/status/\3" target="_new" class="tweet-permalink">(\1/##)</a>
This transformation will open permalinks in a new window with
target_"new" and adds
class="tweet-permalink" if you would like to style your links with CSS. If you don’t need those functions, you can remove them before you run the replacement.
You will also need to run the following transformation to properly close the tags:
\n<sup class="tw-permalink"><i class="fas fa-link"></i></sup>\n</div>
You can stop here and skip to the “What to do with the output” section below. If you want to clean up the file a little more, you can perform the next transformations.
Reformat quote retweets
This transformation strips away unnecessary
<span> wrappers and adds
<blockquote> to preserve attribution. It also prefixes quote retweet links with “QRT:” for clarity. (NB: I have chosen
<strong> to indicate that the boldface is for legibility and not for extra emphasis.)
The transformation adds
class="tweet-qrt" if you would like to style the blockquotes with CSS and
target="_new" to open the associated link in a new window. If you don’t need either functionality, you can delete them.
<span class="entity-embed"><span class="twitter-player"><blockquote class="twitter-tweet" data-conversation="none" data-align="center" data-dnt="true"><a href="([https://twitter.com/[a-zA-Z0-9_]+/status/[0-9]+)">([https://twitter.com/[a-zA-Z0-9_]+/status/[0-9]+)</a></blockquote></span></span></p>
</p>\n<blockquote cite="\1" class="tweet-qrt"><b>QRT:</b> <a href="\1" target="_new">\2</a></blockquote>
You can leave the link as-is, or transform it into an embed via your platform’s tools.
Streamline link classes
This transformation removes unnecessary attributes from external links and sets them to open in a new window with
target="_new". If you don’t want links to open in a new window, leave the Replace box completely blank (i.e., no spaces or other invisible characters in the box).
This transformation renames the link class on mentions (“@username”) to make it consistent with the rest of the document. If you don’t need to style links with CSS and want to remove this class, leave the Replace box completely blank (i.e., no spaces or other invisible characters in the box).
What to do with the output
Your tweets should now be reformatted as paragraphs. You can finish the transformation in a couple of ways.
Copy HTML to your site
Most content management systems will allow you to paste HTML directly into a text editor to post. Make sure you paste the HTML into the Code Editor, HTML, or Plain Text view of the editor to ensure that classes and other attributes don’t get stripped out.
In WordPress, the pasted content will become a “Classic” block that you can then convert into Gutenberg blocks if you’d like.
Add head and body to create a standalone HTML document
You can encapsulate your tweets in HTML document structure to create a complete, static page that can be uploaded to a server, or saved as a backup that can display in any browser.
Add the following to the very beginning of your document, before your first tweet:
<!DOCTYPE html> <html lang="en"> <head> <title>Document title</title> </head> <body>
<html lang="en"> tag specifies that the page is in English. If your tweets are in another language, you can select the appropriate language code from this HTML ISO language code reference page.
Replace the text between
<title></title> with the title of your page, which will display in the top bar of your browser.
Add the following to the end of the document, after your last tweet:
You may now save the file as
filename.html and open it in any browser.
You should now be able to process Thread Reader App pages into streamlined HTML in just a couple minutes per thread. If Thread Reader App changes the structure of their pages and these regular expressions no longer work, please feel free to contact me for updates by reporting an issue.