Python for Telegram OSINT: Automate Telegram Threat Intel Monitoring
Python for Telegram OSINT: Automate Telegram Threat Intel Monitoring
Learn how to build a Python Telegram scraper using Telethon for OSINT and threat intelligence monitoring. Step-by-step guide with practical examples.
I recently looked into how Telegram CEO Durov’s arrest affected illicit crypto activity and raised the question of whether targeting developers is really the right path to fighting crime. While some users have migrated to privacy-focused platforms like Signal (my favorite), and certain threat actors have set up mirror channels on emerging apps like SimpleX, the broader landscape isn’t likely to change dramatically. Telegram remains a critical tool for financial crime investigations, OSINT, and threat intelligence gathering. So today, we’ll explore how to automate threat intelligence monitoring on Telegram using Python.
I’ll be using a Kali Linux virtual machine I set up for this process, but feel free to use whatever environment works best for you. If you’re unsure which setup to choose or want to learn how to create your own virtual machine, I’ve got you covered. Check out the step-by-step guide I published back in February 2024 for all the details.
To set up our Telegram crawler, we’ll be using the Telethon Python library. Before diving into the code, it’s important to note that if you’re using any library to automate actions on Telegram (or any other app), make sure you’re familiar with the app’s terms and conditions. It’s crucial not to violate any rules, as doing so can result in your account being banned. I’ll provide a link to Telegram’s Terms of Service, but it’s up to you to review them and ensure your application stays within those guidelines.
With that out of the way, let’s dive in and get started with the code.
Step-by-Step Guide to Scraping Telegram Channels Using Python and Telethon
Step 1: Obtain API ID and Hash
- Create a Telegram App
Before interacting with Telegram’s API, you’ll need an API ID and API hash. These are unique credentials required to authenticate your application.
- Go to my.telegram.org and log in with the phone number linked to your Telegram account.
- After you enter your phone number a confirmation code will be sent to your Telegram app directly. It will not be send as a text message.
- Once logged in, navigate to “API development tools.”
- Create a new application by filling in the necessary details like app name, short name, and select the platform. URL and description are optional.
- This process will generate your API ID and API Hash. Make sure to set these as environment variables to keep them secure and out of your code. It’s important to keep these credentials private to protect your application.
📝 Note: I had a tough time getting my API ID and API Hash at first. I was using a VPN along with a few proxies, and I kept running into an error message that eventually led to a “too many tries, try again later” warning. After waiting a few hours, I tried again — this time, I disconnected my VPN but continued using a proxy network. That did the trick, and I was able to successfully set up my application.
Check out my article below to learn how to stay on the right side of the law while conducting OSINT and threat intelligence investigations.
Step 2: Set Up the Environment
Now that we’ve set up a way to interact with Telegram programmatically, let’s move on to setting up our environment. Thankfully, this process is quite simple and won’t take much time. For more information check out Telethon’s GitHub page or refer to the official documentation for more details: Telethon Docs.
- Create Your App Folder
First, let’s create a folder to store the app. I’m naming mine “telegram_app.” Once the folder is set up, we’ll navigate into it and get started from there.
- Install Python and Telethon
Next, ensure you have Python installed on your system. Then, use the following command to install Telethon, a pure Python 3 library for interacting with the Telegram API.
python3 --version
pip install telethon
Step 3: Write the Telegram Scraping Script
To write the code I will be using my preferred code editor VS Code but you can use whatever you like the best. And right away I will create a new file that I will call telegram_crawler.py.
touch telegram_crawler.py
- Import the Necessary Packages
In your newly created Python file, start by importing the necessary modules and initializing the Telegram client with your API credentials. Right way we will import three packages that I have outlined below:
from telethon import TelegramClient
from telethon.tl.functions.channels import JoinChannelRequest
import os
2. Store API Credentials Safely
Instead of hard coding these values into your script, it’s better to store them as environment variables for security reasons. In our case, we have imported in the os
module for that very reason.
app_api_id = os.getenv('TELEGRAM_APP_API_ID')
app_api_hash = os.getenv('TELEGRAM_APP_API_HASH')
3. Set Up Your Telegram Client
Now, we can initializing our Telegram client using the API credentials we created on my.telegram.org and stored as environment variables.
client = TelegramClient('session', app_api_id, app_api_hash)
4. Create an Asynchronous Function
Telethon works asynchronously, meaning you’ll need to define your scraping functions using Python’s async def
syntax.
async def main():
# Simple example: send a message to yourself
await client.send_message('me', 'We are good to go!')
5. Run the Script
Use the client.loop.run_until_complete()
method to run your asynchronous function.
with client:
client.loop.run_until_complete(main())
If everything is set up correctly and the environment variables and API credentials are in order, this script will send a message to your own Telegram account.
My test run was successful, but as you can see in the image above, I was prompted to enter my phone number and the code sent directly to my Telegram app. Once I entered both, the script worked perfectly, and I received the “We are good to go!” test message.
Step 4: Join and Scrape Messages from Channels
Now that our client is initialized and the test message has been successfully received, we’re ready to take the next step. The focus now shifts to identifying the Telegram channels we want to monitor and scrape for useful data. There are several Telegram search engines available for finding channels of interest, and I’ll provide links to a few at the end. For this exercise, however, I’ll be using tlgrm.eu.
We’ll be automating the monitoring of Telegram CEO Durov’s channel, specifically flagging any messages where he mentions “privacy” or “law enforcement,” as these could be relevant to ongoing investigations. Of course, you’re welcome to choose different channel(s) or focus on ones you already have in mind, depending on your threat intel gathering needs. The flexibility of this setup allows you to tailor it to your specific investigation requirements.
Joining a Telegram Channel
So let’s join our channel by first defining an asynchronous function called join_channel
. Its purpose is to programmatically join a specified Telegram channel using the JoinChannelRequest
method from the Telethon library. The client
is the initialized Telegram client, and channel_link
is the URL of the channel you want to join.
We are using await
to send a request to Telegram’s API to join a channel. If successful, the script prints a confirmation message with the channel link in cyan text using Fore.CYAN
. If the attempt fails, the error is caught and displayed.
In the main()
function, we specify the channel (Durov's, in this case) and call the join_channel()
function, ensuring the client attempts to connect to the channel before proceeding to other tasks.
async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")
async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)
Retrieving and Displaying Messages from a Telegram Channel
Next, let’s add the functionality to retrieve and display messages from the Telegram channel we previously joined.
In the get_messages()
function, we asynchronously iterate through messages in the specified Telegram channel using iter_messages()
. We set a limit of 5 messages to retrieve. For each message, if it contains text, the script formats the message (in blue) and prints a divider (in white) for clarity.
The message formatting is handled by Fore.BLUE
to distinguish the message text, and a line of hyphens provides separation between each message for readability.
In the main()
function, after joining the channel with join_channel()
, we could follow up by calling get_messages()
to fetch and display the recent messages from that channel.
async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")
async def get_messages(client, channel, limit=5):
async for message in client.iter_messages(channel, limit):
if message.text:
# Print the message in a structured format
print(Fore.BLUE + (message.text)) # <--- we addeed stringify to format the messages
print(Fore.WHITE + '------------------------------------------------------------------------------------')
async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)
Here we added the .stringify()
method to the message output in order to display the message in a more structured and readable format. Instead of printing just the raw text, this method provides detailed information about the message, such as metadata (message ID, date, sender, etc.) along with the actual message content.
Finally, to make things little easier to read and follow we added the colorama
library to the script to introduce color-coded output in the terminal. For example, the channel join status is printed in cyan (Fore.CYAN
), and the message content is displayed in blue (Fore.BLUE
). This makes the output visually organized and easy to read, especially when dealing with multiple status updates and messages.
Step 5: Putting It All Together
That’s a wrap! Now you’ve got all the functionality combined into a single script. This version is just the beginning, but there’s plenty of opportunity to get creative and build on it further. Enjoy experimenting and expanding from here!
from telethon import TelegramClient
from telethon.tl.functions.channels import JoinChannelRequest
import os
from colorama import Fore, Back, Style # add some color to the terminal print
app_api_id = ''
app_api_hash = ''
client = TelegramClient('session', app_api_id, app_api_hash)
async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")
async def get_messages(client, channel, limit=5):
async for message in client.iter_messages(channel, limit):
if message.text:
# Print the message in a structured format
print(Fore.BLUE + (message.text)) # <--- we addeed stringify to format the messages
print(Fore.WHITE + '------------------------------------------------------------------------------------')
async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)
await get_messages(client, channel_link)
##async def main():
# Simple example: send a message to yourself
## await client.send_message('me', 'We are good to go!')
with client:
client.loop.run_until_complete(main())
Step 6: Additional Features and Considerations
- Operational Security (OpSec):
When you’re scraping Telegram channels, maintaining strong OpSec is crucial. Make sure to use proxies if needed, stay away from scraping illegal or sensitive channels, and always comply with Telegram’s terms of service and the law. - Handling Exceptions:
Don’t forget to add solid error handling to your script. You don’t want it to crash just because you can’t join a certain channel or access certain messages. A little extra effort here can save you a lot of headaches later. - Event Triggers:
You can take your script to the next level by adding event triggers. For example, you could set it up to scrape messages containing specific keywords or to take further action when a particular event happens in a channel. This gives your monitoring setup more flexibility and intelligence.
Conclusion
By following these steps, you’ve now got yourself a working Telegram scraper built with Python and Telethon. This tool can be a game-changer, whether you’re gathering threat intelligence or just staying on top of important conversations in key channels. Don’t stop here — dive into the Telethon documentation and see what else you can do to make your scraper even more powerful. And if you’re still looking for channels to monitor, check out other search tools like Teleteg, Telegago, or IntelligenceX. There’s a lot out there to explore, and now you’ve got the tools to make it happen!
Explore Next
For a step-by-step guide on automating dark web monitoring with Python, check out the following article.
Discover how blockchain is transforming industries on the Blockchain Insights Hub. Follow me on Twitter for real-time updates on the intersection of blockchain and cybersecurity. Subscribe now to get my exclusive report on the top blockchain security threats of 2024. Dive deeper into my blockchain insights on Mirror.xyz.
Writing about cyber threat intelligence, OSINT, financial crime, and blockchain forensics. Follow me on Twitter for the latest insights.
Comentários
Postar um comentário