DOE AGORA Qualquer valor

Python for Telegram OSINT: Automate Telegram Threat Intel Monitoring

 

Python for Telegram OSINT: Automate Telegram Threat Intel Monitoring


·

Learn how to build a Python Telegram scraper using Telethon for OSINT and threat intelligence monitoring. Step-by-step guide with practical examples.

A black and white pencil sketch showing a computer monitor with the Telegram logo being strangled by a Python snake, surrounded by spiders symbolizing web crawlers.
Telegram Crawler. Image created using DALL-E.

I recently looked into how Telegram CEO Durov’s arrest affected illicit crypto activity and raised the question of whether targeting developers is really the right path to fighting crime. While some users have migrated to privacy-focused platforms like Signal (my favorite), and certain threat actors have set up mirror channels on emerging apps like SimpleX, the broader landscape isn’t likely to change dramatically. Telegram remains a critical tool for financial crime investigations, OSINT, and threat intelligence gathering. So today, we’ll explore how to automate threat intelligence monitoring on Telegram using Python.

I’ll be using a Kali Linux virtual machine I set up for this process, but feel free to use whatever environment works best for you. If you’re unsure which setup to choose or want to learn how to create your own virtual machine, I’ve got you covered. Check out the step-by-step guide I published back in February 2024 for all the details.

To set up our Telegram crawler, we’ll be using the Telethon Python library. Before diving into the code, it’s important to note that if you’re using any library to automate actions on Telegram (or any other app), make sure you’re familiar with the app’s terms and conditions. It’s crucial not to violate any rules, as doing so can result in your account being banned. I’ll provide a link to Telegram’s Terms of Service, but it’s up to you to review them and ensure your application stays within those guidelines.

Snippet from Telethon documentation on application compliance with Terms of Service.
This image shows a section from the Telethon documentation discussing the importance of ensuring compliance with Telegram’s Terms of Service when developing applications, including calling specific methods to check, accept, or deny updates to the Terms of Service. Source: Telethon’s Documentation.

With that out of the way, let’s dive in and get started with the code.

Step-by-Step Guide to Scraping Telegram Channels Using Python and Telethon

Step 1: Obtain API ID and Hash

  1. Create a Telegram App
    Before interacting with Telegram’s API, you’ll need an API ID and API hash. These are unique credentials required to authenticate your application.
  • Go to my.telegram.org and log in with the phone number linked to your Telegram account.
Telegram’s account management page asking for a phone number to delete an account or manage apps.
The image shows a webpage from my.telegram.org where users can input their phone numbers to either manage apps via the Telegram API or delete their accounts, with a confirmation code sent through Telegram.
  • After you enter your phone number a confirmation code will be sent to your Telegram app directly. It will not be send as a text message.
  • Once logged in, navigate to “API development tools.”
Telegram Core account management page with options for API development tools, deleting an account, and logging out.
The image shows Telegram’s account dashboard, offering links to access API development tools, delete an account, or log out, with a graphic of a character sitting in front of a control panel.
  • Create a new application by filling in the necessary details like app name, short name, and select the platform. URL and description are optional.
Telegram’s “Create New Application” page for app registration.
The image shows Telegram’s “Create New Application” form, where users can enter an app title, short name, URL, platform selection, and description to register a new application with Telegram’s API.
  • This process will generate your API ID and API Hash. Make sure to set these as environment variables to keep them secure and out of your code. It’s important to keep these credentials private to protect your application.

📝 Note: I had a tough time getting my API ID and API Hash at first. I was using a VPN along with a few proxies, and I kept running into an error message that eventually led to a “too many tries, try again later” warning. After waiting a few hours, I tried again — this time, I disconnected my VPN but continued using a proxy network. That did the trick, and I was able to successfully set up my application.

Check out my article below to learn how to stay on the right side of the law while conducting OSINT and threat intelligence investigations.

Step 2: Set Up the Environment

Now that we’ve set up a way to interact with Telegram programmatically, let’s move on to setting up our environment. Thankfully, this process is quite simple and won’t take much time. For more information check out Telethon’s GitHub page or refer to the official documentation for more details: Telethon Docs.

  • Create Your App Folder
    First, let’s create a folder to store the app. I’m naming mine “telegram_app.” Once the folder is set up, we’ll navigate into it and get started from there.
Terminal showing commands to create and navigate to a directory named “telegram_app.
The image displays a terminal where the user has created a directory called “telegram_app” using the mkdir command and navigated into it using the cd command, preparing for further development work.
  • Install Python and Telethon
    Next, ensure you have Python installed on your system. Then, use the following command to install Telethon, a pure Python 3 library for interacting with the Telegram API.
python3 --version
pip install telethon
The image displays a terminal session where the user checks the Python version (3.10.12) and installs the Telethon library using the pip install telethon command, with the installation process in progress.

Step 3: Write the Telegram Scraping Script

To write the code I will be using my preferred code editor VS Code but you can use whatever you like the best. And right away I will create a new file that I will call telegram_crawler.py.

touch telegram_crawler.py
Terminal command creating a Python file named “telegram_crawler.py.”
The image shows a terminal where the user creates a new Python file named “telegram_crawler.py” using the touch command, preparing to write the script for a Telegram crawler.
  1. Import the Necessary Packages
    In your newly created Python file, start by importing the necessary modules and initializing the Telegram client with your API credentials. Right way we will import three packages that I have outlined below:
from telethon import TelegramClient
from telethon.tl.functions.channels import JoinChannelRequest
import os

2. Store API Credentials Safely
Instead of hard coding these values into your script, it’s better to store them as environment variables for security reasons. In our case, we have imported in the os module for that very reason.

app_api_id = os.getenv('TELEGRAM_APP_API_ID')
app_api_hash = os.getenv('TELEGRAM_APP_API_HASH')

3. Set Up Your Telegram Client

Now, we can initializing our Telegram client using the API credentials we created on my.telegram.org and stored as environment variables.

client = TelegramClient('session', app_api_id, app_api_hash)
Python script in VS Code importing Telethon and setting up a Telegram client.
The image shows a Python script in Visual Studio Code where the user imports Telethon and creates a Telegram client using API credentials stored as environment variables, preparing the script to interact with Telegram channels.

4. Create an Asynchronous Function
Telethon works asynchronously, meaning you’ll need to define your scraping functions using Python’s async def syntax.

async def main():
# Simple example: send a message to yourself
await client.send_message('me', 'We are good to go!')

5. Run the Script
Use the client.loop.run_until_complete() method to run your asynchronous function.

with client:
client.loop.run_until_complete(main())
Python script in VS Code using Telethon to send a message through Telegram.
The image shows a Python script in Visual Studio Code where the user sets up a Telegram client with Telethon, defines an asynchronous function to send a message to themselves, and runs the script using the client loop.

If everything is set up correctly and the environment variables and API credentials are in order, this script will send a message to your own Telegram account.

Python script running in the terminal, sending a Telegram message and confirming successful login.
The image shows a successful execution of a Python script where a Telegram message is sent and displayed on the left, and the terminal on the right shows a prompt for login details, confirming the user has signed in while reminding them of Telegram’s Terms of Service

My test run was successful, but as you can see in the image above, I was prompted to enter my phone number and the code sent directly to my Telegram app. Once I entered both, the script worked perfectly, and I received the “We are good to go!” test message.

Step 4: Join and Scrape Messages from Channels

Now that our client is initialized and the test message has been successfully received, we’re ready to take the next step. The focus now shifts to identifying the Telegram channels we want to monitor and scrape for useful data. There are several Telegram search engines available for finding channels of interest, and I’ll provide links to a few at the end. For this exercise, however, I’ll be using tlgrm.eu.

Telegram search page for discovering channels.
The image shows a tlgrm.eu webpage with a search bar where users can discover interesting channels by searching with names, descriptions, or keywords, and includes category options for further browsing.

We’ll be automating the monitoring of Telegram CEO Durov’s channel, specifically flagging any messages where he mentions “privacy” or “law enforcement,” as these could be relevant to ongoing investigations. Of course, you’re welcome to choose different channel(s) or focus on ones you already have in mind, depending on your threat intel gathering needs. The flexibility of this setup allows you to tailor it to your specific investigation requirements.

Screenshot of Pavel Durov’s official Telegram channel as a search result using tlgrm.eu.
The image shows a search result of Pavel Durov’s verified Telegram channel, titled “Du Rove’s Channel,” where users can subscribe to follow updates about his travels and experiences, with options to view recent or popular posts.

Joining a Telegram Channel

So let’s join our channel by first defining an asynchronous function called join_channel. Its purpose is to programmatically join a specified Telegram channel using the JoinChannelRequest method from the Telethon library. The client is the initialized Telegram client, and channel_link is the URL of the channel you want to join.

We are using await to send a request to Telegram’s API to join a channel. If successful, the script prints a confirmation message with the channel link in cyan text using Fore.CYAN. If the attempt fails, the error is caught and displayed.

In the main() function, we specify the channel (Durov's, in this case) and call the join_channel() function, ensuring the client attempts to connect to the channel before proceeding to other tasks.

async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")

async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)
Screenshot showing the process of joining a Telegram channel using a Python script.
The image shows a Telegram interface confirming the user has joined “Du Rove’s Channel” alongside a Python script that uses the JoinChannelRequest method in Telethon to join the channel programmatically.

Retrieving and Displaying Messages from a Telegram Channel

Next, let’s add the functionality to retrieve and display messages from the Telegram channel we previously joined.

In the get_messages() function, we asynchronously iterate through messages in the specified Telegram channel using iter_messages(). We set a limit of 5 messages to retrieve. For each message, if it contains text, the script formats the message (in blue) and prints a divider (in white) for clarity.

The message formatting is handled by Fore.BLUE to distinguish the message text, and a line of hyphens provides separation between each message for readability.

In the main() function, after joining the channel with join_channel(), we could follow up by calling get_messages() to fetch and display the recent messages from that channel.

async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")

async def get_messages(client, channel, limit=5):
async for message in client.iter_messages(channel, limit):
if message.text:
# Print the message in a structured format
print(Fore.BLUE + (message.text)) # <--- we addeed stringify to format the messages
print(Fore.WHITE + '------------------------------------------------------------------------------------')

async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)
Screenshot showing a Telegram channel message and a Python script for scraping messages.
The image displays a Telegram message from “Du Rove’s Channel” and a Python script using Telethon to join the channel and retrieve messages, with output in the terminal showing the message’s metadata.

Here we added the .stringify() method to the message output in order to display the message in a more structured and readable format. Instead of printing just the raw text, this method provides detailed information about the message, such as metadata (message ID, date, sender, etc.) along with the actual message content.

Screenshot showing a Telegram channel message and a Python script using .stringify() to format and display message details.
The image displays a message from “Du Rove’s Channel” on Telegram and a Python script that uses the Telethon library to join the channel and retrieve messages. The script applies the .stringify() method to output message details, including metadata, in a structured format.

Finally, to make things little easier to read and follow we added the colorama library to the script to introduce color-coded output in the terminal. For example, the channel join status is printed in cyan (Fore.CYAN), and the message content is displayed in blue (Fore.BLUE). This makes the output visually organized and easy to read, especially when dealing with multiple status updates and messages.

Telegram channel message and Python script with color-coded output for message scraping.
The image shows a Telegram message from “Du Rove’s Channel” on the left, and a Python script on the right using Telethon to join the channel and scrape messages, with color-coded terminal output for clear visibility of status updates and message details.

Step 5: Putting It All Together

That’s a wrap! Now you’ve got all the functionality combined into a single script. This version is just the beginning, but there’s plenty of opportunity to get creative and build on it further. Enjoy experimenting and expanding from here!

from telethon import TelegramClient
from telethon.tl.functions.channels import JoinChannelRequest
import os
from colorama import Fore, Back, Style # add some color to the terminal print

app_api_id = ''
app_api_hash = ''

client = TelegramClient('session', app_api_id, app_api_hash)

async def join_channel(client, channel_link):
try:
await client(JoinChannelRequest(channel_link))
print(f"{Fore.CYAN}Joined channel: {channel_link}")
except Exception as e:
print(f"Failed to join channel: {e}")

async def get_messages(client, channel, limit=5):
async for message in client.iter_messages(channel, limit):
if message.text:
# Print the message in a structured format
print(Fore.BLUE + (message.text)) # <--- we addeed stringify to format the messages
print(Fore.WHITE + '------------------------------------------------------------------------------------')

async def main():
# Define the Telegram channel link to monitor (in this case, Durov's channel)
channel_link = 'https://t.me/Durov'
await join_channel(client, channel_link)

await get_messages(client, channel_link)


##async def main():
# Simple example: send a message to yourself
## await client.send_message('me', 'We are good to go!')

with client:
client.loop.run_until_complete(main())

Step 6: Additional Features and Considerations

  1. Operational Security (OpSec):
    When you’re scraping Telegram channels, maintaining strong OpSec is crucial. Make sure to use proxies if needed, stay away from scraping illegal or sensitive channels, and always comply with Telegram’s terms of service and the law.
  2. Handling Exceptions:
    Don’t forget to add solid error handling to your script. You don’t want it to crash just because you can’t join a certain channel or access certain messages. A little extra effort here can save you a lot of headaches later.
  3. Event Triggers:
    You can take your script to the next level by adding event triggers. For example, you could set it up to scrape messages containing specific keywords or to take further action when a particular event happens in a channel. This gives your monitoring setup more flexibility and intelligence.

Conclusion

By following these steps, you’ve now got yourself a working Telegram scraper built with Python and Telethon. This tool can be a game-changer, whether you’re gathering threat intelligence or just staying on top of important conversations in key channels. Don’t stop here — dive into the Telethon documentation and see what else you can do to make your scraper even more powerful. And if you’re still looking for channels to monitor, check out other search tools like TeletegTelegago, or IntelligenceX. There’s a lot out there to explore, and now you’ve got the tools to make it happen!

Explore Next

For a step-by-step guide on automating dark web monitoring with Python, check out the following article.

Discover how blockchain is transforming industries on the Blockchain Insights Hub. Follow me on Twitter for real-time updates on the intersection of blockchain and cybersecurity. Subscribe now to get my exclusive report on the top blockchain security threats of 2024. Dive deeper into my blockchain insights on Mirror.xyz.

Ervin Zubic
OSINT Ambition

Writing about cyber threat intelligence, OSINT, financial crime, and blockchain forensics. Follow me on Twitter for the latest insights.


Comentários

Ebook

Postagens mais visitadas