COVID Information on Telegram Bot using Python Git and Heroku

Telegram Bot for COVID Information

Okay, first things first. If you want to check out the bot, please download Telegram and then head to the bot directly.

There is too much information on the web on COVID-19. However, I am very interested in primarily two major aspects:

  1. Information on Statistics of each country. For example: deaths, new cases, recovered, etc.
  2. Categorized news, such that I can follow news specific to each topic. For example, new cases in countries, progress on drugs or vaccine, etc.

The Worldometer website provides amazing information on the first one. For the second, I found that NewsApi.org provides a great API ( maximum of 500 API Calls per day) for free if your application is not used for commercial purposes.

Big Data Jobs

Main Steps

The main steps to get such a bot running are the following:

  1. Set up Telegram Bot account. For details, go to this page and directly to the section:”How do I create a bot?”
  2. Create the python application that allows regular updates of Worldometer and NewsApi data.
  3. Pushing the code to Git and then automatically deploy. I used Heroku for deployment.

I won’t be describing Step 1 as it is pretty straightforward and all the information on how to set up a bot is available on Telegram.org’s own website. Let us focus on Step 2 and 3.

COVID application

The application needs to perform the following two tasks at regular intervals:

  • Update information from Worldometer at regular intervals
  • Update news regularly from NewsApi.

For Worldometer data, the process is pretty straightforward. I hit the website once every hour and update the information of all countries. Therefore, on the bot, there might be a difference than what the website shows. But, I bet that number will be small. I store the information as a Pandas dataframe using the following piece of code:

The dataframe contains two rows for each country: yesterday’s statistics and current statistics. Using this information, whenever any user requests the statistics using country name, the information is obtained from the dataframe and fed into a template that uses all the information to produce a formatted message. For example, when I type, Singapore on the bot, I get the following output. Pretty cool huh!

Telegram bot showing country statistics

Serving this information using the bot api is pretty straightforward. Let us see what goes on inside.

Wait. Why did the function get_country_stats not use directly the country entered by the user? I realized this when I started using the bot myself. I was making spelling mistakes. I typed sngapore (missed the i) and did not get any results. That seemed to be an easy problem if I use a spelling corrector or do fuzzy match. I used SymSpell (an amazing toolkit for spelling correction) and basically find the closest match in the dictionary based on what the user enters. And after this, even when by mistake I search for sngapore (Singapore) or indea (India), it still works. Magic!!

Now, let’s head to COVID related news. I am very interested to know what World leaders are talking about on COVID, or what vaccines are being created. I realized I had to search with keywords on Google every time. Too much work!!

I wanted to find news, and automatically tag them. Based on my previous experiences, Topic Modelling , specifically Latent Dirichlet Allocation (LDA), seemed to be a great option. Get topic distributions of documents and group similar documents together. And when I try to look for documents with specific topics, just show me the recent ones on that topic. Easy peesy!!

Jobs in AI

I took several news documents retrieved from NewsApi and ran LDA on the text (title + description [only the first few words]). A big advantage of LDA: it is unsupervised. Therefore, I did not need to perform any annotations. I used a setting of 5 topics. The top 5 topics with the most important words along with their weights, are as follows:

(0, ‘0.026*”covid” + 0.026*”19″ + 0.012*”positive” + 0.008*”tested” + 0.007*”test” + 0.006*”testing” + 0.006*”tests” + 0.006*”new” + 0.005*”cruise” + 0.005*”said”‘)

(1, ‘0.029*”cases” + 0.022*”new” + 0.020*”19″ + 0.019*”covid” + 0.014*”news” + 0.009*”health” + 0.008*”positive” + 0.008*”death” + 0.008*”deaths” + 0.008*”state”‘)

(2, ‘0.023*”covid” + 0.023*”19″ + 0.011*”pandemic” + 0.007*”people” + 0.006*”home” + 0.006*”help” + 0.006*”outbreak” + 0.006*”amid” + 0.004*”crisis” + 0.004*”fight”‘)

(3, ‘0.010*”pandemic” + 0.010*”businesses” + 0.008*”covid” + 0.008*”19″ + 0.007*”amid” + 0.007*”business” + 0.007*”outbreak” + 0.006*”trump” + 0.006*”said” + 0.005*”relief”‘)

(4, ‘0.017*”covid” + 0.017*”19″ + 0.014*”trump” + 0.011*”pandemic” + 0.009*”news” + 0.008*”outbreak” + 0.007*”spread” + 0.007*”world” + 0.006*”president” + 0.006*”health”‘)

From the importance of the most important words, I assigned names to each of the topics:

  • COVID Progress + Research: Information on vaccine/medications, etc.
  • Events: Information on event cancellations, decisions, etc.
  • New Cases: New cases in countries.
  • COVID Economy & Related: Economic conditions, layoffs, etc.
  • COVID World news: World leaders talking COVID.

This is how the bot looks with the created labels.

The following piece of code shows the menu shown above. I used an additional category Popular 10 Covid News that randomly selects 10 articles from the recent news that has been crawled.

Clicking on any of the topic will use a callback function to get new documents of the specific topic. For new news articles, the topic is assigned based on the topic that has the maximum contribution based on word-distributions. I also assigned a threshold such that documents without enough evidence on any topic do not show up. Relevance is important!

Top 4 Most Popular Ai Articles:

1. Paper repro: “Learning to Learn by Gradient Descent by Gradient Descent”

2. Reinforcement Learning for Autonomous Vehicle Route Optimisation

3. Basics of Neural Network

4. AI, Machine Learning, & Deep Learning Explained in 5 Minutes

When I select the button on COVID Research, I receive the following articles. Pretty relevant ðŸ™‚

Telegram Bot showing articles on COVID Research

Such category-specific information is very helpful for me and I can choose what topics of articles to read now.

In addition to category-specific information, I have allowed search on news articles too. For example, using /search lysol to get information on lysol ( blink blink), I get pretty relevant news (really????) to read.

Telegram Bot showing articles on Lysol ðŸ˜‰

In order to repeatedly get new news articles, I used an amazing library called in python.

Deploying on Heroku

The deployment part is pretty straightforward. The github repo can be directly linked to have continuous development on Heroku. First, create an application on Heroku. In the root folder of the application, a Procfile needs to be created, only with the following content where telebot.py is the main python file to run. You have to set up automatic deploys on Heroku.

worker: python app/telebot.py

There are several types of dynos on Heroku giving several types of benefits, starting from a Free Tier. If you need some minimal level of tracking and metrics, a Hobby Dyno ($7 per month) is a good to have.

I hope the above tutorial gave a decent overview on creating a Telegram bot with a certain functionality (that I find really helpful).

I plan to release the code soon!

Don’t forget to give us your ? !


COVID Information on Telegram Bot using Python, Git and Heroku was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/covid-information-on-telegram-bot-using-python-git-and-heroku-f65383348d84?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/covid-information-on-telegram-bot-using-python-git-and-heroku

Explaining Blackbox Machine Learning Models: Practical Application of SHAP

Train a “blackbox” GBM model on a real dataset and make it explainable with SHAP.

Originally from KDnuggets https://ift.tt/35A1pVk

source https://365datascience.weebly.com/the-best-data-science-blog-2020/explaining-blackbox-machine-learning-models-practical-application-of-shap

Best Coronavirus Projections Predictions Dashboards and Data Resources

Check out this curated collection of coronavirus-related projections, dashboards, visualizations, and data that we have encountered on the internet.

Originally from KDnuggets https://ift.tt/2A2ez1H

source https://365datascience.weebly.com/the-best-data-science-blog-2020/best-coronavirus-projections-predictions-dashboards-and-data-resources

KDnuggets News 20:n18 May 6: Five Cool Python Libraries for Data Science; NLP Recipes: Best Practices

5 cool Python libraries for Data Science; NLP Recipes: Best Practices and Examples; Deep Learning: The Free eBook; Demystifying the AI Infrastructure Stack; and more.

Originally from KDnuggets https://ift.tt/2L1vcNr

source https://365datascience.weebly.com/the-best-data-science-blog-2020/kdnuggets-news-20n18-may-6-five-cool-python-libraries-for-data-science-nlp-recipes-best-practices

Statistical Thinking for Industrial Problem Solving a free online statistics course

This online course is available – for free – to anyone interested in building practical skills in using data to solve problems better.

Originally from KDnuggets https://ift.tt/3baSY42

source https://365datascience.weebly.com/the-best-data-science-blog-2020/statistical-thinking-for-industrial-problem-solving-a-free-online-statistics-course6345382

Coding your first ever LSTM Network

source https://365datascience.weebly.com/the-best-data-science-blog-2020/coding-your-first-ever-lstm-network

Machine Learning Training Data Annotation Types for AI in News & Media

Source

AI in media making this industry operate with more automated tasks for better efficiency in the market. Using the computer vision or NLP/NLU, AI in news media makes the objects and languages recognition system possible for machines.

Cogito provides the training data sets for AI in media and news to develop the visual perception based AI model or language based machine learning models.

Annotation for Face Recognition in Media

Media industry can well-utilize the power of face recognition system to detect the various types of faces captured into the images or videos while reporting or covering the important topics around the world.

The landmark annotation technique is used to detect or recognize such faces through AI. And to train the face detection AI model, huge amount of training data sets containing the annotated images of various persons are created with high level of accuracy.

Jobs in AI

Annotation of Visual Search in News & Media

While reporting the particular news or topic, visual search technology can be also used to detect the objects in the picture frame. Actually, when AI model is trained to detect such objects, it becomes easier for the spot such things when detect by the machine through computer vision. And to train the visual search algorithms, annotated data sets of varied objects are created with accuracy.

Annotation for Brand Recognition in Media Industry

Media industry can use the machine learning based models to detect the popular brands, their names, logs and tags to spot them while recording the videos or images clicked randomly. Bounding box annotation or polygon annotation can be used to annotate such brand names making easier for machines to recognize in the crowd and notify the media person.

Cogito is expert in varied image annotation, technique, hence it can annotate such objects with desired level of accuracy.

Text Annotation Sentiment Analysis & Language Processing

Similarly, text annotation is the technique used to annotate the texts in document to make it comprehensible to machines. The keywords are annotated with added metadata with keynotes and other tags to make the entire sentence or document easily understandable through language based machine learning training.

Cogito provides the text annotation in multiple language for language processing, voice recognition and other needs with high quality.

Top 4 Most Popular Ai Articles:

1. Neural networks for algorithmic trading. Multimodal and multitask deep learning

2. Back-Propagation is very simple. Who made it Complicated ?

3. Introducing Ozlo

4. How to train a neural network to code by itself ?

NLP Annotation for Fake News Detection in News & Media

Using the NLP or natural language processing, news or media agencies can detect the face news reported by the unreliable sources misguiding the people. Actually, when annotated language is used to train the AI models it becomes trained to detect such fake news and report others.

Cogito provides the NLP annotation for language based machine learning models helping the news and media industry easily detect the fake news timely without help of humans.

NER Annotation for Entity Extraction

Named entity recognition or NER is used recognize the popular names, which known as the entity and their relationships with each other. In document, recognizing such names and their entities with right extraction suing the machine learning model can help news and media industry to operate with better efficiency.

Cogito provides, the NER service with next level of accuracy. It can create huge amount of such annotated NER data sets for machine learning training at very affordable cost.

Cogito is expert in all types of data annotation techniques with expertise in image annotation to annotate the different type of images with high level of precision making the object of interest recognizable to machines through computer vision.

It is providing the best quality training data sets for different type of AI models in different fields like healthcare, agriculture, media, retail, automotive, robotics, autonomous flying and various other untapped fields to develop the successful AI model with right training data.

Don’t forget to give us your ? !


Machine Learning Training Data Annotation Types for AI in News & Media was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/machine-learning-training-data-annotation-types-for-ai-in-news-media-99effdc8aa7c?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/machine-learning-training-data-annotation-types-for-ai-in-news-media

Cybersecurity has to match up with the growing online users due to COVID-19

Source

The number of people working remotely has increased like anything. There is no doubt about the fact that the ease of working remotely has benefited the companies in a great way. With the lock-downs and travel restrictions, the organizations in USA and around the world are able to manage their operations. Only, because the employees are able to work from home. The desperate economic condition can worsen if employees aren’t able to work from home. Thus, remote working is turning out to be bliss for the business, industry, or the economic condition of the counties and the world, especially, during this condition. Although, everything comes with its own set of pros and cons. One of the most evident cons of increased online users or remote working is the possibility of increased cyber threats. In this article, we will explore how the cybersecurity industry is preparing to avoid cyber-attacks.

Increasing online users

As, most of us are locked inside the homes, therefore, the use of digital space has increased like anything. People across the globe, not only in the States are using online sites, applications, and mediums like anything. With the increased number of people using online food delivery and other services, to the growing number of online medical supplies, there is a lot that has led to the increased use of online mediums. People who were never using the online mediums are using them now. Thus, the volume is rising immensely. In fact, we see more and more activity on social media channels than earlier. The users are more connected and more active on social platforms. In fact, even the digital entertainment industry has seen a huge upsurge of the users. All this has put huge pressure on the digital space. Thus, the cybersecurity techniques have to up to the mark to ensure thorough security of the data.

Jobs in AI

Ensuring cybersecurity at home

Firstly, every user would have to ensure security of their own systems, gadgets, and applications. If you an online user too, you would have to make sure that your computers, phones, smart devices are secure. You need to make sure that you use certified cybersecurity programs to keep your data safe. The antiviruses help you to safeguard your data from every kind of theft and attack. Also, the kind of gadgets and devices that you are using is strong enough to protect the data and online activities from the attacks.

Top 4 Most Popular Ai Articles:

1. Neural networks for algorithmic trading. Multimodal and multitask deep learning

2. Back-Propagation is very simple. Who made it Complicated ?

3. Introducing Ozlo

4. How to train a neural network to code by itself ?

Businesses and remote workers have to be extra careful

Although, the businesses and the employees who are working remotely are at the highest risk of cyber threats. Organizations need to have the strongest technology and the best programs in place to make sure that their business’s data is absolutely safe. The risk of working remotely is high. Although, it is quite convenient as well.

Steps to ensure the security of the business data and operations

System Patching

Proper patching has the potential to mitigate the risks. Although, the cybersecurity teams can reduce the patch cycles for the vital systems. VPNs (Virtual Private Networks) would play a major role in enhancing the security of the systems. Although, Software IT outsourcing teams would consider end-point protection through proper patching. All this will surely eradicate the vulnerabilities. The cybersecurity teams are inventing new and new patches that are capable of safeguarding the remote infrastructure. Although, they just need more adoption as well as attention.

Maintenance and Management are the keys

We all know that the best line of defense in safeguarding ourselves from the pandemic is hygiene (washing hands) etc. That is the same for the IT systems as well. It is all about the maintenance and management of the devices, programs, application sites, etc. Proper maintenance of the network is the utmost importance. Everything related to your network has to be secure. Also, you should be prepared to manage the systems quite regularly. Make sure you take the necessary steps to ensure the security of everything in the network, or the overall network. And, consistency is quite important too.

Authentication

Authentication and accesses have to be quite decently managed. You need to make sure that every user of the team has strong passwords. Also, you may want to use multi-factor authentication. Almost all business applications and programs have to be protected with solid passwords. Also, one of the best things to go for is device-based authentication. Thus, the cybersecurity teams are coming up with more and more powerful authentication solutions for the users. After all, if the accesses and authentication are taken care of, the security could be managed appropriately.

When we talk about cybersecurity matching up with the growing online users due to COVID-19, we mean, the steps that the cybersecurity teams, IT teams, and the users can take to mitigate and avoid the risks.

Don’t forget to give us your ? !


Cybersecurity has to match up with the growing online users due to COVID-19 was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/cybersecurity-has-to-match-up-with-the-growing-online-users-due-to-covid-19-3e9c5a159fe6?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/cybersecurity-has-to-match-up-with-the-growing-online-users-due-to-covid-19

Design a site like this with WordPress.com
Get started