NLP 101 - Simple Classification :: UNISTRA NLP/LLM 2024 — Natural Language Processing and Large Language Models

Political Tweets - Simple Classifier: A Refresher in NLP

Welcome to this tutorial designed to give you a fresh update in Natural Language Processing (NLP)! In this project, we perform simple text classification to predict the political leaning of tweets. Our goal is to create a model that can accurately classify tweets as either Republican or Democratic using a Logistic Regression pipeline trained on preprocessed data.

What to Expect

Installation and use of essential NLP libraries such as tweet-preprocessor, imbalanced-learn, and gradio.
Data manipulation and text preprocessing with Python modules like pandas, numpy, spacy, and scikit-learn.
Creation and evaluation of a machine learning model using LogisticRegression and TfidfVectorizer.
Handling imbalanced datasets using RandomUnderSampler from imbalanced-learn.
Model explanation with eli5 and creation of a user-friendly interface with Gradio.

By the end of this tutorial, you will have a solid understanding of how to build a machine learning pipeline for text classification using Python libraries and how to create an interactive web application. Let’s dive in!

NLP 101 - Simple Classification

Political Tweets - Simple Classifier: A Refresher in NLP #

What to Expect #

Political Tweets - Simple Classifier: A Refresher in NLP

What to Expect