Document Image Classification with Document Image Transformer (DiT)

Document Image Classification with Document Image Transformer (DiT)

tl;dr A step-by-step tutorial to automatically classify documents based on images of their contents. For example, automatically identify scientific papers or handwritten notes. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 19, 2022 · 5 min · Eugene
Video Subtitling with OpenAI Whisper

Video Subtitling with OpenAI Whisper

tl;dr A step-by-step tutorial to automatically generate subtitles from a video using audio segmentation and OpenAI Whisper. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough. Continue on if you prefer reading the code here....

December 16, 2022 · 8 min · Eugene
Face Super Resolution with Real ESRGAN

Face Super Resolution with Real ESRGAN

tl;dr A step-by-step tutorial to upscale images with faces in the foreground by Real ESRGAN. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough. Continue on if you prefer reading the code here....

December 2, 2022 · 4 min · Eugene
Hate Speech Detection on Dynabench

Hate Speech Detection with Transformers

tl;dr A step-by-step tutorial to train a hate speech detection model to classify text containing hate speech. The trained model has a BERT-based transformer architecture. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

January 6, 2021 · 7 min · Eugene
Biology NER with BioBERT to Extract Diseases and Chemicals.

Biology Named Entity Recognition with BioBERT

tl;dr A step-by-step tutorial to train a BioBERT model for named entity recognition (NER), extracting diseases and chemical on the BioCreative V CDR task corpus. Our model is #3-ranked and within 0.6 percentage points of the state-of-the-art. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 30, 2020 · 6 min · Eugene
Named Entity Recognition with BERT in Mandarin

Named Entity Recognition with BERT in Mandarin

tl;dr A step-by-step tutorial to train a state-of-the-art model with BERT for named entity recognition (NER) in mandarin, 中文命名实体识别. Our model beats the state-of-the-art by 0.7 percentage points. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 24, 2020 · 6 min · Eugene
Named Entity Recognition on Weibo in Mandarin

Named Entity Recognition on Weibo in Mandarin

tl;dr A step-by-step tutorial to train a state-of-the-art model with flair and BERT for named entity recognition (NER) in mandarin, 中文命名实体识别, on a Weibo dataset. Our model beats the state-of-the-art by 20+ percentage points. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 24, 2020 · 6 min · Eugene
Sentiment Analysis in Mandarin on Food Delivery Reviews

Sentiment Analysis in Mandarin with XLNet

tl;dr A step-by-step tutorial to train a state-of-the-art model for sentiment analysis on mandarin food delivery reviews using the XLNet architecture. We will use Google Colab’s free Jupyter Notebook in the cloud. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 23, 2020 · 6 min · Eugene
Named Entity Recognition (NER) Model Using FLAIR

Train a Named Entity Recognition (NER) Model Using FLAIR

tl;dr A step-by-step tutorial to train a state-of-the-art model for named entity recognition (NER), the task of identifying persons, organizations and locations from a piece of text. Practical Machine Learning - Learn Step-by-Step to Train a Model A great way to learn is by going step-by-step through the process of training and evaluating the model. Hit the Open in Colab button below to launch a Jupyter Notebook in the cloud with a step-by-step walkthrough....

December 23, 2020 · 5 min · Eugene
Sentiment Analysis on Movie Reviews

Sentiment Analysis on Movie Reviews with XLNet

tl;dr A step-by-step tutorial to train a sentiment analysis model to classify polarity of IMDB movie reviews with XLNet using a free Jupyter Notebook in the cloud. The IMDB Movie Reviews Dataset and XLNet The Internet Movie Database (IMDb) movie reviews dataset is a very well-established benchmark (since 2011) for sentiment analysis performance. It’s probably the first large-ish (50,000 train+test), balanced sentiment analysis dataset, making it a very nice dataset for benchmarking on....

December 22, 2020 · 7 min · Eugene