tl;dr We feature 5 of the top machine learning code repositories on Github from Singapore. The Top 5 is made up of
popular implementations of state-of-the-art Computer Vision (CV) and Natural Language Processing (NLP) models and
even a high-frequency trading project. The ranking is decided based on the total stars (stargazer count) of the
repositories.
A Computer Vision repository with code for training and evaluation of a YOLO3 model for the Object Detection task.
YOLO, You Only Look Once, is a state-of-the-art, real-time object detection model.
Its claim to fame is its extremely fast and accurate and you can trade-off speed and accuracy without re-training
by changing the model size. Multi-GPU training is also implemented.
3. Chinese Named Entity Recognition and Relation Extraction#
An NLP repository including state-of-art deep learning methods for various tasks in chinese/mandarin language (中文):
named entity recognition (NER/实体识别), relation extraction (RE/关系提取) and word segmentation.
2. High-frequency Trading Model using the Interactive Brokers API#
A high-frequency trading model using Interactive Brokers API with pairs and mean-reversion in Python.
It was last updated with v3.0 in June 2019.
The author describes the model as utilizing statistical arbitrage incorporating these methodologies:
Bootstrapping the model with historical data to derive usable strategy parameters
Resampling inhomogeneous time series to homogeneous time series
Selection of highly-correlated tradable pair
The ability to short one instrument and long the other.
Using volatility ratio to detect up or down trend.
Fair valuation of security using beta, or the mean over some past interval.
A computer vision repository which started with an early PyTorch implementation (circa 2018) of DeepLab-V3-Plus (in PyTorch 0.4.1).
DeepLab is a series of image semantic segmentation models whose latest version, v3+, is state-of-art on the semantic segmentation task.
It can use Modified Aligned Xception and ResNet as backbone.
The authors train DeepLab V3 Plus using Pascal VOC 2012, SBD and Cityscapes datasets. Pre-trained models on ResNet,
MobileNet and DRN are provided.