Skip to content

Latest commit

 

History

History
109 lines (103 loc) · 19.7 KB

microsoft.md

File metadata and controls

109 lines (103 loc) · 19.7 KB

Microsoft currently holds 3436 public repositories out of which 94 are related to data science and machine learning.

Last Updated On:09-07-20

Newly Added

Name Description Language Stars License
DeepSpeed DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. Python 2685 MIT License
hummingbird Hummingbird compiles trained ML models into tensor computation for faster inference. Python 1484 MIT License
tf2-gnn TensorFlow 2 library implementing Graph Neural Networks Python 198 MIT License
DisentangledFaceGAN Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning (CVPR 2020 Oral) Python 158 MIT License
transductive-vos.pytorch a transductive approach for video object segmentation Python 73 N/A
ptgnn A PyTorch Graph Neural Network Library Python 72 MIT License
DeBERTa The implementation of DeBERTa Python 61 MIT License
rat-sql A relation-aware semantic parsing model from English to SQL Python 55 MIT License
hyperspace An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads. Scala 50 Apache License 2.0
blackbox-smoothing Provably defending pretrained classifiers including the Azure, Google, AWS, and Clarifai APIs Jupyter Notebook 48 MIT License
FairMOT This project provides an official implementation of our recent work on real-time multi-object tracking in videos. The previous works conduct object detection and tracking with two separate models so they are very slow. In contrast, we propose a one-stage solution which does detection and tracking with a single network by elegantly solving the alignment problem. The resulting approach achieves groundbreaking results in terms of both accuracy and speed: (1) it ranks first among all the trackers on the MOT challenges; (2) it is significantly faster than the previous state-of-the-arts. In addition, it scales gracefully to handle a large number of objects. Python 42 MIT License
sql-spark-connector Apache Spark Connector for SQL Server and Azure SQL Scala 33 Apache License 2.0
statopt Statistical adaptive stochastic optimization methods Python 29 MIT License
inmt Interactive Neural Machine Translation tool Jupyter Notebook 24 MIT License
P.808 This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR). HTML 22 MIT License
archai Reproducible Rapid Research for Neural Architecture Search (NAS) Python 22 Other
onnxruntime-training-examples Examples for using ONNX Runtime for model training. Python 21 MIT License
MIMICS MIMICS: A Large-Scale Data Collection for Search Clarification N/A 21 MIT License
SGN This is the implementation of CVPR2020 paper “Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition”. Python 19 MIT License
spacy-ann-linker spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking Python 16 MIT License
vs-intellicode VS IntelliCode GitHub Action used for IntelliCode CI tools, such as Model Training. TypeScript 15 MIT License
ZipLine Text clustering algorithm, implemented in .NET C# 13 MIT License
topologic A python library for intelligently building networks and network embeddings, and for analyzing connected data. Python 12 MIT License
reconner ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data. Python 12 MIT License
SmartKG This project accepts excel files as input which contains the description of a Knowledge Graph (Vertexes and Edges) and convert it into an in-memory Graph Store. This project implements APIs to search/filter/get nodes and relations from the in-memory Knowledge Graph. This project also provides a dialog management framework and enable a chatbot based on its knowledge graph. C# 10 MIT License
EA-VQ-VAE This repo provides the code for the ACL 2020 paper "Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder" Python 10 Other
MixingBoard a Knowledgeable Stylized Integrated Text Generation Platform Python 9 MIT License
AirSim-Drone-Racing-Lab A framework for drone racing research, built on Microsoft AirSim. Python 8 MIT License
roman Python library for real-time control of a robotic manipulator Python 7 MIT License
aml-acceleration-template A template repository for quickly adopting Azure Machine Learning Python 7 MIT License
Synapse-AI-Retail-Recommender This Solution Accelerator is an end-to-end example on how to enable personalized customer experiences for retail scenarios by leveraging Azure Synapse Analytics, Azure Machine Learning Services, and other Azure Big Data services TypeScript 6 MIT License
AcousticScatteringData Synthetic exterior acoustic scattering data and sample parsing code. MATLAB 5 N/A
FeatureBroker A library for collecting features and performing inference of machine learning evaluations based on those features, useful especially in situations where the feature publishing software components are strongly decoupled from the software components that wish to exploit those features in machine learning models. C++ 5 MIT License
sqlworkshops-k8stobdc Kubernetes Course for SQL Server Big Data Clusters N/A 3 Other
data-science-sandbox Azure-hosted sandbox environment to enable third-parties to collaborate on data science solutions over protected data sets N/A 2 MIT License
data-contest-toolkit A toolkit for conducting machine learning trials against confidential data TypeScript 2 MIT License
MSMARCODocumentRanking Document Ranking on MSMARCO N/A 2 Other
sqlworkshops-sqlmlsvc Workshop for SQL Server Machine Learning Services Jupyter Notebook 2 Other
arcade-swarm-animation Swarm animation for MakeCode Arcade TypeScript 2 MIT License
ContextualSP The official code of our paper "How Far are We from Effective Context Modeling? An Exploratory Study on Semantic Parsing in Context" Python 2 MIT License
News-Threads The News Threads pipeline processes large volumes of document content, using machine learning to find derived text fragments and trace them to their original sources. The backend pipeline is written for Python. Python 2 MIT License
vscode-simple-jupyter-notebook Simple jupyter notebook for exploration purposes TypeScript 2 MIT License
Multi_Species_Bioacoustic_Classification Multi-species bioacoustic classification using deep learning algorithms Python 1 MIT License
sensus Ensemble Named Entity Recognition with Consensus N/A 1 MIT License
FRSGrabChallenge Hello students! This Future Ready Skills Challenge provides a self-paced learning path for data science and machine learning. After completing foundational content, a challenge by Grab will be available for you to test your skills. N/A 1 MIT License
automl_utils Utilities to streamline the creation and evaluation of AutoML algorithms. Python 1 Other
A-TALE-OF-THREE-CITIES Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection. R 1 MIT License
fhir-codegen Tools for code generation based on the FHIR specification. C# 0 MIT License
SynapseInPractice Examples and Guidance on Using Azure Synapse Analytics to Tackle Real-World Data Science Problems N/A 0 MIT License
aed-learn-data-science-videos This repo will be the initial development of the Data Science Learn modules to go along with the CH9 video series. N/A 0 MIT License

Highly Rated

Name Description Language Stars License
CNTK Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit C++ 16818 Other
LightGBM A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. C++ 11213 MIT License
AirSim Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research C++ 10325 Other
ai-edu AI education materials for Chinese students, teachers and IT professionals. HTML 7969 Other
recommenders Best Practices on Recommendation Systems Jupyter Notebook 7699 MIT License
nni An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning. Python 6452 MIT License
botframework-sdk Bot Framework provides the most comprehensive experience for building conversation applications. Python 6120 MIT License
ailab Experience, Learn and Code the latest breakthrough innovations with Microsoft AI C# 5601 MIT License
nlp-recipes Natural Language Processing Best Practices & Examples Python 4972 MIT License
MMdnn MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML. Python 4818 MIT License
SandDance Visually explore, understand, and present your data. TypeScript 4419 MIT License
SPTAG A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario. C++ 3733 MIT License
ApplicationInspector A source code analyzer built for surfacing features of interest and other characteristics to answer the question 'what's in it' using static analysis with a json based rules engine. Ideal for scanning components before use or detecting feature level changes. C# 3612 MIT License
malmo Project Malmo is a platform for Artificial Intelligence experimentation and research built on top of Minecraft. We aim to inspire a new generation of research into challenging new problems presented by this unique environment. --- For installation instructions, scroll down to Getting Started below, or visit the project page for more information: Java 3550 MIT License
Quantum Microsoft Quantum Development Kit Samples PowerShell 2960 MIT License
tensorwatch Debugging, monitoring and visualization for Python Machine Learning and Data Science Jupyter Notebook 2867 MIT License
computervision-recipes Best Practices, code samples, and documentation for Computer Vision. Jupyter Notebook 2851 MIT License
DMTK Microsoft Distributed Machine Learning Toolkit N/A 2763 MIT License
service-fabric Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale. C++ 2749 MIT License
QuantumKatas Tutorials and programming exercises for learning Q# and quantum computing Jupyter Notebook 2645 MIT License
VoTT Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos. TypeScript 2637 MIT License
onnxruntime ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator C++ 2563 MIT License
dowhy DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks. Python 2051 MIT License
human-pose-estimation.pytorch The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)" Python 1940 MIT License
GraphEngine Microsoft Graph Engine C# 1879 Other
pai Resource scheduling and cluster management for AI JavaScript 1757 MIT License
AutonomousDrivingCookbook Scenarios, tutorials and demos for Autonomous Driving Jupyter Notebook 1643 MIT License
forecasting Time Series Forecasting Best Practices & Examples Python 1315 MIT License
NeuronBlocks NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego Python 1315 MIT License
unilm UniLM - Unified Language Model Pre-training / Pre-training for NLP and Beyond Python 1197 MIT License
BotFramework-WebChat A highly-customizable web-based client for Azure Bot Services. JavaScript 1096 MIT License
presidio Context aware, pluggable and customizable data protection and anonymization service for text and images Go 1036 MIT License
Mobius C# and F# language binding and extensions to Apache Spark C# 924 MIT License
rDSN Robust Distributed System Nucleus (rDSN) is an open framework for quickly building and managing high performance and robust distributed systems. C++ 920 MIT License
gated-graph-neural-network-samples Sample Code for Gated Graph Neural Networks Python 896 MIT License
EdgeML This repository provides code for machine learning algorithms for edge devices developed at Microsoft Research India. C++ 856 Other
DialoGPT Large-scale pretraining for dialogue Python 850 MIT License
EconML ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x. Jupyter Notebook 849 MIT License
TextWorld ​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games. Python 823 Other
LightLDA Scalable, fast, and lightweight system for large-scale topic modeling C++ 800 Other
Multiverso Parameter server framework for distributed machine learning C++ 753 MIT License
azure-pipelines-image-generation Azure Pipelines VM image generation for Microsoft-hosted CI/CD PowerShell 726 MIT License
BuildXL Microsoft Build Accelerator C# 628 MIT License
MCW Microsoft Cloud Workshop Project N/A 616 Other