Sijun He's Unsupervised Learning
  • Archive
  • About

Home

  • Mar 23, 2025nanoMoE: Extending NanoGPT with Mixture of Experts
  • Sep 18, 2023PaddleNLP: An End-to-End LLM toolkit delivering performance and compatibility
  • Jun 04, 2023Quantization Basics
  • Mar 11, 2023ChatGPT Series: Learning from Human Preferences
  • Feb 25, 2023ChatGPT Series: Chain-of-Thought Prompting
  • Jan 28, 2023ChatGPT Series: Instruction Finetuning and FLAN
  • Jan 20, 2023Macbook Pro GPU for ML?
  • Nov 06, 2022Hello Again, Baidu!
  • Nov 05, 2022Graph, Eager and JIT
  • Sep 13, 2022How to Train Transformers - Reading Notes on 【100亿模型计划】
  • Jul 10, 2022A short Survey on Position Embeddings in Transformer models
  • Feb 19, 2022Sentence Embeddings and Similarity
  • Jan 18, 2022Hello, SenseTime!
  • Dec 26, 2021Reflecting on My Time at Twitter Cortex
  • Nov 20, 2020Building Content Representations at Twitter with Self-Supervised Learning
  • Aug 23, 2020Deep Neural Networks for YouTube Recommendations
  • Aug 17, 2020Wide & Deep Learning for Recommendation Systems
  • Dec 13, 2019Scaling NER at Twitter
  • Apr 23, 2019Kaggle: Learning from the Gendered Pronoun Resolution Challenge
  • Feb 19, 2019Kaggle: Flagging Insincere Questions on Quora
  • 52 post articles, 3 pages.

    • 1
    • 2
    • 3
    © Sijun He's Unsupervised Learning 2025, Powered by Jekyll & TeXt Theme.
    Search