NVIDIA WEBINAR
Recent work such as BERT and GPT-2 in unsupervised language modeling demonstrates that training large neural language models advances the state of the art in Natural Language Processing applications. However, for very large models, memory constraints limit the size of models that can be practically trained. Model parallelism allows us to train larger models, because the parameters can be split across multiple processors. Designing such an approach in a simple and efficient way still remains a challenge.
Recently, NVIDIA Research launched project Megatron to enable training state of the art transformer language models with billions of parameters. Join this webinar to learn how NVIDIA researchers created Megatron, the largest Transformer language model ever trained with 8.3 billion parameters at 24x the size of BERT and 5.6x the size of GPT-2. Trained on 174GB of text, this model establishes new state-of-the-art results in tasks such as LAMBADA which tests the ability to model long-term dependencies.
maincontent goes here
Content goes here
content goes here
Content goes here
Content goes here
Dr. Mohammad Shoeybi is a Senior Research Scientist in Applied Deep Learning Research group at NVIDIA. His interests are in Natural Language Processing (NLP) applications, unsupervised learning, and large scale language modeling. Prior to NVIDIA, Mohammad worked at DeepMind and Baidu leading efforts on Deep Learning for speech synthesis, text to speech, and recommender systems.
Presenter 2 Bio
Presenter 3 Bio
Presenter 4 Bio
Content here
Webinar: Description here
Date & Time: Wednesday, April 22, 2018