Paper Detail

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

NVIDIA, Aakshita Chandiramani, Aaron Blakeman, Abdullahi Olaoye, Abhibha Gupta, Abhilash Somasamudramath, Abhinav Khattar, Adeola Adesoba, Adi Renduchintala, Adil Asif, Aditya Agrawal, Aditya Vavre, Ahmad Kiswani, Aishwarya Padmakumar, Ajay Hotchandani, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Gronskiy, Alex Kondratenko, Alex Neefus, Alex Steiner, Alex Yang, Alexander Bukharin, Alexander Young, Ali Hatamizadeh, Ali Taghibakhshi, Alina Galiautdinova, Alisa Liu, Alok Kumar, Ameya Sunil Mahabaleshwarkar, Amir Klein, Amit Zuker, Amnon Geifman, Anahita Bhiwandiwalla, Ananth Subramaniam, Andrew Tao, Anjaney Shrivastava, Anjulie Agrusa, Ankur Srivastava, Ankur Verma, Ann Guan, Anna Shors, Annamalai Chockalingam, Anubhav Mandarwal, Aparnaa Ramani, Arham Mehta, Arti Jain, Arun Venkatesan, Asha Anoosheh, Ashwath Aithal, Ashwin Poojary, Asif Ahamed, Asit Mishra, Asli Sabanci Demiroz, Asma Kuriparambil Thekkumpate, Atefeh Sohrabizadeh, Avinash Kaur, Ayush Dattagupta, Barath Subramaniam Anandan, Bardiya Sadeghi, Barnaby Simkin, Ben Lanir, Benedikt Schifferer, Benjamin Chislett, Besmira Nushi, Bilal Kartal, Bill Thiede, Bita Darvish Rouhani, Bobby Chen, Boris Ginsburg, Brandon Norick, Branislav Kisacanin, Brian Yu, Bryan Catanzaro, Buvaneswari Mani, Carlo del Mundo, Chankyu Lee, Chanran Kim, Chantal Hwang, Chao Ni, Charles Wang, Charlie Truong, Cheng-Ping Hsieh, Chenhan Yu, Chenjie Luo, Cherie Wang, Chetan Mungekar, Chintan Patel, Chris Alexiuk, Chris Holguin, Chris Wing, Christian Munley, Christopher Parisien, Chuck Desai, Chunyang Sheng, Collin Neale, Cyril Meurillon, Dakshi Kumar, Dan Gil, Dan Su, Dane Corneil, Daniel Afrimi, Daniel Burkhardt Eliuth Triana, Daniel Egert, Daniel Fatade, Daniel Lo, Daniel Rohrer, Daniel Serebrenik, Daniil Sorokin, Daria Gitman, Daria Levy, Darko Stosic, David Edelsohn, David Messina, David Mosallanezhad, David Tamok, Deena Donia, Deepak Narayanan, Devin O'Kelly, Dheeraj Peri, Dhruv Nathawani, Di Wu, Dima Rekesh, Dina Yared, Divyanshu Kakwani, Dmitry Konyagin Brandon Tuttle, Dong Ahn, Dongfu Jiang, Dorrin Poorkay, Douglas O'Flaherty, Duncan Riach, Dusan Stosic, Dustin Van Stee, Edgar Minasyan, Edward Lin, Eileen Peters Long, Elad Segal, Elena Lantz, Elena Lewis, Ellie Evans, Elliott Ning, Eric Chung, Eric Harper, Eric Pham-Hung, Eric W. Tramel, Erick Galinkin, Erik Pounds, Esti Etrog, Evan Briones, Evan Wu, Evelina Bakhturina, Evgeny Tsykunov, Ewa Dobrowolska, Farshad Saberi Movahed, Farzan Memarian, Fay Wang, Fei Jia, Felipe Soares, Felipe Vieira Frujeri, Feng Chen, Fengguang Lin, Ferenc Galko, Fortuna Zhang, Frankie Siino, Frida Hou, Gantavya Bhatt, Gargi Prasad, Geethapriya Venkataramani, Geetika Gupta, George Armstrong, Gerald Shen, Giulio Borghesi, Gordana Neskovic, Gorkem Batmaz, Grace Lam, Grace Wu, Greg Pauloski, Greyson Davis, Grigor Nalbandyan, Guoming Zhang, Guy Farber, Guyue Huang, Haifeng Qian, Haran Kumar Shiv Kumar, Harry Kim, Harsh Sharma, Hayate Iso, Hayley Ross, Herbert Hum, Herman Sahota, Hexin Wang, Himanshu Soni, Hiren Upadhyay, Huy Nguyen, Iain Cunningham, Ido Galil, Ido Shahaf, Igino Padovani, Igor Gitman, Igor Shovkun, Ikroop Dhillon, Ilya Loshchilov, Ingrid Kelly, Itamar Schen, Itay Levy, Ivan Moshkov, Izik Golan, Izzy Putterman, Jain Tu, Jan Baczek, Jan Kautz, Jane Polak Scowcroft, Janica Rosenberg, Jared Casper, Jarrod Pflum, Jason Grant, Jason Sewall, Jatin Mitra, Jeffrey Glick, Jenny Chen, Jesse Oliver, Jiacheng Xu, Jiafan Zhu, Jialin Song, Jian Zhang, Jiaqi Zeng, Jie Lou, Jill Milton, Jim Chow, Jimmy Zhang, Jinhang Choi, Jining Huang, Jocelyn Huang, Joel Caruso, Joey Conway, Joey Guman, Johan Jatko, John Kamalu, Johnny Greco, Jonathan Cohen, Jonathan Raiman, Joseph Jennings, Joyjit Daw, Juan Yu, Julio Tapia, Junkeun Yi, Jupinder Parmar, Jyothi Achar, Kari Briski, Kartik Mattoo, Katherine Cheung, Katherine Luna, Keith Wyss, Kevin Shih, Kezhi Kong, Khanh Nguyen, Khushi Bhardwaj, Kirill Buryak, Kirthi Shankar Sivamani, Konstantinos Krommydas, Kris Murphy, Krishna C. Puvvada, Krzysztof Pawelec, Kumar Anik, Laikh Tewari, Laya Sleiman, Leo Du, Leon Derczynski, Li Ding, Lilach Ilan, Lingjie Wu, Lizzie Wei, Luis Vega, Lun Su, Maarten Van Segbroeck, Maer Rodrigues de Melo, Magaret Zhang, Mahan Fathi, Makesh Narsimhan Sreedhar, Makesh Sreedhar, Makesh Tarun Chandran, Manuel Reyes Gomez, Maor Ashkenazi, Marc Cuevas, Marc Romeijn, Margaret Zhang, Mark Cai, Mark Gabel, Markus Kliegl, Martyna Patelka, Maryam Moosaei, Matthew Varacalli, Matvei Novikov, Mauricio Ferrato, Mehrzad Samadi, Melissa Corpuz, Meng Xin, Mengdi Wang, Mengru Wang, Meredith Price, Micah Schaffer, Michael Andersch, Michael Boone, Michael Evans, Michael Z Wang, Miguel Martinez, Mikail Khona, Mike Chrzanowski, Mike Hollinger, Mingyuan Ma, Minseok Lee, Mohammad Dabbah, Mohammad Shoeybi, Mostofa Patwary, Nabin Mulepati, Nader Khalil, Najeeb Nabwani, Nancy Agarwal, Nanthini Balasubramaniam, Narimane Hennouni, Narsi Kodukula, Natalie Hereth, Nathaniel Pinckney, Nave Assaf, Negar Habibi, Nestor Qin, Neta Zmora, Netanel Haber, Nick Reamaroon, Nickson Quak, Nidhi Bhatia, Nikhil Jukar, Nikki Pope, Nikolai Ludwig, Nima Tajbakhsh, Nir Ailon, Nirmal Juluru, Nirmalya De, Nowel Pitt, Oleg Rybakov, Oleksii Hrinchuk, Oleksii Kuchaiev, Olivier Delalleau, Oluwatobi Olabiyi, Omer Ullman Argov, Omri Almog, Omri Puny, Oren Tropp, Otavio Padovani, Ouye Xie, Parth Chadha, Pasha Shamis, Paul Gibbons, Pavlo Molchanov, Peter Belcak, Peter Jin, Pinky Xu, Piotr Januszewski, Pooya Jannaty, Prachi Shevate, Pradeep Thalasta, Pranav Prashant Thombre, Prasoon Varshney, Prerana Gambhir, Pritam Gundecha, Przemek Tredak, Qing Miao, Qiyu Wan, Quan Tran Minh, Rabeeh Karimi Mahabadi, Rachel Oberman, Rachit Garg, Rahul Kandu, Raina Zhong, Ran El-Yaniv, Ran Zilberstein, Rasoul Shafipour, Renee Yao, Renjie Pi, Richard Mazzarese, Richard Wang, Rick Izzo, Ridhima Singla, Rima Shahbazyan, Rishabh Garg, Ritika Borkar, Ritu Gala, Riyad Islam, Robert Clark, Robert Hesse, Roger Waleffe, Rohit Varma Kalidindi, Rohit Watve, Roi Koren, Ron Fan, Ruchika Kharwar, Ruisi Cai, Ruoxi Zhang, Russell J. Hewett, Ryan Prenger, Ryan Timbrook, Ryota Egashira, Sadegh Mahdavi, Sagar Singh Ashutosh Joshi, Sahil Modi, Samuel Kriman, Sandeep Pombra, Sanjay Kariyappa, Sanjeev Satheesh, Santiago Pombo, Saori Kaji, Satish Pasumarthi, Saurav Mishra, Saurav Muralidharan, Scott Hara, Sean Narenthiran, Sebastian Rogawski, Seonjin Na, Seonmyeong Bak, Sepehr Sameni, Seth Poulos, Shahar Mor, Shantanu Acharya, Shaona Ghosh Adam Lord, Sharath Turuvekere Sreenivas, Shaun Kotek, Shaya Gharghabi, Shelby Thomas, Sheng-Chieh Lin, Shibani Likhite, Shiqing Fan, Shiyang Chen, Shreya Gopal, Shrimai Prabhumoye, Shubham Pachori, Shubham Toshniwal, Shuo Zhang, Shuoyang Ding, Shyam Renjith, Shyamala Prayaga, Siddhartha Jain, Simeng Sun, Sirisha Rella, Sirshak Das, Smita Ithape, Sneha Harishchandra S, Somshubra Majumdar, Soumye Singhal, Sri Harsha Singudasu, Sriharsha Niverty, Stas Sergienko, Stefana Gloginic, Stefania Alborghetti, Stephen Ge, Stephen McCullough, Sugam Dipak Devare, Suguna Varshini Velury, Sukrit Rao, Sumeet Kumar Barua, Sunny Gai, Suseella Panguluri, Sushil Koundinyan, Swathi Patnam, Sweta Priyadarshi, Swetha Bhendigeri, Syeda Nahida Akter, Sylendran Arunagiri, Tailling Yuan, Talor Abramovich, Tan Bui, Tan Yu, Terry Kong, Thanh Do, Thomas Gburek, Thorgane Marques, Tiffany Moore, Tijmen Blankevoort, Tim Moon, Timothy Ma, Tiyasa Mitra, Tomasz Grzegorzek, Tomer Asida, Tomer Bar Natan, Tomer Keren, Tomer Ronen, Traian Rebedea, Trenton Starkey, Tugrul Konuk, Twinkle Vashishth, Tyler Condensa, Udi Karpas, Ushnish De, Vahid Noorozi, Vahid Noroozi, Vanshil Atul Shah, Veena Vaidyanathan, Venkat Srinivasan, Venmugil Elango, Victor Cui, Vijay Korthikanti, Vikas Mehta, Virginia Adams, Virginia Wu, Vitaly Kurin, Vitaly Lavrukhin, Vladimir Anisimov, Wan Seo, Wanli Jiang, Wasi Uddin Ahmad, Wei Du, Wei Ping, Wei-Ming Chen, Wendy Quan, Wenliang Dai, Wenwen Gao, Will Jennings, William Zhang, Xiaowei Ren, Xiaowen Xin, Xin Li, Yang Yu, Yangyi Chen, Yaniv Galron, Yashaswi Karnati, Yejin Choi, Yev Meyer, Yi-Fu Wu, Yian Zhang, Ying Lin, Yonatan Geifman, Yonggan Fu, Yoshi Suhara, Youngeun Kwon, Yuan Zhang, Yuki Huang, Zach Moshe, Zhilin Wang, Zhiyu Cheng, Zhongbo Zhu, Zhuolin Yang, Zihan Liu, Zijia Chen, Zijie Yan, Zuhair Ahmed

huggingface Score 15.5

Published 2026-04-14 · First seen 2026-04-15

General AI

Abstract

We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, and 3) include MTP layers for inference acceleration through native speculative decoding. We pre-trained Nemotron 3 Super on 25 trillion tokens followed by post-training using supervised fine tuning (SFT) and reinforcement learning (RL). The final model supports up to 1M context length and achieves comparable accuracy on common benchmarks, while also achieving up to 2.2x and 7.5x higher inference throughput compared to GPT-OSS-120B and Qwen3.5-122B, respectively. Nemotron 3 Super datasets, along with the base, post-trained, and quantized checkpoints, are open-sourced on HuggingFace.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{nvidia2026nemotron,
  title = {Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning},
  author = {NVIDIA and Aakshita Chandiramani and Aaron Blakeman and Abdullahi Olaoye and Abhibha Gupta and Abhilash Somasamudramath and Abhinav Khattar and Adeola Adesoba and Adi Renduchintala and Adil Asif and Aditya Agrawal and Aditya Vavre and Ahmad Kiswani and Aishwarya Padmakumar and Ajay Hotchandani and Akanksha Shukla and Akhiad Bercovich and Aleksander Ficek and Aleksandr Shaposhnikov and Alex Gronskiy and Alex Kondratenko and Alex Neefus and Alex Steiner and Alex Yang and Alexander Bukharin and Alexander Young and Ali Hatamizadeh and Ali Taghibakhshi and Alina Galiautdinova and Alisa Liu and Alok Kumar and Ameya Sunil Mahabaleshwarkar and Amir Klein and Amit Zuker and Amnon Geifman and Anahita Bhiwandiwalla and Ananth Subramaniam and Andrew Tao and Anjaney Shrivastava and Anjulie Agrusa and Ankur Srivastava and Ankur Verma and Ann Guan and Anna Shors and Annamalai Chockalingam and Anubhav Mandarwal and Aparnaa Ramani and Arham Mehta and Arti Jain and Arun Venkatesan and Asha Anoosheh and Ashwath Aithal and Ashwin Poojary and Asif Ahamed and Asit Mishra and Asli Sabanci Demiroz and Asma Kuriparambil Thekkumpate and Atefeh Sohrabizadeh and Avinash Kaur and Ayush Dattagupta and Barath Subramaniam Anandan and Bardiya Sadeghi and Barnaby Simkin and Ben Lanir and Benedikt Schifferer and Benjamin Chislett and Besmira Nushi and Bilal Kartal and Bill Thiede and Bita Darvish Rouhani and Bobby Chen and Boris Ginsburg and Brandon Norick and Branislav Kisacanin and Brian Yu and Bryan Catanzaro and Buvaneswari Mani and Carlo del Mundo and Chankyu Lee and Chanran Kim and Chantal Hwang and Chao Ni and Charles Wang and Charlie Truong and Cheng-Ping Hsieh and Chenhan Yu and Chenjie Luo and Cherie Wang and Chetan Mungekar and Chintan Patel and Chris Alexiuk and Chris Holguin and Chris Wing and Christian Munley and Christopher Parisien and Chuck Desai and Chunyang Sheng and Collin Neale and Cyril Meurillon and Dakshi Kumar and Dan Gil and Dan Su and Dane Corneil and Daniel Afrimi and Daniel Burkhardt Eliuth Triana and Daniel Egert and Daniel Fatade and Daniel Lo and Daniel Rohrer and Daniel Serebrenik and Daniil Sorokin and Daria Gitman and Daria Levy and Darko Stosic and David Edelsohn and David Messina and David Mosallanezhad and David Tamok and Deena Donia and Deepak Narayanan and Devin O'Kelly and Dheeraj Peri and Dhruv Nathawani and Di Wu and Dima Rekesh and Dina Yared and Divyanshu Kakwani and Dmitry Konyagin Brandon Tuttle and Dong Ahn and Dongfu Jiang and Dorrin Poorkay and Douglas O'Flaherty and Duncan Riach and Dusan Stosic and Dustin Van Stee and Edgar Minasyan and Edward Lin and Eileen Peters Long and Elad Segal and Elena Lantz and Elena Lewis and Ellie Evans and Elliott Ning and Eric Chung and Eric Harper and Eric Pham-Hung and Eric W. Tramel and Erick Galinkin and Erik Pounds and Esti Etrog and Evan Briones and Evan Wu and Evelina Bakhturina and Evgeny Tsykunov and Ewa Dobrowolska and Farshad Saberi Movahed and Farzan Memarian and Fay Wang and Fei Jia and Felipe Soares and Felipe Vieira Frujeri and Feng Chen and Fengguang Lin and Ferenc Galko and Fortuna Zhang and Frankie Siino and Frida Hou and Gantavya Bhatt and Gargi Prasad and Geethapriya Venkataramani and Geetika Gupta and George Armstrong and Gerald Shen and Giulio Borghesi and Gordana Neskovic and Gorkem Batmaz and Grace Lam and Grace Wu and Greg Pauloski and Greyson Davis and Grigor Nalbandyan and Guoming Zhang and Guy Farber and Guyue Huang and Haifeng Qian and Haran Kumar Shiv Kumar and Harry Kim and Harsh Sharma and Hayate Iso and Hayley Ross and Herbert Hum and Herman Sahota and Hexin Wang and Himanshu Soni and Hiren Upadhyay and Huy Nguyen and Iain Cunningham and Ido Galil and Ido Shahaf and Igino Padovani and Igor Gitman and Igor Shovkun and Ikroop Dhillon and Ilya Loshchilov and Ingrid Kelly and Itamar Schen and Itay Levy and Ivan Moshkov and Izik Golan and Izzy Putterman and Jain Tu and Jan Baczek and Jan Kautz and Jane Polak Scowcroft and Janica Rosenberg and Jared Casper and Jarrod Pflum and Jason Grant and Jason Sewall and Jatin Mitra and Jeffrey Glick and Jenny Chen and Jesse Oliver and Jiacheng Xu and Jiafan Zhu and Jialin Song and Jian Zhang and Jiaqi Zeng and Jie Lou and Jill Milton and Jim Chow and Jimmy Zhang and Jinhang Choi and Jining Huang and Jocelyn Huang and Joel Caruso and Joey Conway and Joey Guman and Johan Jatko and John Kamalu and Johnny Greco and Jonathan Cohen and Jonathan Raiman and Joseph Jennings and Joyjit Daw and Juan Yu and Julio Tapia and Junkeun Yi and Jupinder Parmar and Jyothi Achar and Kari Briski and Kartik Mattoo and Katherine Cheung and Katherine Luna and Keith Wyss and Kevin Shih and Kezhi Kong and Khanh Nguyen and Khushi Bhardwaj and Kirill Buryak and Kirthi Shankar Sivamani and Konstantinos Krommydas and Kris Murphy and Krishna C. Puvvada and Krzysztof Pawelec and Kumar Anik and Laikh Tewari and Laya Sleiman and Leo Du and Leon Derczynski and Li Ding and Lilach Ilan and Lingjie Wu and Lizzie Wei and Luis Vega and Lun Su and Maarten Van Segbroeck and Maer Rodrigues de Melo and Magaret Zhang and Mahan Fathi and Makesh Narsimhan Sreedhar and Makesh Sreedhar and Makesh Tarun Chandran and Manuel Reyes Gomez and Maor Ashkenazi and Marc Cuevas and Marc Romeijn and Margaret Zhang and Mark Cai and Mark Gabel and Markus Kliegl and Martyna Patelka and Maryam Moosaei and Matthew Varacalli and Matvei Novikov and Mauricio Ferrato and Mehrzad Samadi and Melissa Corpuz and Meng Xin and Mengdi Wang and Mengru Wang and Meredith Price and Micah Schaffer and Michael Andersch and Michael Boone and Michael Evans and Michael Z Wang and Miguel Martinez and Mikail Khona and Mike Chrzanowski and Mike Hollinger and Mingyuan Ma and Minseok Lee and Mohammad Dabbah and Mohammad Shoeybi and Mostofa Patwary and Nabin Mulepati and Nader Khalil and Najeeb Nabwani and Nancy Agarwal and Nanthini Balasubramaniam and Narimane Hennouni and Narsi Kodukula and Natalie Hereth and Nathaniel Pinckney and Nave Assaf and Negar Habibi and Nestor Qin and Neta Zmora and Netanel Haber and Nick Reamaroon and Nickson Quak and Nidhi Bhatia and Nikhil Jukar and Nikki Pope and Nikolai Ludwig and Nima Tajbakhsh and Nir Ailon and Nirmal Juluru and Nirmalya De and Nowel Pitt and Oleg Rybakov and Oleksii Hrinchuk and Oleksii Kuchaiev and Olivier Delalleau and Oluwatobi Olabiyi and Omer Ullman Argov and Omri Almog and Omri Puny and Oren Tropp and Otavio Padovani and Ouye Xie and Parth Chadha and Pasha Shamis and Paul Gibbons and Pavlo Molchanov and Peter Belcak and Peter Jin and Pinky Xu and Piotr Januszewski and Pooya Jannaty and Prachi Shevate and Pradeep Thalasta and Pranav Prashant Thombre and Prasoon Varshney and Prerana Gambhir and Pritam Gundecha and Przemek Tredak and Qing Miao and Qiyu Wan and Quan Tran Minh and Rabeeh Karimi Mahabadi and Rachel Oberman and Rachit Garg and Rahul Kandu and Raina Zhong and Ran El-Yaniv and Ran Zilberstein and Rasoul Shafipour and Renee Yao and Renjie Pi and Richard Mazzarese and Richard Wang and Rick Izzo and Ridhima Singla and Rima Shahbazyan and Rishabh Garg and Ritika Borkar and Ritu Gala and Riyad Islam and Robert Clark and Robert Hesse and Roger Waleffe and Rohit Varma Kalidindi and Rohit Watve and Roi Koren and Ron Fan and Ruchika Kharwar and Ruisi Cai and Ruoxi Zhang and Russell J. Hewett and Ryan Prenger and Ryan Timbrook and Ryota Egashira and Sadegh Mahdavi and Sagar Singh Ashutosh Joshi and Sahil Modi and Samuel Kriman and Sandeep Pombra and Sanjay Kariyappa and Sanjeev Satheesh and Santiago Pombo and Saori Kaji and Satish Pasumarthi and Saurav Mishra and Saurav Muralidharan and Scott Hara and Sean Narenthiran and Sebastian Rogawski and Seonjin Na and Seonmyeong Bak and Sepehr Sameni and Seth Poulos and Shahar Mor and Shantanu Acharya and Shaona Ghosh Adam Lord and Sharath Turuvekere Sreenivas and Shaun Kotek and Shaya Gharghabi and Shelby Thomas and Sheng-Chieh Lin and Shibani Likhite and Shiqing Fan and Shiyang Chen and Shreya Gopal and Shrimai Prabhumoye and Shubham Pachori and Shubham Toshniwal and Shuo Zhang and Shuoyang Ding and Shyam Renjith and Shyamala Prayaga and Siddhartha Jain and Simeng Sun and Sirisha Rella and Sirshak Das and Smita Ithape and Sneha Harishchandra S and Somshubra Majumdar and Soumye Singhal and Sri Harsha Singudasu and Sriharsha Niverty and Stas Sergienko and Stefana Gloginic and Stefania Alborghetti and Stephen Ge and Stephen McCullough and Sugam Dipak Devare and Suguna Varshini Velury and Sukrit Rao and Sumeet Kumar Barua and Sunny Gai and Suseella Panguluri and Sushil Koundinyan and Swathi Patnam and Sweta Priyadarshi and Swetha Bhendigeri and Syeda Nahida Akter and Sylendran Arunagiri and Tailling Yuan and Talor Abramovich and Tan Bui and Tan Yu and Terry Kong and Thanh Do and Thomas Gburek and Thorgane Marques and Tiffany Moore and Tijmen Blankevoort and Tim Moon and Timothy Ma and Tiyasa Mitra and Tomasz Grzegorzek and Tomer Asida and Tomer Bar Natan and Tomer Keren and Tomer Ronen and Traian Rebedea and Trenton Starkey and Tugrul Konuk and Twinkle Vashishth and Tyler Condensa and Udi Karpas and Ushnish De and Vahid Noorozi and Vahid Noroozi and Vanshil Atul Shah and Veena Vaidyanathan and Venkat Srinivasan and Venmugil Elango and Victor Cui and Vijay Korthikanti and Vikas Mehta and Virginia Adams and Virginia Wu and Vitaly Kurin and Vitaly Lavrukhin and Vladimir Anisimov and Wan Seo and Wanli Jiang and Wasi Uddin Ahmad and Wei Du and Wei Ping and Wei-Ming Chen and Wendy Quan and Wenliang Dai and Wenwen Gao and Will Jennings and William Zhang and Xiaowei Ren and Xiaowen Xin and Xin Li and Yang Yu and Yangyi Chen and Yaniv Galron and Yashaswi Karnati and Yejin Choi and Yev Meyer and Yi-Fu Wu and Yian Zhang and Ying Lin and Yonatan Geifman and Yonggan Fu and Yoshi Suhara and Youngeun Kwon and Yuan Zhang and Yuki Huang and Zach Moshe and Zhilin Wang and Zhiyu Cheng and Zhongbo Zhu and Zhuolin Yang and Zihan Liu and Zijia Chen and Zijie Yan and Zuhair Ahmed},
  year = {2026},
  abstract = {We describe the pre-training, post-training, and quantization of Nemotron 3 Super, a 120 billion (active 12 billion) parameter hybrid Mamba-Attention Mixture-of-Experts model. Nemotron 3 Super is the first model in the Nemotron 3 family to 1) be pre-trained in NVFP4, 2) leverage LatentMoE, a new Mixture-of-Experts architecture that optimizes for both accuracy per FLOP and accuracy per parameter, and 3) include MTP layers for inference acceleration through native speculative decoding. We pre-trai},
  url = {https://huggingface.co/papers/2604.12374},
  keywords = {hybrid Mamba-Attention Mixture-of-Experts, NVFP4, LatentMoE, Mixture-of-Experts, speculative decoding, supervised fine tuning, reinforcement learning, context length, inference throughput, huggingface daily},
  eprint = {2604.12374},
  archiveprefix = {arXiv},
}

Metadata

{}