NYU Center for Data Science

Official page of the Center for Data Science at NYU, home of the Masters in Data Science

The NYU Center for Data Science is a focal point for New York University’s university-wide initiative in data science and statistics. The Center was established to help advance NYU’s goal of creating the country’s leading data science training and research facilities, and arming researchers and professionals with tools to harness the power of big data. The Center’s faculty members and scientists a

06/24/2026

How do you teach machine learning and keep it grounded in the real world?

CDS Assistant Professor of Psychology and Data Science Grace Lindsay teaches a machine learning for climate change class that blends the two.

Students learn the basics of machine learning, artificial neural networks, and deep learning, always with an eye toward how those tools get applied in a climate setting, from climate science to human psychology.

"Data science is about the actual application of the tools," Lindsay said, "so I'm definitely interested in seeing how they're used in the real world."

06/22/2026

Congratulations to the CDS Class of 2026! 🎓

This spring, our undergraduate, master's, and doctoral students walked the stage at NYU commencement and the GSAS PhD graduation! 🎉

Across every program, you put in the late nights, the problem sets, the theses, and the defenses to get here.

Congratulations, and good luck with whatever you take on next! 💜

Why Better Language Models Make Worse Models of Human Reading

06/17/2026

Language models keep getting better at predicting the next word in a sentence. As they do, they have gotten worse at modeling how people actually read.

CDS Associate Professor of Linguistics and Data Science Tal Linzen and Nanyang Technological University Assistant Professor Byung-Doh Oh, a former CDS Faculty Fellow, trace the problem to memory.

A child hears at most 100 million words by age 12. A model like Llama 3 trains on 15 trillion, and it forgets almost nothing.

Their paper in Trends in Cognitive Sciences argues that the strongest models have become too good at prediction to mimic the human reader. It lays out how to build models that remember more like we do.

https://nyudatascience.medium.com/why-better-language-models-make-worse-models-of-human-reading-a1b8e95f81c3

Why Better Language Models Make Worse Models of Human Reading As language models have gotten better at guessing the next word in a sentence, they have gotten worse at predicting how people actually…

Opinion | Why We Keep Tricking Ourselves Into Thinking A.I. Is Conscious

06/17/2026

Why do prominent thinkers keep concluding that AI chatbots might be conscious?

In a New York Times op-ed, CDS-affiliated Leif Weatherby identifies an expertise gap behind the trend.

People trained in computer science, math, and statistics encounter chatbot output and reach for explanations their fields can't provide, because what generative AI produces is culture: stories, images, and memes.

Weatherby, director of NYU's Digital Theory Lab, examines the recent case of evolutionary biologist Richard Dawkins, who concluded Claude must be conscious after the bot compared its experience of time to a map apprehending space.

Reading what chatbots produce, Weatherby writes, will take the tools of close reading rather than technical expertise alone.

https://www.nytimes.com/2026/05/15/opinion/ai-consciousness.html

Opinion | Why We Keep Tricking Ourselves Into Thinking A.I. Is Conscious Notable thinkers keep telling us they think A.I. is conscious. That doesn’t mean it’s true.

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents

06/16/2026

A new paper led by KAIST PhD student Kangsan Kim — with co-authors Minki Kang, Taeil Kim, Yanlai Yang, Mengye Ren (CDS Assistant Professor), and Sung Ju Hwang — studies how coding agents transfer memories across task domains. Part of the work was conducted during Kim's academic exchange visit to the Global AI Frontier Lab, the NYU-Korea international research lab founded by CDS faculty Kyunghyun Cho and Yann LeCun.

Their paper, "Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents," found that agents performed better when reusing high-level reasoning strategies — like debugging and validation patterns — instead of detailed code traces. Across six coding benchmarks, the approach improved coding-agent performance by 3.7% on average.

https://arxiv.org/abs/2604.14004

Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Memory-based self-evolution has emerged as a promising paradigm for coding agents. However, existing approaches typically restrict memory utilization to homogeneous task domains, failing to leverage the shared infrastructural foundations, such as runtime environments and programming languages, that....

06/15/2026

"Institutional influence": the way governments can shape what AI chatbots say by shaping the web those models learn from.

A new Nature study by CDS-affiliated Professor of Politics Joshua Tucker and CSMAP Research Associate Professor Sol Messing, covered in CNBC and the WSJ, traces how it happens — and shows it's already affecting what today's chatbots tell users.

When asked about Chinese politics, commercial models produced more favorable answers in Chinese than in English 70-80% of the time. The same pattern showed up for China's allies, including North Korea and Russia.

To probe the mechanism, the researchers turned to the training data. They found more than 3.1 million Chinese-language documents in a major open-source corpus shared substantial phrasing with Chinese state-coordinated media — about 1.64% of the Chinese-language subset, and over 40 times the rate for Chinese Wikipedia.

Similar patterns were found in other countries with low levels of media freedom.

"The public debate has focused on what AI can generate, but this study points upstream," Tucker said. "Before AI systems can influence politics, politics can influence AI."

https://www.nyu.edu/about/news-publications/news/2026/may/governments-may-shape-what-ai-chatbots-say-by-shaping-the-data-t.html

Human Verification

Can AI Learn a Language from a Textbook?

06/12/2026

AI translation works well for languages with lots of training data, but most of the world’s languages don’t have lots of training data.

One workaround: give the model a grammar book instead.

Does this work?

NYU Linguistics PhD student Jackson Petty, CDS MS student Jaulie Goe, and CDS Associate Professor of Linguistics and Data Science Tal Linzen tested this using synthetic languages built from formal grammars.

Models handled simple cases well, but performance fell sharply as grammars grew larger, fell further with complex verb morphology, and collapsed when the target language used Hebrew script with vowel markings.

https://nyudatascience.medium.com/can-ai-learn-a-language-from-a-textbook-d09e09a79305

Can AI Learn a Language from a Textbook? Most of the world’s 7,000 languages will never have enough written text online to train an AI translation system. However, linguists have…

Photos from NYU Center for Data Science's post

06/12/2026

🎓✨ Congratulations to all our newest CDS MS graduates!

We gathered to celebrate the class with toasts, speeches, and well-earned recognition.

CDS Clinical Professor of Data Science and Psychology Pascal Wallisch and CDS Director of Graduate Studies, MS & Associate Professor of Music Technology and Data Science Brian McFee addressed the class.

MS student leader Vidhi Manek — co-president of WiDS and VP of GCBG, and graduating herself — also spoke to her fellow classmates.

The field is lucky to have you. 🥂

06/11/2026

🎓✨ Hats off to our newest PhDs!

CDS hosted an in-person graduation lunch reception to celebrate students graduating in May and August 2026.

Faculty guests joined the graduates for two hours of food, toasts, and well-earned recognition at a restaurant near campus.

Congratulations to everyone closing this chapter — we can't wait to see what you do next. 🥂

06/10/2026

Congrats to ten students from CDS who were recently recognized for their awards at the 2026 Courant Student Prizes and Fellowships Celebration.

CDS students Geri Bakushi and Rafael Mateus Carrion earned the Pathbreaker Scholarship.

The Academic Achievement Award for Master’s Students went to CDS master’s students Harsh*t Bhargava, Zijin Hu, Hanzhe Wu, and Trueman Wu.

CDS undergraduate students Dingyu Fu, Jack Tinker, Selena Zhang, and Sifan ‘Silvia’ Zheng received the Academic Achievement Award for Undergraduate Students.

Want your school to be the top-listed School/college in New York?

Click here to claim your Sponsored Listing.

Location

New York

Contact the school

Click here to send a message to the school

Telephone

+12129983401

Website

http://cds.nyu.edu/

Address

60 5th Avenue
New York, NY
10011