On Friday, February 9th, 2024, the SMU campus welcomed Pete H Smith Jr, PhD in hosting a discussion regarding the cultural shortcomings of Large Language Model (LLM) technology, and how we might work to represent global values and traditions.
Dr. Smith received his PhD in foreign language education from the University of Texas at Austin. Before his post-graduate work, he studied general arts and sciences and Russian language translation before obtaining a master’s degree in Slavic languages and literature. Today, he serves as the Chief Analytics and Data Officer at the University of Texas at Arlington, where he is both a professor of modern languages and cofounder of the Learning Innovation and Networked Knowledge (LINK) research laboratory in learning analytics. His multi-disciplinary background offers keen insight into the issue of cultural bias in LLM technology.
Dr. Smith begins his presentation by defining large language models, exemplified by well-known software including Chat-GPT, BERT, and Google Translate, as neural network models utilizing complex mathematical matrices to represent text as numerical data. Once translated mathematically, systems work to analyze and understand text using predictive methods such as next-sentence prediction and masked word prediction, in which models determine mathematically what words are best associated with the text to modify and add to its presented ideas.
Dr. Smith shares that, as this technology advances, each year adds billions of parameters to currently available models. Despite this, users experience frequent misinterpretations and factual inaccuracies from the answers LLMs have generated. These models often struggle with long-term recall and interpretability, and their answers are prone to cultural bias against minority groups.

Dr. Smith explains that these models develop in environments rife with bias. Trained primarily in English by companies often without the power and scope to train from an inclusive, modern breadth, bias is an unintentional yet inherent consequence of the design of the programs. The limited political and sociocultural contexts it’s familiar with are bound to sit at the center of its understanding and inform everything it takes in and spits out, regardless of how disparate its input is from its training data.
Dr. Smith relays the growing concern among experts, such as Safiya Noble, PhD, that this bias, coupled with the pervasiveness of these models, is a concern of human rights, calling out the tendency for bias and its consequences as algorithmic injustice. Others, such as Emily Bender, PhD, are quick to emphasize that models are merely guessing what they believe should come next, and cannot functionally mimic the inner workings of the human mind, although we treat them as if they can. Both women are key voices in the discourse surrounding LLMs.

The majority of LLMs—approximately 60%–are trained for an English-speaking audience; in response, many countries, like Germany and Japan, are racing to develop their versions of the technology. With this in mind, Dr. Smith explores a pivotal question: how do large language models interact with non-English data?
Not well, according to Dr. Smith. Consistently, research shows that English-trained LLMs are slow to shift perspective and appropriately interpret values that differ from those they are trained with. Despite this, Dr. Smith notices that these models, who rely so heavily on data for training and advancement, become more open to new ideas as they are exposed to a greater number of diverse prompts.
Looking to the future, Dr. Smith believes that cultural prompting, or the act of developing a prompt to encourage LLMs to consider cultural values when developing their responses, has the potential to retrain existing models to reflect the sociocultural diversity of its users. As more countries ramp up the development of their LLMs, questions remain about how to best formulate an inclusive and representative model for an ever-shifting world. Dr. Smith concluded his presentation by encouraging the audience to think critically about the cultural implications of LLMs, opening the floor to questions.
If you are interested in collaborating with peers in technology-enhanced learning, immersive learning, and AI/machine learning spaces, join us for our next TEIL Seminar on Friday, March 8th, from 12 p.m. to 1:30 p.m. in Harold Clark Simmons Hall, Room 116, and on Zoom. For more information visit our website www.smu.edu/teil.
Written by Ainsley Johnson, Research Assistant with the Center for Global Health Impact and the Institute for Leadership Impact.