Just like with any other information source, you need to evaluate information you get from a generative AI tool.
Here are some questions to consider when evaluating output from generative AI tools:
You might notice that these questions are difficult (or sometimes even impossible) to answer when using generative AI tools. You will have to decide how this affects if and how you use the information you get from these tools.
Beyond bias on points of individual facts, large language models reflect the language, attitudes, and perspectives of the creators of its training data. Thus, the style of language and types of thoughts expressed, and even conclusions the LLM comes to reflect those creators, and not a general "universal" human.
“White supremacist and misogynistic, ageist, etc., views are overrepresented in the training data, not only exceeding their prevalence in the general population but also setting up models trained on these datasets to further amplify biases and harms.”
Timnit Gebru, Emily M. Bender, Angelina McMillian-Major, and Shmargret Shmitchell, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?”, p. 613.
This is true of gender, demographics, and also location. The vast majority of data comes from the Global North.
The image below, from a recent paper, shows the locations of place names found in the Llama-2-70b LLM. Europe, North America, and Asia are fairly well represented, while Africa and South America are nearly absent. As a result, the language model may reflect the attitudes and cultural assumptions of people in those areas far more consistently.
Gurnee, W., & Tegmark, M. (2023). Language Models Represent Space and Time (arXiv:2310.02207). arXiv. https://doi.org/10.48550/arXiv.2310.02207