Enhancing Visual Digital Display through Computer Vision Techniques

In the ever-evolving world of media, the UK screen industry is embracing a groundbreaking tool to address the long-standing issue of diversity: computer vision. This technology can significantly aid in measuring on-screen representation and improving diversity in character portrayal, offering objective, scalable, and fine-grained diversity evaluation across large volumes of screen content.

Computer vision techniques, such as object recognition, pose estimation, and character detection, can identify characters in scenes and extract relevant attributes like screen time, visual prominence, and interactions with other characters. For instance, algorithms that detect bounding boxes around people and estimate keypoints (body parts) enable detailed analysis of character poses and screen positioning. This data can be aggregated to measure the proportion of screen time given to characters of various demographic groups, including ethnicity, gender, and age, if combined with appropriate classification models.

Moreover, vision-language models (VLMs) that integrate image understanding with textual context can enhance semantic analysis, such as identifying dialogue prominence or role importance for characters portrayed. This supports evaluating not only how often characters appear visually but also their narrative significance.

Beyond measurement, computer vision aids in improving diversity by providing transparent, interpretable metrics. Techniques like Grad-CAM visualize which parts of an image contribute most to identification or classification, helping stakeholders understand model decisions and address ethical concerns around bias and fairness in evaluations. Furthermore, combining vision with metadata and manual annotation helps ensure ethically responsible use and contextual accuracy when assessing diversity.

The UK screen industry can leverage these tools to create standardized, scalable frameworks for ongoing monitoring of on-screen diversity that are more reliable and less subjective than manual methods. This fosters accountability and enables targeted interventions to improve diverse representation and prominent portrayal of underrepresented groups within media content.

This revolutionary approach to diversity evaluation is the output of a pilot project conducted by Nesta in partnership with Learning on Screen. The project team, including Raphael Leung, Bartolomeo Meletti, Dr Cath Sleeman, Gabriel A. Hernández, and Gil Toffell, has been instrumental in bringing this research to light. Their work builds upon a presentation given by Raphael and Bartolomeo at the International Federation of Television Archives (FIAT/IFTA) Conference 2020 and a short paper by Raphael for the IJCAI 2021 AI for Social Good workshop.

As the world of media continues to evolve, the UK screen industry is poised to lead the way in promoting diversity and representation through the innovative use of computer vision. This technology offers a promising solution to a problem that has long plagued the industry, offering a more objective, scalable, and fine-grained approach to diversity evaluation and improvement.

The data extracted through computer vision techniques can be used to evaluate the diversity of characters in education materials, providing evidence for improving representation across various demographic groups.
By leveraging artificial intelligence and computer vision, the fashion-and-beauty industries can analyze their advertising campaigns for diversity, ensuring more equitable representation and embracing a more inclusive lifestyle.
Researchers in the field of policy developments can analyze data obtained from computer vision to understand patterns and trends in the representation of different skill sets and talent across various industries.
The integration of computer vision with the development of creative content can help AI systems like artificial intelligence better understand and portray diverse characters, enhancing the quality of stories and improving audience engagement.
As computer vision becomes more prevalent in technology, it offers applications beyond media, such as aiding in research for environmental organizations to monitor biodiversity and wildlife populations.
In the realm of art, artists can use computer vision to analyze their collections and identify underrepresented groups, fostering creative solutions to promote diversity and ensure a more inclusive representation in their work.