AI is trying to get hands right

AI text-to-image generators have come a long, arguably troubling way in a very short period of time, but there’s one piece of human anatomy they still can’t quite grasp: hands. Speaking with BuzzFeed earlier this year, Amelia Winger-Bearskin, an artist and associate professor of AI and the arts at the University of Florida, explained that until now, AI programs largely weren’t sure of what a “hand” exactly was. “Hands, in images, are quite nuanced,” she said at the time. “They’re usually holding on to something. Or sometimes, they’re holding on to another person.” While there have been some advances in the past few months, there’s still sizable room for improvement.

Although that might sound odd at first, a quick look at our appendages’ complexities can quickly reveal why this is the case. Unless one can nail numerous points of articulation, varieties of poses, skin wrinkles, veins, and countless other precise details, renderings of hands can rapidly devolve into an uncanny valley of weirdness and inaccuracy. What’s more, AI programs simply don’t have as many large, high-quality images of hands to learn from as they do faces and full bodies. But as AI still contends with this—often to extremely puzzling, ludicrous, and outright upsetting results—programmers at the University of Science and Technology in Hefei, China, are working on a surprisingly straightforward solution: train an AI to specifically study and improve hand generation.

In a recently published research paper, the team details how they eschewed the more common diffusion image production technology in favor of what are known as neural radiance fields, or NeRFs. As New Scientist notes, this 3D modeling is reliant on neural networks, and has previously been utilized by both Google Research and Waymo to create seamless, large-scale cityscape models.

*Credit: University of Science and Technology of China*

“By introducing the hand mapping and ray composition strategy into [NeRF,] we make it possible to naturally handle interaction contacts and complement the geometry and texture in rarely-observed areas for both hands,” reads a portion of the paper’s abstract, adding that the team’s “HandNeRF” program is compatible with both single and two interacting hands. In this updated process, multi-view images of a hand or hands are initially used by an “off-the-shelf skeleton estimator” to parameterize hand poses from the inside. Researchers then employ deformation fields via the HandNeRF program, which generates image results of our upper appendages that are more lifelike in shape and surface.

Although NeRF imaging is difficult to train and can’t generate whole text-to-image results by itself, New Scientist also explains that potentially combining it with diffusion tech could provide a novel path forward for AI generations. Until then, however, most programmers will have to figure out ways to work around AI’s poor grasp—so to speak—of the human hand.

Win the Holidays with PopSci's Gift Guides

Keep a movie theater in your bag with this mini smart projector Keep a movie theater in your bag with this mini smart projector

The app that (finally) helped me kick my TikTok addiction for a healthier one The app that (finally) helped me kick my TikTok addiction for a healthier one

5 ways to get Craiyon, formerly Dall-E mini, to bend to your will 5 ways to get Craiyon, formerly Dall-E mini, to bend to your will

Some of the world’s smartest traffic lights are getting smarter Some of the world’s smartest traffic lights are getting smarter

Can AI escape our control and destroy us? Can AI escape our control and destroy us?

Get your phone’s AI assistant to actually assist you Get your phone’s AI assistant to actually assist you

Adobe is training AI to be a better photo and video editor than you Adobe is training AI to be a better photo and video editor than you

Now You Can Turn Your Photos Into Computerized Nightmares With ‘Deep Dream’ Now You Can Turn Your Photos Into Computerized Nightmares With ‘Deep Dream’

DALL-E’s latest trick: extending the boundaries of paintings DALL-E’s latest trick: extending the boundaries of paintings

A startup is using AI to make call center workers sound ‘American’ A startup is using AI to make call center workers sound ‘American’

2019’s most innovative gadgets 2019’s most innovative gadgets

An Exclusive Look Inside the Super-Secret AI Startup Viv An Exclusive Look Inside the Super-Secret AI Startup Viv

LinkedIn’s recent social research reveals what helps get people jobs LinkedIn’s recent social research reveals what helps get people jobs

Amazon’s new warehouse employee training exec used to manage private prisons Amazon’s new warehouse employee training exec used to manage private prisons

In school districts that ban books, e-readers offer a workaround In school districts that ban books, e-readers offer a workaround

Why Bitcoin won’t go green any time soon Why Bitcoin won’t go green any time soon

Iran cracks down on citizens’ internet access following spreading protests Iran cracks down on citizens’ internet access following spreading protests

The FTC takes a second look at Amazon-iRobot deal The FTC takes a second look at Amazon-iRobot deal

Share

Win the Holidays with PopSci's Gift Guides