Home > Media News >
Source: http://www.mashable.com
Mashable: According to OpenAI, the model was trained using the information on millions of 3D objects.
OpenAI's newest generative AI platform, Point-E, can make complex and unique 3D models from text input and picture training, just like Dall-E and ChatGPT, which wowed the internet community.
OpenAI—the artificial intelligence startup backed by Elon Musk and responsible for the popular DALL-E text-to-image generator—announced the debut of its newest picture-making machine, which can generate 3D point clouds directly from text prompts. For reference, current systems like Google's DreamFusion can take several hours and many GPUs to make the same image that Point-E can generate in just a minute or two.
A wide range of fields and endeavors make use of 3D modeling. Modern blockbuster films rely on 3D modeling for their computer-generated imagery (CGI), as do video games, and virtual and augmented reality (VR) experiences.
OpenAI states that a massive database of text and image pairs was used to train the text-to-image algorithm. It's capable of responding to a wide variety of inputs. The image-to-3D model, on the other hand, has only been trained using a limited number of photos and 3D models. According to OpenAI, the model was trained using the information on millions of 3D objects.
OpenAI goes further by detailing what makes Point E unique among text-to-3D generating models. To optimize 3D representations, for instance, models that employ pre-trained text-to-image models necessitate costly optimization techniques, which in turn necessitate additional resources (computing time and memory). Models that use paired text and 3D data for training face the same scaling challenges.
To create point clouds from text, OpenAI uses a three-stage technique. The first stage involves making a computer-generated representation of some text. In the next stage, a rough point cloud is made from the synthetic imageThe first stage involves making a computer-generated representation of some text. In the next stage, a rough point cloud is made from the synthetic image. The final stage entails creating a high-definition point cloud from the low-resolution one and the synthetic image.