Engineering:Synthography

From HandWiki
Short description: Method of generating media using machine learning
A synthographic image created with a fine-tuned Stable Diffusion model and outpainting

Synthography is a proposed term for the generation of digital photos using a generative adversarial network. City University of Hong Kong photography professor Elke Reinhuber coined the term in her 2021 conference paper "Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography," writing "As soon as a generative adversarial network (GAN) is trained to create an image which resembles a photograph, I propose to better describe it with the term synthograph."[1] This is distinct from other graphic creation and editing methods in that synthography uses artificial intelligence art and text-to-image models to generate synthetic media. It is commonly achieved by prompt engineering text descriptions as input to create or edit a desired image.[2][3]

Text-to-image models, algorithms, and software are tools used in synthography that are designed to have technical proficiency in creating the resulting artificial intelligence art output based on human input. Synthography typically uses text-to-image models to synthesize new images as a derivative of the training, validation, and test data sets on which the text-to-image models were trained.

Synthography is the method used, not the output itself. The output created specifically by generative machine learning models (as opposed to the broader category of artificial intelligence art) are referred to as synthographs.[1] Those who practice synthography are referred to as synthographers.[4][5] A synthographer can harness the ability of linguistic composition to tame a generative model. Other cases also include fine-tuning a model on a dataset to expand its creation possibilities.

Practical uses of synthography include AI-driven product shots, stock photography, and even magazine covers with some making predictions that synthography may be the future of photography.[6]

Etymology

Reinhuber chose the term "synthography" by combining the terms "synthetic" and "photography," proposing its use to describe images that use "the methodology of synthetic production" and that move "beyond the classic understanding of photography."[1]

History

The event that started the broad usage of text-to-image models is the publication of DALL-E by OpenAI in January 2021.[7] While it was not released to the public, CLIP (Contrastive Language-Image Pre-training) was open-sourced, which led to a succession of implementations with other generators such as Generative adversarial networks and Diffusion models.[8][9] The next big event, which led to a rise in popularity of such technique, was the release of DALL-E 2 in April 2022. After slowly releasing it as a private beta, it became public in July 2022. In August 2022, Stable Diffusion was open-sourced by Stability AI,[10] which fostered a community-led movement.[citation needed]

\

Difference between Synthography and Artificial Intelligence Art

Artificial Intelligence (Artificial Intelligence Art) relation to Generative Models (Synthography), Venn diagram

Synthography is the method used to create synthetic media using generative models. Artificial intelligence art (including music, cooking, and video game level design) is the output created using artificial intelligence which is an overly and increasingly broad category.[citation needed]

When Elke Reinhuber coined the term synthography in her paper, "Synthography – An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation", she spoke of a "legitimation crisis" as a need for the term. Before generative models were used, artificial intelligence art algorithms already existed in mediums such as graphics editing software (ie: content-aware fill, application of artistic styles, resolution enhancement) which employs a wide range of artificial intelligence tools, and DSLR and smartphone cameras (ie: object recognition, in-camera focus stacking, low-light machine learning algorithms) all of which continue to undergo rapid development.[1]

Artificial intelligence is a superset of Machine learning. Machine learning is a superset of neural networks. Neural networks are a superset of generative models such as GAN's (generative adversarial networks) and diffusion models. The relation between all of these is depicted in the Venn diagram shown here. Synthography specifically uses generative models, as popularized by software such as DALL-E, Midjourney, and Stable Diffusion.[citation needed]

References

  1. 1.0 1.1 1.2 1.3 Reinhuber, Elke (2 December 2021). "Synthography–An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography". Google Scholar. https://scholar.google.com/citations?view_op=view_citation&hl=en&user=cjLjVk8AAAAJ&citation_for_view=cjLjVk8AAAAJ:hC7cP41nSMkC. 
  2. Smith, Thomas (26 October 2022). "What is Synthography? An Interview With Mark Milstein - Synthetic Engineers". Synthetic Engineers. https://syntheticengineers.com/2022/10/26/what-is-synthography-an-interview-with-mark-milstein/. 
  3. Oosthuizen, Megan (20 December 2022). "Artist Shows Us What A Live-Action Movie Could Look Like". Fortress Entertainment. https://www.fortressofsolitude.co.za/ducktales-artist-shows-us-what-a-live-action-movie-could-look-like/. 
  4. Ango, Stephan (3 July 2022). "A Camera for Ideas". https://stephanango.com/synthography. 
  5. Growcoot, Matt (17 March 2023). "AI Photographers or 'Synthographers'". PetaPixel. https://petapixel.com/2023/03/17/are-ai-photographers-or-synthographers-a-thing-now/. 
  6. Katz, Neil (8 March 2023). "Synthography is the Future of Photography". Meteor. https://www.thisismeteor.com/synthography-kills-photography/. 
  7. Underwood, Ted (21 October 2021). "Mapping the latent spaces of culture". tedunderwood.com. https://tedunderwood.com/2021/10/21/latent-spaces-of-culture/. 
  8. Steinbrück, Alexa (3 August 2021). "VQGAN+CLIP - How does it work?". medium.com. https://alexasteinbruck.medium.com/vqgan-clip-how-does-it-work-210a5dca5e52. 
  9. Smith, Ethan. "A Traveler's Guide to the Latent Space". https://sweet-hall-e72.notion.site/A-Traveler-s-Guide-to-the-Latent-Space-85efba7e5e6a40e5bd3cae980f30235f#8ebdd89267cd42d0a1b8f3bd3297fd10. 
  10. Roose, Kevin (21 October 2022). "A Coming-Out Party for Generative A.I., Silicon Valley's New Craze". https://www.nytimes.com/2022/10/21/technology/generative-ai.html.