current position:Home>Why does AI painting advance by leaps and bounds? From history to technological breakthroughs, read the history of the development of popular AI painting in one article

Why does AI painting advance by leaps and bounds? From history to technological breakthroughs, read the history of the development of popular AI painting in one article

2022-09-21 14:18:08see Metaverse

注:本文来源于微信公众号 Web3天空之城(ID:Web3SkyCity),作者 | 城主,
转自:钛媒体

前言:Since some time ago, by accident by the presentAIAfter the level of painting is shaken (above and beyondAI作画神器, and it was created234beautiful futuristic warrior in armor ) , The author feels deeply todayAIThe rapid progress of painting may have far exceeded all expectations. And here's the cause and effect, 包括AIThe history of painting, and recent breakthroughs, It's worth sorting out and sharing with you guys. 因此有了本文.

This article is divided into the following sections:

  1. 2022, 进击的AI绘画
  2. AIThe history of painting
  3. AIHow painting has advanced by leaps and bounds
  4. 顶级AIpainting modelPK
  5. AIWhat breakthroughs in painting mean for humanity


2022, 进击的AI绘画


今年以来, Enter the text description to automatically generate the pictureAIPainting artifact suddenly have emerged.

首先是Disco Diffusion.

Disco Diffusion 是在今年 2 One that became popular at the beginning of the month AI Image generator,It can render the corresponding image according to the keywords describing the scene:

到了今年4月, Famous artificial intelligence teamOpenAI Also released new models DALL·E 2代,The name comes from the famous painter Dali(Dalí)and robot mobilization(Wall-E), Also supports generating good images from text descriptions.

And many readers are rightAIPainting begins to generate special attention, Maybe from the followingAIThe news of the work that broke out began:

This is a picture to useAI绘画服务 MidJourney Generated digital oil painting, Its users to generate the picture in Colorado exhibition art competition, 夺得了第一名. After this incident was exposed, it triggered a huge debate on the Internet..

目前 AIThe technique of painting is still evolving, It iterates fast, 完全可以用"日新月异"来形容. Even if the early this yearAIPainting compared to now, The effect is also worlds apart.

在年初的时候, 用Disco DiffusionCan generate some very atmospheric sketches, But it is basically impossible to generate a face; 仅仅2个月后, DALL-E 2Accurate facial features can already be generated;  现在, 最强大的Stable DiffusionThere has been an order of magnitude change in the refinement of the painting and the speed of the painting.

AIThe technique of painting is not new in recent years, 但是今年以来, AIThe quality of the output works is improving at a rate that is visible to the naked eye, And efficiency from the beginning of a hours to the present more than ten seconds.

after this change,  究竟发生了什么事情? Let us first review in an all-round wayAIThe history of painting, 再来理解一下, 这一年多来, AIThe breakthrough development of painting technology enough to go down in history.


AIThe history of painting


AIPainting may have appeared earlier than many people think.

Computers are last century60年代出现的, 而就在70年代, 一位艺术家,哈罗德·科恩Harold Cohen(画家,Professor at the University of California, San Diego) start building computer programs"AARON"Make paintings. Just and the momentAIPaintings output digital works differently, AARONIt's really to control a robotic arm to paint.

Harold 对 AARONImprovements have continued for decades, 直到他离世. 在80年代的时候, ARRON"掌握"3D object rendering; 90年代时, AARONAbility to paint with multiple colors, allegedly to this day, ARRONStill creating.

不过, AARONThe code is not open source, Therefore, the details of his paintings are unknown., 但可以猜测, ARRONjust describes the author in a complex programmatic wayHaroldMy understanding of painting -- 这也是为什么ARRONAfter decades of learning iterations,In the end, only colorful abstract paintings can still be produced,这正是 Harold Cohen My abstract color painting style. Harold用了几十年时间, Presented his own understanding and expression of art on the canvas through the program-guided robotic arm.

(左:ARRONand Harold.科恩    右: ARRON 在 1992 year's creations)

Hard to say thoughAARON如何智能, But as the first program to automatically paint and actually paint on the canvas, give it oneAIPainting of the ancestors of the title, It fits his identity.

2006年, 出现了一个类似ARRONcomputer painting products The Painting Fool. it can observe photos, Extract block color information from photos, Use realistic painting materials such as paint, Create with pastels or with pencils, etc..

The above two examples are compared"古典"way of computer automatic painting, A bit like a toddler baby, There is a little look, But it's pretty rudimentary from an intelligent point of view.

而现在, 我们所说的"AI绘画"概念, More is based on the deep learning model for automatic drawing of a computer program. The development of this painting method is actually relatively late..

在2012年 Googletwo famousAI大神, 吴恩达和Jef Deanan unprecedented test, Use together1.6万个CPUTrained one of the largest deep learning networks in the world at the time, Used to instruct a computer to draw a picture of a cat's face. They used the from at that timeyoutube的1000Thousands of cat face pictures, 1.6万个CPU整整训练了3天, 最终得到的模型, Exciting can generate a very blurry cat face.

在今天看起来, The training efficiency of the model and the output isn't worth mentioning. But for the timeAI研究领域, This is a breakthrough attempt, Officially enabled deep learning model supportAIpaint this"全新"研究方向.

Here we go into a little technical detail: based on deep learning modelsAIHow hard is it to draw?, 为什么2012Years of training on a large-scale computer cluster that is already very modern level takes many days to produce only poor results.?

Readers may have a basic idea, The training of a deep learning model is simply to use a large amount of externally labeled training data input., According to the input and the corresponding expected output, The process of iteratively adjusting the internal parameters of the model to match.

那么让AIThe process of learning to draw, is to construct the training data of the existing paintings, 输入AIThe process of iteratively adjusting the parameters of the model.

How much information does a painting carry?? 首先就是长xwideRGB像素点.  Let the computer learn to draw, The simplest starting point is to get an output with regular pixel combinationsAI模型.

但RGBIt's not all paintings that combine pixels, it could just be noise. A rich texture, Paintings with natural strokes have many strokes to finish, Involves the position of each stroke in the painting, 形状, parameters such as color, The combination of parameters involved here is very large. However, the computational complexity of deep model training increases dramatically with the increase of parameter input combinations....  You can understand why this is not easy.

In Andrew Ng andJeff DeanCat, face generation model of innovation, AIScientists are starting to work one after another into this new and challenging field. 在2014年, AIThe depth of the academia puts forward a very important learning model, This is the famous adversarial generative networkGAN (Generative Adverserial Network, GAN).

正如同其名字"对抗生成", This deep learning model of core idea is to make the two internal procedures "生成器(generator)" 和"判别器(discriminator)" 互相PKresult after equilibration.

GANModels are all the rage as soon as they come outAI学术界, It has been widely used in many fields. It also became a lot ofAIBasic framework for painting models, The generator is used to generate images, The discriminator is used to judge the image quality. GANThe emergence of the greatly promotedAIThe development of painting.

但是, 用基础的GAN模型进行AIPainting also has obvious flaws, On the one hand, the control over the output results is very weak, Easy to generate random images, 而AIArtist output should be stable. Another problem is that the resolution of the generated images is relatively low.

It's a matter of resolution,  GAN在"创作"There is a dead end at this point, This knot is precisely its own core feature.:  根据GAN基本架构,The discriminator needs to judge whether the generated image is of the same category as other images that have been provided to the discriminator, This determines that in the best case, The output images are also imitations of existing works, 而不是创新......

Adversarial generative networksGAN之外, Researchers are also starting to use other kinds of deep learning models to try to teachAI绘画.

A well-known example is2015年 GoogleAn image tool published by Deep Dream(Deep Dream). Deep Dream released a series of paintings, Attracts a lot of attention.Google even curated an exhibition of this deep dream work.

But if you're serious, Deep dreams are not so muchAI绘画, More like a seniorAIversion filter, Its filter style can be understood by looking at the works above.

and works are not embarrassing or embarrassingDeep Dream相比, GoogleMore reliable2017A model trained with thousands of hand-drawn sketch pictures, AICan draw some simple strokes through training. (Google, 《A Neural Representation of Sketch Drawings》)

There is a reason why this model has received so much attention,  GoogleOpen source relevant source code, Therefore, third-party developers can develop interestingAIStick figure application. Is called an online application “Draw Together with a Neural Network” ,Feel free to draw a few strokes,AIIt can automatically help you fill in the complete graph.

值得注意的是, 在AIPainting model in the process of research, Internet giants have become the main force, 除了上述Google所做的研究, 比较有名的是2017年7月, FacebookNew model from a tripartite collaboration between Rutgers University and the Art History Department of the College of Charleston, Creative Adversarial Network (CAN, Creative Adversarial Networks).(Facebook, 《CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms》)

It can be seen from the collection of works below,This creative adversarial networkCANTrying to output some pictures that look like the artist's work,它们是独一无二的,rather than imitations of existing works of art.

CANModel generation works are embodied in the creative shock development researchers at that time, Because of these works and artistic circles of abstract painting seem to be quite similar. So the researchers organized a Turing test,Ask the audience to guess that these works are the work of human artists,or artificial intelligence.

结果, 53%的观众认为CAN模型的AIArtwork by human hands, This is the first time in history that a similar Turing test has broken through half.

但CAN这个AI作画, limited to some abstract expressions, And in terms of artistic scores, Still far from the level of a human master.

Not to mention creating some realistic or figurative paintings, 不存在的.

其实一直到2021年初, OpenAIpublished a well-receivedDALL-E系统, 其AIThe level of painting is average, 下面是DALL-Eresult of drawing a fox, Barely discernible.

但值得注意的是, 到了DALL-E这里,  AIbegan to acquire an important ability, That is, you can create according to the text input prompt.!

接下来, We continue to explore the questions posed at the beginning of this article. I don't know if readers feel the same way, 自今年以来, AIThe level of painting suddenly soared, A substantial leap in quality compared to previous works, Suddenly there is a feeling that I haven't seen it for a day, like every three autumns.

事出必有妖. what exactly happened? 我们慢慢道来.


AIHow painting has advanced by leaps and bounds


in many sci-fi movies or series, Often there will be such a scene, The protagonist and the computer with a particularly sci-fi senseAI说了一句话, 然后AI生成了一个3D影像, 用VR/AR/The way of holographic projection is presented in front of the protagonist.

Ditch those cool visual effects wrappers, The core competence is here, human input with language, 然后电脑AIUnderstand human expression, Generate a graphic image that meets the requirements, show to humans.

仔细一想, The most basic form of this ability, 就是一个AIThe concept of painting. (当然, From flat painting to3DGeneration is a little further away, 但相比于AIThe difficulty of creating a meaningful painting out of thin air, 从2DThe graph automatically generates the corresponding3DModels are not a matter of magnitude).所以, control by speaking, Or more mysterious brain wave control, Cool scenes in sci-fi movies actually describe aAI能力 , 那就是把"语言描述" 通过AIUnderstanding automatically becomes an image. At present, the technology of automatic speech recognition of text has matured to the extreme,  So this is essentially a text-to-imageAI绘画过程.

Actually pretty cool, Text description only, There are no reference pictures, AIWill be able to understand and to draw the corresponding content automatically, And it's getting better and better!  It felt a little far off yesterday, Now it has really appeared in front of everyone.

how did all this happen?

The first thing to mention the birth of a new model.  还是前面提到的OpenAI团队, 在2021年1A new deep learning model in open source CLIP(Contrastive Language-Image Pre-Training). A state-of-the-art AI for image classification today.

CLIP训练AIDo two things at the same time, One is natural language understanding, One is computer vision analysis. It is designed to be a powerful tool with a specific purpose, That is to do general image classification, CLIPCan determine the degree of correspondence between images and text prompts, For example, combining images of cats with"猫"match the word exactly.

CLIP模型的训练过程, 简单的说, is to use the already marked"文字-图像"训练数据, On the one hand, model training on text, Train another model on images on the one hand, Constantly adjust the internal parameters of the two models, Make model respectively output text and image eigenvalue can make corresponding"文字-图像"Confirmed match with simple verification.

关键的地方来了, 其实呢, Someone has tried training before"文字-图像" matching model, 但CLIP最大的不同是, It scraped40亿个"文本-图像"训练数据! Through this amount of data, Then smashed into the expensive training time, CLIPThe model is finally completed.

Astute readers will ask, 这么多的"文本-图像"Who made the mark? 40billions, If you need to manually mark the image related text, The time cost and labor cost are sky-high. 而这正是CLIP最聪明的地方, It uses pictures that are widely distributed on the Internet! 

Pictures on the Internet generally come with various textual descriptions, 比如标题, 注释, even user-typed tags, 等等, This naturally becomes a usable training sample.  in this particularly clever way, CLIPThe training process completely avoids the most expensive and time-consuming manual annotation, 或者说, Internet users all over the world have already done the labeling work in advance.

CLIP功能强大, 但无论如何, it looks at first glance, It doesn't seem to have anything to do with art.

但就在CLIPA few days after the open source release, Some machine learning engineer players realize, This model can be used to do more. 比如Ryan Murdock, figured out how to put otherAI连接到CLIP上, 来打造一个AI图像生成器. Ryan Murdock在接受采访时说:“after i played with it for a few days,I realized that I can generate images.”

最终他选择了BigGAN, 一个GAN模型的变种, and post the code asColab笔记The Big Sleep.

( 注: Colab Notebook 是Google提供的非常方便的Python NotebookInteractive Programming Notebook Online Service, 背后是GoogleCloud computing support. Slightly tech-savvy users can use a laptop-likeWebEdit and run on the interfacePythonscript and get the output.重要的是, This programming note can be shared ).

Big SleepThe pictures created are actually slightly weird and abstract, 但这是一个很好的开始.

随后, Spanish [email protected]在此基础上发布了CLIP+VQGANversion and tutorial, 这个版本通过TwitterIt was widely circulated, 引起了AIHigh attention from the research community and enthusiasts.  而这个ID背后, computer data scientists as they are now known Katherine Crowson.

在之前,类似VQ-GANSuch a generative tool, after being trained on a large number of images,Similar new images can be synthesized,然而,As readers and impressions, 前面说过, GANsThe model of the type itself does not generate a new image with a text prompt, Not good at creating new graphic content.

而把CLIP嫁接到GANGo up to generate the image, The idea is simple and clear: 

既然利用CLIPIt is possible to calculate which image feature values ​​match any string of text, Then just link this match verification process to the one responsible for generating the imageAI模型 (比如这里是VQ-GAN), , The model responsible for generating the image in turn deduces a feature value that generates the appropriate image, Images that pass match verification, Don't you get a work that matches the description of the text??

有人认为 CLIP+VQGAN是自2015年Deep DreamThe biggest innovation in artificial intelligence art since. And the wonderful thing is, CLIP+VQGANready-to-use for anyone who wants to use them.按照Katherine Crowsononline tutorials and Colab Notebook, A slightly technical user can run the system in minutes.

有意思的是, 上一章也提到, 在同一个时间(2021年初), 开源发布CLIP的OpenAIThe team also released its own image generation engineDALL-E. DALL-EIt is also used internallyCLIP, 但DALL-E并不开源!

So on Community Influence and Contribution, DALL-ECompletely incompatibleCLIP+VQGANof open source implementation releases compared to, 当然, 开源CLIP已经是OpenAIGreat contribution to the community.

Speaking of open source contributions, 这里还不得不提到LAION.

LAION is a global non-profit machine learning research organization,今年3月开放了当前最大规模的开源跨模态数据库LAION-5B,Contains proximity60亿(5.85 Billion)个图片-文本对, Can be used to train all generative models from text to images,也可以用于训练 CLIPThis used to text and image matching model on a scale,And both are now AI The Heart of Image Generative Models.

In addition to providing the above massive training material library,LAION 还训练 AI According to artistic sense and visual beauty,给LAION-5B 里图片打分, and put the high-scoring pictures into a category called LAION-Aesthetics 的子集.

事实上, 最新的AIPainting models include the subsequently mentionedAIThe king of painting models Stable Diffusion都是利用LAION-AestheticsTrained on this high-quality dataset.

CLIP+VQGAN 引领了全新一代 AI图像生成技术的风潮,All open source now TTI(Text to Image, text text to image)The introduction to the model will be correct Katherine Crowson 致谢,She's a whole new generationAIfounder of the painting model.

tech players aroundCLIP+VQGAN开始形成社区,代码不断有人做优化改进,还有TwitterAccounts are collected and published exclusivelyAI画作.  And the earliest practitioners Ryan Murdoch was also recruited intoAdobeWork as a Machine Learning Algorithm Engineer.

不过这一波AIThe players who draw waves are mainly stillAI技术爱好者.

Although and local deploymentAI开发环境相比, 在Golab Notebooks上跑CLIP+VQGANThe threshold is relatively low, 但毕竟在Colab申请GPURun the code and callAI输出图片,From time to time, you have to deal with code errors,This is not a popular people especially there is no technical background art creators can do. And this is now MidJourney This kind of zero-threshold foolAIPay the cause of the creative service puts glorious greatly.

But the exciting progress is far from over here. Attentive readers note, CLIP+VQGANThis powerful combination was released early last year and spread in small circles, 但AIPopularity of Painting, 如开篇所说, In the beginning, 由Disco DiffusionThis online service is detonated. It's been half a year here. What is the delay?

一个原因是CLIP+VQGAN The image generation part used by the model, 即GANClass model generation results are always unsatisfactory.

AIPeople noticed another way of generating images.

If you review itGAN模型的工作原理, Its image output is the internal generator and determinerPKcompromise results.

但还有另外一种思路, 那就是Diffusion模型(Diffusion model).

DiffusionThe word is also very high, But the basic principle can be understood by everyone, 其实就是"去噪点". 对, It's the mobile phone we are familiar with(Especially taking pictures at night)the automatic noise reduction function. If the calculation process of this denoising point is repeated, 在极端的情况下, Is it possible to restore a completely noisy picture to a clear picture??

Of course not by people, Simple denoising procedures are also not possible, 但是基于AIAbility to go aside"猜"denoise on one side, It is doable.

这就是DiffusionThe basic ideas of diffusion model.

DiffusionDiffusion models are now increasingly influential in computer vision,It enables efficient synthesis of visual data,Image generation completely beatsGAN模型, It has also shown great potential in other fields such as video generation and audio synthesis..

First known to the public earlier this yearAIpainting products Disco Diffusion,  It is the first based onCLIP + Diffusion Practical use of the modelAIpainting products.

但Disco DiffusionThe shortcomings are still somewhat obvious, as a professional artist Stijn Windig Tried it again and againDisco Diffusion,认为 Disco Diffusion does not replace the ability to manually create,核心原因有2点:

  • Disco Diffusion Unable to describe specific details,The rendered image is stunning at first glance,But a closer look reveals that most of them are vague generalizations,Not up to the level of commercial detail.
  • Disco DiffusionThe initial render time is in hours, Instead, we want to describe the details on the basis of the rendered image.,It is equivalent to redrawing the whole picture,Such a process takes time and effort,More than direct hand-painted.

不过 Stijn Windig 还是对AIThe development of the painting is optimistic,He felt that although direct use Disco Diffusion Commercial creation is not yet feasible,But it's still very good as a reference for inspiration: "……I find it works better as an idea generator.Give a text prompt,it returns some pictures that spark my imagination,and can be used as a sketch to paint on."

其实从技术上来说,   StijnTwo pain points raised, 1) AIPainting details are not deep enough, 2) 渲染时间过长, 实际上都是因为DiffusionAn Inherent Shortcoming of Diffusion Models, This is the iterative process of reverse denoising to generate pictures is very slow, Models are computed in pixel space,This results in a huge demand for computing time and memory resources, Extremely expensive when generating high resolution image.

(像素空间, a bit specialized, In fact, it means that the model directly calculates at the level of raw pixel information)

Therefore, for mass application-level platform products, This model cannot compute and mine more image details in a user-acceptable generation time, Even if the kind of drawing draft level, It also costs moneyDisco Diffusiontime in hours.

但无论如何, Disco DiffusionThe given painting quality, compared to all previousAIPainting the model, It's all crushing transcendence, And it is already a level of painting that most ordinary people can't reach., StijnThe nitpicking is just a request from the high point of human professional creation.

但是, StijnMy classmates may never have imagined, what he pointed outAITwo pain points in painting, It hasn't been a few months yet, 就被AIResearchers have solved it almost perfectly!

讲到这里, 当当当当, The most powerful in the world todayAIPainting the model Stable Diffusion终于闪亮登场了!

Stable Diffusion今年7月开始测试, It solves the above pain points very well.

实际上Stable Diffusion和之前的DiffusionDiffusion model compared to, The point is to do one thing, That is to put the computational space of the model, Mathematically transformed from pixel space, Descent to a space called latent space while preserving as much detail as possible(Latent Space)in low-dimensional space, Then do the heavy model training and image generation calculations.

这个"简单"transformation of ideas, How much impact did it have??

Latent space basedDiffusionModel and pixel spaceDiffusion模型相比, Dramatically reduces memory and computing requirements.比如Stable DiffusionThe latent space coding reduction factor used is8, Speaking of human words means that the length and width of the image are reduced8倍,  一个512x512The image of is directly transformed into the latent space64x64, 节省了8x8=64倍的内存!

这就是Stable DiffusionIs fast and good,  它能快速(以秒计算)generate a detailed512x512图像, Only need a consumer8GB 2060显卡即可!

Readers can simply calculate, Without this space compression conversion, 要实现Stable Diffusion Such a second-level image generation experience, you need one 8Gx64=512GSuper graphics card with video memory.  According to the development law of graphics card hardware, Consumer-grade graphics cards to reach this memory is probably8-10年后的事情.

而AIAn important iteration of an algorithm for researchers, 把10Years later we may enjoyAIThe results of the painting are brought directly to the computers of all ordinary users today!

So everyone is right nowAIIt's perfectly normal to be surprised by the progress of a painting, Because from last year to this year, AIThe technique of painting has indeed seen successive breakthroughs, 从CLIPThe model is trained based on a large number of Internet pictures that do not need to be labeled, 到CLIPCaused by open sourceAIPainting model grafting boom, 然后找到了DiffusionDiffusion models as better image generation modules,  Finally, the improved method of latent space dimensionality reduction is used to solve the problemDiffusionThe huge problem of model time and memory resource consumption... 这一切的一切, 让人目不暇接, 可以说AIpainting during the year, Changes are calculated in days!

而在这个过程中, The most happy is allAITech enthusiasts and art creators.Everyone witnessed the stagnation for many yearsAIPainting levels rocketed to the top. 毫无疑问, 这是AIA highlight in the history of development.

And for all ordinary users, 最开心的, Of course enjoy the use ofStable Diffusion或者MidJourneySuch top-notch paintingsAIGreat fun to generate professional-level paintings.

有趣的是, Stable DiffusionThe birth is also related to the two pioneers mentioned earlierKatherine Crowson 和Ryan Murdoch 有关.  they became a decentralized organizationAI开源研发团队EleutherAI的核心成员. Although the self-proclaimed grassroots team, 但EleutherAIin hyperscale oracle models andAIThe field of image generation is currently the leader of the open source team.

正是EleutherAISupported by the technical core teamStability.AI Founded in London, EnglandAI方案提供商. These ideal people get together, Based on these latestAIA breakthrough in painting technology,  launched today's most powerfulAIPainting the model  Stable Diffusion. 重要的是, Stable Diffusion按照承诺, 已经在8Fully open source!  This important open source allows the world'sAI学者和AITech enthusiasts are moved to tears. Stable Diffusion一经开源, always dominateGitHub热榜第一.

Stability.AICompletely fulfilled the homepage of its official website Slogan "AI by the people, for the people",  must give a big thumbs up.

The following picture is run by the author onlineStable Diffusion, 感谢开源! 话说这个AIThe generated Japanese man with a halo is quite handsome:)

顶级AIpainting modelPK: Stable Diffusion V.S. MidJourney

The author has already introduced in the previous articleMidJourney这个在线AI作画神器, Its biggest advantage is zero-threshold interaction and very good output results. Creators don't need any technical background to take advantage ofDiscord的MidJourney botConversational painting creation (恩, 当然, 全英文).

From the output style, MidJourneyIt is very obvious that some optimizations have been made for portraits,  After using too much, MidJourneyThe style tendency is also more obvious (作者在MidJourneyFirst-hand experience after trying various theme creations with hundreds of dollars of computing resources) , It's more delicate to say it nicely, 或者说, A little bit greasy.

而Stable Diffusion的作品, obviously more elegant, Be more artistic.

The following are the author's creation on these two platforms using the same text descriptionAIWork comparison.The reader might as well directly.

(注: The following generated paintings are fully copyrighted, Please indicate the source if reproduced separately)

Stable Diffusion(左) V.S. MidJourney(右) :

树屋

diesel punk city

World of Warcraft main city Orgrimmar

Armor wolf knight

Granblue Fantasy style comic girl

Romantic Realism Beauty Oil Painting (Style reference Daniel·Gorharts, 美国画家)

Labyrinth of old city buildings with narrow walkways

Which style is better? 其实萝卜青菜各有所爱..

Because had targeted optimization, If you want to create portrait pictures or sugar water style pictures, useMidJourney更方便. But after compared the more plates, 作者认为Stable DiffusionStill clearly superior, Whether in terms of artistic expression or the diversity of style changes.

不过, MidJourneyThe iterations in the past few months are obvious to all(毕竟是付费服务, Very profitable and motivated), 加上Stable Diffusion的完全开源, It is expected that the relevant technical advantages will be absorbed into theMidJourney.  而另一方面, Stable DiffusionModel training is still in progress, we can very much look forward to, 未来版本的Stable DiffusionThe model will also go a step further.

For all creator users, It's all a good thing.

AIWhat breakthroughs in painting mean for humanity

2022年的AI领域, image based textAIPainting models are the protagonists of the limelight. 从2月份的Disco Diffusion开始, 4月 DALL-E 2和MidJourneyInvitation to beta,  5月和6月GoogleTwo models are releasedImagen 和Parti (No open beta, only papers, Feels slightly watery), 然后7月底, Stable Diffusion横空出世...

真的让人眼花缭乱. Don't blame the author for his emotion in the previous article, Why not pay attentionAIThe level of painting has advanced by leaps and bounds, 事实上, Indeed, in this year and a half, AIpainting revolutionized, It can even be said that the breakthrough progress that will be remembered in history.

And in the next time, AI绘画, 或者更广泛的, AIGenerate content areas(图像, 声音,  视频, 3D内容等...)还会发生什么, Let a person full of imagination and expectation.

But don't wait for the future, experienced the presentStable Diffusion state-of-the-artAIThe artistic height that the painting model can reach, We can almost confirm, "想象力"和"创造力"These two words that were once full of mysticism, It is also the last pride of mankind, In fact, it can also be deconstructed by technology.

For proponents of the divine supremacy of the human soul, 当今AICreativity shown by painted models, a ruthless blow to faith. 所谓灵感, 创造力, 想象力,These full of divine word, 即将(或者已经)Supercomputed+大数据+The powerful combination of mathematical models is ruthlessly slapped in the face.

事实上, 类似Stable Diffusion这种AIA core idea of ​​generative model, Or a lot of deep learningAIThe core idea of ​​the model, content created by humans, Represented as a high-dimensional or low dimensional mathematical a vector space(更简单的理解, 一串数字). 如果这个"内容->向量"The conversion design is reasonable enough,  Then all human creations can be represented as partial vectors in a certain mathematical space.. and other vectors that exist in this infinite mathematical space, It is those that theoretically humans might create, but not yet created. by reverse"向量->内容"的转换, The content that has not been created isAIexcavated.

这正是目前MidJourney, Stable DiffusionThese latestAIwhat the painting model does. AIIt can be said to be creating new content, It can also be said to be a porter of new paintings.. AIThe resulting new paintings have always existed objectively in the mathematical sense, 只是被AIin a smart way, Restored from mathematical space, 而已.

"文章本天成, 妙手偶得之".

This sentence is very appropriate here. 这"天", is that infinite mathematical space; 而这"手", 从人类,换成了AI.

Mathematics is the supreme law of the world:)

目前最新AI绘画的"创造力"Started to catch up or even almost equal to humans, This may be a further blow to human dignity, Start with Go AlphaGo, 人类在"智慧"The dignified territory of this point is getting smaller and smaller, 而AIThe breakthrough in painting further put human beings"想像力"和"创造力"His dignity was shattered -- Maybe haven't completely broken, But already full of cracks and crumbling.

The author has always maintained a neutral view on the development of human science and technology:  Although we hope that technology will make human life better, But in fact, just like the invention of the nuclear bomb, The emergence of some science and technology is neutral, 也可能是致命的. Superhumans that completely replace humansAIIn practice it seems to be an increasingly likely thing. Humans need to think about, 在不太远的将来, We face in all areasAIwhen they all run away, How to maintain dominance of the world.

A friend is right, 如果AIFinally learned to write code -- What seems to be no inevitable barriers to prevent it happen -- 那么电影<终结者>The story may be about to happen. If this is too pessimistic, Then humans at least consider, How to have a relationship with someone who is beyond all their wit and creativityAIGet along with the world.

当然咯, From an optimistic point of view, The world in the future will only be better: 人类通过AR/VRAccess to a unified or personal metaverse, Human masters only need to move their lips, 无所不能的AIAssistant can automatically generate content on demand, even directly generate stories that can be experienced by humans/游戏/虚拟生活.

This is a better Inception space, Still a better Matrix? (笑)

无论如何, What we witnessed todayAIThe breakthrough and transcendence of painting ability, It is the first step on this road of no return:)

end with a digression.  Although it hasn't appeared yet, But it should be within two years, 我们可以直接让AIGenerate a complete novel in the specified style, especially those that are typed, 比如<斗破苍穹>,<凡人修仙传>Such a fantasy novel, 还可以指定长度, Specify the number of heroines, Specifies the plot preference, Specify the level of sadness and passion, 甚至xx程度, AI一键生成 :)

It is not entirely a fable, 考虑到AIPainting has developed at a rocket-like pace this year, The author even feels that this day is just around the corner.

目前还没有AIModels can generate long-form literary content that is sufficiently compelling and logical, 但从AIJudging from the aggressive development trend of painting models, 不久的将来AIGenerating high-quality genre literature is almost a certainty, In theory, there is no doubt.

Saying this may hit those hard-to-code web writers, But as a technophile and a fan of fantasy novels, The author is still looking forward to this day... Since then no longer need to push more, There is no need to worry about the writing status of the serial author.; 更美好的是, If you feel uncomfortable seeing half of it, You can also let it go at any timeAIAdjust the direction of subsequent plots to regenerate and continue watching...

If you're not sure that a day like this is coming, 我们可以求同存异, 一起等待.

Finally share a group of authors withstable diffusionGenerated details are completely different, The style is exactly the same, The quality will always remain full"Urban labyrinth old building district with narrow walkways"系列. Look at these beautifullyAI作品, The author has only one feeling, AI创作有"灵魂"了, I don't know readers, 是否有同感? :)

copyright notice
author[see Metaverse],Please bring the original link to reprint, thank you.
https://en.netfreeman.com/2022/264/202209211326213211.html

Random recommended