"Umatshini unokuba nememori enkulu, kodwa awukwazi ukucinga - ngaphandle kokuba siyawufundisa." – Alan Turing. Imifuziselo ye-AI yanamhlanje efana ne -GPT-4 okanye iLlama isekwe kwiiseti ezinkulu zedatha kunye nezakhiwo ezintsonkothileyo zemathematika. Kodwa yintoni kanye kanye ebangela oko? Kweli nqaku, sijonga amacandelo aphambili afunekayo ukwakha imodeli yolwimi ukusuka ekuqaleni.
IiModeli zoLwimi olukhulu (LLMs) luthungelwano lwe-neural oluqeqeshwe ngezixa ezikhulu zesicatshulwa. Amandla abo alele ekukwazini ukuvelisa isicatshulwa esifana nomntu, ukushwankathela umxholo kunye nekhowudi yokubhala. Ingundoqo yale mizekelo yi- architecture yeTransformer , eyenza ukuba babambe ukuxhomekeka ngaphakathi kweetekisi kwaye benze uqikelelo lweemeko.
Ubunzima obulinganisiweyo buvumela imodeli ukuba icinezelwe ngakumbi kancinci, yoyise ngempumelelo imiqobo yehardware. I-distillation yolwazi iphinda inciphise ubungakanani bemodeli: imodeli enkulu idlulisela ulwazi lwayo kwi-compact compact. Ukuthena kususa iiparamitha ezingafunekiyo, okukhokelela kulwakhiwo olubhityileyo, olusebenzayo ngaphandle kokuncama ukuchaneka.
Umntu unokusebenzisa iMasked Language Modeling ukwandisa ubunzulu besemantic. Imodeli iphinda iqulunqe izicatshulwa ezingaphelelanga kwaye ngaloo ndlela iqonda amagama angqale kushishino. Ngokunjalo, I-Next Word Prediction ingasetyenziswa kulwimi lobugcisa oluthe ngqo kwishishini. Phambi kokuba imodeli iqeqeshwe, okubhaliweyo kufuneka kuguqulwe kuhlobo olunokuthi luqondwe uthungelwano lwe-neural ngophawu , olufakelweyo , kunye ne -byte pair encoding .
Ukuhlawulela ukungabikho kwedatha yoqeqesho oluthe ngqo kwishishini, ukudluliselwa kokufundwa kunye nokwandiswa kwedatha eyenziweyo kusetyenziswa. Iimodyuli ezixhasayo ezixhasayo kunye nolungiso olulungisiweyo luzilungelelanisa nedatha ethe ngqo kushishino. Into ebalulekileyo yeemodeli ze-transformer yindlela yokuziqwalasela . Umqondiso ngamnye unobunzima ngokunxulumene nazo zonke ezinye iimpawu kwisivakalisi, nto leyo eyenza ukuba umxholo wegama ucace ngakumbi.
Umzekelo, isivakalisi esifana nokuthi “Ikati yatsibela etafileni kuba ilambile” sinokuthetha ukuba “yena” yikati. Imodeli iqaphela unxibelelwano olunjalo ngokunika ukubaluleka kwigama ngalinye. Oku kuyinceda iwuqonde ngcono umxholo. Inkqubo yenza ukuba imodeli ifunde uxhomekeko oluntsonkothileyo kunye neentsingiselo zesemantiki kwisicatshulwa.
Iimodeli eziqeqeshwe kwangaphambili zidibanisa ulwazi lwangaphakathi. Le ndibaniselwano yonyusa iyantlukwano yedatha kwaye yenza umgangatho ophezulu wemodeli nangona iiseti zedatha zasekhaya ezinyiniweyo. Ukusebenza kweemodeli ze-AI kuvavanywa kusetyenziswa iimethrikhi ezithile: I-Weighted-F1 kunye ne -Perplexity ilinganisa umgangatho wemisebenzi yokucubungula umbhalo, ngelixa ixesha lokuphendula kunye nesantya sempazamo kubonisa ngokucacileyo ukufaneleka okusebenzayo.
Ukulungelelaniswa ngokuqhubekayo kwiinkqubo zokulawula eziguquguqukayo kuphunyezwa ngokufunda komnyanzelo , othi, umzekelo, udibanise izikhokelo zokukhuselwa kwedatha ngokuthe ngqo kwimodeli ye-AI usebenzisa ubumfihlo obahlukileyo . Isethi yemithetho eguquguqukayo kunye ne-domain-specific-specific-tuning process ivumela ukuba siphendule kwimimiselo emitsha ngokuguquguqukayo kwaye ngokukhawuleza.
Inyathelo lokuqala kwinkqubo yoqeqesho lwemodeli yolwimi luqeqesho lwangaphambili . Imodeli yondliwa ngezixa ezikhulu zesicatshulwa esingacwangciswanga ukuze kufundwe iipateni zolwimi ngokubanzi, izakhi zezivakalisi kunye neentsingiselo zamagama. Ngexesha lale nkqubo, imodeli izama ukuqikelela amagama alandelayo kwisivakalisi ngaphandle kokugxila kumsebenzi othile. Oku kudala uhlobo lokuqondwa kolwimi jikelele.
Ukulungiswa kakuhle linyathelo lesibini apho imodeli eqeqeshwe kwangaphambili ikhethekileyo kumsebenzi othile. Iqeqeshelwa ngeeseti ezincinci, ezithe ngqo ngakumbi, umzekelo ukuphendula imibuzo yabathengi, ukwahlula iitekisi okanye ukwenza izishwankathelo. Ukulungiswa kakuhle kuqinisekisa ukuba imodeli inika iimpendulo ezichanekileyo kunye neemeko zendawo yesicelo esichaziweyo.
Uqeqesho lwe-LLM lufuna amandla aphezulu ekhompyuter. Ukwenza inkqubo isebenze ngakumbi, iindlela ezahlukeneyo zokuphucula zingasetyenziswa. Oku kukuvumela ukuba ugcine ubunzima bemodeli kwaye ulayishe kamva okanye ukhuphele ukuqeqeshwa kwangaphambili, iiparamitha ezipapashwe. I-LoRA (I-Low-Rank Adaptation) ikwasetyenziselwa ukulungiswa kakuhle ngomzamo omncinci wokubala.
I-loop yokufunda kwi-intanethi isetyenziselwa uphuhliso oluqhubekayo kunye nokulungelelaniswa neziphumo ezintsha kunye neemfuno. Oku kubeka iliso rhoqo imodeli yokusebenza, ukuhlalutya idatha entsha kunye nengxelo yomsebenzisi, kwaye ilungelelanise imodeli ngokuzenzekelayo ukuba kuyimfuneko. Ukukhuselwa kwedatha kunye nokusebenza kakuhle kuqinisekiswa ngeendlela ezahlukeneyo zabucala kunye nokususwa koqhagamshelwano olungeyomfuneko .
Umbhalo wePython ocwangciswe ngokukodwa unokuqeqesha imodeli yolwimi ngokufanelekileyo. Inokulayisha kwakhona ubunzima bangaphandle ukusuka kwimodeli eqeqeshwe kwangaphambili. Imodeli ilungiselelwe umsebenzi othile ngokuyihlengahlengisa kwidatha ethile. Emva kokuba uqeqesho lugqityiwe, iskripthi sigcina iintsimbi ezihlaziyiweyo ukuze zifumaneke ukuze zisetyenziswe kwixesha elizayo.
a0aa20559d62cebe2e1991af1d9d15e0
Iimodeli zolwimi sele ziguqule amashishini amaninzi, ukusuka kwinkonzo yabathengi ukuya ekudalweni komxholo. Ngoqeqesho olujoliswe ngaphambili kunye nokulungiswa kakuhle, iimodeli zinokuhlengahlengiswa kwimisebenzi eyahlukeneyo. Abo baphuhlisa ukuqonda okunzulu kwezi nkqubo banokuzenzela ezabo izisombululo ze-AI ezilungiselelweyo kwaye balolonge inkqubela phambili yetekhnoloji.