"Umshini ungaba nenkumbulo enkulu, kodwa awukwazi ukucabanga - ngaphandle kokuthi sikufundise." – Alan Turing. Amamodeli esimanje e-AI afana ne -GPT-4 noma i-Llama asekelwe kumasethi amakhulu wedatha nezakhiwo eziyinkimbinkimbi zezibalo. Kodwa yini ngempela engemuva kwakho? Kulesi sihloko, sibheka izingxenye ezibalulekile ezidingekayo ukuze kwakhiwe imodeli yolimi kusukela ekuqaleni.
Amamodeli Olimi Amakhulu (LLMs) amanethiwekhi emizwa aqeqeshwe ngenani elikhulu lombhalo. Amandla abo asekhonweni lokukhiqiza umbhalo ofana nomuntu, ukufingqa okuqukethwe nokubhala ikhodi. Umnyombo walawa mamodeli ukwakhiwa kwe-Transformer , okubenza bakwazi ukuthwebula ukuncika ngaphakathi kwemibhalo futhi benze izibikezelo zesimo.
Izisindo ezilinganisiwe zivumela imodeli ukuthi icindezelwe kancane kancane, inqobe ngempumelelo izithiyo zehadiwe. I-distillation yolwazi kunciphisa usayizi wemodeli: imodeli enkulu idlulisela ulwazi lwayo kolunye uhlobo oluhlangene. Ukuthena kususa amapharamitha angasebenzi, okuholela ekwakhiweni okuthambile, okusebenzayo ngaphandle kokudela ukunemba.
Umuntu angasebenzisa i-Masked Language Modelling ukuze akhulise ukujula kwe-semantic. Imodeli yakha kabusha imibhalo engaphelele futhi ngaleyo ndlela iqonda amagama aqondene nomkhakha othile. Ngokunjalo, I-Next Word Prediction ingasetshenziselwa ulimi lobuchwepheshe oluqondene nemboni. Ngaphambi kokuthi imodeli iqeqeshwe, umbhalo kufanele uguqulelwe kufomu amanethiwekhi e-neural angaliqonda ngokwenza amathokheni , ukushumeka , kanye nombhalo wekhodi wokubhanqwa kwebhayithi .
Ukuze kunxeshezelwe ukuntuleka kwedatha yokuqeqeshwa eqondene nemboni, ukufunda kokudlulisa kanye nokwengezwa kwedatha yokwenziwa kuyasetshenziswa. Amamojula we-feedforward ancike kakhulu kanye nokushumeka okuthuthukisiwe kuzivumelanisa nedatha eqondene nomkhakha othile. Isici esibalulekile samamodeli we-transformer indlela yokuzinaka . Ithokheni ngayinye inesisindo maqondana nawo wonke amanye amathokheni emshweni, okwenza umongo wegama ucace kakhudlwana.
Isibonelo, umusho onjengokuthi “Ikati lagxumela etafuleni ngoba lilambile” ungasho ukuthi “lona” ikati. Imodeli ibona ukuxhumana okunjalo ngokunikeza ukubaluleka kwegama ngalinye. Lokhu kuyisiza ukuthi iqonde kangcono umongo. Indlela yokwenza yenza imodeli ifunde ukuncika okuyinkimbinkimbi nezincazelo zesemantiki ngaphakathi kombhalo.
Amamodeli aqeqeshwe kusengaphambili ahlanganisa ulwazi lwangaphakathi. Le nhlanganisela inyusa ukuhlukahluka kwedatha futhi inika amandla imodeli yekhwalithi ephezulu naphezu kwamasethi edatha yendawo elinganiselwe. Ukusebenza kwamamodeli we-AI kuhlolwa kusetshenziswa amamethrikhi athile: I-Weighted-F1 kanye ne -Perplexity ikala ikhwalithi yemisebenzi yokucubungula umbhalo, kuyilapho isikhathi sokuphendula nezinga lephutha kumelela ngokusobala ukufaneleka okungokoqobo.
Ukuzijwayeza okuqhubekayo kuzinhlaka zokulawula eziguquguqukayo kufinyelelwa ngokufunda okuyizingqinamba , okuthi, ngokwesibonelo, kudidiyelwe imihlahlandlela yokuvikela idatha kumodeli ye-AI kusetshenziswa ubumfihlo obuhlukile . Isethi yemithetho eguquguqukayo kanye nezinqubo zokushuna kahle eziqondene nesizinda zisivumela ukuthi siphendule kumithetho emisha ngokuguquguquka nangokushesha.
Isinyathelo sokuqala senqubo yokuqeqeshwa kwemodeli yolimi ukuqeqeshwa kwangaphambili . Imodeli idliswa ngenani elikhulu letheksthi engahlelekile ukuze kufundwe amaphethini olimi ajwayelekile, izakhiwo zemisho nezincazelo zamagama. Phakathi nale nqubo, imodeli izama ukubikezela amagama alandelayo emshweni ngaphandle kokugxila emsebenzini othile. Lokhu kudala uhlobo lokuqonda ulimi jikelele.
Ukuhlela kahle kuyisinyathelo sesibili lapho imodeli eqeqeshwe ngaphambilini ikhethekile ngomsebenzi othile. Iqeqeshelwa ngamasethi edatha amancane, acaciswe kakhulu, isibonelo ukuphendula imibuzo yamakhasimende, ukuhlukanisa imibhalo noma ukwenza izifinyezo. Ukucushwa kahle kuqinisekisa ukuthi imodeli inikeza izimpendulo ezinembayo nezinomongo endaweni yohlelo echaziwe.
Ukuqeqesha i-LLM kudinga amandla aphezulu ekhompyutha. Ukwenza inqubo isebenze kahle, kungasetshenziswa izindlela ezahlukahlukene zokuthuthukisa. Lokhu kukuvumela ukuthi ulondoloze izisindo zemodeli futhi uzilayishe kamuva noma ulande amapharamitha aqeqeshwe ngaphambilini, ashicilelwe. I-LoRA (Low-Rank Adaptation) nayo isetshenziselwa ukulungisa kahle ngomzamo omncane wokubala.
I-loop yokufunda eku-inthanethi isetshenziselwa ukuthuthukiswa okuqhubekayo nokuzivumelanisa nokutholakele okusha kanye nezidingo. Lokhu kuqapha ngokuqhubekayo ukusebenza kwemodeli, kuhlaziya idatha entsha nempendulo yomsebenzisi, futhi kulungisa ngokuzenzakalelayo imodeli uma kudingeka. Ukuvikelwa kwedatha nokusebenza kahle kuqinisekiswa ngezindlela zobumfihlo ezihlukene kanye nokususwa kokuxhumana okungadingekile .
Umbhalo wePython ohlelwe ngokukhethekile ungaqeqesha kahle imodeli yolimi. Ingakwazi futhi ukulayisha izisindo zangaphandle kusuka kumodeli eqeqeshwe ngaphambilini. Imodeli ithuthukiselwe umsebenzi othile ngokuyivumelanisa nedatha ethile. Ngemuva kokuthi ukuqeqeshwa sekuqediwe, umbhalo ugcina izisindo ezibuyekeziwe ukuze zitholakale ukuze zisetshenziswe esikhathini esizayo.
a0aa20559d62cebe2e1991af1d9d15e0
Amamodeli olimi asevele eguqule izimboni eziningi, kusukela ezinsizakalweni zamakhasimende kuya ekudaleni okuqukethwe. Ngokuqeqeshwa kwangaphambili okuhlosiwe nokulungisa kahle, amamodeli angashintshwa ukuze enze imisebenzi eyahlukene. Labo abathuthukisa ukuqonda okujulile kwalezi zinqubo bangakha izixazululo zabo ze-AI ezenziwe ngokwezifiso futhi balolonge ngenkuthalo inqubekelaphambili yezobuchwepheshe.