Ukugenca kwe-Google Translation API

Njengengxenye ye-Google Cloud yayo, i-Google inikeza i- Google Translation API ngesakhiwo sezindleko ezisuselwa ekusetshenzisweni. Kukhona futhi i- API engenamibhalo engasetshenziswa ngaphandle kokhiye , kepha enqaba ukusebenza ngemuva kwezicelo ezimbalwa nje. Lapho usebenzisa umsebenzi wokuhumusha iwebhusayithi weGoogle Chrome, kuyabonakala ukuthi amakhasi angahunyushwa ngekhwalithi enhle kakhulu ngaphandle komkhawulo obonakalayo.


Ngokusobala imodeli ye-nmt ethuthukile isivele isetshenziswa lapha. Kepha iyiphi i-Google esebenzisa i-Chrome ngaphakathi ukuhumusha okuqukethwe futhi ingabe le API nayo ingabhekiswa ngqo - noma ngasohlangothini lweseva? Ukuhlaziya ithrafikhi yenethiwekhi, amathuluzi afana neWireshark noma iTelerik Fiddler , nawo angahlaziya ithrafikhi ebethelwe, ayanconywa. Kepha i-Chrome iletha nezicelo ezizithumela ukuhunyushwa kwekhasi mahhala : Zingabukwa kalula kusetshenziswa i- Chrome DevTools:

Uma wenza ukuhumusha, bese ubamba isicelo esibalulekile se-POST ku- https://translate.googleapis.com nge- "Copy> Copy as cURL (bash)" bese usisebenzisa kuthuluzi elifana nePostman , ngokwesibonelo, ungathumela isicelo futhi ngaphandle kwezinkinga:

Incazelo yamapharamitha we-URL nayo isobala kakhulu:

UkhiyeInani lesiboneloIncazelo
Anno3Imodi yesichasiselo (ithinta ifomethi yokubuyisa)
iklayentite_libImininingwane yeklayenti (iyahlukahluka, inani "yi-webapp" ngokusebenzisa isikhombimsebenzisi sewebhu se-Google Translate; kunomthelela kufomethi yokubuyisa nokukhawulelwa kwesilinganiso)
ifomethihtmlString format (kubalulekile ekuhumusheni amathegi e-HTML)
v1.0Inombolo yenguqulo ye-Google Translate
ukhiyeAIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgwUkhiye we-API (bona ngezansi)
logldvTE_20200210_00Uhlobo lwesivumelwano
sldeUlimi lomthombo
tlzuUlimi oluqondiwe
spnmtImodeli ye-ML
i-tc1akwaziwa
sr1akwaziwa
tk709408.812158Ithokheni (bheka ngezansi)
Imfashini1akwaziwa

Ezinye izihloko zezicelo nazo zisethwe - kepha lezi zinganakwa kakhulu. Ngemuva kokukhetha ngesandla wonke amaheda, kufaka phakathi lawo avela kumenzeli womsebenzisi , kutholakala inkinga yokufaka ikhodi lapho kufakwa izinhlamvu ezikhethekile (lapha lapho kuhunyushwa " Sawubona Umhlaba "):

Uma uphinde uvule i-ejenti yomsebenzisi (lokho ngokuvamile akulimazi), i-API iletha izinhlamvu ezifakiwe ze-UTF-8:

Ngabe sesivele sikhona futhi sinalo lonke ulwazi lokusebenzisa le-API ngaphandle kwe-Google Chrome? Uma uguqula intambo ukuthi ihunyushwe (inkambu yedatha q yesicelo se -POST) kusuka, ngokwesibonelo, “Sawubona mhlaba” uye ku- “Sawubona mhlaba ! “, Sithola umyalezo wephutha:

Manje sesihumushela le eguquliwe futhi ngaphakathi kweGoogle Chrome sisebenzisa umsebenzi wokuhumusha iwebhusayithi futhi sithola ukuthi, ngaphezu kwepharamitha q , ipharamitha tk nayo isishintshile (yonke eminye imingcele ihleli injalo):

Ngokusobala, kuyisibonakaliso esincike entanjeni, ukwakheka kwayo okungelula ukukubona. Lapho uqala ukuhumusha iwebhusayithi, amafayela alandelayo alayishiwe:

  • Ifayela le-1 CSS: translateelement.css
  • Imidwebo engu- 4: translate_24dp.png (2x), gen204 (2x)
  • Amafayela we-2 JS: main_de.js , element_main.js

Amafayela amabili weJavaScript ahlanganisiwe futhi enziwe minified. Amathuluzi afana ne- JS Nice nama- de4js manje ayasisiza ukwenza la mafayili afundeke kakhudlwana. Ukuze uzisombulule bukhoma, sincoma i-Chrome Extension Requestly, elungisa imigqa yamafayela akude endizeni:

Manje sesingasusa iphutha ikhodi (ama- CORES kufanele aqale asebenze kuseva yendawo). Isigaba sekhodi esifanele sokwenza ithokheni sibonakala sifihliwe kulesi sigaba kufayela le- element_main.js:

b7739bf50b2edcf636c43a8f8910def9

Lapha umbhalo hashed ngosizo abanye ukuqhela kancane . Kepha ngeshwa sisaphuthelwa ucezu olulodwa lwephazili: Ngaphezu kwempikiswano a ( okuwumbhalo ozohunyushwa ), enye ingxabano b idluliselwa emsebenzini Bp () - uhlobo lwembewu olubonakala luguquka ngezikhathi ezithile futhi olufaka futhi ugelezela ku-hashing. Kodwa uvelaphi? Uma sifinyelela ocingweni lomsebenzi we- Bp () , sithola isigaba sekhodi esilandelayo:

b7739bf50b2edcf636c43a8f8910def9

Umsebenzi Hq umenyezelwe ngaphambili ngale ndlela elandelayo:

b7739bf50b2edcf636c43a8f8910def9

Lapha iDeobfuscater yashiya udoti othile; Ngemuva kokuthi singene esikhundleni seString.fromCharCode ('...') ngentambo yezinhlamvu ezifanele, susa okungasasebenzi a () bese uhlanganisa izingcingo zomsebenzi [c (), c ()] , umphumela:

b7739bf50b2edcf636c43a8f8910def9

Noma kulula:

b7739bf50b2edcf636c43a8f8910def9

Umsebenzi yq ngaphambilini wawuchazwa ngokuthi:

b7739bf50b2edcf636c43a8f8910def9

Imbewu kubukeka sengathi isentweni yomhlaba google.translate._const._ctkk , etholakala ngesikhathi sokusebenza. Kepha kubekephi? Kokunye, ifayela le-JS ebelilayishwe ngaphambilini main_de.js, okungenani liyatholakala ekuqaleni. Sifaka okulandelayo ekuqaleni:

b7739bf50b2edcf636c43a8f8910def9

Ku-console empeleni sithola imbewu yamanje:

Lokhu kushiya iGoogle Chrome uqobo, okusobala ukuthi inikeza imbewu, njengenketho yokugcina. Ngenhlanhla, ikhodi yayo yomthombo (i-Chromium, kufaka phakathi ingxenye ye-Translate) ingumthombo ovulekile ngakho-ke itholakala esidlangalaleni. Thina donsa Igumbi endaweni futhi uthole ucingo ukuze umsebenzi TranslateScript :: GetTranslateScriptURL efayeleni translate_script.cc e yezingxenye / ukuhumusha / umongo / ifolda isiphequluli:

b7739bf50b2edcf636c43a8f8910def9

Ukuhluka nge-URL kuchazwe kanzima kufayela elifanayo:

b7739bf50b2edcf636c43a8f8910def9

Uma manje sihlola ifayili le- element.js ngokuseduze (ngemuva kokulikhipha futhi), sithola okuhleliwe okunzima c._ctkk - into yegoogle.translate nayo isethwe ngokufanele futhi ukulayishwa kwazo zonke izimpahla ezifanele ( esesizitholile phambilini) kuyasuswa:

b7739bf50b2edcf636c43a8f8910def9

Manje ukhiye wepharamitha uhlala usacatshangelwa (ngenani AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw). Lokho kubonakala kungukhiye we-API ojwayelekile wesiphequluli (ongatholwa kweminye imiphumela ye-Google ). Isethwe ku-Chromium kufayela translate_url_util.cc kuzinto zefolda / translate / core / browser:

b7739bf50b2edcf636c43a8f8910def9

Ukhiye wenziwa kugoogle_apis / google_api_keys.cc kusuka kunani ledummy:

b7739bf50b2edcf636c43a8f8910def9

Kodwa-ke, ukuhlolwa kukhombisa ukuthi izingcingo ze-API zisebenza ngokufanayo ngaphandle kwale paramitha yokhiye. Uma uzama i-API, uzothola ikhodi yesimo engu- 200 emuva uma uphumelela. Uma ngabe uhlangabezana nomkhawulo, uthola ikhodi yesimo engu- 411 ngomyalezo othi " POST izicelo zidinga unhlokwana wobude bokuqukethwe ". Ngakho-ke kuyalulekwa ukufaka le nhloko ( esethwe ngokuzenzekelayo njengesihloko sesikhashana ePostman).

Ifomethi yokubuyisa yezintambo ezihunyushiwe ayijwayelekile uma kunemisho eminingana esicelweni esisodwa. Imisho ngayinye iboshwe ngamathegi i- / b-HTML:

Futhi, i-Google Chrome ayithumeli i-HTML yonke ku-API, kepha igcina amanani emfanelo afana ne- href esicelweni (futhi esikhundleni salokho isetha ama-indices ukuze amaki amaki anikezwe ohlangothini lweklayenti kamuva):

Uma uguqula inani leklayenti lokhiye le-POST kusuka ku- te_lib (i-Google Chrome) ku- webapp ( iwebhusayithi ye-Google Translation ), uthola intambo yokugcina ehunyushiwe:

Inkinga ukuthi usemathubeni amaningi okungena kumkhawulo wamanani kune- te_lib (ngokuqhathanisa: nge- webapp lokhu kufinyelelwe ngemuva kwama-chars angama-40,000, nge te_lib akunamkhawulo wokulinganiselwa). Ngakho-ke sidinga ukubhekisisa ukuthi i-Chrome iwudlulisa kanjani umphumela. Sizoyithola lapha ku- element_main.js:

b7739bf50b2edcf636c43a8f8910def9

Uma uthumela yonke ikhodi ye-HTML ku-API, ishiya izimfanelo empendulweni ehunyushiwe. Ngakho-ke akudingekile ukuba silingise yonke indlela yokuziphatha yokuhlaziya, kodwa sikhiphe iyunithi yezinhlamvu yokugcina, ehunyushiwe empendulweni. Ukuze senze lokhu, sakha umhlaseli wethegi ye-HTML encane elahla omaka be-<i> abangaphandle kakhulu okuhlanganisa nokuqukethwe kwabo futhi isuse omaka be-<b> abangaphandle kakhulu. Ngalolu lwazi singakwazi manje (ngemuva kokufaka ukuncika nomqambi kudinga i-fzaninotto / faker vielhuber / stringhelper ) ukwakha inguqulo ye-server-side ye-API yokuhumusha:

b7739bf50b2edcf636c43a8f8910def9

Okulandelayo imiphumela yokuhlolwa kokuqala okwenziwe ezinhlelweni ezinhlanu ezihlukene ezinama-bandwidth ahlukene namakheli e-IP:

UhlamvuIzinhlamvu ngesicelo ngasinyeIsikhathiIsilinganiso sephuthaIzindleko nge-API esemthethweni
13.064.662~25003: 36: 17h0%237,78€
24.530.510~25011: 09: 13h0%446,46€
49.060.211~25020: 39: 10h0%892,90€
99.074.487~100061: 24: 37h0%1803,16€
99.072.896~100062: 22: 20h0%1803,13€
4284.802.766~ Ø550159: 11: 37h0%51 € 5183.41

Qaphela: Lokhu okuthunyelwe kwebhulogi kufaka phakathi yonke imibhalo kubhalelwe izinjongo zokuhlola kuphela. Ungayisebenzisi scripts ukusetshenziswa ekhiqizayo, esikhundleni sisebenze esemthethweni kwe-Google-API .

Emuva