Ngresiki riwayat Git bagean 2

Data sensitif utawa konsumsi memori sing akeh banget: Ana alasan sing apik kanggo ngganti riwayat Git. Ing kirim blog iki , aku nerangake carane ngresiki file saka riwayat Git nggunakake BFG . Titik lemah BFG yaiku kekurangan dhukungan kanggo jalur langsung , mula sampeyan ora bisa mbusak file utawa folder ing subfolder saka sejarah. Kanthi mangkono, wektune golek solusi alternatif.


Saliyane cabang filter git sing ora dianjurake sacara resmi , git-filter-repo minangka salah sawijining alat kanggo ngresiki sejarah. Sawise instalasi cendhak, kita pisanan nganalisa repositori lan nemokake, contone, folder paling gedhe ing sejarah:

git filter-repo --analyze

Inggih ing folder .git/filter-repo/analysis kui kabeh limo file TXT:

  • directories-all-sizes.txt
  • extensions-all-sizes.txt
  • path-all-sizes.txt
  • ...

Iku worth file directories-all-sizes.txt njupuk dipikir nyedhaki:

=== All directories by reverse size ===

Format: unpacked size, packed size, date deleted, directory name

  4624417043 3796607988 <present> <toplevel>
  4475940396 3778033787 <present> wp-content
  4060236681 3694449320 <present> wp-content/uploads
   305163809   70576241 <present> wp-content/plugins
   123818107   15442735 <present> wp-includes
...

Asring kedadeyan yen sampeyan wis suwe ora nggatekake lan mbusak saka data HEAD ing sejarah (contone, folder media WordPress wp-content/uploads/ utawa sengaja di-push node_modules- utawa vendor- Pengikat).

Umume nyaranake git-filter-repo sawise reresik, push menyang anyar, gudang kosong. Ana akeh alasan sing kadhaptar ing kene, kok iki ndadekake pangertèn lan ngindari akeh masalah. Nanging, bisa kedadeyan yen sampeyan pengin push menyang repositori sing padha lan bisa uga kanthi sawetara petunjuk.

Sing penting, platform hosting kode utama GitHub lan GitLab nyaranake pendekatan beda, sawetara kang beda-beda saka saben liyane. Contone, ing GitHub kita mbusak wp-content/uploads/ nggunakake langkah-langkah ing ngisor iki git-filter-repo saka sajarah:

mkdir tmp-repo
cd tmp-repo
git clone git@github.com:foo/bar.git .
cp .git/config /tmp/config-backup
git filter-repo --invert-paths --path wp-content/uploads/
# option 1: same repo
  mv /tmp/config-backup .git/config
  git push origin --force --all
# option 2: new repo
  git remote add origin git@github.com:foo/bar-new.git
  git push origin --force --all
cd ..
rm -rf tmp-repo

Saiki kita uga bisa mriksa ukuran jarak jauh (ngganti ukuran liwat API lan ing UI bisa nganti 24 jam). Kanggo nindakake iki, bukak setelan repositori (yen repositori kasebut kalebu organisasi, sampeyan kudu nambah akun sampeyan dhewe ing organisasi kasebut). Saiki kita ndeleng ukuran:

GitHub: ruang disk sadurunge ngresiki
GitHub: ruang disk sawise ngresiki

Prosedur kasebut rada beda ing GitLab:

mkdir tmp-repo
cd tmp-repo
# option 1: same repo
  # Settings > General > Advanced > Export project > download tar.gz file into tmp-repo
  tar xzf 20*.tar.gz
  git clone --bare --mirror project.bundle
  cd project.git
  git filter-repo --invert-paths --path wp-content/uploads/
  cp ./filter-repo/commit-map /tmp/commit-map-1
  # copying the commit-map has to be done after every single command from git filter-repo
  # you need the commit-map files later
  git remote remove origin
  git remote add origin git@gitlab.com:foo/bar.git
  # Settings > Repository > Protected branches/Protected branches >
  # enable "Allowed to force push to main/master"
  git push origin --force 'refs/heads/*'
  git push origin --force 'refs/tags/*'
  git push origin --force 'refs/replace/*'
  # Settings > Repository > Protected branches/Protected branches >
  # disable "Allowed to force push to main/master"
  date
  # wait 30 minutes (😱)
  date
  # Settings > Repository > upload /tmp/commit-map-X
# option 2: new repo
  git clone git@gitlab.com:foo/bar.git .
  git filter-repo --invert-paths --path wp-content/uploads/
  git remote add origin git@gitlab.com:foo/bar-new.git
  # Settings > Repository > Protected branches/Protected branches >
  # enable "Allowed to force push to main/master"
  git push origin --force --all
  # Settings > Repository > Protected branches/Protected branches >
  # disable "Allowed to force push to main/master"
cd ..
rm -rf tmp-repo

Sawise ngenteni liyane ~ 5 menit kita bisa pindhah ing Settings > Usage Quotas ndeleng papan panyimpenan:

GitLab: ruang disk sadurunge ngresiki
GitLab: ruang disk sawise ngresiki

Sawise dibusak, penting yen kabeh pangembang sing melu melu ing langkah-langkah pungkasan: Yen pangguna saiki nindakake push normal nganggo salinan lokal dhewe, iki bakal nyebabake file gedhe bali menyang repositori tengah. Mulane, ing ngisor iki 3 opsi dianjurake:

  • "Klone seger wong miskin"
    • rm -rf .git && git clone xxx temp && mv temp/.git ./.git && rm -rf temp
    • Kanggo file sing diganti (gumantung saka aplikasi): git checkout -- . utawa. git add -A . && git commit -m "Push obscure file changes." && git push
  • "miwiti saka ngeruk"
    • rm -rf repo && git clone xxx .
  • "tarik elek karo rebase"
    • git pull -r
    • Ing kene sampeyan isih duwe riwayat sing ora diresiki, nanging umume sampeyan ora sengaja nimpa repositori remot karo varian lokal sing gedhe.

Sajrone kuota saiki (utamane amarga watesan anyar GitLab ), mesthine kudu mriksa ukuran riwayat repositori sampeyan lan ngresiki yen perlu:

GitHub GratisGitLab Gratis
Watesan ukuran file maksimal100 MB
Batas ukuran repo maksimal5.000MB
Batas count repo maksimal
Max watesan ukuran sakabèhé5.000MB

Akhire, iku uga worth njupuk dipikir ing self-host, free varian kaya Gitea kanggo uncalan. Kanthi sethitik gaweyan sampeyan bisa ing server banget slim conto Git sing di-host dhewe (GUI per SSL dijamin, Serep klebu, kontrol liwat API kuat) inang, sing uga apik banget ngatur lan uga unggul babagan proteksi data. Ing kene, kanthi cara, sampeyan uga bisa nggunakake git-filter-repo Cukup streamline repositori:

mkdir tmp-repo
cd tmp-repo
git clone git@git.tld.com:foo/bar.git .
cp .git/config /tmp/config-backup
git filter-repo --invert-paths --path wp-content/uploads/
# option 1: same repo
  mv /tmp/config-backup .git/config
  git push origin --mirror
  # login on the remote command line and run in the repo-folder
  sudo -u git git reflog expire --expire=now --all
  sudo -u git git gc --aggressive --prune=now
  # if you face memory limit issues, modify the git configuration
  sudo -u git git config --global pack.windowMemory "100m"
  sudo -u git git config --global pack.packSizeLimit "100m"
  sudo -u git git config --global pack.threads "1"
  # if in web ui the size does not change, make a slight
  # modification to a file and push again normally
# option 2: new repo
  git remote add origin git@git.tld.com:foo/bar-new.git
  git push origin --force --all
cd ..
rm -rf tmp-repo

Punika khusus printah sudo -u git git gc --aggressive --prune=now penting (cron mlaku git gc digunakake wis siji dawa banget wektu prune saka 2 minggu).

Bali