---
title: "Mirroring git repositories with Gitea"
date: 2024-06-27T10:05:46-04:00
draft: true
tags:
    - gitea
    - curl
    - yq
categories:
    - development
---

[Gitea][1] is an awesome self hosted git forge. I use the [pull-mirror][2]
feature to mirror many git repos (mostly from github). In this post, I want to
share few maintenance scripts I ran that connects to [gitea-api][14] using
[yq][] and [curl][].

<!--more-->

# TODO

  * Change mirror interval
  * Disable for mirror repos by default actions
  * use jo https://github.com/jpmens/jo
  * get just latest mirror and apply the settings

## Setup

### API token

Go to `https://gitea.balki.me/user/settings/applications` and Generate new
Token with write permission for repositories.

# TODO
set base url

## Why Mirror?

1. Remote sites may disappear one day.
2. Better local code search. Github does not allow code search without signing
   in.
3. Tools like vim-plugins can use the local urls which is better for privacy.
4. Setup notification when a repository creates a new tag.

## Dumb crawlers problem

My gitea [instance][3] is public and had all mirror repos public as well. This
caused a huge network traffic from bots. 

I created an [organization][4] without public visibility and made it own all
the mirror repos.

```
❯ yq --version
yq (https://github.com/mikefarah/yq/) version v4.44.1

❯ curl -V | head -c 11
curl 8.8.0 
```

## API token

## Hooks for notification
[Hooks][5]

Lets first download the list of all mirror repos

```bash
TOKEN=d88446542e844f4da4ba75bbb85bd694a71907b5

curl "https://gitea.balki.me/api/v1/repos/search?limit=100&mode=mirror" \
	-H "accept: application/json" \
	-H "Authorization: token $TOKEN" \
	-o mirror-repos.json
```

References

1. API doc: https://gitea.balki.me/api/swagger#/repository/repoSearch

Create a [hook][5] manually in one repo and get the hook using the API

```bash
curl -s "https://gitea.balki.me/api/v1/repos/MirrorWatch/snac2/hooks?page=1&limit=10" \
        -H "Authorization: token $TOKEN" | yq -P -oj
```

Sample Output:
```json
[
  {
    "id": 32,
    "type": "telegram",
    "branch_filter": "tag",
    "config": {
      "content_type": "json",
      "url": "https://api.telegram.org/bot1169894068:J1JVbV3f2vEQpdnPqFANfhjWZrFuUCJs1EW/sendMessage?chat_id=-1008910751069"
    },
    "events": [
      "create"
    ],
    "authorization_header": "",
    "active": true,
    "updated_at": "2024-06-20T20:54:33-04:00",
    "created_at": "2024-06-20T20:54:33-04:00"
  }
]
```

Now loop through all mirror repos and add the same webook. Remove unwanted fields like `id`, `created_at`, etc.,

```bash
yq -r '.data[] | .full_name' mirror-repos.json | while read -r repo; do
	echo "$repo"

	curl "https://gitea.balki.me/api/v1/repos/$repo/hooks" \
		-H "Authorization: token $TOKEN" \
		--json @- <<-EOM
			{
				"active": true,
				"branch_filter": "tag",
				"config": {
					"content_type": "json",
					"url": "https://api.telegram.org/bot1169894068:J1JVbV3f2vEQpdnPqFANfhjWZrFuUCJs1EW/sendMessage?chat_id=-1008910751069"
                },
				"events": [
					"create"
				],
				"type": "telegram"
			}
		EOM

	echo "============"
done
```

## Fixing issues and pr links

Fix the issue url setting in first repo as shown [here][7].

Get the json representataion.

```bash
yq '.data[] | .external_tracker ' mirror-repos.json | head 
```

Sample output

```json
{
  "external_tracker_url": "https://github.com/caddyserver/caddy/issues",
  "external_tracker_format": "https://github.com/caddyserver/caddy/issues/{index}",
  "external_tracker_style": "numeric",
  "external_tracker_regexp_pattern": ""
}
```

Now loop throug all repos and update. Making sure only add to github repos and
they are not already updated

```bash
yq -r '.data[] 
| select(.original_url == "*github*" and has("internal_tracker") )
| "\(.full_name) \(.original_url)"' mirror-repos.json | while read -r repo og; do
	echo "Repo is $repo and github origin url is $og"
	curl "https://gitea.balki.me/api/v1/repos/$repo" \
		-H "Authorization: token $TOKEN" \
		-X PATCH \
		--json @- <<-EOM
			{
					"external_tracker": {
					"external_tracker_url": "${og%.git}/issues",
					"external_tracker_format": "${og%.git}/issues/{index}",
					"external_tracker_style": "numeric",
					"external_tracker_regexp_pattern": ""
					}
			}
		EOM
done
yq -r '.data[] | .full_name' mirror-repos.json | while read -r repo; do
	echo "$repo"
	jo has_actions=false | curl "https://gitea.balki.me/api/v1/repos/$repo" \
		-H "Authorization: token $TOKEN" \
		-X PATCH \
		--json @-
done
```

### Doc links
* yq: [select][8], [has][9], [string interpolation][10]
* bash: [parameter expansion][11], [here-doc][12]
* curl: [`--json`][13]


[1]: https://github.com/go-gitea/gitea
[2]: https://docs.gitea.com/usage/repo-mirror#pulling-from-a-remote-repository
[3]: https://gitea.balki.me
[4]: https://docs.gitea.com/usage/permissions#organization-repository
[5]: https://docs.gitea.com/usage/webhooks
[6]: https://docs.gitea.com/development/api-usage
[7]: https://github.com/go-gitea/gitea/issues/18986
[8]: https://mikefarah.gitbook.io/yq/operators/select
[9]: https://mikefarah.gitbook.io/yq/operators/has#select-checking-for-existence-of-deep-paths
[10]: https://mikefarah.gitbook.io/yq/operators/string-operators#interpolation
[11]: https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
[12]: https://www.gnu.org/software/bash/manual/html_node/Redirections.html#Here-Documents
[13]: https://everything.curl.dev/http/post/json.html
[14]: https://gitea.balki.me/api/swagger#/repository/repoSearch