June 06, 2023

Sync GitHub Repos to Notion with Temporal Schedules

Loren Sands-Ramshaw

Loren Sands-Ramshaw

At our last company hackathon, I built an internal tool that syncs your GitHub org's list of repositories to a Notion database. It takes this list:

List of organization repositories on GitHub

and makes this database in Notion:

Notion DB with the same repositories

The database serves two purposes:

  1. Tracking which team owns each repo, and whom to contact if you have a question. A couple of questions that come up are:
  • "This sample repo is out of date—who is responsible for it?"
  • "A user asked about their issue/PR on this repo—which team is responsible for responding?"

The two manually updated columns we use for this are "Owner" with team multi-select and "POC" with Person multi-select.

  1. Tracking which GitHub teams or individuals have which roles on each repo. It's hard to view all that info on github.com, and having it all in one table makes it much easier. The columns we use for this are:
  • Teams admins
  • Individual admins
  • Teams with role: maintainer
  • Individuals with role: maintainer
  • Teams with role: write
  • Individuals with role: write
  • Teams with role: triage
  • Individuals with role: triage
  • Teams with role: read
  • Individuals with role: read

Since, by default, everyone in the org has write access to all repos, org members are excluded from the "Individuals with role: write" column to reduce noise.

Implementation

Sync function

The one-way sync is implemented with TypeScript code (temporalio/github-repo-notion-sync) that fetches data from the GitHub API and then updates the Notion database via Notion's client library:

sync.ts

export async function syncGithubToNotion() {
  const [repos, teammates] = await Promise.all([getRepos(), getTeammates()])
  const withContributors = await addContributors(repos)
  const completeRepos = await addCollaboratorsAndTeams(
    withContributors,
    teammates
  )
  const transformedRepos = transformRepos(completeRepos)

  await updateNotion(transformedRepos)
}

The syncGithubtoNotion function:

Running on a schedule

syncGithubToNotion runs daily at 6am. The easiest way to set that up would have been a GitHub Action that runs on a schedule:

name: run daily at 6am

on:
  schedule:
    - cron: '0 6 * * *'

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - env:
          NOTION_TOKEN: ${{ secrets.NOTION_TOKEN }}
          GITHUB_TOKEN: ${{ secrets.GH_TOKEN }}
        run: npm start

However, GitHub Action crons aren't very reliable:

Theo and swyx tweets on reliability

So I used a new feature of Temporal called Schedules:

create-or-update-schedule.ts

await client.schedule.create({
  action: {
    type: 'startWorkflow',
    workflowType: syncGithubToNotion,
    taskQueue,
    workflowId: 'gh-notion-sync,
  },
  scheduleId: 'gh-notion-sync',
  spec: {
    calendars: [
      {
        comment: 'daily at 6am',
        hour: 6,
      },
    ],
  },
})

This reliably runs syncGithubToNotion daily at 6am, and since syncGithubToNotion is a durable function, if any part of it fails, that part will automatically be retried until it succeeds, and then the function will continue. It's run so reliably that even if the machine loses power on the third line of the function, the function will continue to run starting from the third line on another machine. Perhaps that's more reliability than I needed for this particular internal tool, but it was a good use case for Temporal Schedules and demo of durable functions, which you can use for your application's backend code if you want it to run extremely reliably! 😃 (For an intro and explanation of how, see Building Reliable Distributed Systems in Node.)


So that was my project 🤓. It's open source if you'd like to adapt it for your company. Let me know if you have any questions—I'd be happy to help out. I'm @lorendsr on Twitter and @Loren on our community Slack 💃.