Skip to content

Commit

Permalink
feat: Add support for 🐍 Python objects to be updated in README
Browse files Browse the repository at this point in the history
feat: Add support for 🐍 Python objects to be updated in README
  • Loading branch information
kvankova authored Nov 3, 2024
2 parents ba27426 + c662b57 commit 9b7eace
Show file tree
Hide file tree
Showing 7 changed files with 332 additions and 55 deletions.
122 changes: 98 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,20 @@
## **Code Embedder**
Seamlessly update code snippets in your **README** files! πŸ”„πŸ“πŸš€

[Description](#-description) β€’ [How it works](#-how-it-works) β€’ [Examples](#-examples) β€’ [Setup](#-setup) β€’ [Under the hood](#-under-the-hood)
[Description](#-description) β€’ [How it works](#-how-it-works) β€’ [Setup](#-setup) β€’ [Examples](#-examples) β€’ [Under the hood](#-under-the-hood)
</div>


## πŸ“š Description

**Code Embedder** is a GitHub Action that automatically updates code snippets in your markdown (`README`) files. It finds code blocks in your `README` that reference specific scripts, then replaces these blocks with the current content of those scripts. This keeps your documentation in sync with your code.

✨ **Key features**
### ✨ Key features
- πŸ”„ **Automatic synchronization**: Keep your `README` code examples up-to-date without manual intervention.
- πŸ“ **Section support**: Update specific sections of the script in the `README`.
- πŸ› οΈ **Easy setup**: Simply add the action to your GitHub workflow and format your `README` code blocks.
- 🌐 **Language agnostic**: Works with any programming language or file type.
- πŸ“ **Section support**: Update only specific sections of the script in the `README`.
- 🧩 **Object support**: Update only specific objects (functions, classes) in the `README`. *The latest version supports only 🐍 Python objects (other languages to be added soon).*


By using **Code Embedder**, you can focus on writing and updating your actual code πŸ’», while letting the action take care of keeping your documentation current πŸ“šπŸ”„. This reduces the risk of outdated or incorrect code examples in your project documentation.

Expand All @@ -43,9 +44,46 @@ You must also add the following comment tags in the script file `path/to/script`
...
[Comment sign] code_embedder:section_name end
```
The comment sign is the one that is used in the script file, e.g. `#` for Python, or `//` for JavaScript. The `section_name` must be unique in the file, otherwise the action will not be able to identify the section.
The comment sign is the one that is used in the script file, e.g. `#` for Python, or `//` for JavaScript. The `section_name` must be unique in the file, otherwise the action will use the first section found.

### 🧩 **Object** updates
In the `README` (or other markdown) file, the object of the script is marked with the following tag:
````md
```language:path/to/script:object_name
```
````

> [!Note]
> The object name must match exactly the name of the object (function, class) in the script file. Currently, only 🐍 Python objects are supported.
> [!Note]
> If there is a section with the same name as any object, the object definition will be used in the `README` instead of the section. To avoid this, **use unique names for sections and objects!**
## πŸ”§ Setup
To use this action, you need to configure a yaml workflow file in `.github/workflows` folder (e.g. `.github/workflows/code-embedder.yaml`) with the following content:

```yaml
name: Code Embedder

on: pull_request

permissions:
contents: write

jobs:
code_embedder:
name: "Code embedder"
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3

- name: Run code embedder
uses: kvankova/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

```

## πŸ’‘ Examples

Expand Down Expand Up @@ -112,34 +150,70 @@ print("Embedding successful")

With any changes to the section `A` in `main.py`, the code block section is updated in the `README` file with the next workflow run.

## πŸ”§ Setup
To use this action, you need to configure a yaml workflow file in `.github/workflows` folder (e.g. `.github/workflows/code-embedder.yaml`) with the following content:
### 🧩 Object update
The tag used for object update follows the same convention as the tag for section update, but you provide `object_name` instead of `section_name`. The object name can be a function name or a class name.

```yaml
name: Code Embedder
> [!Note]
> The `object_name` must match exactly the name of the object (function, class) in the script file, including the case. If you define class `Person` in the script, you must use `Person` as the object name in the `README`, not lowercase `person`.
on: pull_request
For example, let's say we have the following `README` file:
````md
# README

permissions:
contents: write
This is a readme.

jobs:
code_embedder:
name: "Code embedder"
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
Function `print_hello` is defined as follows:
```python:main.py:print_hello
```

- name: Run code embedder
uses: kvankova/[email protected]
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Class `Person` is defined as follows:
```python:main.py:Person
```
````

The `main.py` file contains the following code:
```python
...
def print_hello():
print("Hello, world!")
...

class Person:
def __init__(self, name):
self.name = name
def say_hello(self):
print(f"Hello, {self.name}!")
...
```

Once the workflow runs, the code block section will be updated in the `README` file with the content of the function `print_hello` and class `Person` from the script located at `main.py` and pushed to the repository πŸš€.

````md
# README

This is a readme.

Function `print_hello` is defined as follows:
```python:main.py:print_hello
def print_hello():
print("Hello, world!")
```

Class `Person` is defined as follows:
```python:main.py:Person
class Person:
def __init__(self, name):
self.name = name
def say_hello(self):
print(f"Hello, {self.name}!")
```
````

With any changes to the function `print_hello` or class `Person` in `main.py`, the code block sections are updated in the `README` file with the next workflow run.


## πŸ”¬ Under the hood
This action performs the following steps:
1. πŸ”Ž Scans through the markdown (`README`) files to identify referenced script files (full script or section).
1. πŸ”Ž Scans through the markdown (`README`) files to identify referenced script files (full script, section or 🐍 Python object).
1. πŸ“ Extracts the contents from those script files and updates the corresponding code blocks in the markdown (`README`) files.
1. πŸš€ Commits and pushes the updated documentation back to the repository.
99 changes: 72 additions & 27 deletions src/script_content_reader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import ast
import re
from typing import Protocol

Expand All @@ -16,68 +17,112 @@ def __init__(self) -> None:
self._section_end_regex = r".*code_embedder:.*end"

def read(self, scripts: list[ScriptMetadata]) -> list[ScriptMetadata]:
script_contents = self._read_full_script(scripts)
return self._process_scripts(script_contents)
scripts_with_full_contents = self._read_full_script(scripts)
return self._process_scripts(scripts_with_full_contents)

def _read_full_script(self, scripts: list[ScriptMetadata]) -> list[ScriptMetadata]:
script_contents: list[ScriptMetadata] = []
scripts_with_full_contents: list[ScriptMetadata] = []

for script in scripts:
try:
with open(script.path) as script_file:
script.content = script_file.read()

script_contents.append(script)
scripts_with_full_contents.append(script)

except FileNotFoundError:
logger.error(f"Error: {script.path} not found. Skipping.")

return script_contents
return scripts_with_full_contents

def _process_scripts(self, scripts: list[ScriptMetadata]) -> list[ScriptMetadata]:
full_scripts = [script for script in scripts if not script.extraction_part]
scripts_with_sections = [script for script in scripts if script.extraction_part]
scripts_with_extraction_part = [script for script in scripts if script.extraction_part]

if scripts_with_sections:
scripts_with_sections = self._read_script_section(scripts_with_sections)
if scripts_with_extraction_part:
scripts_with_extraction_part = self._update_script_content_with_extraction_part(
scripts_with_extraction_part
)

return full_scripts + scripts_with_sections
return full_scripts + scripts_with_extraction_part

def _read_script_section(self, scripts: list[ScriptMetadata]) -> list[ScriptMetadata]:
def _update_script_content_with_extraction_part(
self, scripts: list[ScriptMetadata]
) -> list[ScriptMetadata]:
return [
ScriptMetadata(
path=script.path,
extraction_part=script.extraction_part,
readme_start=script.readme_start,
readme_end=script.readme_end,
content=self._extract_section(script),
content=self._extract_part(script),
)
for script in scripts
]

def _extract_section(self, script: ScriptMetadata) -> str:
def _extract_part(self, script: ScriptMetadata) -> str:
lines = script.content.split("\n")
section_bounds = self._find_section_bounds(lines)

if not section_bounds:
logger.error(f"Section {script.extraction_part} not found in {script.path}")
# Try extracting as object first, then fall back to section
is_object, start, end = self._find_object_bounds(script)
if is_object:
return "\n".join(lines[start:end])

# Extract section if not an object
start, end = self._find_section_bounds(lines)
if not self._validate_section_bounds(start, end, script):
return ""

start, end = section_bounds
return "\n".join(lines[start:end])

def _find_section_bounds(self, lines: list[str]) -> tuple[int, int] | None:
section_start = None
section_end = None
def _validate_section_bounds(
self, start: int | None, end: int | None, script: ScriptMetadata
) -> bool:
if not start and not end:
logger.error(
f"Part {script.extraction_part} not found in {script.path}. Skipping."
)
return False

if not start:
logger.error(
f"Start of section {script.extraction_part} not found in {script.path}. "
"Skipping."
)
return False

if not end:
logger.error(
f"End of section {script.extraction_part} not found in {script.path}. "
"Skipping."
)
return False

return True

def _find_section_bounds(self, lines: list[str]) -> tuple[int | None, int | None]:
for i, line in enumerate(lines):
if re.search(self._section_start_regex, line):
section_start = i + 1
start = i + 1
elif re.search(self._section_end_regex, line):
section_end = i
break

if section_start is None or section_end is None:
return None

return section_start, section_end
return start, i

return None, None

def _find_object_bounds(
self, script: ScriptMetadata
) -> tuple[bool, int | None, int | None]:
tree = ast.parse(script.content)

for node in ast.walk(tree):
if (
isinstance(node, ast.FunctionDef)
| isinstance(node, ast.AsyncFunctionDef)
| isinstance(node, ast.ClassDef)
):
if script.extraction_part == getattr(node, "name", None):
start = getattr(node, "lineno", None)
end = getattr(node, "end_lineno", None)
return True, start - 1 if start else None, end

return False, None, None
16 changes: 16 additions & 0 deletions tests/data/example_python_objects.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import re


# Function verifying an email is valid
def verify_email(email: str) -> bool:
return re.match(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$", email) is not None


class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age

# String representation of the class
def __str__(self) -> str:
return f"Person(name={self.name}, age={self.age})"
23 changes: 23 additions & 0 deletions tests/data/expected_readme3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# README 3

This is a test README file for testing the code embedding process.

## Python objects

This section contains examples of Python objects.

```python:tests/data/example_python_objects.py:verify_email
def verify_email(email: str) -> bool:
return re.match(r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$", email) is not None
```

```python:tests/data/example_python_objects.py:Person
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
# String representation of the class
def __str__(self) -> str:
return f"Person(name={self.name}, age={self.age})"
```
13 changes: 13 additions & 0 deletions tests/data/readme3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# README 3

This is a test README file for testing the code embedding process.

## Python objects

This section contains examples of Python objects.

```python:tests/data/example_python_objects.py:verify_email
```

```python:tests/data/example_python_objects.py:Person
```
4 changes: 3 additions & 1 deletion tests/test_code_embedding.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,13 @@ def test_code_embedder(tmp_path) -> None:
"tests/data/readme0.md",
"tests/data/readme1.md",
"tests/data/readme2.md",
"tests/data/readme3.md",
]
expected_paths = [
"tests/data/expected_readme0.md",
"tests/data/expected_readme1.md",
"tests/data/expected_readme2.md",
"tests/data/expected_readme3.md",
]

# Create a temporary copy of the original file
Expand All @@ -36,4 +38,4 @@ def test_code_embedder(tmp_path) -> None:
with open(temp_readme_path) as updated_file:
updated_readme_content = updated_file.readlines()

assert expected_readme_content == updated_readme_content
assert updated_readme_content == expected_readme_content
Loading

0 comments on commit 9b7eace

Please sign in to comment.