Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AutoRAG to Monorepo #960

Draft
wants to merge 63 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
3d706dc
just commit
bwook00 Nov 18, 2024
deb5d65
just commit
bwook00 Nov 18, 2024
e154268
Merge branch 'main' into Feature/#956
bwook00 Nov 18, 2024
b92172d
add the root directory
Nov 19, 2024
356c253
.gitignore in the autorag source folder
Nov 19, 2024
bc085e4
edit github actions
Nov 19, 2024
a9043fb
fix .env .gitignore
Nov 19, 2024
9a07a44
add root .gitignore
Nov 19, 2024
e2c08fc
set PYTHONPATH at test.yml
Nov 19, 2024
4ba97d3
change the name of the test_base.py
Nov 19, 2024
d9fa4b2
change the VERSION path at docs/conf.py
Nov 19, 2024
9d90aca
Add api to repository
Nov 19, 2024
6f12102
Add api to repository
Nov 19, 2024
8ff9c86
add autorag at pythonpath
Nov 19, 2024
b6c4232
edit gitignore for tracking projects folder
Nov 19, 2024
f32cec3
add README.md at projects folder for tracking projects folder
Nov 19, 2024
e9d5666
add autorag-frontend as git submodule
Nov 19, 2024
7d2b8cb
Do not run API test at github actions
Nov 19, 2024
51354ed
rename: update file path from api/projects/README.md to projects/READ…
Nov 19, 2024
1dd72fb
🚑 fix: Update .gitignore and add .dockerignore and Dockerfile
Nov 19, 2024
c7c0f9b
📝 docs: remove AutoRAG Workflow API documentation and related resources.
Nov 19, 2024
55249ea
✨ feat: Add description for tutorial_1 project
Nov 19, 2024
33cab10
🔧 chore: update .gitignore to exclude .DS_Store
Nov 19, 2024
191e499
🚑 fix: Update project naming convention in README and adjust requirem…
Nov 19, 2024
0b6bfc3
🚑 fix: Update ports and environment variables in docker-compose.yml t…
Nov 19, 2024
270638d
🚑 fix: Update schema.py with corrected field indentation and added 'p…
Nov 19, 2024
60cda5b
🚑 fix: Fix indentation in validate.py for decorator functions.
Nov 19, 2024
62a9eb6
🚑 fix: refactor authentication decorator in auth.py
Nov 19, 2024
c467580
🚑 fix: Correct get_new_trial_dir parameter naming and handle trial di…
Nov 19, 2024
dea7c09
🚑 fix: Corrected import formatting in qa_create.py and standardized f…
Nov 19, 2024
0473cb3
✨ feat: Add dashboard module to autorag package and implement async p…
Nov 19, 2024
5359b4d
🚑 fix: Refactor PandasTrialDB to handle trial operations more efficie…
Nov 19, 2024
b0b1970
move upload file endpoint
Nov 19, 2024
cc56006
turn evaluate_history.py workable again
Nov 19, 2024
06284b2
just reformat and edit ignore files
Nov 19, 2024
31db596
Merge branch 'main' into Feature/#959
Nov 20, 2024
e9a546c
working with uvicorn now
Nov 20, 2024
d15ae04
Merge pull request #966 from Marker-Inc-Korea/Feature/#965
hongsw Nov 20, 2024
81a2f63
Merge branch 'main' into Feature/#959
vkehfdl1 Nov 20, 2024
24db0d7
Add env variable to locate the project folder and resolve new pydanti…
vkehfdl1 Nov 23, 2024
61c2cb8
Add env variable endpoints for managing env variable (#975)
vkehfdl1 Nov 23, 2024
23260f4
upload multiple files at once using key 'files' (#981)
vkehfdl1 Nov 24, 2024
b4d0776
Merge branch 'main' into Feature/#959
Nov 24, 2024
7565f7c
[API] fix validate and evaluation api config, set_trial_config #984 (…
hongsw Nov 25, 2024
0558ab2
Make the default timezone at the API server to UTC (#992)
vkehfdl1 Nov 25, 2024
f4c664b
✨ feat: Add QA document generation task in trial_tasks.py and schema.…
hongsw Nov 25, 2024
cd530bd
Change the api port to 8000 (#1007)
vkehfdl1 Nov 26, 2024
9611073
artifacts/content GET endpoint for sending raw_data files (#1008)
vkehfdl1 Nov 26, 2024
21e9577
Change the API server that qa, chunk, and qa contains to the project_…
vkehfdl1 Nov 28, 2024
a5c977e
Merge main into Feature/#959
vkehfdl1 Nov 30, 2024
4a0cdfd
Working API with SQL DB (#1016)
vkehfdl1 Dec 1, 2024
1759488
API server refactor to celery with report, streamlit, and qaurt api s…
vkehfdl1 Dec 1, 2024
c048227
add ko version at requirements.txt (#1026)
vkehfdl1 Dec 2, 2024
42df1fc
Update frontend
vkehfdl1 Dec 3, 2024
476a510
Merge branch 'main' into Feature/#959
vkehfdl1 Dec 3, 2024
23f11ee
update for api compatibility
vkehfdl1 Dec 3, 2024
a76aa29
Merge branch 'main' into Feature/#959
vkehfdl1 Dec 4, 2024
02ff52a
update autorag-frontend
vkehfdl1 Dec 8, 2024
9f27257
Add parsed data get endpoint (#1041)
vkehfdl1 Dec 8, 2024
dc5432f
update AutoRAG version to 0.3.11rc3
vkehfdl1 Dec 9, 2024
f34b7d9
update AutoRAG version to 0.3.12
vkehfdl1 Dec 9, 2024
91cdc6c
update autorag frontend to the latest
vkehfdl1 Dec 9, 2024
27d9bdb
Enable the file extensions (data, html, etc.) (#1053)
vkehfdl1 Dec 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/docker-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ on:
push:
branches: [ "main" ]

defaults:
run:
working-directory: ./autorag

env:
DOCKER_REPO: "autoraghq/autorag"

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/gpu-docker-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ on:
push:
branches: [ "main" ]

defaults:
run:
working-directory: ./autorag

env:
DOCKER_REPO: "autoraghq/autorag"

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/publish.yml
vkehfdl1 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ on:
branches:
- main

defaults:
run:
working-directory: ./autorag

jobs:
pypi-publish:
name: upload release to PyPI
Expand Down
8 changes: 5 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
sudo apt-get install tesseract-ocr
- name: Install AutoRAG
run: |
pip install -e '.[ko,dev,parse,ja]'
pip install -e './autorag[ko,dev,parse,ja]'
- name: Install dependencies
run: |
pip install -r tests/requirements.txt
Expand All @@ -54,6 +54,8 @@ jobs:
python3 -c "import nltk; nltk.download('averaged_perceptron_tagger_eng')"
- name: delete tests package
run: python3 tests/delete_tests.py
- name: Run tests
- name: Run AutoRAG tests
env:
PYTHONPATH: ${PYTHONPATH}:./autorag
run: |
python3 -m pytest -o log_cli=true --log-cli-level=INFO -n auto tests/
python3 -m pytest -o log_cli=true --log-cli-level=INFO -n auto tests/autorag
10 changes: 6 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,10 @@ ipython_config.py
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
Expand Down Expand Up @@ -158,10 +160,10 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
pytest.ini
.DS_Store
projects/tutorial_1
!projects/tutorial_1/config.yaml
pytest.ini
projects
test_projects

# Visual Studio Code
.vscode/
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "autorag-frontend"]
path = autorag-frontend
url = https://github.com/Auto-RAG/autorag-frontend.git
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ First, you need to set the config YAML file for your RAG optimization.

We highly recommend using pre-made config YAML files for starter.

- [Get Sample YAML](./sample_config/rag)
- [Get Sample YAML](autorag/sample_config/rag)
- [Sample YAML Guide](https://docs.auto-rag.com/optimization/sample_config.html)
- [Make Custom YAML Guide](https://docs.auto-rag.com/optimization/custom_config.html)

Expand Down
90 changes: 90 additions & 0 deletions api/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Git
.git
.gitignore
.gitattributes


# CI
.codeclimate.yml
.travis.yml
.taskcluster.yml

# Docker
docker-compose.yml
Dockerfile
.docker
.dockerignore

# Byte-compiled / optimized / DLL files
**/__pycache__/
**/*.py[cod]

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.cache
nosetests.xml
coverage.xml

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Virtual environment
.env
.venv/
venv/

# PyCharm
.idea

# Python mode for VIM
.ropeproject
**/.ropeproject

# Vim swap files
**/*.swp

# VS Code
.vscode/
projects/
2 changes: 2 additions & 0 deletions api/.env.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
AUTORAG_API_ENV=dev
AUTORAG_WORK_DIR=../projects
File renamed without changes.
165 changes: 165 additions & 0 deletions api/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
.DS_Store
projects
!projects/README.md
42 changes: 42 additions & 0 deletions api/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
FROM python:3.10-slim

WORKDIR /app

# Install system dependencies + parsing dependencies
RUN apt-get update && apt-get install -y \
python3-pip \
build-essential \
libmagic-dev \
libgl1-mesa-dev \
libglib2.0-0 \
poppler-utils \
tesseract-ocr \
tesseract-ocr-eng \
tesseract-ocr-kor \
&& rm -rf /var/lib/apt/lists/*

# Upgrade pip
RUN python -m pip install --upgrade pip

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install watchfiles pdf2image bert_score
# Install NLTK and download model
RUN pip install nltk && \
python3 -c "import nltk; nltk.download('punkt_tab')" && \
python3 -c "import nltk; nltk.download('averaged_perceptron_tagger_eng')"
ENV PYTHONPATH=/app
ENV PYTHONUNBUFFERED=1
# # Copy application code
# COPY . .

# Create directory for celery beat schedule
RUN mkdir -p /app/celerybeat

# Add entrypoint script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh && \
sed -i 's/\r$//' /entrypoint.sh # Remove Windows line endings

ENTRYPOINT ["/entrypoint.sh"]
Loading
Loading