-
-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AutoRAG to Monorepo #960
Draft
vkehfdl1
wants to merge
63
commits into
main
Choose a base branch
from
Feature/#959
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Added various entries to ignore specific files and directories in both the root directory's .gitignore and the api directory's .dockerignore. Additionally, included a Dockerfile for building a Python 3.10-slim-based API image with specified dependencies and runtime configurations. A docker-compose.yml file was introduced to define services and networks for frontend and API components.
…ents This commit updates the project naming convention in the README file from "AutoRAG API Server" to "AutoRAG-API" for consistency. Additionally, it modifies the version requirement in the `requirements.txt` file for AutoRAG to be greater than or equal to 0.3.8 to ensure compatibility with the latest features.
…o use port 5001 instead of 5000
…ath' field in ParseRequest model
# Conflicts: # autorag/autorag/vectordb/couchbase.py
working with uvicorn now
hongsw
previously approved these changes
Nov 20, 2024
…c version issues (#971) Co-authored-by: jeffrey <[email protected]>
* add delete endpoint and change to .env based operations * add api endpoint for gathering all env settings * load env variable when start each task * change GET /env to return everything (key & values) --------- Co-authored-by: jeffrey <[email protected]>
Co-authored-by: jeffrey <[email protected]>
# Conflicts: # autorag/autorag/vectordb/qdrant.py
…987) * feat: refactor SQL Trial DB from Pandas Trial DB, and Test code * 🚑 fix: Set correct WORK_DIR based on environment variable - Updated the logic in app.py to properly set the `WORK_DIR` based on the environment variable `AUTORAG_API_ENV`. If the environment is 'dev', the `WORK_DIR` will be located at `"../projects"`, otherwise, it will be set to `"projects"`. Additionally, the `.env` file path is now correctly constructed using the determined `WORK_DIR` value. * 🚑 fix: Update method to use model_validate_json in trial_dict['config'] assignment and update set_trial_config for trial_id with TrialConfig model dump JSON. Add get_all_config_ids and get_all_trial_ids SQL query functions. * ✨ feat: Add CORS headers and handle OPTIONS requests This commit introduces the addition of CORS headers in every response and explicit handling of OPTIONS requests in the API server. Includes setting Access-Control-Allow-Origin, Access-Control-Allow-Credentials, Access-Control-Allow-Headers, and Access-Control-Allow-Methods based on the request origin. * ✅ test: add test file for project creation with setup and cleanup fixtures, including logging configurations, environment setup, client creation, and project directory validation * 🚑 fix: Remove unnecessary commented-out properties in Trial class * 🚑 fix: Set correct WORK_DIR based on environment variable AUTORAG_WORK_DIR * ♻️ refactor: Update code in app.py and schema.py for better handling of working directory and model configuration. Fix deprecated usage in test_app.py and enhance testing in test_trial_config.py. * 📝 docs: update README with instructions for running using Docker Compose and monitoring options. * ✨ feat: start parsing documents task with improved import handling This commit introduces changes to the document parsing task initiation. The import statement for `parse_documents` has been updated within the file. Additionally, the logic for initiating the parsing process has been streamlined and improved for better performance and handling of imports. * ✅ test: add tests for project database operations such as initializing DB, setting/getting trials, updating trial configurations, and retrieving trial information by project or ID. * ♻️ refactor: Improve database initialization in SQLiteProjectDB - Refactored the `_init_db` method to enhance database initialization. - Added logging and enhanced debugging statements for better clarity. - Now checks for the existence of the database file and its directory before initializing. - If the database file does not exist, it creates the necessary directory and tables. - Adjusted permissions for directories (777) and the database file (666) accordingly. * 🚑 fix: correct chunking and parsing tasks in trial_tasks.py * 🔧 chore: Update imports and debug logging level in app.py - Updated import statement in app.py to include chunk_documents from trial_tasks module. - Changed the logging level from INFO to DEBUG for more detailed logging information. * ♻️ refactor: refactor parsing endpoint and improve error handling - Refactored the parsing endpoint to handle configuration data retrieval more efficiently. - Improved error handling to provide more informative error messages in case of missing data or failed tasks. * 🚑 fix: Correct chunked data path and task handling in start_chunking function * ✨ feat: Configure not to use uvloop, apply nest_asyncio, and correct import in app.py - Avoid using uvloop by setting asyncio event loop policy to DefaultEventLoopPolicy(). - Apply nest_asyncio after that to prevent conflicts. - Change the import in app.py from `from database.project_db import SQLiteProjectDB` to the correct import. refactor: Update Celery configuration in celery_app.py - Adjust broker and backend URLs to use 'redis://redis:6379/0'. - Modify the timezone to 'Asia/Seoul' for better synchronization. * 🚑 fix: Install system dependencies and pip, adjust Dockerfile for API service - Removed unnecessary comments related to installing pip as it's clear from the command itself - Added installation of 'watchfiles', setting PYTHONPATH and PYTHONUNBUFFERED environment variables - Created a directory for celery beat schedule and added an entrypoint script - Adjusted permissions for the entrypoint script and removed Windows line endings - Updated entrypoint to /entrypoint.sh in the API service section - Added environment variables for watching files, setting time zone, log level, and disabling Python output buffering * 🔧 chore: update subproject commit reference in autorag-frontend * 🔧 chore: add test_projects to .gitignore * add new lines and fix .env.dev * fix chunk_documents --------- Co-authored-by: Seungwoo hong <Seungwoo hong [email protected]> Co-authored-by: jeffrey <[email protected]>
* Change all datetime.now() to the timezone UTC * properly working UTC timezone in the API server --------- Co-authored-by: jeffrey <[email protected]>
…py (#1005) * ✨ feat: Add QA document generation task in trial_tasks.py and schema.py - Added a new field `qa_task_id` in the Trial schema to store the QA task ID. - Introduced `generate_qa_documents` shared task in `trial_tasks.py` for creating QA documents. - Updated imports and added `QACreationRequest` in `trial_tasks.py`. - Included function `run_qa_creation` in `generate_qa_documents` task for generating QA documents with status tracking and database updates. * 🚑 fix: Return full trial config in get_trial_config Adjusts the return statement in `get_trial_config` to return the complete trial configuration instead of just the model dump. * 🔧 chore: update subproject commit in autorag-frontend to 1434e797 --------- Co-authored-by: Seungwoo hong <Seungwoo hong [email protected]>
* Change the WORK_DIR setting * send file directly
…id. (#1011) * get all parsed documents and the parse is not relevant to the trial_id now * add get chunk list at the API server * chunk document at project view * /parse POST with parse_name * QA creation endpoint
* Refactor start_evaluate api endpoint * if there is no .env, make one * make to one api endpoint that retrieve file content /artifacts/content * add /artifacts/content delete operation to delete the file * upload korean filenames * working parse with frontend * working QA! * validation 정상화 shout! * checkpoint (working but no result at evaluation) * Fix problem that trial_tasks.py cannot load the env * Finally success!!!! Working evaluate and validate
…erver with streaming (#1021) * working running dashboard * working running and closing report * working and closing the chat streamlit server * working and closing the external api server port to 8100
* add parsed file get endpoint * Add an "all_files" endpoint.
* change to the dynamic root directory * enable uploading html and data file extensions
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.