GitHub - Crowdliness/sheepdewg: This dewg will eventually learn how to heard the wild content from the entire crowd of AI research projects.

SheepDewg might eventually become an AI crowdshepherd which searches, connects, annotates and delivers reusable project artifacts.

Right now, SheepDewg is ONLY an early stage fork ... not a Python package for pip yet ... this cloned fork is just an IDEA for our new pup.

Quick links

To start off, you probably want to think about using AI2's Tango for your own collection of materials.

The SheepDewg is PRE-Pre-pre-IDEATION stages ... way before becoming an alpha pup.

The eventual long-term intent of the SheepDewg project is develop something that might become like WhiteFang to be of fierce assistance on the trail, to help with the impossible task of staying current with the literature, while also sniffing out the wolves and squirrels ... think of something perhaps like an RSS reader with an IPython environment for running code on your machine's GPU while furnishing some indication of how efficient a model is ... bascially, just a lovable dewg for helping the AI researchers stay somewhat current with the much larger crowd of people engaged in the rapidly developing world of AI research ... or even just one teeny-weeny portion of that larger world, such as the realm of transformer models.

Quick start

Create a Tango step:

# hello.py

from tango import step

@step()
def hello(name: str) -> str:
    message = f"Hello, {name}!"
    print(message)
    return message

And create a corresponding experiment configuration file:

// hello.jsonnet

{
  steps: {
    hello: {
      type: "hello",
      name: "World",
    }
  }
}

Then run the experiment using a local workspace to cache the result:

tango run hello.jsonnet -w /tmp/workspace

You'll see something like this in the output:

Starting new run expert-llama
● Starting step "hello"...
Hello, World!
✓ Finished step "hello"
✓ Finished run expert-llama

If you run this a second time the output will now look like this:

Starting new run open-crab
✓ Found output for step "hello" in cache...
✓ Finished run open-crab

You won't see "Hello, World!" this time because the result of the step was found in the cache, so it wasn't run again.

For a more detailed introduction check out the First Steps walk-through.

Installation

ai2-tango requires Python 3.8 or later.

Installing with `pip`

ai2-tango is available on PyPI. Just run

pip install ai2-tango

To install with a specific integration, such as torch for example, run

pip install 'ai2-tango[torch]'

To install with all integrations, run

pip install 'ai2-tango[all]'

Installing with `conda`

ai2-tango is available on conda-forge. You can install just the base package with

conda install tango -c conda-forge

You can pick and choose from the integrations with one of these:

conda install tango-datasets -c conda-forge
conda install tango-torch -c conda-forge
conda install tango-wandb -c conda-forge

You can also install everything:

conda install tango-all -c conda-forge

Even though ai2-tango itself is quite small, installing everything will pull in a lot of dependencies. Don't be surprised if this takes a while!

Installing from source

To install ai2-tango from source, first clone the repository:

git clone https://github.com/allenai/tango.git
cd tango

Then run

pip install -e '.[all]'

To install with only a specific integration, such as torch for example, run

pip install -e '.[torch]'

Or to install just the base tango library, you can run

pip install -e .

Checking your installation

Run

tango info

to check your installation.

Docker image

You can build a Docker image suitable for tango projects by using the official Dockerfile as a starting point for your own Dockerfile, or you can simply use one of our prebuilt images as a base image in your Dockerfile. For example:

# Start from a prebuilt tango base image.
# You can choose the right tag from the available options here:
# https://github.com/allenai/tango/pkgs/container/tango/versions
FROM ghcr.io/allenai/tango:cuda11.3

# Install your project's additional requirements.
COPY requirements.txt .
RUN /opt/conda/bin/pip install --no-cache-dir -r requirements.txt

# Install source code.
# This instruction copies EVERYTHING in the current directory (build context),
# which may not be what you want. Consider using a ".dockerignore" file to
# exclude files and directories that you don't want on the image.
COPY . .

Make sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports. You can see a list of all available image tags on GitHub.

FAQ

Why is the library named Sheepdewg?

It's a lovable pup. The starting motivation behind Sheepdewg, as a fork of Tango, is just to learn something about the larger process of using Python packages such as Sphinx or Jupyter to develop an AI literature classifier ... at first, we don't expect too much from the new Dewg, ie, we are just tickled to now end to have the little guy and we have not even really started even trying to teach our new herding puppy to behave himself ... but long before our Sheepdewg is useful, we will have learned more than a few things about what it's like to train a new puppy.

You probably want to start with the Tango CLI?

Think about how you can debug your own steps by running the tango command through pdb. For example:

python -m pdb -m tango run config.jsonnet

How is Tango different from Metaflow, Airflow, or redun?

We've found that existing DAG execution engines like these tools are great for production workflows but not as well suited for messy, collaborative research projects where code is changing constantly. AI2 Tango was built specifically for these kinds of research projects.

How does Tango's caching mechanism work?

AI2 Tango caches the results of steps based on the unique_id of the step. The unique_id is essentially a hash of all of the inputs to the step along with:

the step class's fully qualified name, and
the step class's VERSION class variable (an arbitrary string).

Unlike other workflow engines like redun, Tango does not take into account the source code of the class itself (other than its fully qualified name) because we've found that using a hash of the source code bytes is way too sensitive and less transparent for users. When you change the source code of your step in a meaningful way you can just manually change the VERSION class variable to indicate to Tango that the step has been updated.

Team

SheepDewg is developed and maintained by the Crowdliness Herd, backed by the Crowdliness.ORG. The mission of Crowdliness.ORG is to expand and enhance a more crowdly view of life ... to celebrate the joy of artistic, creative science by doing PRACTICAL learning projects ... that, first of all, teach us something about how the crowd is developing technology ... and, secondly, might actually possibly inform how we might contribute to humanity through free, extensible, open source development ... mostly, it's about the rambunctious, ad hoc, crowdliness of open source development ... just doing it and making up the rules as we learn them is how we do social engineering ... Crowdliness is about encouraging more social experimentation, more interactive science and collaborative technoloogical development ... the general aim of just doing it is to learn something, which might help us to raise the tide of development in terms of capability and knowledge which can lift all boats.

License

SheepDewg is licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 1,082 Commits
.github		.github
curatedlist_steps		curatedlist_steps
docs		docs
examples		examples
integration_tests		integration_tests
scripts		scripts
tango		tango
test_fixtures		test_fixtures
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
RELEASE_PROCESS.md		RELEASE_PROCESS.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick links

In this README

Quick start

Installation

Installing with `pip`

Installing with `conda`

Installing from source

Checking your installation

Docker image

FAQ

Why is the library named Sheepdewg?

You probably want to start with the Tango CLI?

How is Tango different from Metaflow, Airflow, or redun?

How does Tango's caching mechanism work?

Team

License

About

Releases

Packages

Languages

License

Crowdliness/sheepdewg

Folders and files

Latest commit

History

Repository files navigation

Quick links

In this README

Quick start

Installation

Installing with pip

Installing with conda

Installing from source

Checking your installation

Docker image

FAQ

Why is the library named Sheepdewg?

You probably want to start with the Tango CLI?

How is Tango different from Metaflow, Airflow, or redun?

How does Tango's caching mechanism work?

Team

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Installing with `pip`

Installing with `conda`

Packages